Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Pyspark - Senior Engineer @ Iris Software

Home > Software Development

Iris Software  Pyspark - Senior Engineer

Job Description

  • Strong proficiency in Python, PySpark / Apache Spark
  • Solid understanding of RDDs,
  • Spark SQL , and Spark performance tuning
  • Experience in writing optimized ETL/ELT pipelines
  • Experience with SQL and relational databases (PostgreSQL, MySQL, Oracle, etc.)
  • Exposure to Big Data ecosystems (Hadoop, Hive, HDFS)
  • Familiarity with batch and streaming data processing
Good to Have
  • AWS / Azure / GCP (preferred)
    • AWS services such as S3, EMR, Glue, Redshif
  • Version control using Git
  • Experience with CI/CD pipelines
  • Basic familiarity with Docker and workflow schedulers (Airflow preferred)
  • Knowledge of Databrick
Responsibility :
  • Design, develop, and maintain data pipelines using PySpark and Python .
  • Process and transform large structured and unstructured datasets in distributed environments .
  • Optimize Spark jobs for performance, scalability, and reliability .
  • Develop reusable data transformation frameworks and utilities.
  • Integrate data from multiple sources including relational, NoSQL, and streaming systems.
  • Perform data quality checks, validations, and error handling.
  • Collaborate with data analysts, data scientists, and upstream/downstream teams.
  • Support deployment and monitoring of data pipelines in production environments.
Mandatory Competencies
  • Big Data - Big Data - Pyspark
  • Data Science and Machine Learning - Data Science and Machine Learning - Apache Spark
  • Programming Language - Python - Apache Airflow
  • Database - PostgreSQL - PostgreSQL
  • Big Data - Big Data - HIVE
  • Big Data - Big Data - Hadoop
  • Big Data - Big Data - HDFS
  • Database - Database Programming - SQL
  • Beh - Communication and collaboration
 

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Platform Engineer
Employement Type: Full time

Contact Details:

Company: Iris Software
Location(s): Noida, Gurugram

+ View Contactajax loader


Keyskills:   pyspark hive data pipelines sql docker git postgresql data science spark gcp apache spark mysql hadoop big data etl apache airflow azure s3 python oracle streaming data airflow machine learning elt nosql data quality aws

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Web Development Engineer-I, Carrier Logistics

  • Amazon
  • 1 - 5 years
  • Hyderabad
  • 9 hours ago
₹ Not Disclosed

Software Dev Engineer II

  • Amazon
  • 3 - 8 years
  • Hyderabad
  • 9 hours ago
₹ Not Disclosed

Application Support Engineer

  • Accenture
  • 1 - 4 years
  • Chennai
  • 12 hours ago
₹ Not Disclosed

Full Stack Engineer

  • Accenture
  • 7 - 12 years
  • Chennai
  • 12 hours ago
₹ Not Disclosed

Iris Software

Iris Software Inc.