Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Azure Data Engineer @ Infogain

Home > Software Development






 Azure Data Engineer

Job Description

  • Analyze existing Hadoop, Pig, and Spark scripts from Dataproc and refactor them into Databricks-native PySpark.
  • Implement data ingestion and transformation pipelines using Delta Lake best practices.
  • Apply conversion rules and templates for automated code migration and testing.
  • Conduct data validation between legacy and migrated environments (schema, count, and data-level checks).
  • Collaborate on developing AI-driven tools for code conversion, dependency extraction, and error remediation.
  • Ensure best practices for code versioning, error handling, and performance optimization.
  • Participate in UAT, troubleshooting, and post-migration validation activities.
Technical Skills
  • Core: Python, PySpark, SQL
  • Databricks: Delta Lake, Unity Catalog, Databricks Workflows, MLflow (basic understanding)
  • GCP: Dataproc, BigQuery, GCS, Composer/Airflow, Cloud Functions
  • Data Engineering: Hadoop, Hive, Pig, Spark SQL
  • Automation: Experience with migration utilities or AI-assisted code transformation tools
  • CI/CD: Git, Jenkins, Terraform (preferred)
  • Validation: Data comparison utilities (Delta-to-Delta, DataFrame diffing, schema validation)
Preferred Experience
  • 5-8 years in data engineering or big data application development.
  • Hands-on experience migrating Spark or Hadoop workloads to Databricks.
  • Familiarity with Delta architecture , data quality frameworks , and GCP cloud integration .
  • Exposure to GenAI-based tools for automation or code refactoring is a plus.
EXPERIENCE
  • 6-8 Years

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time

Contact Details:

Company: Infogain
Location(s): Noida, Gurugram

+ View Contactajax loader


Keyskills:   hive continuous integration python scala pyspark microsoft azure apache pig code versioning tools machine learning data engineering application development sql dataproc gen git automation spark gcp jenkins terraform hadoop bigquery sqoop big data

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Custom Software Engineering Lead

  • Accenture
  • 2 - 5 years
  • Kolkata
  • 2 days ago
₹ Not Disclosed

Lead Software Engineer

  • Capgemini
  • 5 - 8 years
  • Hyderabad
  • 2 days ago
₹ Not Disclosed

Lead Software Engineer

  • Capgemini
  • 5 - 8 years
  • Pune
  • 2 days ago
₹ Not Disclosed

Software Dev Engineer II, Cross Border Tech

  • Amazon
  • 1 - 4 years
  • Kolkata
  • 3 days ago
₹ Not Disclosed

Infogain

Infogain is a Silicon Valley headquartered company with software platform engineering and deep domain expertise in the travel, retail, insurance and high technology industries. We accelerate the delivery of digital customer engagement systems using digital technologies such as cloud, mic...