Design and implement ETL pipelines using PySpark and AWS Glue.
Develop and optimize data processing frameworks on large-scale datasets.
Work extensively with AWS services such as Glue, Lambda, ECS, S3, DynamoDB, and CloudWatch.
Build and maintain data ingestion and transformation workflows.
Develop and manage Python-based automation and data transformation scripts.
Collaborate with cross-functional teams to ensure data availability, quality, and performance.
(Good to have) Develop and integrate RESTful APIs for data access and service communication.
Troubleshoot and optimize data solutions for performance and cost efficiency.
Preferred candidate profile
Strong proficiency in Python programming.
Hands-on experience with PySpark for distributed data processing.
Deep understanding and hands-on exposure to AWS services like:
AWS Glue (ETL development)
AWS Lambda (serverless data processing)
ECS / EKS (containerized workloads)
DynamoDB (NoSQL database)
S3 (data storage and management)
Experience with data ingestion, transformation, and orchestration.
Familiarity with API concepts request/response models, RESTful design, JSON handling.
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Engineering - Software & QARole Category: Software DevelopmentRole: Back End DeveloperEmployement Type: Full time