Experience with DBT (Data Build Tool) for data transformation, modular data modeling, testing, and ELT pipeline development.
Hands-on experience with big data technologies, including Apache Spark, Hadoop ecosystem (HDFS, YARN, Hive), and distributed data processing frameworks.
Strong experience in SQL programming, performance tuning, query optimization, and relational data model analysis across large-scale datasets.
Experience with DBT (Data Build Tool) for data transformation, modular data modeling, testing, and ELT pipeline development.
Experience in designing and delivering complex, large-volume data warehouse and data lake applications.
Strong technical expertise in design (Mapping Specifications, HLD, LLD) and development (coding, unit testing) using Ab Initio and modern data processing frameworks.
Experience with data partitioning, bucketing, and file formats (Parquet, ORC, Avro) in big data environments.
Familiarity with Spark optimizations (caching, partition tuning, broadcast joins) and performance troubleshooting.
Understanding of ETL vs ELT architectures and modern data pipeline design patterns.
Familiarity with data lake architectures, batch and stream processing, and scalable data pipelines.
Ability to relate to both business and technical stakeholders with excellent communication skills.
Participate in code reviews to ensure adherence to coding standards, performance optimization, and best practices across ETL and big data platforms.
Good to have knowledge on Ab Initio
Roles & Responsibilities:
Analyze, design, implement, and maintain high-volume, multi-terabyte, 24/7 data warehouse and data lake ETL/ELT applications.
Develop logical and physical data models and perform advanced ETL/ELT development using Ab Initio, Apache Spark, and DBT.
Build and optimize scalable data pipelines using Spark and Hadoop ecosystem tools, ensuring high performance and reliability.
Implement data transformation, cleansing, and validation logic to ensure high data quality and consistency.
Work with large datasets stored in distributed systems and optimize processing using appropriate data structures and algorithms.
Act as a technical contributor on complex data engineering projects involving large datasets and multiple team members.
Develop and maintain functional and technical documentation for ETL pipelines, data models, and workflows.
Lead the design and development of data warehouse and data lake solutions using modern architectures (Lambda/Kappa, medallion architecture, etc.).
Work with business users to translate requirements into system flows, data flows, and data mappings, delivering scalable and efficient solutions.
Lead design reviews, create architecture and design artifacts, capture feedback, and drive improvements in system design.
Ensure best practices in data governance, data quality, metadata management, and performance optimization across all platforms.
Collaborate with cross-functional teams including data analysts, architects, and DevOps teams for end-to-end data solution delivery.
Job Requirements :
4-6 years of experience in Analytics Software Engineering with expertise in BigQuery, Spark, and Snowflake.
Strong understanding of SQL programming language for querying large datasets.
Experience with Data Build tools such as Airflow or similar technologies.
Job Classification
Industry: Telecom / ISPFunctional Area / Department: Data Science & AnalyticsRole Category: Data Science & Analytics - OtherRole: Data Science & Analytics - OtherEmployement Type: Full time