Job Description
< p> AI/ML Data Engineer< /p> < p> This role has been designed as Hybrid with an expectation that you will work on average 2 days per week from an HPE office.< /p> < p> < b> What youll do:< /b> < /p> < p> < b> Responsibilities:< /b> < /p> < p> Develop, maintain, and optimize data pipelines and workflows and Feature Store to ensure seamless data ingestion and transformation as a scalable data solution.< /p> < ul> < li> Design, develop, implement, and architect Data Engineering pipelines, considering performance scalability including data storage and processing.< /li> < li> Implement advanced data transformations and quality checks to ensure data accuracy, completeness, security and consistency of data.< /li> < li> Seamlessly integrate data from diverse sources, for data ingestion, transformation and storage, leveraging AWS S3 Storage and possibly Snowflake as a SQL Data Warehouse.< /li> < li> Create and implement advanced data models and schemas and ensure data governance and data management best practices.< /li> < /ul> < p> < b> What you need to bring:< /b> < /p> < p> < b> Qualification and Desired Experiences:< /b> < /p> < ul> < li> 5+ years of data analysis and engineering experience< /li> < li> Bachelor s degree in computer science, Statistics, Informatics, Information Systems or another quantitative field.< /li> < li> Working knowledge of API or Stream-based data extraction processes like Salesforce API and Bulk API and have hands-on experience in web crawling.< /li> < /ul> < p> < b> Primary Tech skills:< /b> < /p> < ul> < li> Advanced Web-crawling scraping methods and tools< /li> < li> Building end-end Data Engineering pipelines for Semi and unstructured data (Text, all kinds of simple/complex table structures, images, video and audio data)< /li> < li> Python, Pyspark, SQL, RDBMS< /li> < li> Data Transformation (ETL/ELT) activities< /li> < li> SQL Data warehouse (e.g. Snowflake) working / preferably administration< /li> < /ul> < p> < b> Secondary Tech skills:< /b> < /p> < ul> < li> Databricks< /li> < li> Familiarity with AWS services : S3, Glue, EMR, EC2, RDS, monitoring and IAM< /li> < li> Kafka, Spark Kafka Streaming< /li> < li> Workflow automation (e.g. using Github actions)< /li> < li> Performing RCA< /li> < /ul> < p> < b> Personal Skills:< /b> < /p> < ul> < li> Ability to collaborate cross-functionally and build sound working relationships within all levels of the organization< /li> < li> Ability to handle sensitive information with keen attention to detail and accuracy. Passion for data handling ethics.< /li> < li> Effective time management skills and ability to solve complex technical problems with creative solutions while anticipating stakeholder needs and helping meet or exceed expectations< /li> < li> Comfortable with ambiguity and uncertainty of change when assessing needs for stakeholders< /li> < li> Self-motivated and innovative; confident when working independently, but an excellent team player with a growth-oriented personality< /li> < /ul> Disclaimer : This job posting has been aggregated from external source. Role details, content, and availability are subject to change. Applicants are advised to confirm the latest information directly on the company website before applying.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Machine Learning Engineer
Employement Type: Full time
Contact Details:
Company: Hewlett Packard
Location(s): Bengaluru
Keyskills:
TCP
Automation
Data analysis
RDBMS
Workflow
Monitoring
SQL
Python
Salesforce
Recruitment