What your typical day would look like?
As part of this role, you will:
Architect End-to-End Agentic AI Systems: Lead the design, solutioning, and deployment of multi-agent LLM systems, RAG architectures, and autonomous workflows capable of solving complex enterprise challenges.
Bridge AI & Data Engineering: Architect and optimize large-scale data lake environments, distributed computing frameworks, and high-throughput real-time streaming pipelines explicitly designed to feed and power GenAI applications.
Drive Unit Alignment & Tech Solutioning: Partner closely with business unit leaders and executive stakeholders to translate vague business challenges into concrete, high-impact AI/DE roadmaps and technical solutions.
Productionize & Scale MLOps/LLMOps: Establish and govern scalable MLOps and CI/CD pipelines to deploy, monitor, and continuously retrain agentic systems, effectively managing challenges like hallucination, bias, latency, and costs.
Lead & Mentor Across Disciplines: Formally lead and mentor cross-functional technical teams comprising Data Engineers, AI Researchers, and Data Scientists, fostering technical excellence and cross-disciplinary innovation.
Evaluate & Innovate: Continuously explore, prototype, and benchmark emerging open-source and proprietary tools within the evolving GenAI and Big Data landscapes to maintain a competitive technical edge.
Who do we expect ?
11+ years of overall technical experience in the Data Science, Analytics, and Data Engineering spaces.
5+ years of hands-on experience specifically within the Hadoop/Spark ecosystem or large-scale cloud data architecture.
3+ years of leadership experience managing, mentoring, and scaling high-performing, cross-functional technical teams.
AI & Generative AI Technical Depth: Advanced expertise in LLMs, prompt engineering, RAG, and fine-tuning (LoRA, PEFT). Proven experience building and deploying autonomous LLM Agents using frameworks like LangChain, LlamaIndex, CrewAI, or AutoGen. Strong foundations in Deep Learning, ML, PyTorch, or TensorFlow.
Data Engineering & Architecture Depth: Hands-on mastery of the Big Data ecosystem (HDFS, Hive, Kafka, Spark, Scala) , modern platforms (Databricks, Snowflake) , and Data Lake design. Proficient in NoSQL/Vector databases (Cassandra, Pinecone, Milvus) and optimizing distributed computing for massive AI workloads.
Cloud, Infrastructure & MLOps: Proven experience architecting large-scale cloud solutions (AWS, Azure, or GCP) with robust MLOps/DataOps pipelines, containerization (Docker, Kubernetes) , and microservices-driven backend APIs (FastAPI/Django)

Keyskills: Data Engineering Generative Ai Natural Language Processing Python