Job Description
As a Sr. Manager, Cloud AI Engineer, you will own the cloud environments, backend services, CI/CD pipelines, LLMOps tooling, and observability infrastructure that enable fast, reliable, and secure AI product delivery. You will partner with AI engineers, data scientists, and the enterprise platform team to ensure DecisionIQ's infrastructure is production-grade, scalable, and compliant. This role is for someone who thrives at the intersection of cloud engineering, backend development, and AI/ML operations ensuring that the platform beneath the product is rock-solid.
ROLE RESPONSIBILITIES
- Design, build, and operate cloud-native environments on AWS/Azure for DecisionIQ: compute, storage, networking, and security
- Implement Infrastructure as Code (Terraform, Ansible, Helm) for reproducible, version-controlled environments
- Develop and maintain backend services, APIs, and data pipelines that power DecisionIQ's AI features
- Define and enforce architecture patterns: containerization (Kubernetes/Docker), secrets management, identity and access management, and security guardrails
- Optimize cloud resource utilization, cost, and performance across all DecisionIQ workloads
2) LLMOps/AI/ML Operations
- Build and operate MLOps/LLMOps pipelines: experiment tracking, model registry, model deployment, monitoring, and rollback
- Deliver LLM-as-a-Service capabilities for the team: API key management, usage optimization, cost tracking, and guardrails
- Operationalize AI workflows including data pipelines, feature stores, model packaging, and inference serving
- Partner with data scientists and AI engineers to accelerate model-to-production cycles with standardized tooling (MLflow, LangChain, Langfuse, SageMaker, Bedrock)
3) CI/CD, Testing Observability
- Own the CI/CD strategy: design and maintain automated pipelines (GitHub Actions) for build, test, and deployment across all services and ML workloads
- Establish observability and reliability: implement monitoring (OpenTelemetry, Prometheus, Grafana, ELK), define SLOs, and drive operational excellence (DORA metrics, MTTR)
- Champion DevSecOps: embed security controls, automated compliance checks, and supply-chain integrity into all pipelines
- Support test automation integration: ensure CI/CD pipelines enforce quality gates, coverage targets, and release readiness criteria
4) Team Contribution Engineering Maturity
- Contribute to a highly effective engineering team; mentor colleagues on cloud, DevOps, and LLMOps best practices
- Drive engineering maturity through design docs, architecture decision records (ADRs), IaC reviews, and secure coding practices
- Partner with enterprise Data and AI Platform teams to align DecisionIQ infrastructure with corporate standards and shared services
- Stay current with emerging cloud and AI operations technologies; evaluate and adopt tools that improve delivery velocity and reliability
This role covers a broad spectrum of skills and we encourage you to apply even if you meet partially.
BASIC QUALIFICATIONS
- Bachelors or Masters degree in Computer Science, Engineering, or related field
- 7+ years in platform engineering, DevOps, cloud infrastructure, or backend engineering roles
- Strong hands-on experience with AWS or Azure; Kubernetes/Docker; Terraform/Ansible/Helm
- Proficiency in Python and scripting (Bash); experience with backend service development and RESTful APIs
- Hands-on experience with CI/CD pipelines (GitHub Actions, Jenkins) and Git-based workflows
- Hands-on experience with LLMOps and ML frameworks (LangChain, MLflow, Langfuse) and cloud AI services (SageMaker, Bedrock, Azure AI Foundry)
- Expertise in observability and reliability tools (OpenTelemetry, Prometheus, Grafana, ELK) and secrets management (Vault, AWS Secrets Manager)
- Demonstrated experience with DevSecOps and secure SDLC practices
- Familiarity with cloud-based analytics ecosystems (AWS, Snowflake)
- Hands-on experience working in Agile teams
Disclaimer: This job posting has been aggregated from external source. Role details, content, and availability are subject to change. Applicants are advised to confirm the latest information directly on the company website before applying.
Job Classification
Industry: Pharmaceutical & Life Sciences
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time
Contact Details:
Company: Pfizer
Location(s): Mumbai
Keyskills:
Supply chain
Computer science
Ccsp
Networking
Coding
Packaging
Monitoring
Analytics
SDLC
Python