Site Reliability Engineer - Datadog @ HUCON Solutions

Home > Software Development

Site Reliability Engineer - Datadog

HUCON Solutions
5 - 10 years
Hyderabad
24 days ago
Email to a friend
Report this job

Job Description

Role Overview

We are looking for a highly skilled Site Reliability Engineer (SRE) to build, operate, and continuously improve highly available, scalable, and observable platforms running on baremetal

Kubernetes clusters, DataDog and Google Kubernetes Engine (GKE).

The ideal candidate brings deep Kubernetes expertise, strong cloud-native experience on GCP, and a passion for reliability, automation, and operational excellence. This role works closely with application, platform, and architecture teams to ensure production systems are resilient, secure, and performant at scale.

Experience Range - 4 to 12 years

Locations: Chennai, Hyderabad, Noida and Gurgaon only

Mode of work: Work from office only

Key Responsibilities

Design, operate, and support Kubernetes platforms across baremetal clusters and GKE

Ensure high availability, scalability, performance, and reliability of production systems

Implement and manage GitOps-based deployment workflows using tools like Argo CD

Build, maintain, and optimize CI/CD pipelines using tools such as GitHub Actions,Harness, CircleCI, or equivalent

Deploy and manage applications using Helm, including canary and progressive delivery strategies

Hands on exp on cloudbased monitoring and observability tool i.e. DataDog Implement comprehensive observability using Prometheus, Grafana, Loki, and Tempo

Proactively monitor systems, troubleshoot incidents, and perform root cause analysis (RCA)

Partner with development teams to improve service reliability, scalability, and operational maturity

Provision and manage cloud infrastructure on Google Cloud Platform (GCP)

Automate infrastructure and platform operations using Infrastructure as Code (IaC) and scripting

Drive continuous improvements in resilience, automation, and operational efficiency

Required Skills & Qualifications

Strong hands-on experience with Kubernetes architecture and administration

Experience managing both bare-metal Kubernetes clusters and Google Kubernetes

Engine (GKE)

Solid understanding of Google Cloud Platform (GCP) services and networking concepts

Proven experience with GitOps practices and tools such as Argo CD

Proficiency with CI/CD tools (GitHub Actions, Harness, CircleCI, or similar)

Practical experience with:

Helm
Canary / progressive deployments

Strong expertise in observability and monitoring:

Prometheus
Grafana
Loki
Tempo
Experience with Terraform for infrastructure provisioning

Understanding of modern API technologies such as GraphQL

Familiarity with API management platforms (Apigee Edge, Apigee X)

Knowledge of CDN and edge services (e.g., Akamai)

Good to Have

Working knowledge of Java (Spring Boot) and/or Node.js framework

Understanding of microservices architecture and service-to-service communication

Experience with Ansible or similar configuration management tools

Exposure to hybrid or multicloud environments

Experience in performance tuning and cost optimization on GCP

Understanding of Kubernetes and cloud security best practices

SRE experience aligned with SLIs, SLOs, and error budgets

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Platform Engineer
Employement Type: Full time

Contact Details:

Company: HUCON Solutions
Location(s): Hyderabad

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: Kubernetes Cluster Springboot Java Site Reliability Engineering Datadog Gcp Cloud Platform Development Ansible Prometheus Helm Grafana

Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Site Reliability Engineer

Capgemini

6 - 9 years

Hyderabad

20 days ago

₹ Not Disclosed

Senior Site Reliability Engineer (SRE)

Kale Logistics

10 - 15 years

Pune

1 month ago

₹ 25-32.5 Lacs P.A.

Senior Site Reliability Engineer

Tekskills

6 - 10 years

Bengaluru

2 mths ago

₹ 5-15 Lacs P.A.

Manager Site Reliability Engineer

Global Technology

8 - 12 years

Pune

2 mths ago

₹ 0-40 Lacs P.A.

HUCON Solutions

Hucon Solutions India Pvt.Ltd. Hucon Solutions is an Integrated HR Service Provider for all Corporates all over India. We are backed by a good ERP and enough experience in HR and related activities. It has helped generate career opportunities for more than a million individuals in India. Hucon Solu...

Site Reliability... in Hyderabad

Software Test Engineer Capgemini

Cloud Platform Engineer Accenture

Voice AI Engineering Lead Wipro

AI Test Engineer Hexaware Technologies

Azure Devops Engineer Black white Business

Automation Test Engineer_... Capgemini
See all →

Site Reliability Engineer - Datadog @ HUCON Solutions

Home > Software Development