Site Reliability Engineer @ AICS Consultancy

Home > Other

Site Reliability Engineer

AICS Consultancy
10 - 16 years
Bengaluru
6 years ago
Email to a friend
Report this job

Job Description

Opportunity for SRE (Site Reliability Engineer) in Bangalore

Position open: 03

Experience: 10-16 years

Location: Bangalore

This is for one of our esteemed client which is into cloud based product and services, located in Bangalore.

Job Duties and Responsibilities

The Site Reliability Engineer (SRE) will be responsible for both uplifting and maintaining our evolving technology platforms, infrastructure and technology controls. As an SRE, the role will include both oversight for production operations of our systems, as well as development/engineering of solutions to maximize system reliability & automation. The role will address three dimensions:

Tools Coverage Assess the tools coverage and ensure sufficient monitoring is in place to enable mature observability and data driven decision making
Defining and educating Engineering teams - Process, Procedures, Guide Rails and best practices
Culture Inculcate the culture of high performing teams and adopt the ways of working with the influence of SRE

The role will need to work with a global team responsible for a mission critical business function, and will partner with Infrastructure, DevOps and Core practices (like Security, Identity, ProdOps, Cloud platform and Tools) teams to identify and implement automation opportunities to drive down toil, reduce technical debt and improve system reliability.

Key Responsibilities:

Own the Infrastructure, APM and work with DevOps teams to Build, Release, Monitor and run the services to improve service reliably
Write software to automate API-driven tasks at scale and contribute to the product codebase in Java, JS, React, Node, Go and Python
Work with Ansible, Puppet, Chef, Terraform or another config management / orchestration suite, know where it's broken, work towards fixing them and explore new alternatives
Define and accelerate implementation of support processes, tools and best practices
Maintain services once they are live by measuring and monitoring availability, latency and overall system reliability
Handle cross team performance issues from identification of the cause, determining the areas of improvement and driving those actions to closure
Performance and maturity base lining of DevOps process, tools maturity & coverage, metrics, technology and engineering practices
Define, Measure and improve Reliability Metrics (SLO/SLI), Observability (Monitoring, Logging- Tracing solutions), Ops process (Incident, Problem Mgmt) and streamline automate release management
Strong believer of automation to bring in sustained continuous improvement by automating Toil, Runbooks, Improving ability of the applications to auto heal leading to improved reliability.

Experience to Include

Knowledge in the one or more of the following key areas: Ops maturity (performance testing, monitoring, operations - SIP), APM, Performance Benchmarking, Software Design and lifecycle (planning - discovery to provision), Infosec (including compliance, security)
Good understanding & implementation experience using 12-factor App principle
Experience in building monitoring/metrics & alerting tool (APM tool), custom dashboard for each Application stack against supported environment
Expertise with Python-related Technologies and Frameworks
Experience with Unix/Linux-OS Internals and administration or Networking and SME on at least one of the Cloud computing Infrastructure - GCP / Azure / AWS
Familiarity with handling

Containerization Kubernetes, Docker, Rancher, etc
Kafka, Yarn, Elastic Search
Source code management and Implementation of Security best
Tech Stack - Python, Falcon, Elastic Search, MongoDB, AWS (SQS S3), Map Reduce
Data science (AI/ ML) and analytics to be able to predict failures / operational issues

Be a subject matter expert, able to upskill / cross skill engineering teams on SRE principles, tools and execution
Troubleshoot, debug, and diagnose operational issues and drive them to
Monitor the health of Dish-Sling services, and define as well as track reliability metrics

Skills - Requirements

The successful candidate will have the following attributes/qualifications:

Bachelors/ Masters Degree and 10+ years of Development and Operations related experience and/or training; or equivalent combination of education and experience
Relevant experience as SRE would be an added advantage
Good understanding of uplifting the maturity (App Engineering practices & Ops)
Understanding of software delivery lifecycles, particularly Agile/Lean & DevOps
Proven experience in handling large scale and growing infrastructure across Data Centres and heterogeneous Cloud platforms
Experience as a service owner in managing large geographically diverse stake holders
Ability to work with creative fast growing engineering team and motivate them to deliver their best work
History of driving innovation

Please apply us with your updated resume at su*************r@gm**l.com

Job Classification

Industry: Other
Functional Area: Other,
Role Category: Other
Role: Other
Employement Type: Full time

Education

Under Graduation: Any Graduate in Any Specialization
Post Graduation: Post Graduation Not Required, Any Postgraduate in Any Specialization
Doctorate: Doctorate Not Required, Any Doctorate in Any Specialization

Contact Details:

Company: ICS Consultancy Service Pvt. Ltd.
Location(s): Bengaluru

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: cloud kubernetes python ops devops SRE site reliability engineer linux elastic APM unix docker

Job seems aged, it may have been expired!
Fraud Alert to job seekers!

₹ 20,00,000 - 35,00,000 P.A

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Aws Gen Ai Engineer For Pune Location On 25th April '26

Tata Consultancy

5 - 8 years

Pune

4 days ago

₹ Not Disclosed

Azure Platform Support Engineer

Capgemini

4 - 8 years

Pune

5 days ago

₹ Not Disclosed

Graduate Engineer Trainee - Only Fresher

Krishna Group

0 - 1 years

Noida, Gurugram

5 days ago

₹ Not Disclosed

Qa Engineer

Coforge

6 - 11 years

Bengaluru

5 days ago

₹ 15-25 Lacs P.A.

AICS Consultancy

Our Client.

Site Reliability Engineer @ AICS Consultancy

Home > Other