Job Description:
Job Title: Senior Site Reliability Engineer
Corporate Title: AVP
Location: Bangalore, India
Role Description
We are seeking a Site Reliability Engineer for Observability platforms in the Bank to enhance, scale, and modernise our enterprise observability capability.
This role focuses on owning and evolving Observability and Monitoring tools across the Bank, driving a shift towards OpenTelemetry (OTel)-based telemetry standardisation.
The successful candidate will contribute to automation, AI adoption, and observability-by-design practices to improve reliability, scalability, and developer experience.
What we’ll offer you
As part of our flexible scheme, here are just some of the benefits that you’ll enjoy,
Best in class leave policy.
Gender neutral parental leaves
100% reimbursement under childcare assistance benefit (gender neutral)
Sponsorship for Industry relevant certifications and education
Employee Assistance Program for you and your family members
Comprehensive Hospitalization Insurance for you and your dependents
Accident and Term life Insurance
Complementary Health screening for 35 yrs. and above
Your key responsibilities
Tools Reliability Governance:
Own the availability, performance, and resilience of the Observability tool stack in the Bank
Act as admin of the tool stack, ensuring platforms effectively support enterprise monitoring requirements
Drive standardisation of telemetry using
OpenTelemetry (OTel) across Metrics, Events, Logs, and Traces (MELT)
Define and implement telemetry collection, enrichment, and routing strategies using OTel collectors and pipelines
Identify and implement automation and self-healing for common issues and adopt AI practices to enhance tools availability and user experience
Own Incident and Problem Management framework (severity, escalation, response and resolution):
Ensure quick incident response, containment, and service restoration
Perform deep root cause analysis and deliver permanent resolutions
Oversee major incidents and proactively identify systemic risks
Identify and eliminate audit and control risks
Align and adhere with SRE best practices:
Provide frameworks, playbooks, and automation capabilities
Conduct reliability reviews and implement and improve SLO/SLI tracking
Maintain and govern error budgets
Promote observability-by-design principles across application and platform teams
Strong SRE / production engineering experience
Expertise in SLOs, error budgets, incident governance, and modern observability practices
Experience with distributed systems, GCP, Kubernetes, Openshift
Leverage
OTel-driven telemetry insights
to improve reliability and proactive issue detection
Strong understanding of risk, audit, and compliance (financial services preferred)
Own and evolve the Observability platform ecosystem – ITRS Geneos, New Relic (SaaS), Netcool, Grafana (KDB), and
OTel-based telemetry pipelines
Your skills and experience
Strong experience as admin of at-least 2 of the observability tools: ITRS Geneos, New Relic (SaaS), Netcool, Grafana (KDB)
Strong understanding of MELT concepts and modern Observability architectures
Hands-on experience with
OpenTelemetry (OTel):
Application and infrastructure instrumentation (auto and manual)
OTel collectors, exporters, and telemetry pipelines
Integration of OTel with tools such as Grafana and New Relic
Understanding of vendor-agnostic telemetry frameworks
Hands-on experience in working on Unix servers (Windows server would be added benefit), Google Cloud, Openshift
Strong hands-on experience in any scripting language: shell, bash, python etc. Experience with ansible playbooks and terraform will be beneficial
Experience in Oracle, MSSQL database, KDB knowledge will be an added advantage
How we’ll support you
Training and development to help you excel in your career.
Coaching and support from experts in your team.
A culture of continuous learning to aid progression.
A range of flexible benefits that you can tailor to suit your needs.
About us and our teams
Please visit our company website for further information:
https://www.db.com/company/company.html
We strive for a
culture
in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.
Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.
We welcome applications from all people and promote a positive, fair and inclusive work environment.