Talent.com
OM Bank - Site Reliability Engineer

OM Bank - Site Reliability Engineer

Old MutualZA
13 days ago
Job description

Job title : OM Bank - Site Reliability Engineer

Job Location : Gauteng, Cape Town

Deadline : January 30, 2026

Quick Recommended Links

  • Jobs by Location
  • Job by industries

Job Description

  • OM Bank is currently looking for a site reliability engineer to join OM Bank platform team. The candidate will be responsible for maintaining the OM Bank platform, including first line support for the platform’s technical services and managing service outages through the incident management process.
  • KEY RESULT AREAS

  • First line support for all services that comprise the platform
  • Managing the incident management process for production incidents including detection, triaging, resolve and driving continuous improvements
  • Maintain the production readiness score card defined in terraform to ensure checks are working as expected and responsible for adding new checks to the scorecard workflow
  • Creating and maintaining monitors in datadog that improve observability across the platform
  • Engagement with the wider OM Bank product and build team to ensure alignment to the observability standards defined by the platform team
  • Designing and implementing enhancements to the platform that contribute towards reducing MTTR (mean time to recovery)
  • Designing and implementing automation initiatives including self-service capabilities
  • Implementing Service Level Indicators & Objectives for the platform
  • Implementing and maintaining datadog dashboards for the platform
  • Defining and maintaining baseline monitors to be used by product teams
  • Maintaining the observability repository that contains all service definitions and observability related configurations
  • Maintaining the feature flagging repository containing all feature flagging definition for product teams
  • Maintaining Pager Duty definitions and overall administration
  • Fine tuning monitors to ensure alerts are triggered appropriately
  • Leading an action center during a production incident, fostering collaboration across the bank to resolve the outage
  • Advising product and platform on engineering best practices to ensure services are built with observability and scalability from the start
  • Maintaining overall platform health by monitoring key metrics
  • Maintaining and extending the SRE API written in python and deploy to Kubernetes
  • ROLE REQUIREMENTS

  • Bachelor’s degree in computer science, electrical or electronic engineering, Information Technology, or relevant field
  • 7+ years of software and platform engineering experience building and supporting scalable services
  • 3-5 years experience in writing infrastructure as code (Terraform, AWS CDK, Cloudformation)
  • Solid experience using observability platforms like Datadog
  • Experience with microservices architecture and Restful API
  • Solid Kubernetes experiencing displaying end to end deployment and maintenance of clusters including designing and building infrastructure as code required to deploy the cluster and required cloud resources that support the cluster
  • Experience with Kubernetes custom resource management and deployment
  • Solid experiencing deploying Kubernetes resources using Helm Charts
  • Experience in fine tuning Kubernetes HPA configs
  • Moderate experience using go / python programming language
  • Solid experience using GitOps and general git based operations
  • Solid infrastructure as code background displaying experience in designing, implementing and maintaining IAC design patterns that manage large scale cloud environment.
  • Solid AWS experience, displaying advanced understanding of cloud architecture and maintaining distributed systems
  • Experience maintaining queuing systems like AWS SQS and event streaming platforms like Kafka
  • Experience supporting mobile applications
  • Skills

  • Action Planning, Application Development, Business Process Design, Computer Literacy, Data Management, Data Modeling, Evaluating Information, Identifying Customer Needs, Information Technology (IT) Support, Market Analysis, Oral Communications, Product Development, Technical Support, Technical Troubleshooting, Test Case Management, User Requirements Documentation, Web Development
  • Competencies

  • Business Insight
  • Collaborates
  • Courage
  • Cultivates Innovation
  • Decision Quality
  • Drives Results
  • Ensures Accountability
  • Manages Complexity
  • Closing Date

  • 01 November 2025
  • ICT jobs
  • Create a job alert for this search

    Reliability Engineer • ZA

    Related jobs
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LexisNexis Risk SolutionsSouth Africa
    About our Team • •LexisNexis Legal & Professional, which serves customers in more than 150 countries with 11,800 employees worldwide, is part of , a global provider of information based analytics and...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    CueSouth Africa
    Cue provides modern customer service software that enables businesses to communicate with people using chatbots and live chat on platforms like WhatsApp, Messenger, Web chat, Telegram, and more.Our...Show moreLast updated: 4 days ago
    • Promoted
    Systems Reliability Engineering Technical Specialist

    Systems Reliability Engineering Technical Specialist

    Cummins Inc.ZA
    Job title : Systems Reliability Engineering Technical Specialist.We are looking for a talented Systems Reliability Engineering Technical Specialist to join our team specializing in Engineering for ...Show moreLast updated: 18 days ago
    • Promoted
    Refinery Operations Engineer - North and B&S

    Refinery Operations Engineer - North and B&S

    Astron Energy Ltd.ZA
    Job title : Refinery Operations Engineer - North and B&S.The Refinery Operations Engineer has the primary responsibility of ensuring that the refinery is operated in alignment with the agreed opera...Show moreLast updated: 11 days ago
    • Promoted
    Site Civil Engineer (Solar)

    Site Civil Engineer (Solar)

    TOTAL Deutschland GmbHSouth Africa
    Employer company TotalEnergies Renewables Southern Africa.Type of contract Fixed term position.This is a skilled civil engineering site based position responsible for the technical development and ...Show moreLast updated: 30+ days ago
    • Promoted
    IT Infrastructure and Deployment Engineer - Remote

    IT Infrastructure and Deployment Engineer - Remote

    JenRec RecruitmentRemote, South Africa
    Remote
    The successful candidate will manage Linux-based environments, oversee deployments of the companys platform into customer infrastructure, maintain PostgreSQL databases (including replication), and ...Show moreLast updated: 11 days ago
    • Promoted
    Site Agent

    Site Agent

    Faircape GroupZA
    Our six world-class facilities feature modern, technologically advanced Healthcare Centres that provide everything from frail care and dementia support to sub-acute recovery and assisted living.For...Show moreLast updated: 4 days ago
    • Promoted
    Refinery Operations Engineer - South

    Refinery Operations Engineer - South

    Astron Energy Ltd.ZA
    Job title : Refinery Operations Engineer - South.The Refinery Operations Engineer South will be responsible and carry the DoA of the Refinery Operations Engineer North and B&S as required due to an...Show moreLast updated: 11 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    IOCOBlank, South Africa
    Join a dynamic engineering team driving innovation.As a Senior Site Reliability Engineer, you'll work alongside cross-functional teams to build and scale a mission-critical internal data platform.Y...Show moreLast updated: 24 days ago
    • Promoted
    • New!
    Reliability Engineer

    Reliability Engineer

    Pragma GlobalZA
    Job title : Reliability Engineer.Job Advert Summary .At Pragma, we provide the opportunity for individuals to enjoy their working lives as much as their home lives.We foster a tea...Show moreLast updated: 9 hours ago
    • Promoted
    • New!
    OM Bank - Platform Engineer

    OM Bank - Platform Engineer

    Old MutualZA
    Job title : OM Bank - Platform Engineer.Job Location : Gauteng, Cape Town.We are seeking an experienced Platform Engineer to build high performing, scalable enterprise grade platforms that enable b...Show moreLast updated: 9 hours ago
    • Promoted
    Roads Site Agent (ECSA / SACPCMP Registered) - Construction Industry

    Roads Site Agent (ECSA / SACPCMP Registered) - Construction Industry

    RPO RecruitmentZA
    Job title : Roads Site Agent (ECSA / SACPCMP Registered) - Construction Industry.RPO Recruitment’s client, a well-established civil engineering contractor with a proven track record in delivering lar...Show moreLast updated: 5 days ago
    Solutions Engineer | South Africa (Remote)

    Solutions Engineer | South Africa (Remote)

    LAdminsZA
    Remote
    Quick Apply
    Solutions Engineer (OEM / Testing Solutions) Location : Remote — Philippines (collaborating daily with the U.Employment Type : Full-time Help leading manufacturers and innovators measure what m...Show moreLast updated: 11 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    DuckDuckGoSouth Africa
    Be among the first 25 applicants.Hi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable ...Show moreLast updated: 6 days ago
    • Promoted
    OM Bank - Resilience Engineer

    OM Bank - Resilience Engineer

    Old MutualZA
    Job title : OM Bank - Resilience Engineer.Job Location : Gauteng, Cape Town.Supports the IT organization in developing, implementing and maintaining stable and resilient IT environment by driving b...Show moreLast updated: 30+ days ago
    • Promoted
    Asset Reliability Engineering Manager-Pipeline

    Asset Reliability Engineering Manager-Pipeline

    Rand WaterZA
    Job title : Asset Reliability Engineering Manager-Pipeline.Job Location : Gauteng, Cape Town.Job Advert Summary .This role is tasked with implementing state-of-the-art reliability...Show moreLast updated: 11 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Nanolabs Health ServicesSouth Africa
    We are seeking a talented Site Reliability Engineer (SRE) to join our growing, innovative team.This role is ideal for an experienced, curious, and ambitious professional who thrives on ensuring rel...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (Datadog)

    Site Reliability Engineer (Datadog)

    DatacentrixZA
    Job title : Site Reliability Engineer (Datadog).Are you a Site Reliability Engineer with solid Datadog experience? Our client in the Warehousing and Logistics sector is looking to employ an Enginee...Show moreLast updated: 8 days ago