Company Description
What started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce.
Our mission is simple : help businesses build stronger relationships through seamless digital commerce.
At Sana Commerce, you’ll join a team that’s bold, growth-oriented, and customer-obsessed, where every engineer has real ownership and impact.
About the role
We’re looking for a Senior Site Reliability Engineer (SRE) to enhance the reliability, performance, and scalability of our global e-commerce platform.
You’ll play a critical role in building resilient systems, automating infrastructure, and driving observability across Azure and Kubernetes environments.
This is a hands‑on engineering role where you’ll blend reliability strategy with real‑world execution, keeping today’s systems healthy while shaping the ones we’ll depend on tomorrow.
Job Description
What you’ll do
- Lead incident response and postmortems, drive investigations, document learnings, and implement permanent fixes to prevent recurrence.
- Manage and optimize Azure Kubernetes environments, own cluster configurations, performance, cost control, and security best practices.
- Build observability systems, develop dashboards, alerts, and metrics using Dynatrace, Honeycomb, ElasticSearch, Grafana / Kibana, and Azure Monitor (KQL).
- Automate for resilience, write reliable scripts in PowerShell, Bash, Python, or C#, embedding logging, rollback, and version control.
- Implement Infrastructure-as-Code, design and maintain Terraform, Bicep, or ARM templates to standardize and automate deployments.
- Optimize system performance, identify bottlenecks through deep monitoring, dump analysis, and right-sizing of cloud resources.
- Collaborate across engineering teams, integrate reliability principles into CI / CD pipelines and the broader SDLC.
- Participate in on‑call rotations, lead during critical incidents, ensuring lasting fixes and operational excellence.
Qualifications
What you’ll bring
5+ years in SRE, DevOps, or Cloud Infrastructure roles with experience in large-scale, distributed systems.Strong Azure and Kubernetes expertise (production‑level).Proven ability in observability engineering using Dynatrace, Honeycomb, Elastic, Grafana / Kibana, or Azure Monitor.Skilled in PowerShell, Bash, Python, or C#, with an automation-first mindset.Proficient in Infrastructure-as-Code (Terraform, Bicep, ARM).Solid grasp of TCP / IP, networking fundamentals, and performance tuning.Strong communicator able to translate complex technical findings into clear, actionable insights.Certifications preferred :Microsoft Certified : Azure Administrator Associate
Certified Kubernetes Administrator (CKA)Why you’ll love working here
Impact from day one – Join a scale‑up where your ideas shape how global businesses operate online.Continuous learning – Access a structured onboarding rated 9.1 / 10 by previous hires, mentorship, and feedback culture.Hybrid flexibility – Work from our Cape Town office 3 days per week and from home 2 days.Career growth – Expand your technical and leadership scope in a company built for long‑term success.Our values
At Sana Commerce, our values drive everything we do :
Champions of Our League – We deliver lasting success, balancing quick wins and long‑term value.Supercharge Our Customers – We help our customers lead and succeed.Determined to Grow – We embrace feedback and challenges to raise the bar.Bold Together – We take risks, collaborate deeply, and support each other.Ready to build reliability that scales?
Apply now and help shape the foundation of our next‑generation SaaS platform.
#J-18808-Ljbffr