Job title : Platform Engineer (AWS, GitHub Actions, Heroku CI) (JHB)
Job Location : Gauteng, Johannesburg
Deadline : October 30, 2025
Quick Recommended Links
- Jobs by Location
- Job by industries
ENVIRONMENT :
A provider of cutting-edge Financial Tools in Joburg seeks the technical expertise of a Platform Engineer to manage Heroku pipelines, CI / CD, review apps, and production environments.You will also operate Celery workers and queues, monitor health, and handle missed task check-ins, manage Cloudflare for DNS, edge security, and performance optimisation & collaborate with Developers to streamline workflows and educate on secure coding practices.The ideal candidate must have 3+ years’ operating production apps on Heroku, AWS, DigitalOcean, or similar, CI / CD pipelines : Hands-on experience with GitHub Actions, Heroku CI, or equivalent; solid Git fundamentals and Monitoring & incident response : Experience with Sentry, Papertrail (or similar), logs, and uptime / performance dashboards.DUTIES :
Reliability & Operations –
Own uptime, performance, and monitoring for all production applications.Manage Heroku pipelines, CI / CD, review apps, and production environments.Operate Celery workers and queues, monitor health, and handle missed task check-ins.Define and track service level objectives (SLOs) (availability, latency, task success rate).Maintain runbooks, a centralised wiki for incident response, and lead post-mortems.Run periodic disaster recovery drills and coordinate Penetration Tests.Platform Engineering –
Keep environments current (Heroku stacks, Postgres / Redis versions, DO / AWS base images).Manage daily backups, ensure restore tests and disaster recovery runbooks are in place.Standardise infrastructure (Terraform or scripts for DO / AWS; app.json for Heroku).Manage Cloudflare for DNS, edge security, and performance optimisation.Tune performance (DB indices, query optimisation, cache usage, Celery queue design).Optimise infrastructure costs across Heroku, DigitalOcean, and AWS.Developer Experience & CI / CD –
Maintain CI pipelines with type checking, linting, and security scanning.Enforce test coverage and automate deploy checks (smoke tests, migration health, error budgets).Support Developers with tooling for local / staging environments and build self-service dashboards (e.g., Celery queue status).Collaborate with Developers to streamline workflows and educate on secure coding practices.Security & Compliance –
Own vulnerability management and dependency patching cadence.Manage access reviews, secrets, MFA / SSO, and enforce least-privilege IAM policies.Implement encryption for data at rest and in transit (e.g., S3 server-side encryption).Contribute evidence and responses for security questionnaires and SOC 2 audits.Maintain a “security pack” with architecture, sub-processors, and DR / backup processes.Monitoring & Alerting –
Configure Sentry ownership rules, Cron Monitors, and release health.Centralise metrics / logs (Heroku metrics, Papertrail, Sentry, APM, Prometheus / New Relic).Set up alerts on golden signals (latency, errors, traffic, saturation) and avoid alert fatigue.Conduct capacity planning and track resource usage trends.Vendor & External Services –
Evaluate and manage vendor relationships (e.g., Mailgun, Twilio) to ensure service level agreements (SLAs) and performance.Assess new tools / services to enhance platform capabilities (e.g., observability, security).Track costs, security posture, and integration quality for all third-party services.REQUIREMENTS : Must-Haves –
Cloud Infrastructure Management : 3+ years’ operating production apps on Heroku, AWS, DigitalOcean, or similar.CI / CD pipelines : Hands-on experience with GitHub Actions, Heroku CI, or equivalent; solid Git fundamentals.Monitoring & incident response : Experience with Sentry, Papertrail (or similar), logs, and uptime / performance dashboards.Security Fundamentals : Understanding of IAM, encryption in transit / at rest, MFA / SSO, and secure configuration practices.Disaster recovery & backups : Experience implementing and operating automated backups, restore testing, and writing / maintaining incident runbooks.Communication & collaboration : Ability to document processes clearly and work closely with Developers in a small team.Strong Plus –
Infrastructure as Code & automation : Experience with Terraform, Docker, or equivalent tooling.Asynchronous workloads : Familiarity with Celery, Redis, or other task queues and message brokers.Scaling & cost optimisation : Capacity planning, performance tuning, and managing infra spend.Compliance frameworks : Exposure to SOC 2, GDPR, or supporting client security questionnaires.Incident management : Participation in on-call rotations, leading post-mortems, or serving as incident commander.Nice-to-Haves –
Certifications (AWS Certified DevOps Engineer, CKS, or equivalent).Proficiency in Python; familiarity with Django / Flask.Experience with DNS / CDN / edge security (e.g., Cloudflare).Observability platforms (Prometheus, Grafana, New Relic).Static analysis and code quality tools (mypy, Bandit, SonarQube).Prior exposure to multi-tenant SaaS environments.ICT jobs