Purpose of the Role
The Senior Data Engineer is the architect and builder of Dis-Chem Lifes data foundation, creating the infrastructure that turns raw information into a strategic asset. This role goes far beyond moving data from A to B, it is about designing high-performance, future-proof systems that make data accurate, accessible, and truly powerful.
By developing best-in-class data pipelines, warehouse systems, architecture, and governance frameworks the Senior Data Engineer enables the entire organisation, from the actuarial, data science and analytics teams to general operations, to work with clean, structured, and reliable datasets at scale while protecting our customers data privacy as stipulated in the POPI Act.
Solving hard engineering problems, building resilient ingestion frameworks, handling messy and complex source systems, optimising cloud architecture for cost and performance, and ensuring that every downstream user can focus on insight and innovation rather than wrangling.
The ultimate purpose is to build and continuously evolve a scalable, intelligent data platform that grows with Dis-Chem Lifes ambitions, fuels advanced analytics and modelling, unlocks automation, and sets a new benchmark for how data drives customer intelligence and operational excellence in the South African insurance industry.
Summary of the Role
The Senior Data Engineer is responsible for designing, implementing, and maintaining the core technical solutions that keep Dis-Chem Lifes running at peak performance. This includes building scalable and resilient data ingestion frameworks, integrating complex source systems, and optimising cloud architecture for both performance and cost efficiency. The role requires deep hands-on experience with modern data engineering tools, ETL / ELT processes, workflow orchestration, and cloud platforms. Strong problem-solving skills, precision, and the ability to collaborate seamlessly with analytics, AI, and automation teams are essential. The Senior Data Engineer continuously drives improvements in data processes and platform efficiency, ensuring the organisation can rely on high-quality, reliable data to make faster, smarter, and more impactful decisions.
Benefits
- Competitive salary
- Direct and significant influence over building the companys data backbone as we are still in early development stages.
- Exposure to advanced analytics and AI projects with real-world business impact
- Access to modern cloud, orchestration, and automation technologies
- Hybrid working model with flexibility and autonomy
- Will be working with interesting datasets comprising health data, customer behaviour, payments, retail spend, etc.
Key Responsibilities
Build & Maintain Data Pipelines, Architecture, and SoftwareDesign, develop, optimise, and monitor scalable ETL / ELT pipelines and warehouse systems,Implement, monitor, and maintain reporting and analytics software.Architect robust, future-proof data infrastructure to support advanced analytics, AI, and automation.Ensure performance, reliability, and security across all data systems.Ensure Data Quality, Reliability & AccessibilityImplement rigorous data quality validation, monitoring, and governance to guarantee data integrity.Deliver clean, well-structured datasets that downstream teams can confidently use.Minimise time spent on data cleaning and wrangling for actuaries, data scientists, and operational BI analysts .Enable AI, Analytics & AutomationPrepare AI-ready datasets with consistency, scalability, and timeliness in mind.Collaborate with data scientists to build feature pipelines for predictive modelling.Support advanced automation workflows and real-time data requirements.Scale Data ArchitectureDesign and optimise best-in-class data architecture to be capable of handling increasing data volumes and complexity.Leverage cloud-native solutions to enable rapid scaling, flexibility, and cost efficiency.Continuously enhance data infrastructure performance and reduce operational costs.Handle Complex Engineering ChallengesOwn the technical work of data ingestion, transformation, and orchestration.Solve challenging engineering problems to allow teams to focus on insights, models, and decisions.Act as the go-to expert for ensuring data is accessible, accurate, and usable.Collaboration & Knowledge SharingWork closely with analysts, actuaries, and data scientists to understand evolving data needs.Document data flows, definitions, and system processes to ensure transparency and reusability.Mentor colleagues and promote best-practice data engineering across the organisation.Soft Skills
Obsessed with clean, high-quality data and how it drives better models / decisionsCollaborative mindset, thriving at the intersection of engineering and analyticsStrong communicator, able to explain complex engineering choices to non-technical usersDetail-driven but pragmatic, balancing precision with speed in deliveryCurious, innovative, and always seeking ways to improveTechnical Skills
Data Architecture - design and implementation of scalable, maintainable data systems, defining data flows, and establishing architectural patterns for enterprise-scale solutionsAdvanced SQL - extraction, transformation, and optimisationPython Programming - strong skills (pandas, PySpark) for data pipelines and scientific workflowsBig Data Frameworks - hands-on experience with at least one major framework (Hadoop, Spark, Kafka, Elastic Stack, or Databricks)Database Expertise - proficiency across all industry standard types including relational (PostgreSQL, MySQL), NoSQL (MongoDB, Cassandra). Understanding of lesser used types including time-series (InfluxDB, TimescaleDB) and graph databases (Neo4j)Data Modelling - dimensional modelling, normalisation, star / snowflake schemas, and designing efficient data structures for analytical workloadsData Lifecycle Management - end-to-end data management including ingestion, storage, processing, archival, retention policies, and data quality monitoring throughout the pipelineData Science Integration - familiarity with feature stores, model-serving pipelinesETL / ELT Tools - hands-on experience with tools like dbt, Windmill, Airflow, FivetranCloud Platforms - experience with AWS, Azure, or GCP and modern warehouses (Snowflake, BigQuery, Redshift)Streaming Data - knowledge of real-time data processing (Kafka, Spark, Flink)Infrastructure Management - experience with Docker, Kubernetes, container orchestration, and managing scalable data infrastructure deployments is advantageousAPIs & Integrations - understanding of APIs, integrations, and data interoperabilityVersion Control - Git and CI / CD practices for production data pipelinesData Governance - familiarity with governance and compliance (POPIA, FAIS)Experience
5 - 7 years in a Data Engineering or related technical roleProven ability to deliver clean, scalable pipelines supporting analytics and AIHands-on work with cloud-native and warehouse systemsExperience collaborating with Data Science teams to deliver AI / ML-ready datasetsExposure to regulated industries (insurance / finance) advantageousQualifications
Bachelors degree in Data Engineering, Computer Science, Information Systems, or related fieldCloud certifications (AWS, Azure, GCP) or Data Engineering credentials preferredAdvanced SQL and Python certifications are advantageous