Talent.com
Data Engineer (Hadoop Ecosystem)

Data Engineer (Hadoop Ecosystem)

PraesignisJohannesburg, Gauteng, South Africa
5 days ago
Job description

Overview

Job title : Data Engineer (Hadoop Ecosystem)

Job Location : Gauteng, Johannesburg

Deadline : November 22, 2025

Job Description

We are seeking a skilled Data Engineer to design and develop scalable data pipelines that ingest raw, unstructured JSON data from source systems and transform it into clean, structured datasets within the Hadoop-based data platform. The ideal candidate will play a critical role in enabling data availability, quality, and usability by engineering the movement of data from the Raw Layer to the Published and Functional Layers.

Responsibilities

  • Design, build, and maintain robust data pipelines to ingest raw JSON data from source systems into the Hadoop Distributed File System (HDFS).
  • Transform and enrich unstructured data into structured formats (e.g., Parquet, ORC) for the Published Layer using tools like PySpark, Hive, or Spark SQL.
  • Develop workflows to further process and organize data into Functional Layers optimized for business reporting and analytics.
  • Implement data validation, cleansing, schema enforcement, and deduplication as part of the transformation process.
  • Collaborate with Data Analysts, BI Developers and Business Users to understand data requirements and ensure datasets are production-ready.
  • Optimize ETL / ELT processes for performance and reliability in a large-scale distributed environment.
  • Maintain metadata, lineage and documentation for transparency and governance.
  • Monitor pipeline performance and implement error handling and alerting mechanisms.

Technical Skills & Experience

  • 3+ years of experience in data engineering or ETL development within a big data environment.
  • Strong experience with Hadoop ecosystem tools : HDFS, Hive, Spark, YARN and Sqoop.
  • Proficiency in PySpark, Spark SQL, and HQL (Hive Query Language).
  • Experience working with unstructured JSON data and transforming it into structured formats.
  • Solid understanding of data lake architectures : Raw, Published, and Functional layers.
  • Familiarity with workflow orchestration tools like Airflow, Oozie, or NiFi.
  • Experience with schema design, data modeling, and partitioning strategies.
  • Comfortable with version control tools (e.g., Git) and CI / CD processes.
  • Nice to Have

  • Experience with data cataloging and governance tools (e.g., Apache Atlas, Alation).
  • Exposure to cloud-based Hadoop platforms like AWS EMR, Azure HDInsight, or GCP Dataproc.
  • Experience with containerization (e.g., Docker) and / or Kubernetes for pipeline deployment.
  • Familiarity with data quality frameworks (e.g., Deequ, Great Expectations).
  • Research / Data Analysis jobs
  • #J-18808-Ljbffr

    Create a job alert for this search

    Data Engineer • Johannesburg, Gauteng, South Africa

    Related jobs
    • Promoted
    Data Engineer (Azure) - 6 month contract

    Data Engineer (Azure) - 6 month contract

    Ace StaffingJohannesburg, South Africa
    Bryanston, Johannesburg (Hybrid / remote).Contract : 6 months starting asap / 1 July 2025.A leading digital strategy and technology services company is seeking a skilled. This role involves designing, bu...Show moreLast updated: 30+ days ago
    • Promoted
    Data Platform Engineer

    Data Platform Engineer

    Boardroom AppointmentsJohannesburg, South Africa
    Data Platform Engineer - 12 Month Contract.Platform Engineering & Development.Design, implement, and maintain Big Data platforms (e. Hadoop, Spark, Kafka) used across the CIB environment.Build robus...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer (Integration for Data Science Team)

    Data Engineer (Integration for Data Science Team)

    PBT GroupJohannesburg, South Africa
    PBT Group is seeking a skilled.Data Science team responsible for designing, developing, and optimising data pipelines and integration frameworks to support advanced analytics and machine learning i...Show moreLast updated: 1 day ago
    • Promoted
    Intermediate Cloud Data Engineer – Johannesburg – up to R600k per annum

    Intermediate Cloud Data Engineer – Johannesburg – up to R600k per annum

    E-MergeJohannesburg, South Africa
    An independent, results-focused management consulting firm specialising in working with industry-leading banks and insurance companies around the world, including institutions in the United Kingdom...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer

    Data Engineer

    Network RecruitmentCenturion, South Africa
    Our client is a leading player in the financial services sector, known for its innovative approach to technology and data. With a strong focus on digital transformation, they offer a dynamic environ...Show moreLast updated: 30+ days ago
    • Promoted
    AWS Data Engineer (6-Month Contract)

    AWS Data Engineer (6-Month Contract)

    Visi SelectJohannesburg, Gauteng, South Africa
    AWS Data Engineer (6-Month Contract).Remote (supporting an international client).Design & optimise data pipelines and ETL processes. Work with AWS services : S3, Glue, Redshift, DBT, Spark, Terraform...Show moreLast updated: 5 days ago
    • Promoted
    Data Engineer-2

    Data Engineer-2

    RmbwestportGauteng, South Africa
    Showing 19 Data Engineer 2 jobs in Randburg.We are seeking Data Engineer (Front end Analytics) for a 12 month contract.Provide Technical Expertise to implement front-end analytics through Google An...Show moreLast updated: 5 days ago
    • Promoted
    Data Engineer

    Data Engineer

    FNB South AfricaJohannesburg, Gauteng, South Africa
    Talent Specialist @ FNB SA | Driving Business Process Improvement.To ensure effective movement, collection, integration, storage, and provisioning of data to meet business objectives through sound ...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineers (Denodo)

    Data Engineers (Denodo)

    InfyStratJohannesburg, Gauteng, South Africa
    InfyStrat is on the lookout for skilled and driven Data Engineers with expertise in Denodo to join our innovative data team. As a Data Engineer, you will be responsible for designing, building, and ...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Data Engineer (12 month contract) | Centurion

    Lead Data Engineer (12 month contract) | Centurion

    The Recruitment CouncilCenturion, Gauteng, South Africa
    Are you ready to take charge of cutting-edge data initiatives and lead a high-performing team? Our client is looking for a Lead Data Engineer to drive the design, development, and enhancement of en...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer / Lead - Hybrid

    Data Engineer / Lead - Hybrid

    Recruitment Dynamix (Pty) LtdJohannesburg, South Africa
    Data Lead (Engineer) to take ownership of the organisations data infrastructure, cloud environments, and IT services.This pivotal role will ensure that their data ecosystem is secure, scalable, r...Show moreLast updated: 1 day ago
    • Promoted
    Data Engineer

    Data Engineer

    Level-UpJohannesburg, South Africa
    We are seeking a skilled Data Engineer to design, develop, optimize, and manage robust, highly available data analytics infrastructure, reports, and data models. This role drives the delivery of hig...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Data and AI Engineer

    Lead Data and AI Engineer

    Boardroom AppointmentsSandton, South Africa
    Drive the vision, execution, and continuous improvement of the company's data and AI team.Mentor a high-performing team, instilling best practices and innovative thinking.Align projects with busine...Show moreLast updated: 30+ days ago
    • Promoted
    Senior / Lead Platform Engineer (Databrick)

    Senior / Lead Platform Engineer (Databrick)

    TymeXJohannesburg, Gauteng, South Africa
    TymeX Johannesburg, Gauteng, South Africa.DevSecOps, DataOps, and ML infrastructure.You will build, integrate and operate platforms on. Architect and implement end‑to‑end data and ML platforms : data...Show moreLast updated: 7 days ago
    • Promoted
    Senior / Lead Platform Engineer (Databricks)

    Senior / Lead Platform Engineer (Databricks)

    TymeXJohannesburg, Gauteng, South Africa
    We are seeking a Senior / Lead Platform Engineer who will take ownership of the design, implementation and operation of our core data, analytics and ML infrastructure. This role spans across platfor...Show moreLast updated: 2 days ago
    • Promoted
    AWS Data Engineer

    AWS Data Engineer

    PBT GroupJohannesburg, South Africa
    Ready to take your data engineering career to new heights?.Architect modern data analytics frameworks.Translate complex requirements into scalable, secure, high-performance pipelines.Build & optimi...Show moreLast updated: 30+ days ago
    • Promoted
    Data Engineer - AWS

    Data Engineer - AWS

    Psybergate (Pty) LTDCenturion, South Africa
    Design, develop and maintain complex data pipelines from multiple sources into a central data platform / lakehouse.Ensure reliability, scalability, and maintainability of pipelines.Optimize data flow...Show moreLast updated: 3 days ago
    • Promoted
    Kafka & Flink Data Engineer

    Kafka & Flink Data Engineer

    Pbt GroupJohannesburg, Gauteng, South Africa
    We are looking for an experienced Data Engineer with a strong background in building and optimising data processing systems. The ideal candidate will have proven expertise in distributed data proces...Show moreLast updated: 1 day ago