Job Title / Designation: Data Engineer – Data Lake & Analytics Platform

Number of Vacancies: 1
Functional Area: Data Engineering / Data Analytics
Industry: IT
Location of Job: Kampala, Uganda
Salary Offered: Negotiable
Industry: IT Software & Services / Systems Integration
Qualification: Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field.
Contact Person: HR Manager: careeruganda@sybyl.com

Role Overview

The Data Engineer will be responsible for designing, building, and maintaining the core data infrastructure within the organization’s Data Fabric environment. The role focuses on developing robust ETL/ELT pipelines using Sparkflow or similar tools, optimizing data lake storage, and ensuring reliable data delivery for downstream analytics in Power BI.

Working closely with AI/ML engineers, software teams, and business analysts, the Data Engineer will transform raw data into structured, production-ready datasets that support advanced analytics, reporting, and machine learning initiatives.

Responsibilities

  • Data Fabric Management, design and optimize data lake structures and schemas within the Data Fabric environment.

  • Ensure scalable architecture capable of handling large volumes of structured and unstructured data.

  • Pipeline development, develop, test, and deploy scalable ETL/ELT pipelines using Sparkflow or Apache Spark.

  • Build reusable data components and standardized transformation frameworks to accelerate data delivery.

  • Implement pipelines that support both large batch processing workloads and near real-time data ingestion.

  • Automate pipeline operations to ensure reliability, monitoring, and maintainability.

  • Optimize data access patterns and storage formats to ensure high performance for both Spark processing and Power BI querying.

  • Manage the full data lifecycle, including ingestion, storage, retention, and archival to support analytics and AI workloads.

Data Quality & Governance

  • Define and enforce data quality standards to ensure accuracy, completeness, and consistency across the data platform.
  • Maintain metadata documentation, lineage tracking, and cataloging to support governance and audit requirements.
  • Ensure all data engineering solutions comply with applicable data privacy regulations and cybersecurity policies.
  • Support the implementation of data governance frameworks across the organization.
  • Stakeholder & BI Support, Collaborate with business analysts and data consumers to translate reporting and analytics requirements into technical data models.
  • Design and maintain curated data layers that serve as optimized data sources for Power BI dashboards.
  • Provide engineered datasets and features that support AI/ML model development and training.
  • Work closely with analytics teams to ensure semantic models and datasets support enterprise reporting needs.

Technical Competencies

  • Hands-on experience with Power BI, including semantic model design, DAX, DirectQuery and Import modes, and enterprise data source integration.
  • Strong proficiency in Python and/or Scala for data transformation, pipeline development, and automation.
  • Solid understanding of data lake design principles, including Delta Lake or Apache Iceberg table formats, schema evolution, and medallion architecture (Bronze, Silver, Gold layers).
  • Experience working with data cataloging and metadata management tools such as Apache Atlas, Alation, or similar platforms.
  • Strong knowledge of SQL, NoSQL databases, and modern data warehousing concepts, including dimensional modelling.

Required Qualifications & Experience

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related discipline.
  • Minimum of 3 years of practical experience in data engineering, including building and deploying production data pipelines.
  • Proven experience working with Data Fabric environments or similar Hadoop or S3-based data platforms.
  • Strong expertise in Apache Spark or Sparkflow (PySpark or Scala) for distributed data processing.
  • Advanced SQL capabilities and experience optimizing data models for BI tools such as Power BI.
  • Proficiency in Python for data transformation, scripting, and pipeline automation.
  • Deep understanding of distributed computing, data warehousing architectures, and data modeling techniques.

Preferred Skills

  • Experience with MLOps or DataOps frameworks such as Apache Airflow, Docker, or CI/CD automation pipelines.
  • Familiarity with real-time streaming technologies such as Kafka or Amazon Kinesis.
  • Experience working in regulated industries such as banking, financial services, or government systems.
  • Professional certifications related to HPE Ezmeral, Microsoft Power BI, or Cloud Data Engineering platforms.
 
 

    Upload Your CV: