Hadoop Data Engineer

DGTL Performance

La BI au service des entreprises.

About the Company

DGTL Performance is a digital services company specializing in DATA and business intelligence. Founded in 2018, the company serves clients in France and abroad, offering a variety of services such as subcontracting, pre-employment, recruitment, and business outsourcing. With a focus on proximity, fair pricing, and ethical responsibility, DGTL supports clients in optimizing their data strategies and solutions.

About the Role

The Data Engineer (Spark/Hadoop) will play a crucial role in managing and optimizing big data pipelines. The position involves working with cutting-edge technologies in the Hadoop ecosystem, focusing on high-performance batch processing, data transformations, and system optimizations. The engineer will be part of a dynamic team, collaborating with clients in the insurance sector, and ensuring the effective implementation of data solutions.

Responsibilities

  • Lead and mentor the team, providing guidance on both technical and functional aspects.
  • Analyze and simplify complex technical concepts for easier understanding and communication.
  • Manage and optimize Spark-based data pipelines, ensuring high efficiency and performance.
  • Work on Hadoop ecosystem components such as Hive, HDFS, Yarn, and HBase.
  • Implement data quality measures, data transformations, and mapping processes.
  • Apply Agile methodologies (SAFe) and tools like JIRA in daily operations.
  • Support data governance practices and ensure effective data lifecycle management.
  • Collaborate with teams in the insurance industry (life insurance and provident insurance).
  • Ensure optimal performance of batch processing in Spark.

Required Skills

  • Java: Strong experience in Java programming for data solutions.
  • Spark: Deep expertise in Spark and its optimization for data processing.
  • Hadoop Ecosystem: Excellent knowledge of Hadoop components, including Hive, HDFS, Yarn, HBase.
  • Databases: Experience with both relational and NoSQL databases.
  • CI/CD Tools: Familiarity with Git, Jenkins, and Kubernetes for continuous integration and deployment.
  • Data Governance: Strong understanding of data governance and management practices.

Preferred Qualifications

  • Experience with streaming systems like Spark Streaming and Kafka.
  • Familiarity with MapR, Oozie, and AirFlow.
  • Expertise in data modeling for analytical and transactional systems.
  • Knowledge of best practices for data governance and lifecycle management.

Head to the official website below for the full vacancy description and requirements:

Copyright © 2025 hadoop-jobs. All Rights Reserved.