Senior Data Engineer (Java and Hadoop)

ResearchGate

We connect the world of science and make research open to all.

About the Company

ResearchGate is the professional network for scientists and researchers, created by scientists to foster scientific collaboration and drive progress for a better world. It is the largest professional network serving over 25 million members from more than 193 countries. ResearchGate’s mission is to connect the world of science and make research open to all by providing scientists with the tools and support needed to engage with the most active scientific community globally. The company embraces a remote-first culture with flexible and mobile working options and is based in Berlin.

About the Role

The Senior Data Engineer with Java will collaborate with Data Scientists to implement machine learning (ML) systems in production. The role involves ownership of continuous maintenance and improvement of ML-related components, with a focus on recommendation-related products. The engineer will develop, implement, and maintain Java and Python-based services (REST, queue, and batch-based), build efficient batch and stream processing pipelines to handle large-scale data workloads, and create robust, evolvable solutions with attention to quality of service and data integrity.

Responsibilities

  • Build infrastructure enabling workflows involving large datasets and/or ML models in production using distributed computing and big data processing technologies
  • Take ownership of technical design, monitoring, and maintenance of Java and Python microservices
  • Proactively identify and evaluate opportunities for new data products and automation
  • Assess impact, risks, and technical/data feasibility of potential new data products or automation initiatives in collaboration with engineering and product teams
  • Contribute to developing the overall ML strategy for the function or business unit
  • Support team development by preparing and running trainings, mentoring colleagues, participating in hiring processes, and engaging in employer branding initiatives

Required Skills

  • Expert knowledge of Java (5+ years of experience) with working knowledge of Python, especially in ML ecosystems
  • Experience designing and implementing batch and streaming data pipelines
  • Proven track record developing microservices with a strong understanding of REST principles
  • Familiarity with DevOps tools such as Docker and Kubernetes
  • Proficiency in SQL; experience with BigQuery is a plus
  • Experience working in cloud services environments
  • Excellent command of English with strong communication skills

Preferred Qualifications

  • Experience with a wide range of data technologies including queue-based integration (Kafka, ActiveMQ), NoSQL databases (e.g., MongoDB), and Big Data tools (Hadoop ecosystem, Flink)
  • Experience applying large language model (LLM)-based approaches in production systems
  • Experience with recommendation or search applications

The ideal candidate excels at building maintainable, efficient, and scalable software, enjoys deep-diving into challenging problems, is skilled in optimizing code for performance and stability, thrives in collaborative agile environments, and is motivated by ResearchGate’s mission to change science for the better.

Visit the official website below to access the full details of this vacancy:

Copyright © 2025 hadoop-jobs. All Rights Reserved.