BigData/Hadoop Developer

HSTechnologies LLC

Value Driven Solutions Delivered

Job Summary:

We’re looking for a skilled professional to lead the design, architecture, and development of scalable machine learning components and services. You’ll play a key role in enabling advanced analytics by managing large-scale data ingestion and infrastructure.

Key Responsibilities:

  • Design and develop systems for large-scale machine learning applications.

  • Manage data ingestion from a variety of sources, including SAP and non-SAP systems like iEnergy and MDMS.

  • Maintain and enhance a big data platform tailored to the organization’s needs.

  • Recommend the right tools, frameworks, and architectures for batch and real-time data processing.

  • Design solutions at the cluster level and develop enterprise-grade applications with thorough unit testing.

  • Build data pipelines from source systems to SFTP, then from SFTP to Hadoop landing zones using Talend.

  • Create automated ingestion frameworks in Talend to synchronize data between Hadoop and SAP HANA in both directions.

  • Execute complex data queries using techniques such as bucketing, partitioning, joins, and sub-queries.

  • Develop advanced big data applications using both functional and object-oriented programming techniques.

  • Use Spark/Scala for complex data transformations using DataFrames and Datasets.

  • Build standalone Spark/Scala apps that analyze error logs from multiple sources and perform data validations.

  • Write build scripts with tools like Apache Maven, Ant, and SBT, and deploy code through Jenkins as part of a CI/CD pipeline.

  • Develop complex workflow jobs in Redwood and configure job scheduling systems for Hadoop, Hive, Sqoop, and Spark.

  • Monitor and troubleshoot pipeline jobs and configure Redwood Scheduler properties.

  • Create Kafka producers to process real-time streaming data.

  • Mentor junior engineers and contribute to team knowledge sharing.

  • Document technical and functional specifications in line with company standards.

  • Perform real-time data validation and cleanup using Spark, Spark Streaming, and Scala.

Required Qualifications:

  • Bachelor’s degree in Computer Science or a related field.

  • Experience with technologies such as Cloudera Hadoop (CDH), Cloudera Manager, Informatica BDM, HDFS, YARN, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix, Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell scripting, Eclipse IDE, Git, and SVN.

  • Strong analytical and problem-solving skills.

  • Ability to assess complex issues, analyze related data, and develop effective solutions.

To Apply:

If you’re interested in a dynamic, fast-paced work environment and enjoy challenging and impactful projects, please send your resume to:

HSTechnologies LLC
2801 W Parker Road, Suite #5
Plano, TX – 75023
Email: [email protected]

Copyright © 2025 hadoop-jobs. All Rights Reserved.