Working with simple Aim to keep all the Consultants and Customers Happy
About the Company
Centraprise is a leading provider of IT consulting and services, focused on driving innovation and delivering high-quality solutions to clients. The company is committed to fostering growth and collaboration, creating a work environment that encourages creativity and problem-solving to meet industry demands.
About the Role
A highly skilled Hadoop Developer is needed to join a dynamic team. With 5+ years of experience in data warehousingand Hadoop ecosystems, the ideal candidate will have a solid understanding of Cloudera, Hadoop, and Big Data technologies such as Hive, Spark, Python, and Scala. The role involves working with a Big Data implementation in a production environment, applying the latest tools within the Hadoop ecosystem, and solving complex technical challenges. Proficiency in Python, Unix shell scripting, Autosys, and SQL is required, with a strong focus on performance tuning and query optimization.
Key Responsibilities
-
Design, develop, and implement Big Data solutions using Hadoop, Spark, Hive, and related technologies.
-
Collaborate with multiple technology teams to ensure successful project delivery.
-
Manage scheduling, change controls, and delivery timelines to meet project requirements.
-
Optimize and tune complex SQL queries for performance.
-
Assist with the development and implementation of best practices for data warehousing and data pipelinemanagement.
-
Facilitate clear communication between technical teams and stakeholders.
-
Identify risks and resolve critical path issues to maintain project momentum.
Required Skills and Experience
-
5+ years of experience in data warehousing and Big Data technologies.
-
Minimum 4 years of experience with Cloudera and Hadoop ecosystem (Hive, Spark, etc.).
-
Proficiency in Python, Scala, SQL, and Unix shell scripting.
-
Experience with Autosys and knowledge of Agile methodologies.
-
Strong expertise in query optimization and performance tuning.
-
Familiarity with distributed systems and relational databases.