Job description / Role
• Define the architecture, scope and deliver various Big Data solutions.
• Build and maintain large scale deployment of data lakes & power analytics
• Build and maintain large scale deployments of elastic search
• Should be able to make healthcare data conversion to multiple formats with compliances in place
• Support other teams by providing guidance on data modelling, data usage, processing and how they can best leverage the platform
• Build scalable data pipelines to ingest data from a variety of data sources, identify critical data elements and define data quality rules.
• Leverage Spark/Hadoop ecosystem knowledge to design and develop capabilities to deliver innovative and improved data solutions.
• Provide insights on area of improvements including Data Governance, best practices, large scale processing
• Support the bug fixing and performance analysis along the data pipeline
• Collaborate, coach, and mentor colleagues in an energetic, growing team
• Complete end-to-end ownership from requirements to go-live
• Bracing ambiguity and prioritizing right items
• Managing stakeholders and driving business goals
• Willing to take challenges and step out of comfort zone
Qualifications and Experience:
• 4+ years of experience as software engineer, with strong skills in at least one programming language is mandatory, preferably Scala or Java or Python
• 1+ years of experience with MLOps
• 1+ year of experience as a Big Data Engineer or similar role
• 1+ year of experience with Hadoop and/or Spark
• Expertise with distributed systems and design/implementation for reliability, availability, scalability and performance
• Proven experience with Cloud technologies like Object Blob storage, Map reduce, Infrastructure as code.
• Creative and innovative approach to problem-solving
• Experience with CICD using Jenkins, Terraform or other related technologies
• Familiarity with containerized platform like Docker and Kubernetes
• Experience working with real time data processing using Kafka, Spark Streaming or similar technology
• Experience working on Delta lake, Parque, Elastic search & No-sql datastores
• Experience working with Hive, Presto or other querying frameworks.
• A high level of attention to detail and the ability to produce accurate and consistent engineering documentation
• A desire to contribute to and maintain the company values and culture and ability to work collaboratively across cross cultural team
About the Company
Group 42 is an Abu Dhabi based artificial intelligence (AI) and cloud computing company, uniquely positioned in the national ecosystem to develop and deploy holistic and scalable AI solutions.
• Industry Solutions: experienced team of data scientists and engineers based in Abu Dhabi. • Fundamental AI research, through our subsidiary the Inception Institute of Artificial Intelligence (IIAI), on AI, big data and machine learning. • Cloud Computing Infrastructure: the largest and most powerful Cloud Computing capability in the region. • Multidisciplinary and diverse team.
G42 has an active and extensive partnership network, connecting leading international organizations who complement our ecosystem and support our vision. Our partnerships range from strategic teaming agreement, joint ventures, to direct investment by G42.