Posted
Ref: SP766-02

Job description / Role

Job Type
Full Time
Job Location
Abu Dhabi, UAE
Nationality
Any Nationality
Salary
Not Specified
Gender
Not Specified
Arabic Fluency
Not Specified
Job Function
Administration & Secretarial
Company Industry
Government & Public Sector

AIOps / MLOps Engineer
GovDigital AI Factory is at the forefront of technological innovation across the Abu Dhabi government sector, building cutting-edge AI-powered solutions that radically transform how internal and public-facing government services operate. We are building the next generation of intelligent applications, seamlessly embedding AI to empower individuals and enhance efficiency across the ADGEs. We foster a culture of intellectual curiosity, deep collaboration, and a relentless pursuit of impactful solutions. We are passionate about leveraging the transformative power of Artificial Intelligence and Machine Learning to solve complex national challenges and are seeking a highly talented and driven AIOps/MLOps Engineer to join our purpose-driven and pioneering team.

As an AIOps/MLOps Engineer at GovDigital AI Factory, you will be a critical enabler of our AI vision, responsible for designing, building, and maintaining the infrastructure, tools, and processes that ensure the reliability, scalability, performance, and security of our AI/ML systems throughout their lifecycle. You will work closely with Data Scientists, Data Engineers, and Software Engineers to streamline the development, deployment, and monitoring of both traditional and generative AI models, ensuring seamless integration into the Abu Dhabi government's technology landscape.

What You'll Do:
• Design and implement robust and scalable MLOps pipelines for the end-to-end lifecycle of AI/ML models, from experimentation and training to deployment, monitoring, and governance, ensuring efficiency and reproducibility.
• Develop and maintain the AI/ML infrastructure leveraging cloud-native technologies (primarily Azure) and open-source tools, including compute resources, data storage, and networking, optimized for AI/ML workloads.
• Build and manage CI/CD pipelines for AI/ML models and related code, automating testing, validation, and deployment processes to ensure rapid and reliable delivery of AI capabilities.
• Implement comprehensive monitoring and observability solutions for AI/ML systems, tracking model performance, data drift, infrastructure health, and application logs to proactively identify and resolve issues.
• Develop and integrate AIOps capabilities to automate incident detection, root cause analysis, and remediation for AI/ML infrastructure and applications, improving system uptime and reducing manual intervention.
• Establish and enforce best practices for MLOps, including version control, experiment tracking (e.g., MLflow), model registry, deployment strategies (e.g., A/B testing, canary deployments), and security protocols, aligned with government regulations.
• Collaborate closely with Data Scientists, AI Engineers and Data Engineers to understand their needs and provide them with the necessary tools and infrastructure to accelerate their research and development efforts.
• Work with Software Engineers to seamlessly integrate AI/ML models into existing government applications and services, ensuring scalability, reliability, and performance.
• Implement and manage data governance and lineage solutions for AI/ML datasets and models, ensuring data quality, compliance, and auditability within the government framework.
• Automate infrastructure provisioning and management using Infrastructure-as-Code (IaC) tools (e.g., Terraform, ARM templates) to ensure consistency and scalability.
• Evaluate and adopt new AIOps/MLOps tools and technologies to continuously improve our AI/ML platform and processes.
• Troubleshoot and resolve issues related to AI/ML infrastructure, pipelines, and deployments in production environments, ensuring minimal disruption to critical government services.
• Document all aspects of the AI/ML infrastructure and MLOps processes clearly and concisely for both technical and non-technical stakeholders

Requirements:

What You'll Bring:
• Bachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field with a strong focus on automation and system reliability.
• At least 3 years of hands-on experience in building and managing infrastructure and pipelines for machine learning applications in a production environment.
• Strong understanding of the AI/ML lifecycle and the challenges associated with deploying and maintaining AI systems at scale.
• Proven experience with cloud platforms, especially Microsoft Azure, and their AI/ML services (e.g., Azure Machine Learning, Azure Kubernetes Service, Azure Data Factory).
• Extensive experience with containerization technologies (Docker) and orchestration frameworks (Kubernetes).
• Strong scripting and automation skills using languages such as Python, Bash, or PowerShell.
• Experience with CI/CD tools (e.g., Azure DevOps, Jenkins, GitLab CI) and Infrastructure-as-Code (IaC) tools (e.g., Terraform, ARM templates).
• Experience with monitoring and logging tools (e.g., Azure Monitor, Prometheus, Grafana, ELK stack) and the principles of observability.
• Familiarity with MLOps platforms and tools (e.g., MLflow, Kubeflow).
• Understanding data governance and security best practices in a cloud environment.
• Excellent problem-solving and troubleshooting skills with a systematic approach to identifying and resolving issues in complex systems.
• Strong collaboration and communication skills, with the ability to work effectively with Data Scientists, AI Engineers, Data Engineers, and Software Engineers.
• A proactive and automation-first mindset with a passion for building reliable and efficient AI systems.
• Familiarity with AIOps concepts and tools for intelligent incident management and automation.

Bonus Points For:
• Experience with specific AIOps platforms or tools.
• Experience with deploying and managing generative AI models in production.
• Knowledge of security best practices for AI/ML systems in a government context.
• Experience with performance tuning and optimization of AI/ML infrastructure and pipelines.
• Certifications in relevant cloud platforms or DevOps/MLOps technologies.
• Experience with data lineage and data quality tools.

About the Company

Welcome to Cloud June: Where Innovation Meets Possibility

Cloud June stands at the forefront of software, IT services, and transformative solutions, revolutionising industries by unlocking the true potential of businesses. As a trailblazing powerhouse, we empower companies to ideate, plan, and implement digital solutions for unprecedented growth.

Our innovative suite of offerings is meticulously engineered to guide businesses across industries through the challenges and opportunities presented by the digital age. As your strategic partner, Cloud June is dedicated to navigating the ever-evolving landscape of digital transformation with a focus on groundbreaking technologies and an unwavering commitment to excellence, propelling businesses toward success in an era defined by technological acceleration.

Candidates who applied for this job also applied for
AI Software Engineer Easy Apply
Michael Page
UAE 16 Dec
AI Tech Intern – GenAI & Automation Projects Easy Apply
Success Factor
Dubai 3 Jan
Data Engineer Easy Apply
Capgemini
Riyadh 12 Nov
AI Data Engineer Easy Apply
Michael Page
UAE 26 Jan
Data Researcher / Research Analyst – Automotive R&D (Junior Level) Easy Apply
Black Pearl
Dubai 8 Dec
Job Alerts by Email
  • Personalised updates on latest career opportunities
  • Insights on hiring and employment activity in your industry
  • Typically sent twice a month
Machine Learning Engineer salaries in UAE

Average monthly compensation
AED 7,500

Breakdown available for industries, cities and years of experience