ROLE SUMMARY
As Manager – AI Engineering, you will lead a multidisciplinary team of AI Engineers and Data Scientists delivering GenAI-powered solutions, agentic AI systems, LLM-integrated applications, machine learning models, and retrieval-enabled intelligent products. This role combines technical leadership, people management, delivery accountability, and product ownership. You will be responsible for ensuring that the team delivers scalable, secure, supportable, and business-relevant AI and ML solutions from discovery through production operation and continuous improvement.
This is not a lightweight line-management role. The successful candidate must be technically strong enough to guide senior engineers, data scientists, and technical leads, make sound architectural and implementation trade-offs, and raise the engineering bar of the team. You are expected to provide hands-on technical direction across LLM-powered systems, agent orchestration, RAG, ML model development, MLOps, cloud-native application delivery, integrations, observability, and SDLC discipline, while also building a culture of accountability, ownership, experimentation rigor, and engineering excellence.
You will work closely with Product Managers, Solution Architects, Lead AI Engineers, Lead Data Scientists, Platform teams, DevOps, Security, and business stakeholders to translate business goals into implementable engineering work and durable product outcomes.
KEY RESPONSIBILITIES
Lead and mentor a team of AI Engineers and Data Scientists across varying experience levels, including senior engineers and technical leads, ensuring strong delivery quality, modeling rigor, engineering discipline, and technical growth
Own end-to-end delivery of GenAI, agentic AI, and ML-enabled solutions — from discovery, design, and backlog shaping through development, testing, deployment, monitoring, and continuous improvement
Guide architectural and design decisions across LLM-powered systems, embeddings pipelines, retrieval-augmented generation (RAG), agent orchestration, tool-enabled workflows, ML model pipelines, APIs, and cloud-native application services
Oversee the design, development, evaluation, and operationalization of machine learning models for predictive, classification, recommendation, anomaly detection, forecasting, optimization, or other business use cases where applicable
Ensure strong practices across feature engineering, experimentation, validation, model performance assessment, explainability, drift awareness, and MLOps readiness
Translate business and product priorities into executable engineering and data science plans, technical workstreams, model delivery milestones, and sustainable release outcomes
Review and challenge implementation choices to ensure systems and models are scalable, secure, observable, cost-aware, and maintainable
Partner with Product Managers and engineering leadership to prioritize work, manage technical dependencies, and align delivery with roadmap goals
Establish strong SDLC and build-own-operate practices within the team, including design reviews, code reviews, model reviews, automated testing, release readiness, production support, reliability improvement, and technical debt management
Drive reuse and productivity by scaling frameworks, shared components, prompt templates, orchestration patterns, feature templates, modeling utilities, evaluation frameworks, and internal accelerators across the team
Promote a culture of responsible AI and operational excellence, emphasizing security, token and cost governance, model safety, quality, observability, reproducibility, and supportability
Coordinate with cloud platform, DevOps, Security, Integration, Architecture, and data teams to ensure enterprise readiness of all deployments
Own hiring, onboarding, coaching, performance development, and growth plans for both the AI Engineering and Data Science team members
Track and communicate KPIs related to delivery, GenAI adoption, model quality, latency, business impact, reliability, experimentation outcomes, and production performance
Act as the primary technical and delivery escalation point for the team, helping remove blockers and resolve design, modeling, execution, and operational issues
Required Qualifications
10+ years of experience in software engineering, AI / ML engineering, data science, solution engineering, or technology delivery, including strong experience building and operating production-grade intelligent systems
4 to 5 + years of experience delivering or leading AI / ML / GenAI / LLM-powered solutions in enterprise or product environments
Proven experience leading multidisciplinary teams or technical pods delivering LLM-powered products, agentic AI workflows, machine learning models, or AI-enabled application capabilities, with accountability for both technical quality and delivery outcomes
Strong technical depth in Python, modern backend engineering, machine learning solution delivery, API-first architectures, microservices, distributed systems, and cloud-native application delivery
Strong hands-on or design-level experience with LLM platforms and orchestration frameworks such as Azure OpenAI, Azure AI Studio, Semantic Kernel, LangChain, AutoGen, or equivalent platforms used for enterprise GenAI delivery
Strong experience designing or guiding implementations involving retrieval-augmented generation (RAG), embeddings pipelines, vector search, grounding strategies, and retrieval optimization using platforms such as Azure AI Search, Pinecone, Weaviate, FAISS, or equivalent
Strong understanding of machine learning model development, including feature engineering, model training, validation, tuning, evaluation, performance interpretation, and production-readiness considerations
Practical experience guiding or reviewing MLOps practices, including experiment tracking, model versioning, deployment automation, CI/CD for ML, monitoring, drift detection, retraining readiness, and reproducibility
Experience building and deploying AI- and ML-enabled cloud-native services using technologies such as Azure Functions, Azure Container Apps, FastAPI, Docker, Azure DevOps, GitHub, GitHub Actions, Kubernetes / AKS, Azure Machine Learning, Databricks, MLflow, or equivalent engineering and deployment platforms
Strong understanding of CI/CD, containerization, deployment automation, secure delivery practices, and operational readiness for AI-driven and ML-enabled systems
Knowledge of Model Context Protocol (MCP), agent-to-agent (A2A) interaction models, memory / context management approaches, and other distributed AI coordination patterns
Practical experience with observability and operational tooling such as Application Insights, Azure Monitor, OpenTelemetry, Log Analytics, Datadog, New Relic, or equivalent platforms, including monitoring of reliability, latency, cost, runtime behavior, and model / workflow health
Strong understanding of agentic AI implementation patterns, including multi-step orchestration, tool calling, context management, and workflow decomposition
Experience integrating AI- and ML-enabled solutions with REST APIs, enterprise systems, workflow platforms, event-driven services, or downstream business applications
Demonstrated ability to translate business and product needs into scalable, secure, and maintainable AI / ML engineering solutions, while guiding teams on implementation trade-offs, experimentation choices, and delivery sequencing
Strong SDLC ownership mindset across design, build, testing, deployment, support, reliability improvement, model lifecycle management, and long-term maintainability
Proven ability to raise engineering and data science quality through code reviews, model reviews, design guidance, architectural mentoring, coaching of senior engineers and data scientists, and reinforcement of reusable patterns and standards
Strong people leadership capability, including coaching, feedback, performance management, capability development, and fostering accountability and engineering excellence
Strong collaboration and communication skills, with the ability to work effectively across engineering, data science, product, platform, architecture, DevOps, and business stakeholders
Preferred Qualifications
Experience leading implementations involving agentic AI workflows, multi-agent coordination, tool-enabled automation, reusable orchestration abstractions, or structured task delegation patterns
Experience with AI observability, prompt safety, runtime guardrails, hallucination mitigation, evaluation frameworks, quality monitoring, and enterprise governance practices for GenAI systems
Familiarity with broader enterprise AI and ML platforms such as Microsoft AI Foundry, Azure Machine Learning, PromptFlow, MLflow, Databricks, or equivalent AI / ML lifecycle and experimentation ecosystems
Experience leading or supporting teams working on forecasting, optimization, recommender systems, anomaly detection, classification, NLP, or hybrid ML + GenAI solutions
Experience contributing to reusable GenAI accelerators, internal SDKs, orchestration templates, prompt frameworks, feature templates, evaluation patterns, modeling utilities, or shared engineering utilities
Familiarity with enterprise integration landscapes involving SAP, ServiceNow, API management layers, workflow systems, event buses, and business process platforms
Experience with cost-aware AI and ML delivery, including token usage visibility, model selection trade-offs, compute efficiency, scaling considerations, and engineering productivity optimization
Ability to communicate complex technical and analytical decisions clearly to both engineers and non-technical stakeholders, and to represent team direction confidently in leadership forums
Experience operating in a build-own-operate product environment
Knowledge of responsible AI, AI quality engineering, governance-by-design, model risk awareness, and compliance-aware delivery in enterprise environments