Senior AI/ML Engineer

Chennai, India Senior Posted 2026-05-12

Don't apply into the void — reach the hiring manager

ResuMail finds the recruiters and hiring managers behind this Senior AI/ML Engineer role at Zocket, drafts a personalised outreach email, and schedules the send — so your application actually gets seen.

Reach the hiring manager ›

About this role

About the role We are looking for a Senior AI/ML Engineer who can take AI features from prototype to production with confidence. You will own the full lifecycle of our LLM-powered systems — from benchmarking model and pipeline performance, to hardening the stack for scale, to shipping it live to real users. This role sits at the intersection of applied LLM/GenAI work and MLOps, and is critical to how quickly and reliably we can put new AI capabilities in front of customers. You will work closely with product, backend, and design to make sure what we ship is fast, accurate, cost- efficient, and observable in production. What you will do • Design, build, and ship LLM-powered features end-to-end — including RAG pipelines, agentic workflows, prompt orchestration, and fine-tuning where it makes sense. • Define and run benchmarking frameworks for our AI applications: latency, throughput, accuracy, hallucination rate, cost per request, and quality regressions across model and prompt changes. • Establish offline evals (golden sets, LLM-as-judge, human-in-the-loop) and online evals (A/B tests, shadow traffic, canary releases) before any model or prompt goes live. • Take models and pipelines to production: containerize, deploy, autoscale, and monitor inference services with clear SLOs for latency, error rate, and cost. • Build the MLOps backbone — CI/CD for models and prompts, versioning, feature stores where needed, observability (traces, metrics, logs), and rollback paths. • Optimize inference performance and cost: batching, caching, quantization, distillation, model routing, and choosing the right managed vs. self-hosted trade-offs. • Partner with product to translate fuzzy product asks into measurable AI quality bars, and own the “is this good enough to ship?” decision with data behind it. • Mentor other engineers on LLM best practices, eval rigor, and production readiness. What we are looking for • 3–6 years of engineering experience, with a meaningful portion spent shipping ML or AI systems to production (not just notebooks or POCs). • Strong hands-on experience with LLMs and GenAI: at least one production system using OpenAI / Anthropic / open-source models, plus practical experience with RAG, embeddings, vector stores, and prompt engineering. • Solid MLOps foundation — model serving (FastAPI, vLLM, Triton, SageMaker, or similar), containerization (Docker, Kubernetes), and at least one cloud (AWS, GCP, or Azure). • Demonstrated ability to benchmark systems rigorously: you can talk concretely about how you measured a model’s quality and performance, what you optimized, and what you knowingly traded off. • Strong Python skills; comfortable with PyTorch or TensorFlow, and with frameworks like LangChain, LlamaIndex, or equivalent (or a clear point of view on why not to use them). • Good engineering discipline: testing, code review, clear API design, and the instinct to add observability before it is needed. • Comfortable owning the path to production — you have taken something live, watched it break, and fixed it, and you do not need a separate team to do that for you. Bonus points • Experience fine-tuning or post-training open-source models (LoRA/QLoRA, DPO, RLHF). • Worked with multimodal models (image, video, or audio generation/understanding). • Built or contributed to an internal eval harness or LLM observability tooling. • Experience with high-QPS, low-latency inference at consumer scale. • Open-source contributions or technical writing in the AI/ML space. What success looks like in your first 6 months • You have shipped at least one customer-facing AI feature to production, fully owned by you. • A benchmarking and evals framework is in place, run on every model or prompt change, with results visible to the team. • Production AI services have clear SLOs, dashboards, and alerting — and you can answer “what is this costing us per 1,000 requests?” without thinking. • The team’s velocity on shipping AI features has measurably increased because of the infrastructure and patterns you put in place. Why Zocket Zocket is building the AI layer for marketing — letting any business create high-performing ads, creatives, and campaigns in minutes instead of weeks. AI is not a side project here; it is the product. You will work on systems that thousands of businesses use every day, with a tight team, fast feedback loops, and a real mandate to ship. How to apply Send your resume and links to anything you have shipped (GitHub, projects, papers, demos) to careers@zocket.com. If you have taken an AI system from prototype to production and have a story about how you knew it was ready, lead with that.

How to get this job at Zocket

Don't rely on the portal. Cold applications for a role like Senior AI/ML Engineer land in a pile of hundreds. A direct, personalised message to the hiring manager or a referrer is the fastest way in.
Find the right person. ResuMail surfaces the actual recruiters and hiring managers at Zocket — not a generic careers inbox.
Send tailored outreach. ResuMail drafts an email personalised to your resume and this role, then paces and schedules sends so you stay out of spam.
Follow up. One polite nudge after 5–7 days roughly doubles reply rates — scheduled for you.

Reach Zocket's hiring managers today.

Free to start. No credit card. Built for Indian job seekers.

Start free with ResuMail ›