SDE II - ML/AI Engineer

Jaipur, India Mid Posted 2026-04-16

Don't apply into the void — reach the hiring manager

ResuMail finds the recruiters and hiring managers behind this SDE II - ML/AI Engineer role at Auriga IT, drafts a personalised outreach email, and schedules the send — so your application actually gets seen.

Reach the hiring manager ›

About this role

Job Summary We’re seeking a hands-on GenAI & Computer Vision Engineer with 3–5 years of experience delivering production-grade AI solutions. You must be fluent in the core libraries, tools, and cloud services listed below, and able to own end-to-end model development—from research and fine-tuning through deployment, monitoring, and iteration. In this role, you’ll tackle domain-specific challenges like LLM hallucinations, vector search scalability, real-time inference constraints, and concept drift in vision models. Key Responsibilities Generative AI & LLM Engineering Fine-tune and evaluate LLMs (Hugging Face Transformers, Ollama, LLaMA) for specialized tasks Deploy high-throughput inference pipelines using vLLM or Triton Inference Server Design agent-based workflows with LangChain or LangGraph, integrating vector databases (Pinecone, Weaviate) for retrieval-augmented generation Build scalable inference APIs with FastAPI or Flask, managing batching, concurrency, and rate-limiting Computer Vision Development Develop and optimize CV models (YOLOv8, Mask R-CNN, ResNet, EfficientNet, ByteTrack) for detection, segmentation, classification, and tracking Implement real-time pipelines using NVIDIA DeepStream or OpenCV (cv2); optimize with TensorRT or ONNX Runtime for edge and cloud deployments Handle data challenges—augmentation, domain adaptation, semi-supervised learning—and mitigate model drift in production MLOps & Deployment Containerize models and services with Docker; orchestrate with Kubernetes (KServe) or AWS SageMaker Pipelines Implement CI/CD for model/version management (MLflow, DVC), automated testing, and performance monitoring (Prometheus + Grafana) Manage scalability and cost by leveraging cloud autoscaling on AWS (EC2/EKS), GCP (Vertex AI), or Azure ML (AKS) Cross-Functional Collaboration Define SLAs for latency, accuracy, and throughput alongside product and DevOps teams Evangelize best practices in prompt engineering, model governance, data privacy, and interpretability Mentor junior engineers on reproducible research, code reviews, and end-to-end AI delivery Required Qualifications You must be proficient in at least one tool from each category below: LLM Frameworks & Tooling: Hugging Face Transformers, Ollama, vLLM, or LLaMA Agent & Retrieval Tools: LangChain or LangGraph; RAG with Pinecone, Weaviate, or Milvus Inference Serving: Triton Inference Server; FastAPI or Flask Computer Vision Frameworks & Libraries: PyTorch or TensorFlow; OpenCV (cv2) or NVIDIA DeepStream Model Optimization: TensorRT; ONNX Runtime; Torch-TensorRT MLOps & Versioning: Docker and Kubernetes (KServe, SageMaker); MLflow or DVC Monitoring & Observability: Prometheus; Grafana Cloud Platforms: AWS (SageMaker, EC2/EKS) or GCP (Vertex AI, AI Platform) or Azure ML (AKS, ML Studio) Programming Languages: Python (required); C++ or Go (preferred) Additionally: Bachelor’s or Master’s in Computer Science, Electrical Engineering, AI/ML, or a related field 3–5 years of professional experience shipping both generative and vision-based AI models in production Strong problem-solving mindset; ability to debug issues like LLM drift, vector index staleness, and model degradation Excellent verbal and written communication skills Typical Domain Challenges You’ll Solve LLM Hallucination & Safety: Implement grounding, filtering, and classifier layers to reduce false or unsafe outputs Vector DB Scaling: Maintain low-latency, high-throughput similarity search as embeddings grow to millions Inference Latency: Balance batch sizing and concurrency to meet real-time SLAs on cloud and edge hardware Concept & Data Drift: Automate drift detection and retraining triggers in vision and language pipelines Multi-Modal Coordination: Seamlessly orchestrate data flow between vision models and LLM agents in complex workflows About Company Hi there! We are Auriga IT. We power businesses across the globe through digital experiences, data and insights. From the apps we design to the platforms we engineer, we're driven by an ambition to create world-class digital solutions and make an impact. Our team has been part of building the solutions for the likes of Zomato, Yes Bank, Tata Motors, Amazon, Snapdeal, Ola, Practo, Vodafone, Meesho, Volkswagen, Droom and many more. We are a group of people who just could not leave our college-life behind and the inception of Auriga was solely based on a desire to keep working together with friends and enjoying the extended college life. Who Has not Dreamt of Working with Friends for a Lifetime Come Join In Our Website - https://aurigait.com/

How to get this job at Auriga IT

Don't rely on the portal. Cold applications for a role like SDE II - ML/AI Engineer land in a pile of hundreds. A direct, personalised message to the hiring manager or a referrer is the fastest way in.
Find the right person. ResuMail surfaces the actual recruiters and hiring managers at Auriga IT — not a generic careers inbox.
Send tailored outreach. ResuMail drafts an email personalised to your resume and this role, then paces and schedules sends so you stay out of spam.
Follow up. One polite nudge after 5–7 days roughly doubles reply rates — scheduled for you.

Reach Auriga IT's hiring managers today.

Free to start. No credit card. Built for Indian job seekers.

Start free with ResuMail ›