About Impact Analytics
Impact Analytics™ (Series D Funded) delivers AI-native SaaS solutions and consulting services that help companies maximize profitability and customer satisfaction through deeper data insights and predictive analytics. With a fully integrated, end-to-end platform for planning, forecasting, merchandising, pricing, and promotions, Impact Analytics empowers companies to make smarter decisions based on real-time insights rather than relying on last year’s inputs to forecast and plan this year’s business. Powered by over one million machine learning models, Impact Analytics has been leading AI innovation for a decade, setting new benchmarks in forecasting, planning, and operational excellence across the retail, grocery, manufacturing, and CPG sectors. In 2025, Impact Analytics is at the forefront of the Agentic AI revolution, delivering autonomous solutions that
enable businesses to adapt in real time, optimize operations, and drive profitability without manual intervention.
Role Summary
The AI Lead owns technical execution for a defined domain within the company’s applied AI platform — spanning our GenAI-powered reasoning and analytics platform, our agentic AI authoring and runtime platform, and emerging multimodal / computer vision workloads. You translate a prioritised backlog and architectural guidelines into shipped, production-grade AI systems.
This is a hands-on engineering leadership role. You write and review production code, drive design decisions within your domain, mentor 3–5 engineers, and partner with the AI Architect and Engineering Manager on cross-cutting platform concerns. You do not define the organisation’s reference architecture end-to-end — you implement and evolve it within your domain, flag issues upward, and own quality within your squad.
Key Responsibilities:
1. Platform Implementation & Domain Ownership
Implement and evolve features across the GenAI reasoning and agentic AI platform in alignment with the reference architecture defined by the AI Architect.
Own the technical design and delivery of your squad’s workstreams — including data flow, API contracts, component boundaries, and integration points with upstream/downstream systems.
Write and review production-quality Python code; enforce code standards, conduct PR reviews, and maintain a clean, well-tested codebase.
Participate in build-vs-buy-vs-open-source evaluations for your domain; provide structured trade-off analysis to the AI Architect for final decision.
Contribute to Architecture Decision Records (ADRs) for decisions within your scope; review ADRs from peer squads.
2. Agentic AI Systems
Build and ship agent workflows (ReAct, planner–executor, reflection, human-in-the-loop checkpoints) using the chosen orchestration framework (LangGraph / LlamaIndex Workflows / Semantic Kernel or equivalent).
Implement tool and MCP (Model Context Protocol) integrations for connecting agents to enterprise systems (CRM, ITSM, REST/GraphQL APIs) with scoped auth (OAuth2/OBO), sandboxing, and rate limiting.
Own agent evaluation coverage for your squad: task-completion benchmarks, tool-call success rates, trajectory evals, and cost-per-task regression tests.
Implement agent memory patterns (short-term conversation context, long-term episodic/semantic memory) with TTL and retrieval policies defined in collaboration with the AI Architect.
3. GenAI & RAG Systems
Implement and iterate on RAG pipelines: hybrid retrieval (BM25 + dense + rerankers), query decomposition, GraphRAG for knowledge-graph workloads, and citation grounding.
Execute on the fine-tune vs. prompt vs. in-context-learning decision set by the AI Architect; own fine-tuning runs (LoRA/QLoRA/PEFT) end-to-end when required.
Apply context engineering patterns: prompt contracts, structured outputs (JSON Schema, Pydantic, function-calling), constrained decoding, output validators, and fallback chains.
Implement grounding and hallucination-mitigation mechanisms: retrieval confidence scoring, “I don’t know” paths, claim-level validators.
4. Multimodal & Computer Vision (Contextual Scope)
Where the product roadmap requires it, build VLM-first pipelines (GPT-4V class, Gemini, Qwen-VL) and integrate CV outputs (OCR, document understanding, layout parsing) as tools inside the agentic platform.
Run benchmarks on open-weight VLMs (Qwen-VL, InternVL) vs. closed VLM APIs for your squad’s use cases; report findings with cost-latency-accuracy data.
5. LLMOps, Evaluation & Observability
Instrument telemetry for your squad’s features: token cost, latency p50/p95, tool-call success rates, retrieval hit rates, cache hit rates — per-tenant where applicable.
Maintain continuous evaluation coverage: hallucination, faithfulness, jailbreak resistance, and drift detection for all features in your scope; treat prompts as code (versioned, reviewed, rolled back).
Operate and extend the LLMOps toolchain (LangSmith / Langfuse / Arize / Ragas / DeepEval / Promptfoo) as established by the platform team.
6. Responsible AI, Security & Governance
Implement AI security controls for your squad’s features: prompt-injection defense, PII detection and redaction, RBAC on agent tools, and secrets handling for tool auth.
Ensure features in your scope meet compliance requirements (SOC 2 Type II, GDPR, DPDP) as defined by the platform governance team.
Apply responsible AI guardrails: topic filters, PII egress controls, usage-policy enforcement at gateway level.
7. Inference Economics & Performance
Profile and optimise inference for your squad’s workloads: batching, KV-cache utilisation, speculative decoding, and quantisation (AWQ, GPTQ, INT4) under guidance from the AI Architect.
Make SLM-vs-LLM routing recommendations for individual capabilities (e.g., Phi-3.5 / Qwen-2.5-7B vs. frontier API) with supporting cost-latency-quality data.
Implement and instrument caching layers (prompt cache, semantic cache, retrieval cache) with invalidation policies.
8. Technical Leadership
Mentor 3–5 engineers (senior and mid-level) through design reviews, PR feedback, pair programming, and career conversations.
Run squad-level design sessions; surface cross-cutting concerns to the AI Architect and Engineering Manager proactively.
Partner with Product, SRE, and Security on sprint-level execution, incident triage, and feature flag / release decisions.
Participate as a technical interviewer for senior-engineer and specialist roles on the AI team.
Required Qualifications
8+ years of software engineering experience, with 3+ years shipping production AI/ML systems and 1+ year delivering LLM-powered applications (RAG, agents, fine-tuning) at meaningful scale.
Hands-on proficiency in at least one agent orchestration framework (LangGraph, Semantic Kernel, CrewAI, AutoGen, DSPy) — with clear opinions on their limitations.
Production experience with vector stores (pgvector, Pinecone, Weaviate, Milvus, Qdrant), embedding models (open + proprietary), and hybrid retrieval with rerankers.
Working knowledge of fine-tuning (LoRA/QLoRA/PEFT) and inference optimisation (vLLM, quantisation, batching) — implementation-level, not just conceptual.
Solid MLOps / LLMOps foundations: experiment tracking, CI/CD for ML, prompt versioning, and evaluation pipelines (LLM-as-judge, RAG triad metrics, adversarial suites).
Cloud experience on at least one of AWS (Bedrock, SageMaker), Azure (AI Foundry, Azure OpenAI), or GCP (Vertex AI) — including cost management, not just API usage.
Python proficiency at senior-engineer level; working knowledge of PyTorch and the HuggingFace stack.
Comfortable with LLM evaluation methodologies: faithfulness / answer-relevance / context-relevance, agent trajectory evals, and human-in-the-loop eval workflows.
Strong communication — can write a clear design doc that both a product manager and a senior engineer find useful, and can defend technical trade-offs in sprint planning.
Preferred Qualifications
Experience building features for a multi-tenant GenAI SaaS product (not a pilot or internal tool).
Hands-on with MCP (Model Context Protocol), A2A, or equivalent agent-tool interoperability standards.
Exposure to VLMs / multimodal systems (GPT-4V, Gemini, Qwen-VL, InternVL, LLaVA) and VLM fine-tuning.
Experience with small language models (SLMs) and on-prem / edge deployment (Phi, Gemma, Qwen, Llama 3.1-8B class).
Familiarity with graph databases (Neo4j, Neptune) for GraphRAG and knowledge-graph workloads.
Public contributions to AI open-source, technical blog posts, or conference talks are a plus.
Graduate degree (MS) in CS, ML, or related — shipped systems outweigh credentials.
What This Role Is NOT
A people-manager role — you lead through technical mentorship and delivery, not through headcount. An Engineering Manager owns squad delivery and people processes.
A research scientist role — research-to-production translation is the job; pure research is not.
A prompt engineer role — prompting is a tool you use daily, not your primary deliverable.
A data science role — statistical modelling matters, but systems engineering is the centre of gravity.
Required Qualifications
8+ years of software engineering experience, with 3+ years shipping production AI/ML systems and 1+ year delivering LLM-powered applications (RAG, agents, fine-tuning) at meaningful scale.
Handson proficiency in at least one agent orchestration framework (LangGraph, Semantic Kernel, CrewAI, AutoGen, DSPy) — with clear opinions on their limitations.
Production experience with vector stores (pgvector, Pinecone, Weaviate, Milvus, Qdrant), embedding models (open + proprietary), and hybrid retrieval with rerankers.
Working knowledge of fine-tuning (LoRA/QLoRA/PEFT) and inference optimisation (vLLM, quantisation, batching) — implementation-level, not just conceptual.
Solid MLOps / LLMOps foundations: experiment tracking, CI/CD for ML, prompt versioning, and evaluation pipelines (LLM-as-judge, RAG triad metrics, adversarial suites).
Cloud experience on at least one of AWS (Bedrock, SageMaker), Azure (AI Foundry, Azure OpenAI), or GCP (Vertex AI) — including cost management, not just API usage.
Python proficiency at senior-engineer level; working knowledge of PyTorch and the HuggingFace stack.
Comfortable with LLM evaluation methodologies: faithfulness / answer-relevance / context-relevance, agent trajectory evals, and human-in-the-loop eval workflows.
Strong communication — can write a clear design doc that both a product manager and a senior engineer find useful, and can defend technical trade-offs in sprint planning.
Preferred Qualifications
Experience building features for a multi-tenant GenAI SaaS product (not a pilot or internal tool).
Hands-on with MCP (Model Context Protocol), A2A, or equivalent agent-tool interoperability standards.
Exposure to VLMs / multimodal systems (GPT-4V, Gemini, Qwen-VL, InternVL, LLaVA) and VLM fine-tuning.
Experience with small language models (SLMs) and on-prem / edge deployment (Phi, Gemma, Qwen, Llama 3.1-8B class).
Familiarity with graph databases (Neo4j, Neptune) for GraphRAG and knowledge-graph workloads.
Public contributions to AI open-source, technical blog posts, or conference talks are a plus.
Graduate degree (MS) in CS, ML, or related — shipped systems outweigh credentials.
What we offer
An opportunity to be part of some of the best enterprise SaaS products to be built out of India
Opportunities to quench your thirst for problem-solving, experimenting, learning, and implementing innovative solutions
A flat, collegial work environment, with a work hard, play hard attitude
A platform for rapid growth if you are willing to try new things without fear of failure.
Remuneration with best in class industry standards with generous health insurance cover
Our accolades include:
Ranked #72 in America’s Most innovative Companies
list in 2023 alongside companies like Microsoft, Tesla, Apple, IBM, etc.
Ranked as one of
America' s fastest growing companies
by Financial Times for four consecutive years: 2020-2023.
Ranked as
one of America' s fastest-growing private companies
by Inc 5000 for Six consecutive years: 2018-2023.
Recognized in multiple
Gartner reports, including Market Guides and Hype Cycle
, spanning assortment, merchandising, forecasting, algorithmic retailing, and Unified Price, Promotion, and Markdown Optimization Applications.
Featured as one of
top 25 ML startups to watch by Forbes
in 2019.
Ranked as one of
North Americas' fastest-growing technology
companies by Deloitte for two consecutive years 2019 & 2020