We’re hiring a Full Stack AI Engineer to build AI-native products end to end: applications, agents, RAG/GraphRAG, NL2SQL, evals, and observability. You’ll turn LLM capabilities into reliable, user-facing features that are measurable, debuggable, and safe in production.
What you’ll do
Build and own full-stack AI features across frontend, backend, and data layers for web applications.
Design agentic workflows (single- and multi-agent / A2A) that can plan, route, call tools, and coordinate to complete complex tasks.
Implement and refine RAG pipelines, including retrieval strategies, chunking, embeddings, reranking, and hybrid search across multiple data sources.
Design and operate GraphRAG-style retrieval on top of knowledge graphs to support multi-hop reasoning and relationship-heavy use cases.
Build NL2SQL / NL2DB capabilities that convert natural language into safe, validated queries against SQL databases, warehouses, or analytics systems.
Define and manage tool interfaces and MCP-style capability layers so agents can call internal APIs, SaaS tools, and data services with proper contracts and permissions.
Create evaluation pipelines for prompts, agents, RAG, GraphRAG, NL2SQL, and tool use, including regression tests, LLM-as-judge scoring, and human review loops.
Instrument AI systems with traces, logs, metrics, and structured events so you can debug failures, track versions, and understand behavior across the entire request path.
Build dashboards and alerts to monitor quality, latency, cost, and safety signals for AI features in production.
Collaborate with product, design, data, and platform teams to move from prototype to production while adding guardrails, fallbacks, and human-in-the-loop flows where needed.
Continuously experiment with new models, prompting techniques, and architectures, then distill what works into reusable patterns and libraries for the team.
Required qualifications
Hands-on experience shipping LLM-based features (agents, RAG, tool calling, or NL2SQL) into production.
Strong full-stack engineering experience with modern web stacks (e.g., TypeScript/React/Next.js plus Python/Node.js).
Solid backend fundamentals: REST/GraphQL APIs, relational databases, caching, and cloud deployment (AWS/GCP/Azure with Docker/CI).
Experience designing, measuring, and improving AI behavior using evals, metrics, and real user feedback.
Ability to work closely with product teams, own projects end to end, and make pragmatic tradeoffs between quality, speed, and cost.
Tech Stack
Area
Example tools
Frontend
React, Next.js, TypeScript, Tailwind
Backend
Python, FastAPI, Node.js, PostgreSQL, Redis
AI orchestration
LangChain, LangGraph, Semantic Kernel, custom agent frameworks
Retrieval
Pinecone, Weaviate, FAISS, Elasticsearch, hybrid search
Graph / GraphRAG
Neo4j, graph stores, entity linking, knowledge graph pipelines
Evals
LangSmith, DeepEval, custom benchmark suites, human review workflows
Observability
Langfuse, Arize, W&B, OpenTelemetry, custom dashboards
Infra
AWS, Docker, Kubernetes, GitHub Actions