Careers | VedhaAI

What we value

The traits of people who thrive here

Depth over breadth

We respect engineers who have gone genuinely deep on hard problems. Being a generalist is useful. Being expert at something — truly expert, with scars from production failures — is rare and valuable here.

Production discipline

You believe in tests, monitoring, documentation, and rollback plans. You have seen what happens when these are absent at 3am and you have learned the right lessons from it.

Evaluation-first thinking

You define success before you start building. You are skeptical of demos. You insist on measuring what matters in the real deployment environment, not just on convenient benchmarks.

Intellectual honesty

You say what you think, including when a proposed approach will not work. You update your views when evidence changes. You prefer being right over being comfortable.

Ownership mentality

You do not consider a task done when the code is merged. You watch the dashboards after deployment. You care about whether the system is actually delivering value to the people using it.

Research & engineering focus

What we are working on

Research

LLM Alignment & Post-Training

Fine-tuning, DPO, RLHF, preference data collection, and alignment techniques applied at practical scale.

Engineering

Inference Optimisation

Quantisation, speculative decoding, continuous batching, and serving throughput at the GPU level.

Applied

RAG & Retrieval Systems

Hybrid retrieval, embedding model adaptation, reranking architectures, and knowledge graph integration.

Systems

Agentic AI Architectures

Multi-agent orchestration, planning, reliable tool use, and autonomous workflow automation in enterprise environments.

Platform

LLM Evaluation & Observability

Automated evaluation pipelines, hallucination detection, golden dataset management, and production monitoring.

Infrastructure

AI-Ready Data Engineering

Feature stores, streaming pipelines, Lakehouse architectures, and data quality systems built for ML consumption.

Open positions

Current openings

01 Senior Machine Learning Engineer

Senior · Full-Time · Remote / Hybrid +

The role

Design, train, evaluate, and deploy ML models that solve real business problems in production environments with no shortcuts on quality.
Architect end-to-end ML pipelines: data ingestion, feature engineering, training, evaluation, serving, and monitoring — owned end-to-end.
Build MLOps infrastructure: experiment tracking, model registries, automated evaluation gates, canary deployments, and drift detection.
Collaborate closely with data engineers to design feature stores and training pipelines that treat data quality as a first-class constraint.

What you bring

5+ years building production ML systems across the full lifecycle — not only model training but deployment, monitoring, and iteration.
Deep PyTorch and scikit-learn experience; comfort with model debugging, profiling, and systematic optimisation.
Strong understanding of ML evaluation design: metrics selection, offline vs. online evaluation, statistical rigour in A/B testing.
Production debugging mindset: logging, tracing, and alerting are as natural as writing model code.

Apply for this role

02 LLM Engineer

Mid–Senior · Full-Time · Remote +

The role

Build and optimise production LLM pipelines: prompt systems, structured outputs, multi-step orchestration, and tool-augmented reasoning.
Fine-tune open-weight models using LoRA, QLoRA, and instruction tuning techniques on domain-specific datasets with rigorous evaluation.
Design and evaluate RAG pipelines with hybrid retrieval, reranking, semantic caching, and hallucination detection layers.
Build LLMOps tooling: prompt versioning, output monitoring, evaluation automation, and cost attribution dashboards.

What you bring

Hands-on experience fine-tuning or deploying open-weight LLMs — LLaMA, Mistral, Qwen, Gemma — in production, not only in notebooks.
Practical knowledge of PEFT methods: LoRA, QLoRA, DoRA, and the nuances of adapter merging and quantisation.
Experience building production RAG systems: chunking strategies, embedding model selection, reranking, and real retrieval quality evaluation.
Clear understanding of LLM inference tradeoffs: batching, KV cache, quantisation formats, and latency vs. throughput tuning.

Apply for this role

03 Research Scientist — Applied ML

Senior · Full-Time · Remote +

The role

Drive applied ML research that translates directly into production system improvements — research for deployment, not publication alone.
Design and run post-training experiments: RLHF, DPO, constitutional AI, and preference dataset curation at practical scale.
Build evaluation frameworks that measure model quality in ways that genuinely predict real-world performance.
Collaborate with ML engineers to bridge research insights and production engineering constraints — both directions.

What you bring

PhD or equivalent research depth in machine learning, NLP, or AI systems — demonstrated through shipped systems, not only papers.
Deep understanding of transformer architectures, attention mechanisms, and training dynamics at scale.
Experience with preference learning, RLHF, or alignment techniques beyond theoretical familiarity.
Strong empirical instincts: you know how to design experiments that answer the right questions efficiently.

Apply for this role

04 AI Infrastructure Engineer

Senior · Full-Time · Remote / Hybrid +

The role

Design and operate GPU-accelerated inference infrastructure for LLMs and traditional ML models at enterprise scale.
Optimise serving throughput and latency using vLLM, TGI, Triton, and targeted CUDA-level optimisations.
Build and maintain model quantisation pipelines: INT4, INT8, GPTQ, AWQ, and GGUF formats for different deployment targets.
Design production observability: GPU utilisation dashboards, latency profiling, cost attribution, and automated alerting.

What you bring

Deep experience with GPU infrastructure: CUDA programming, multi-GPU topologies, NVLink, and memory management.
Production experience with vLLM, TGI, or Triton Inference Server serving real traffic.
Strong Kubernetes skills for ML inference workloads, including autoscaling strategies for variable-demand LLM traffic.
Clear mental model of model quantisation tradeoffs and how different formats perform across hardware targets.

Apply for this role

05 Applied AI Engineer

Mid–Senior · Full-Time · Remote +

The role

Build production AI applications and agentic systems integrating LLMs, retrieval, tool use, and enterprise APIs.
Implement multi-agent workflows with memory, planning, and human-in-the-loop checkpoints for enterprise environments.
Design and ship backend AI services: reliable APIs, rate limiting, auth, error handling, and observability built in from day one.

What you bring

Production experience shipping AI-powered backend services — not prototypes — with real users and real SLAs.
Practical LangChain, LangGraph, or equivalent agentic framework experience in actual deployments.
Strong Python skills and comfort with async systems, database integration, and API design.
Good judgment on when AI is the right solution and when simpler approaches work better.

Apply for this role

We hire exceptional people who want to build real AI systems.

The traits of people who thrive here

Depth over breadth

Production discipline

Evaluation-first thinking

Intellectual honesty

Ownership mentality

What we are working on

LLM Alignment & Post-Training

Inference Optimisation

RAG & Retrieval Systems

Agentic AI Architectures

LLM Evaluation & Observability

AI-Ready Data Engineering

Current openings

The role

What you bring

The role

What you bring

The role

What you bring

The role

What you bring

The role

What you bring

Straightforward and respectful of your time

Application review

Technical conversation

Technical assessment

Final interview

Don't see the right role?