ai-observability
LangSmith
LLM observability, testing & evaluation — by LangChain
LLM tracing + eval + prompt management + datasets by LangChain. Framework-agnostic (works without LangChain). Free dev tier + $39/seat Plus + Enterprise with self-host.
Pricing
| Tier | Price | Notes |
|---|---|---|
| Developer (Free) | Free | Free forever. 5,000 traces/month. 14-day retention. 1 seat. Basic evaluations. |
| Plus | $39/mo | $39/seat/month. 10k base traces included ($2.50 per 1k overage). Full evaluations, custom dashboards, email support. |
| Enterprise | Custom | Custom. Self-host option, SSO, custom retention, dedicated support. |
Limits
| Tier | Metric | Value | Notes |
|---|---|---|---|
| — | annotation queues | Unlimited human-in-the-loop review queues (Pro+) | Annotation |
| — | evaluation engines | LLM-as-judge + custom Python + offline batch | Eval methods |
| — | framework agnostic | Works with any LLM stack, not just LangChain | Framework support |
| — | prompt canvas | Prompt Hub with versioning + collaboration | Prompt management |
| — | retention base | 14 days | Base trace retention |
| — | retention extended | 400 days | Extended trace retention option |
| plus | overage base trace | $2.50 per 1k base traces (14-day retention) | Plus base trace overage |
| plus | overage ext trace | $5.00 per 1k extended traces (400-day retention) | Plus extended trace overage |
Features
- Alerts — Threshold alerts on latency, cost, eval metrics.
- Annotation Queues — Human-review workflows for trace quality rating.
- Custom Dashboards — Aggregate metrics dashboards per project/tag.
- Datasets — Collect examples → use as eval sets or training data.
- Evaluations — LLM-as-judge, embedding similarity, custom Python evaluators, offline batch evals. · docs
- LangChain Integration — Auto-trace any LangChain/LangGraph run with env var.
- LangGraph Integration — First-class trace + eval for LangGraph agents.
- LLM Tracing — Automatic trace every LLM call + tool call + chain step. · docs
- OpenTelemetry Export — Export traces as OTLP to Datadog/Honeycomb/etc.
- Playground — Test prompts + models inline before deploying.
- Prompt Canvas — Visual prompt editor with live test + eval.
- Prompt Hub — Public + private prompt library with versioning. · docs
- Self-Hosted (Enterprise) — Docker + k8s deployment in your infra.
- Threads + Sessions — Group traces into conversational sessions.
Developer interfaces
| Slug | Name | Kind | Version |
|---|---|---|---|
| cli | LangSmith CLI | cli | 0.x |
| dashboard | LangSmith Dashboard | other | — |
| sdk-node | langsmith-js | sdk | 0.x |
| mcp | LangSmith MCP | mcp | — |
| sdk-python | langsmith-python | sdk | 0.x |
| rest-api | LangSmith REST API | rest | v1 |
Compare LangSmith with
ai-api
LangSmith vs Anthropic API
Side-by-side breakdown.
ai-api
LangSmith vs AssemblyAI
Side-by-side breakdown.
ai-api
LangSmith vs Deepgram
Side-by-side breakdown.
ai-api
LangSmith vs ElevenLabs
Side-by-side breakdown.
ai-api
LangSmith vs Google Gemini API
Side-by-side breakdown.
ai-api
LangSmith vs Groq
Side-by-side breakdown.
ai-api
LangSmith vs OpenAI API
Side-by-side breakdown.
ai-api
LangSmith vs Replicate
Side-by-side breakdown.
Staxly is an independent catalog of developer platforms. Outbound links to LangSmith are plain references to their official pages. Pricing is verified at publication time — reconfirm on the vendor site before buying.