Staxly
ai-api

Replicate

Run and fine-tune AI models in the cloud — pay-per-second GPU

Run 1000s of open-source AI models (FLUX, Stable Diffusion, LLMs) via API. Per-second GPU billing. Cog framework for packaging your own models. Deploy + fine-tune.

Replicate websiteDocs ↗

Pricing

TierPriceNotes
Pay-as-you-goFreePer-second GPU billing. No minimum. Public models billed by processing time or tokens.
EnterpriseCustomCustom. Dedicated capacity, private deployments, SOC2, HIPAA on request.

Limits

TierMetricValueNotes
cpu small$0.000025/sec (1 vCPU, 2GB)CPU small
cpu standard$0.000100/sec (4 vCPU, 8GB)CPU standard
fast boot fine tunesOnly active processing time billedFine-tune billing
gpu a100 80gb$0.001400/sec (~$5.04/hr)Nvidia A100 80GB
gpu h100 80gb$0.001525/sec (~$5.49/hr)Nvidia H100 80GB
gpu l40s 48gb$0.000975/sec (~$3.51/hr)Nvidia L40S
gpu t4$0.000225/sec (~$0.81/hr)Nvidia T4
model claude sonnet$3/M input + $15/M output tokens (Claude 3.7 Sonnet)Token-billed example
model flux pro$0.04 per output image (FLUX 1.1 Pro)Image model example
private model billingDedicated hardware billed for setup + idle + active timePrivate model billing

Features

Developer interfaces

SlugNameKindVersion
cogCog (package models)cli0.x
sdk-goreplicate-gosdk1.x
mcpReplicate MCPmcp
sdk-nodereplicate (Node)sdk1.x
sdk-pythonreplicate-pythonsdk1.x
rest-apiReplicate REST APIrestv1
webhooksWebhooksother

Compare Replicate with

Staxly is an independent catalog of developer platforms. The link to Replicate above may be an affiliate link — Staxly may earn a commission if you sign up through it, at no extra cost to you. Pricing is verified at publication time — reconfirm on the vendor site before buying.