# inference.club > A community-run inference network. Members run agents on their own hardware that expose local AI model servers. Other members call those models through a single OpenAI-compatible API. ## What it is inference.club is a peer-to-peer inference network with an OpenAI-compatible API surface. You can: - **Use** models that other members (or you) are serving, via standard OpenAI SDKs - **Serve** models from your own hardware by running `inference-club-agent` and registering it ## Base URL ``` https://api.inference.club/v1 ``` Auth: `Authorization: Bearer ` ## Supported modalities | Endpoint | Modality | |---|---| | POST /v1/chat/completions | LLM chat (streaming supported) | | POST /v1/completions | Legacy text completions | | POST /v1/audio/transcriptions | Speech-to-text (STT/ASR) | | POST /v1/audio/speech | Text-to-speech (TTS) | | POST /v1/images/generations | Image generation | | POST /v1/images/edits | Image editing (multipart) | | POST /v1/music/generations | Music generation (ACE-Step) | | POST /v1/videos/generations | Video generation (LTX-2) | | POST /v1/voice/generations | Voice cloning (Dia, multi-speaker) | | POST /v1/3d/generations | 3D mesh generation | ## Model discovery `GET /v1/models` returns each model's `service_type` (llm/stt/tts/image/music/video/mesh), `input_modalities`, `output_modalities`, `supported_features`, and `context_length` — enough for a client to select the right model for a task without hardcoding names. ## Async jobs Add `"async": true` to any JSON-bodied request to queue it instead of blocking. Returns 202 with a job id. Poll `GET /v1/jobs/`. Supported: chat/completions, completions, images/generations, videos/generations, music/generations, audio/speech. ## Batches `POST /v1/batches` accepts up to 256 async requests in one call. ## Workflows `POST /v1/workflows/runs` runs a DAG of inference steps. Step kinds: inference (one job), map (fan-out), transform (inline data), collect (gather fan-out outputs), gate (human approval pause). Steps template on each other via `{{ steps..output. }}`. Curated templates available at `GET /v1/workflows/templates`. ## Voice cloning `POST /v1/voice/generations` takes a `[S1]`/`[S2]`-tagged dialogue script and optional voice sample IDs, then produces cloned speech via Dia. Voice samples live in the user's library (`/api/inference/voice-samples/`). ## Routing Requests route to the first online provider on the caller's account that serves the requested model and service type. No load balancing. `404 no_provider` when nothing matches; `502 upstream_error` when the provider's local server fails. ## Docs - Full docs: https://inference.club/docs - Quickstart for AI agents: https://inference.club/docs/quickstart-agents - API reference: https://inference.club/docs/api/overview - Interactive API explorer: https://inference.club/api-reference - OpenAPI spec: https://api.inference.club/openapi.json