아직 한국어로 제공되지 않아 영어로 표시합니다.

Batches

A batch lets you submit up to 256 async inference requests in a single API call. All items are validated up front — if any item is malformed, the entire batch is rejected before anything is created. Once accepted, each item becomes an independent async job that runs as providers have capacity.

`POST /v1/batches`

curl -X POST https://api.inference.club/v1/batches \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "label": "Q2 product shots",
    "requests": [
      {
        "endpoint": "/v1/images/generations",
        "body": { "model": "flux-dev", "prompt": "a minimalist desk lamp" }
      },
      {
        "endpoint": "/v1/images/generations",
        "body": { "model": "flux-dev", "prompt": "a wooden bookshelf with plants" }
      },
      {
        "endpoint": "/v1/chat/completions",
        "body": {
          "model": "qwen3-8b",
          "messages": [{ "role": "user", "content": "Write a product description for a lamp." }]
        }
      }
    ]
  }'

Response: 202 Accepted

{
  "id": "7",
  "label": "Q2 product shots",
  "status": "PENDING",
  "total": 3,
  "queued": 3,
  "processing": 0,
  "processed": 0,
  "failed": 0,
  "created": 1718312400
}

Request body

Field	Type	Notes
`requests`	array	Required. 1–256 items.
`requests[].endpoint`	string	Required. One of the supported endpoints (see below).
`requests[].body`	object	Required. The inference body for that endpoint (same shape as a direct call, minus `async`).
`label`	string	Optional. A human-readable name for the batch.

Supported endpoints

Endpoint	Inference type
`/v1/chat/completions`	LLM
`/v1/completions`	LLM
`/v1/images/generations`	IMAGE
`/v1/videos/generations`	VIDEO
`/v1/music/generations`	MUSIC
`/v1/audio/speech`	TTS

File-upload endpoints are not batch-submittable.

`GET /v1/batches`

List your batches, newest first (up to 50).

curl https://api.inference.club/v1/batches \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

`GET /v1/batches/<id>`

Get batch status, per-status job counts, and a link to each job.

curl https://api.inference.club/v1/batches/7 \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

{
  "id": "7",
  "label": "Q2 product shots",
  "status": "PROCESSING",
  "total": 3,
  "queued": 0,
  "processing": 1,
  "processed": 2,
  "failed": 0,
  "jobs": [
    { "id": "43", "status": "PROCESSED", "inference_type": "IMAGE" },
    { "id": "44", "status": "PROCESSING", "inference_type": "IMAGE" },
    { "id": "45", "status": "PROCESSED", "inference_type": "LLM" }
  ]
}

Batch status reflects the aggregate:

Batch status	Meaning
`PENDING`	All jobs still queued.
`PROCESSING`	At least one job is running or queued.
`DONE`	All jobs have finished (some may have failed).

Cancel a batch

POST /v1/batches/<id>/cancel cancels every QUEUED or PROCESSING job in the batch.

curl -X POST https://api.inference.club/v1/batches/7/cancel \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

Returns the updated batch object.

Individual jobs

Each item in a batch is a regular async job accessible via GET /v1/jobs/<id>. Results (including media URLs) are on the individual job objects, not the batch.

Errors

`type`	When	HTTP
`invalid_request`	`requests` is missing, empty, or an item is malformed	400
`too_large`	More than 256 items	400
`async_disabled`	Async processing is not enabled on this server	503