아직 한국어로 제공되지 않아 영어로 표시합니다.

Batches

A batch lets you submit up to 256 async inference requests in a single API call. All items are validated up front — if any item is malformed, the entire batch is rejected before anything is created. Once accepted, each item becomes an independent async job that runs as providers have capacity.

POST /v1/batches

curl -X POST https://api.inference.club/v1/batches \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "label": "Q2 product shots",
    "requests": [
      {
        "endpoint": "/v1/images/generations",
        "body": { "model": "flux-dev", "prompt": "a minimalist desk lamp" }
      },
      {
        "endpoint": "/v1/images/generations",
        "body": { "model": "flux-dev", "prompt": "a wooden bookshelf with plants" }
      },
      {
        "endpoint": "/v1/chat/completions",
        "body": {
          "model": "qwen3-8b",
          "messages": [{ "role": "user", "content": "Write a product description for a lamp." }]
        }
      }
    ]
  }'

Response: 202 Accepted

{
  "id": "7",
  "label": "Q2 product shots",
  "status": "PENDING",
  "total": 3,
  "queued": 3,
  "processing": 0,
  "processed": 0,
  "failed": 0,
  "created": 1718312400
}

Request body

FieldTypeNotes
requestsarrayRequired. 1–256 items.
requests[].endpointstringRequired. One of the supported endpoints (see below).
requests[].bodyobjectRequired. The inference body for that endpoint (same shape as a direct call, minus async).
labelstringOptional. A human-readable name for the batch.

Supported endpoints

EndpointInference type
/v1/chat/completionsLLM
/v1/completionsLLM
/v1/images/generationsIMAGE
/v1/videos/generationsVIDEO
/v1/music/generationsMUSIC
/v1/audio/speechTTS

File-upload endpoints are not batch-submittable.

GET /v1/batches

List your batches, newest first (up to 50).

curl https://api.inference.club/v1/batches \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

GET /v1/batches/<id>

Get batch status, per-status job counts, and a link to each job.

curl https://api.inference.club/v1/batches/7 \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"
{
  "id": "7",
  "label": "Q2 product shots",
  "status": "PROCESSING",
  "total": 3,
  "queued": 0,
  "processing": 1,
  "processed": 2,
  "failed": 0,
  "jobs": [
    { "id": "43", "status": "PROCESSED", "inference_type": "IMAGE" },
    { "id": "44", "status": "PROCESSING", "inference_type": "IMAGE" },
    { "id": "45", "status": "PROCESSED", "inference_type": "LLM" }
  ]
}

Batch status reflects the aggregate:

Batch statusMeaning
PENDINGAll jobs still queued.
PROCESSINGAt least one job is running or queued.
DONEAll jobs have finished (some may have failed).

Cancel a batch

POST /v1/batches/<id>/cancel cancels every QUEUED or PROCESSING job in the batch.

curl -X POST https://api.inference.club/v1/batches/7/cancel \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

Returns the updated batch object.

Individual jobs

Each item in a batch is a regular async job accessible via GET /v1/jobs/<id>. Results (including media URLs) are on the individual job objects, not the batch.

Errors

typeWhenHTTP
invalid_requestrequests is missing, empty, or an item is malformed400
too_largeMore than 256 items400
async_disabledAsync processing is not enabled on this server503