Aún no disponible en Español — mostrando en inglés.

Async jobs

Any JSON-bodied inference endpoint accepts an "async": true field. When present, the request is queued as a job rather than blocking until the upstream provider finishes. The response is 202 Accepted with a job envelope.

curl https://api.inference.club/v1/images/generations \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux-dev",
    "prompt": "a lighthouse at dusk",
    "async": true
  }'
{
  "id": "42",
  "object": "inference.job",
  "status": "QUEUED",
  "inference_type": "IMAGE",
  "model": "flux-dev",
  "created": 1718312400,
  "queued_at": "2026-06-13T12:00:00Z"
}

The async flag is stripped from the body before it reaches a provider — the provider never sees it.

Supported modalities

The async path supports: chat/completions, completions, images/generations, videos/generations, music/generations, and audio/speech.

File-upload endpoints (audio/transcriptions, images/edits, voice/generations) are synchronous only.

Job statuses

StatusMeaning
QUEUEDWaiting for a free capacity slot on a provider.
PROCESSINGA provider has claimed and is running the job.
PROCESSEDFinished successfully. Result is available.
FAILEDThe job failed (provider error, no provider, or max retries exhausted).
CANCELEDCanceled by the caller before it completed.

GET /v1/jobs

List your async jobs, newest first.

curl https://api.inference.club/v1/jobs \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

Query parameters:

ParameterDescription
statusFilter by status (e.g. QUEUED, FAILED).
active=1Only return QUEUED or PROCESSING jobs.
limitMax results, default 50.

Response:

{
  "data": [
    {
      "id": "42",
      "object": "inference.job",
      "status": "PROCESSED",
      "inference_type": "IMAGE",
      "model": "flux-dev",
      "created": 1718312400,
      "result_url": "https://api.inference.club/api/inference/assets/99/"
    }
  ]
}

GET /v1/jobs/<id>

Get a single job's current status, result, and any media assets.

curl https://api.inference.club/v1/jobs/42 \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

A completed image job returns a result_url; a completed LLM job returns the choices array inside result.

POST /v1/jobs/<id>/cancel

Cancel a QUEUED or PROCESSING job. Has no effect if the job has already completed.

curl -X POST https://api.inference.club/v1/jobs/42/cancel \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"
{ "id": "42", "status": "CANCELED", "canceled": true }

POST /v1/jobs/<id>/retry

Re-queue a FAILED or CANCELED job with the same payload. The job goes back to QUEUED and gets a fresh attempt.

curl -X POST https://api.inference.club/v1/jobs/42/retry \
  -H "Authorization: Bearer $INFERENCE_CLUB_KEY"

Returns the updated job envelope (202 Accepted) or 409 Conflict if the job is in a non-retryable state.

Idempotency

Pass an Idempotency-Key: <uuid> header when submitting a job to deduplicate retries from your client. A second submission with the same key returns the existing job instead of creating a new one.

Dashboard

Live job state is visible at Dashboard → Queue. The queue page shows a summary badge for active work and renders each workflow run as an SVG DAG.

Worker availability

If async is not enabled on the server, all async endpoints return 503 Service Unavailable with "type": "async_disabled". The queue summary endpoint (GET /api/inference/queue/summary) exposes an async_enabled flag and a worker_stalled indicator — if jobs are queued but the dispatcher hasn't ticked in the last 60 seconds, the frontend shows a warning.