Documentation Index
Fetch the complete documentation index at: https://docs.getimpala.ai/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Impala will provide these values to you ahead of time:BASE_URL— the hostname of your Impala endpointJOB_ID— an identifier reserved for your account (required when creating a batch)
Step 1 — Prepare your input file (JSONL)
The input file must be JSONL (one JSON object per line). Each line represents one request to the inference endpoint you want to target (e.g./v1/chat/completions).
Example input.jsonl for chat completions:
Step 2 — Upload the input file
Upload your JSONL file using the OpenAI-compatible Files API:id — you’ll use it as input_file_id in the next step.
Step 3 — Create the batch
| Field | Value |
|---|---|
| input_file_id | The id returned by Step 2 |
| endpoint | One of /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/responses — must match the url field in each JSONL line |
| completion_window | ”unlimited” |
| job_id | The Job ID supplied to you by Impala (job-xxxxxxxx) |
id for the batch (for example, batch-xyz789).
Step 4 — Poll for status
status will move through:
validating → in_progress → finalizing → completed (or failed / expired)
When the batch is completed, the response includes:
output_file_id— a file containing successful responseserror_file_id— a file containing failed requests (only present if any requests failed)
Step 5 — Download results
custom_id.
Using the OpenAI Python SDK
The endpoint is OpenAI-compatible. Pointbase_url to your Impala endpoint:
job_id is required by Impala. Since the OpenAI SDK doesn’t expose a job_id parameter, pass it via extra_body as shown.
API reference
POST /v1/files— uploadGET /v1/files/{file_id}— metadataGET /v1/files/{file_id}/content— downloadPOST /v1/batches— createGET /v1/batches/{batch_id}— statusGET /v1/batches— list

