• Tech Dev NotesTech Dev Notes
Apps
  • App lookup
  • App compare
Market movement
  • App charts
  • App rankings
Visual proof
  • App screens
  • App listing screenshots
  • App icons
Build intelligence
  • App tech stacks
  • Tool releases
  • Developers
More
  • X feature flags
  • Grokipedia
  • Blog
  • Follow on X
Skip to content
All content/ filesChangelog

xai-docs/latest/content · Jun 27, 00:17 UTC

pages/developers/advanced-api-usage/batch-api.md

MD·36.5 KB·870 lines

content/

  • .

    • llms.txt
  • pages

    • overview.md
  • pages/build

    • enterprise.md
    • modes-and-commands.md
    • overview.md
    • settings.md
  • pages/build/cli

    • headless-scripting.md
  • pages/build/features

    • skills-plugins-marketplaces.md
  • pages/console

    • billing.md
    • collections.md
    • usage.md
  • pages/console/faq

    • accounts.md
    • billing.md
    • security.md
  • pages/developers

    • community.md
    • cost-tracking.md
    • debugging.md
    • docs-mcp.md
    • files.md
    • grpc-api-reference.md
    • management-api-guide.md
    • models.md
    • pricing.md
    • quickstart.md
    • rate-limits.md
    • release-notes.md
  • pages/developers/advanced-api-usage

    • async.md
    • batch-api.md
    • context-compaction.md
    • deferred-chat-completions.md
    • mtls.md
    • priority-processing.md
    • prompt-caching.md
    • websocket-mode.md
  • pages/developers/advanced-api-usage/prompt-caching

    • best-practices.md
    • how-it-works.md
    • maximizing-cache-hits.md
    • multi-turn.md
    • usage-and-pricing.md
  • pages/developers/faq

    • accounts.md
    • billing.md
    • general.md
    • security.md
    • team-management.md
  • pages/developers/files

    • collections.md
    • managing-files.md
    • public-urls.md
  • pages/developers/files/collections

    • api.md
    • metadata.md
  • pages/developers/migration

    • may-15-retirement.md
  • pages/developers/model-capabilities

    • imagine.md
  • pages/developers/model-capabilities/audio

    • custom-voices.md
    • ephemeral-tokens.md
    • speech-to-text.md
    • text-to-speech.md
    • voice-agent.md
    • voice.md
  • pages/developers/model-capabilities/audio/voice-agent

    • sip.md
  • pages/developers/model-capabilities/files

    • chat-with-files.md
  • pages/developers/model-capabilities/images

    • editing.md
    • generation.md
    • multi-image-editing.md
    • understanding.md
  • pages/developers/model-capabilities/imagine

    • files.md
  • pages/developers/model-capabilities/imagine/files

    • inputs.md
    • outputs.md
  • pages/developers/model-capabilities/legacy

    • chat-completions.md
  • pages/developers/model-capabilities/text

    • comparison.md
    • generate-text.md
    • multi-agent.md
    • reasoning.md
    • streaming.md
    • structured-outputs.md
  • pages/developers/model-capabilities/video

    • editing.md
    • extension.md
    • generation.md
    • image-to-video.md
    • reference-to-video.md
  • pages/developers/models

    • speech-to-text.md
    • text-to-speech.md
    • voice-agent-api.md
  • pages/developers/rest-api-reference

    • collections.md
    • files.md
    • inference.md
    • management.md
  • pages/developers/rest-api-reference/collections

    • collection.md
    • search.md
  • pages/developers/rest-api-reference/files

    • download.md
    • manage.md
    • upload.md
  • pages/developers/rest-api-reference/inference

    • batches.md
    • chat.md
    • images.md
    • legacy.md
    • models.md
    • other.md
    • speech-to-text.md
    • videos.md
    • voice.md
  • pages/developers/rest-api-reference/management

    • audit.md
    • auth.md
    • billing.md
  • pages/developers/tools

    • advanced-usage.md
    • citations.md
    • code-execution.md
    • collections-search.md
    • function-calling.md
    • overview.md
    • remote-mcp.md
    • streaming-and-sync.md
    • tool-usage-details.md
    • web-search.md
    • x-search.md
  • pages/grok

    • connector-management.md
    • connectors.md
    • faq.md
    • management.md
    • organization.md
    • user-guide.md
  • pages/grok/connectors

    • custom-mcp-tunneling.md
    • gmail-google-calendar.md
    • google-drive.md
    • microsoft-teams.md
    • onedrive.md
    • outlook.md
    • salesforce.md
    • sharepoint.md
  • pages/grok/faq

    • team-management.md
  • pages/integrations

    • hubspot-mcp-setup.md

Advanced API Usage

Batch API

The Batch API lets you process large volumes of requests asynchronously with reduced pricing and higher rate limits. For pricing details, see Batch API Pricing. If you need lower latency on real-time requests instead, see Priority Processing.

What is the Batch API?

When you make a standard API call to Grok, you send a request and wait for an immediate response. This approach is perfect for interactive applications like chatbots, real-time assistants, or any use case where users are waiting for a response.

The Batch API takes a different approach. Instead of processing requests immediately, you submit them to a queue where they're processed in the background. You don't get an instant response—instead, you check back later to retrieve your results.

Key differences from real-time API requests:

Real-time API Batch API
Cost Standard pricing Reduced pricing (see details)
Rate limits Per-minute limits apply Requests don't count towards rate limits
Response time Immediate (seconds) Typically within 24 hours*
Use case Interactive, real-time Background processing, bulk jobs

* Processing time: Most batch requests complete within 24 hours, though processing time may vary depending on system load and batch size. Completion time is best effort and not guaranteed.

[!NOTE]

You can also create, monitor, and manage batches through the xAI Console. The Console provides a visual interface for tracking batch progress and viewing results.

When to use the Batch API

The Batch API is ideal when you don't need immediate results and want to reduce your API costs:

  • Running evaluations and benchmarks - Test model performance across thousands of prompts
  • Processing large datasets - Analyze customer feedback, classify support tickets, extract entities
  • Content moderation at scale - Review backlogs of user-generated content
  • Document summarization - Process reports, research papers, or legal documents in bulk
  • Data enrichment pipelines - Add AI-generated insights to database records
  • Scheduled overnight jobs - Generate daily reports or prepare data for dashboards

How it works

The Batch API workflow consists of four main steps:

  1. Create a batch - A batch is a container that groups related requests together
  2. Add requests - Submit your inference requests to the batch queue
  3. Monitor progress - Poll the batch status to track completion
  4. Retrieve results - Fetch responses for all processed requests

Let's walk through each step.

Step 1: Create a batch

A batch acts as a container for your requests. Think of it as a folder that groups related work together—you might create separate batches for different datasets, experiments, or job types.

When you create a batch, you receive a batch_id that you'll use to add requests and retrieve results.

curl -X POST https://api.x.ai/v1/batches \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer $XAI_API_KEY" \\
  -d '{
    "name": "customer_feedback_analysis"
  }'
from xai_sdk import Client

client = Client()

# Create a batch with a descriptive name
batch = client.batch.create(batch_name="customer_feedback_analysis")
print(f"Created batch: {batch.batch_id}")

# Store the batch_id for later use
batch_id = batch.batch_id
// Create a batch with a descriptive name
const response = await fetch("https://api.x.ai/v1/batches", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: \`Bearer \${process.env.XAI_API_KEY}\`,
  },
  body: JSON.stringify({ name: "customer_feedback_analysis" }),
});
const batch = await response.json();
console.log(\`Created batch: \${batch.batch_id}\`);

// Store the batch_id for later use
const batchId = batch.batch_id;

Step 2: Add requests to the batch

With your batch created, you can now add requests to it. Each request will be processed asynchronously.

With the xAI SDK, adding batch requests is simple: use chat.create() for text, image.prepare() for images, video.prepare() for videos, or video.prepare_extension() for video extensions, then pass them as a list. You can also upload a JSONL file if you prefer.

Important: Assign a unique batch_request_id to each request. This ID lets you match results back to their original requests, which becomes important when you're processing hundreds or thousands of items. If you don't provide an ID, we generate a UUID for you. Using your own IDs is useful for idempotency (ensuring a request is only processed once) and for linking batch requests to records in your own system.

from xai_sdk import Client
from xai_sdk.chat import system, user
from xai_sdk.tools import web_search, x_search, mcp

client = Client()

batch_requests = []

# Chat completion with tools
chat = client.chat.create(
    model="grok-4.3",
    batch_request_id="chat_001",
    tools=[web_search(), x_search()],
)
chat.append(system("Analyze market sentiment from recent news and posts."))
chat.append(user("What is the current sentiment around TSLA stock?"))
batch_requests.append(chat)

# Image generation
image_req = client.image.prepare(
    prompt="A sleek modern laptop on a minimalist desk",
    model="grok-imagine-image-quality",
    batch_request_id="img_001",
)
batch_requests.append(image_req)

# Image edit
image_edit_req = client.image.prepare(
    prompt="Add a rainbow in the background",
    model="grok-imagine-image-quality",
    image_url="https://picsum.photos/800",
    batch_request_id="img_edit_001",
)
batch_requests.append(image_edit_req)

# Video generation
video_req = client.video.prepare(
    prompt="A product rotating on a turntable with dramatic lighting",
    model="grok-imagine-video",
    batch_request_id="vid_001",
)
batch_requests.append(video_req)

# Video edit
video_edit_req = client.video.prepare(
    prompt="Make it slow motion",
    model="grok-imagine-video",
    video_url="https://lorem.video/cat_360p_3s",
    batch_request_id="vid_edit_001",
)
batch_requests.append(video_edit_req)

# Video extension
video_ext_req = client.video.prepare_extension(
    prompt="The camera slowly pans to reveal a sunset behind the mountains",
    model="grok-imagine-video",
    video_url="https://lorem.video/cat_360p_3s",
    duration=6,
    batch_request_id="vid_ext_001",
)
batch_requests.append(video_ext_req)

# Remote MCP
mcp_chat = client.chat.create(
    model="grok-4.3",
    batch_request_id="mcp_001",
    tools=[mcp(server_url="https://mcp.deepwiki.com/mcp")],
)
mcp_chat.append(user("What does the xai-sdk-python repo do?"))
batch_requests.append(mcp_chat)

# Add all requests to the batch
client.batch.add(batch_id=batch.batch_id, batch_requests=batch_requests)
print(f"Added {len(batch_requests)} requests to batch")
curl -X POST https://api.x.ai/v1/batches/{batch_id}/requests \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer $XAI_API_KEY" \\
  -d '{
    "batch_requests": [
      {
        "batch_request_id": "feedback_001",
        "batch_request": {
          "responses": {
            "input": [
              {"role": "system", "content": "Classify the sentiment as positive, negative, or neutral."},
              {"role": "user", "content": "The product exceeded my expectations!"}
            ],
            "model": "grok-4.3"
          }
        }
      },
      {
        "batch_request_id": "feedback_002",
        "batch_request": {
          "responses": {
            "input": [
              {"role": "system", "content": "Classify the sentiment as positive, negative, or neutral."},
              {"role": "user", "content": "Shipping took way too long."}
            ],
            "model": "grok-4.3"
          }
        }
      }
    ]
  }'
const batchRequests = [];

// Chat completion with tools (uses "responses" endpoint for server-side tool support)
batchRequests.push({
  batch_request_id: "chat_001",
  batch_request: {
    responses: {
      model: "grok-4.3",
      tools: [{ type: "web_search" }, { type: "x_search" }],
      input: [
        { role: "system", content: "Analyze market sentiment from recent news and posts." },
        { role: "user", content: "What is the current sentiment around TSLA stock?" },
      ],
    },
  },
});

// Image generation
batchRequests.push({
  batch_request_id: "img_001",
  batch_request: {
    image_generation: {
      prompt: "A sleek modern laptop on a minimalist desk",
      model: "grok-imagine-image-quality",
    },
  },
});

// Image edit
batchRequests.push({
  batch_request_id: "img_edit_001",
  batch_request: {
    image_edit: {
      prompt: "Add a rainbow in the background",
      model: "grok-imagine-image-quality",
      image: { url: "https://picsum.photos/800", type: "image_url" },
    },
  },
});

// Video generation
batchRequests.push({
  batch_request_id: "vid_001",
  batch_request: {
    video_generation: {
      prompt: "A product rotating on a turntable with dramatic lighting",
      model: "grok-imagine-video",
    },
  },
});

// Video edit
batchRequests.push({
  batch_request_id: "vid_edit_001",
  batch_request: {
    video_generation: {
      prompt: "Make it slow motion",
      model: "grok-imagine-video",
      video: { url: "https://lorem.video/cat_360p_3s" },
    },
  },
});

// Video extension
batchRequests.push({
  batch_request_id: "vid_ext_001",
  batch_request: {
    video_extension: {
      prompt: "The camera slowly pans to reveal a sunset behind the mountains",
      model: "grok-imagine-video",
      video: { url: "https://lorem.video/cat_360p_3s" },
      duration: 6,
    },
  },
});

// Remote MCP
batchRequests.push({
  batch_request_id: "mcp_001",
  batch_request: {
    responses: {
      model: "grok-4.3",
      tools: [{ type: "mcp", server_label: "deepwiki", server_url: "https://mcp.deepwiki.com/mcp" }],
      input: [{ role: "user", content: "What does the xai-sdk-python repo do?" }],
    },
  },
});

// Add all requests to the batch
const response = await fetch(\`https://api.x.ai/v1/batches/\${batchId}/requests\`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: \`Bearer \${process.env.XAI_API_KEY}\`,
  },
  body: JSON.stringify({ batch_requests: batchRequests }),
});
if (!response.ok) throw new Error(\`Failed to add requests: \${await response.text()}\`);
console.log(\`Added \${batchRequests.length} requests to batch\`);

Step 3: Monitor batch progress

After adding requests, they begin processing in the background. Since batch processing is asynchronous, you need to poll the batch status to know when results are ready.

The batch state includes counters for pending, successful, and failed requests. Poll periodically until num_pending reaches zero, which indicates all requests have been processed (either successfully or with errors).

# Check batch status
curl https://api.x.ai/v1/batches/{batch_id} \\
  -H "Authorization: Bearer $XAI_API_KEY"

# Response includes state with request counts:
# {
#   "state": {
#     "num_requests": 100,
#     "num_pending": 25,
#     "num_success": 70,
#     "num_error": 5
#   }
# }
import time
from xai_sdk import Client

client = Client()

# Poll until all requests are processed
print("Waiting for batch to complete...")
while True:
    batch = client.batch.get(batch_id=batch.batch_id)
    
    pending = batch.state.num_pending
    completed = batch.state.num_success + batch.state.num_error
    total = batch.state.num_requests
    
    print(f"Progress: {completed}/{total} complete, {pending} pending")
    
    if pending == 0:
        print("Batch processing complete!")
        break
    
    # Wait before polling again (avoid hammering the API)
    time.sleep(5)
// Poll until all requests are processed
console.log("Waiting for batch to complete...");
const interval = setInterval(async () => {
  const response = await fetch(
    \`https://api.x.ai/v1/batches/\${batchId}\`,
    { headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` } }
  );
  const batch = await response.json();

  const { num_pending, num_success, num_error, num_requests } = batch.state;
  const completed = num_success + num_error;
  console.log(\`Progress: \${completed}/\${num_requests} complete, \${num_pending} pending\`);

  if (num_requests > 0 && num_pending === 0) {
    clearInterval(interval);
    console.log("Batch processing complete!");
  }
  // Wait before polling again (avoid hammering the API)
}, 5000);

Understanding batch states

The Batch API tracks state at two levels: the batch level and the individual request level.

Batch-level state shows aggregate progress across all requests in a given batch, accessible through the batch.state object returned by the client.batch.get() method:

Counter Description
num_cancelled Requests that were cancelled
num_error Requests that failed with an error
num_pending Requests waiting to be processed
num_requests Total number of requests added to the batch
num_success Requests that completed successfully

When num_pending reaches zero, all requests have been processed (either successfully, with errors, or cancelled).

Individual request states describe where each request is in its lifecycle, accessible through the batch_request_metadata object returned by the client.batch.list_batch_requests() method:

State Description
cancelled Request was cancelled (e.g., when the batch was cancelled before this request was processed)
failed Request encountered an error during processing
pending Request is queued and waiting to be processed
succeeded Request completed successfully, result is available

Batch lifecycle: A batch can also be cancelled or expire. If you cancel a batch, pending requests won't be processed, but already-completed results remain available. Batches have an expiration time after which results are no longer accessible—check the expires_at field when retrieving batch details.

Step 4: Retrieve results

You can retrieve results at any time, even before the entire batch completes. Results are available as soon as individual requests finish processing, so you can start consuming completed results while other requests are still in progress.

Each result is linked to its original request via the batch_request_id you assigned earlier. For chat completions, use result.response which has the familiar fields: .content, .usage, .finish_reason, and more. For image requests, use result.image_response which provides .url, .base64, .usage, and .model. For video requests, use result.video_response which provides .url, .duration, .usage, and .model. These are the same response types returned by the regular client.image.sample() and client.video.generate() methods.

The SDK provides convenient .succeeded and .failed properties to separate successful responses from errors.

Pagination: Results are returned in pages. Use the limit parameter to control page size and pagination_token to fetch subsequent pages. When pagination_token is None, you've reached the end.

from xai_sdk import Client

client = Client()

# Paginate through all results
all_succeeded = []
all_failed = []
pagination_token = None

while True:
    # Fetch a page of results (limit controls page size)
    page = client.batch.list_batch_results(
        batch_id=batch.batch_id,
        limit=100,
        pagination_token=pagination_token,
    )
    
    # Collect results from this page
    all_succeeded.extend(page.succeeded)
    all_failed.extend(page.failed)
    
    # Check if there are more pages
    if page.pagination_token is None:
        break
    pagination_token = page.pagination_token

# Process results - handle different response types
print(f"Successfully processed: {len(all_succeeded)} requests")
for result in all_succeeded:
    rid = result.batch_request_id
    resp = result.proto.response

    if resp.HasField("completion_response"):
        # Chat completion response
        print(f"[{rid}] {result.response.content}")
        print(f"  Tokens used: {result.response.usage.total_tokens}")
    elif resp.HasField("image_response"):
        # Image generation response
        print(f"[{rid}] Image URL: {result.image_response.url}")
    elif resp.HasField("video_response"):
        # Video generation response
        print(f"[{rid}] Video URL: {result.video_response.url}")

if all_failed:
    print(f"\\nFailed: {len(all_failed)} requests")
    for result in all_failed:
        print(f"[{result.batch_request_id}] Error: {result.error_message}")
# Fetch first page
curl "https://api.x.ai/v1/batches/{batch_id}/results?limit=100" \\
  -H "Authorization: Bearer $XAI_API_KEY"

# Use pagination_token from response to fetch next page
curl "https://api.x.ai/v1/batches/{batch_id}/results?limit=100&pagination_token={token}" \\
  -H "Authorization: Bearer $XAI_API_KEY"
// Paginate through all results
const allSucceeded = [];
const allFailed = [];
let paginationToken = undefined;

while (true) {
  // Fetch a page of results (limit controls page size)
  const url = new URL(\`https://api.x.ai/v1/batches/\${batchId}/results\`);
  url.searchParams.set("limit", "100");
  if (paginationToken) url.searchParams.set("pagination_token", paginationToken);

  const res = await fetch(url, {
    headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` },
  });
  const page = await res.json();

  // Collect results from this page
  for (const result of page.results) {
    const response = result.batch_result?.response;
    if (response?.chat_get_completion || response?.image_generation || response?.video_generation) {
      allSucceeded.push(result);
    } else {
      allFailed.push(result);
    }
  }

  // Check if there are more pages
  if (!page.pagination_token) break;
  paginationToken = page.pagination_token;
}

// Process all results
console.log(\`Successfully processed: \${allSucceeded.length} requests\`);
for (const result of allSucceeded) {
  const response = result.batch_result.response;
  const content = response.chat_get_completion?.choices[0].message.content
    ?? response.image_generation?.data[0].url
    ?? response.video_generation?.video.url;
  const tokens = response.chat_get_completion?.usage?.total_tokens;
  // Access the full response object
  console.log(\`[\${result.batch_request_id}] \${content}\`);
  if (tokens != null) console.log(\`  Tokens used: \${tokens}\`);
}

if (allFailed.length > 0) {
  console.log(\`\\nFailed: \${allFailed.length} requests\`);
  for (const result of allFailed) {
    console.log(\`[\${result.batch_request_id}] Error: \${result.error_message}\`);
  }
}

Additional operations

Beyond the core workflow, the Batch API provides additional operations for managing your batches.

Cancel a batch

You can cancel a batch before all requests complete. Already-processed requests remain available in the results, but pending requests will not be processed. You cannot add more requests to a cancelled batch.

curl -X POST https://api.x.ai/v1/batches/{batch_id}:cancel \\
  -H "Authorization: Bearer $XAI_API_KEY"
from xai_sdk import Client

client = Client()

# Cancel processing
cancelled_batch = client.batch.cancel(batch_id=batch.batch_id)
print(f"Cancelled batch: {cancelled_batch.batch_id}")
print(f"Completed before cancellation: {cancelled_batch.state.num_success} requests")
// Cancel processing
const response = await fetch(
  \`https://api.x.ai/v1/batches/\${batchId}:cancel\`,
  { method: "POST", headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` } }
);
const cancelledBatch = await response.json();
console.log(\`Cancelled batch: \${cancelledBatch.batch_id}\`);
console.log(\`Completed before cancellation: \${cancelledBatch.state.num_success} requests\`);

List all batches

View all batches belonging to your team. Batches are retained until they expire (check the expires_at field). This endpoint supports the same limit and pagination_token parameters for paginating through large lists.

curl "https://api.x.ai/v1/batches?limit=20" \\
  -H "Authorization: Bearer $XAI_API_KEY"
from xai_sdk import Client

client = Client()

# List recent batches
response = client.batch.list(limit=20)

for batch in response.batches:
    status = "complete" if batch.state.num_pending == 0 else "processing"
    print(f"{batch.name} ({batch.batch_id}): {status}")
// List recent batches
const response = await fetch(
  "https://api.x.ai/v1/batches?limit=20",
  { headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` } }
);
const data = await response.json();

for (const batch of data.batches) {
  const status = batch.state.num_pending === 0 ? "complete" : "processing";
  console.log(\`\${batch.name} (\${batch.batch_id}): \${status}\`);
}

Check individual request status

For detailed tracking, you can inspect the metadata for each request in a batch. This shows the status, timing, and other details for individual requests. This endpoint supports the same limit and pagination_token parameters for paginating through large batches.

curl "https://api.x.ai/v1/batches/{batch_id}/requests?limit=50" \\
  -H "Authorization: Bearer $XAI_API_KEY"
from xai_sdk import Client

client = Client()

# Get metadata for individual requests
metadata = client.batch.list_batch_requests(batch_id=batch.batch_id)

for request in metadata.batch_request_metadata:
    print(f"Request {request.batch_request_id}: {request.state}")
// Get metadata for individual requests
const response = await fetch(
  \`https://api.x.ai/v1/batches/\${batchId}/requests?limit=50\`,
  { headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` } }
);
const data = await response.json();

for (const req of data.batch_request_metadata) {
  console.log(\`Request \${req.batch_request_id}: \${req.state}\`);
}

Track costs

Each batch tracks the total processing cost. Access the cost breakdown after processing to understand your spending. For pricing details, see Batch API Pricing on the Pricing page.

# Get batch with cost information
curl -s "https://api.x.ai/v1/batches/{batch_id}/results?limit=100" \\
  -H "Authorization: Bearer $XAI_API_KEY"

# Cost per result can be found on response.results[].batch_result.response.chat_get_completion.usage.cost_in_usd_ticks
# Cost is returned in ticks (1e-10 USD) for precision
from xai_sdk import Client

client = Client()

# Get batch with cost information
batch = client.batch.get(batch_id=batch.batch_id)

# Cost is returned in ticks (1e-10 USD) for precision
total_cost_usd = batch.cost_breakdown.total_cost_usd_ticks / 1e10
print("Total cost: $%.4f" % total_cost_usd)
// Get batch with cost information
const response = await fetch(
  \`https://api.x.ai/v1/batches/\${batchId}/results?limit=100\`,
  { headers: { Authorization: \`Bearer \${process.env.XAI_API_KEY}\` } }
);
const data = await response.json();

// Cost is returned in ticks (1e-10 USD) for precision
let totalTicks = 0;
for (const r of data.results) {
  totalTicks += r.batch_result?.response?.chat_get_completion?.usage?.cost_in_usd_ticks ?? 0;
}
console.log(\`Total cost: $\${(totalTicks / 1e10).toFixed(4)}\`);

Complete example

This end-to-end example demonstrates a realistic batch workflow: analyzing customer feedback at scale. It creates a batch, submits feedback items for sentiment analysis, waits for processing, and outputs the results. For simplicity, this example doesn't paginate results—see Step 4 for pagination when processing larger batches.

import time
from xai_sdk import Client
from xai_sdk.chat import system, user

client = Client()

# Sample dataset: customer feedback to analyze
feedback_data = [
    {"id": "fb_001", "text": "Absolutely love this product! Best purchase ever."},
    {"id": "fb_002", "text": "Delivery was late and the packaging was damaged."},
    {"id": "fb_003", "text": "Works fine, nothing special to report."},
    {"id": "fb_004", "text": "Customer support was incredibly helpful!"},
    {"id": "fb_005", "text": "The app keeps crashing on my phone."},
]

# Step 1: Create a batch
print("Creating batch...")
batch = client.batch.create(batch_name="feedback_sentiment_analysis")
print(f"Batch created: {batch.batch_id}")

# Step 2: Build and add requests
print("\\nAdding requests...")
batch_requests = []
for item in feedback_data:
    chat = client.chat.create(
        model="grok-4.3",
        batch_request_id=item["id"],
    )
    chat.append(system(
        "Analyze the sentiment of the customer feedback. "
        "Respond with exactly one word: positive, negative, or neutral."
    ))
    chat.append(user(item["text"]))
    batch_requests.append(chat)

client.batch.add(batch_id=batch.batch_id, batch_requests=batch_requests)
print(f"Added {len(batch_requests)} requests")

# Step 3: Wait for completion
print("\\nProcessing...")
while True:
    batch = client.batch.get(batch_id=batch.batch_id)
    pending = batch.state.num_pending
    completed = batch.state.num_success + batch.state.num_error
    
    print(f"  {completed}/{batch.state.num_requests} complete")
    
    if pending == 0:
        break
    time.sleep(2)

# Step 4: Retrieve and display results
print("\\n--- Results ---")
results = client.batch.list_batch_results(batch_id=batch.batch_id)

# Create a lookup for original feedback text
feedback_lookup = {item["id"]: item["text"] for item in feedback_data}

for result in results.succeeded:
    original_text = feedback_lookup.get(result.batch_request_id, "")
    sentiment = result.response.content.strip().lower()
    print(f"[{sentiment.upper()}] {original_text[:50]}...")

# Report any failures
if results.failed:
    print("\\n--- Errors ---")
    for result in results.failed:
        print(f"[{result.batch_request_id}] {result.error_message}")

# Display cost
cost_usd = batch.cost_breakdown.total_cost_usd_ticks / 1e10
print("\\nTotal cost: $%.4f" % cost_usd)
const BASE_URL = "https://api.x.ai/v1";
const headers = { "Content-Type": "application/json", Authorization: \`Bearer \${process.env.XAI_API_KEY}\` };

// Sample dataset: customer feedback to analyze
const feedbackData = [
  { id: "fb_001", text: "Absolutely love this product! Best purchase ever." },
  { id: "fb_002", text: "Delivery was late and the packaging was damaged." },
  { id: "fb_003", text: "Works fine, nothing special to report." },
  { id: "fb_004", text: "Customer support was incredibly helpful!" },
  { id: "fb_005", text: "The app keeps crashing on my phone." },
];

// Step 1: Create a batch
console.log("Creating batch...");
const batchRes = await fetch(\`\${BASE_URL}/batches\`, {
  method: "POST",
  headers,
  body: JSON.stringify({ name: "feedback_sentiment_analysis" }),
});
const batch = await batchRes.json();
const batchId = batch.batch_id;
console.log(\`Batch created: \${batchId}\`);

// Step 2: Build and add requests
console.log("\\nAdding requests...");
const response = await fetch(\`\${BASE_URL}/batches/\${batchId}/requests\`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    batch_requests: feedbackData.map((item) => ({
      batch_request_id: item.id,
      batch_request: {
        chat_get_completion: {
          model: "grok-4.3",
          messages: [
            {
              role: "system",
              content: "Analyze the sentiment of the customer feedback. Respond with exactly one word: positive, negative, or neutral.",
            },
            { role: "user", content: item.text },
          ],
        },
      },
    })),
  }),
});
if (!response.ok) throw new Error(\`Failed to add requests: \${await response.text()}\`);
console.log(\`Added \${feedbackData.length} requests\`);

// Step 3: Wait for completion
console.log("\\nProcessing...");
const interval = setInterval(async () => {
  const statusRes = await fetch(\`\${BASE_URL}/batches/\${batchId}\`, { headers });
  const status = await statusRes.json();
  const { num_pending, num_success, num_error, num_requests } = status.state;
  console.log(\`  \${num_success + num_error}/\${num_requests} complete\`);

  if (num_requests > 0 && num_pending === 0) {
    clearInterval(interval);

    // Step 4: Retrieve and display results
    console.log("\\n--- Results ---");
    const resultsRes = await fetch(\`\${BASE_URL}/batches/\${batchId}/results?limit=100\`, { headers });
    const { results } = await resultsRes.json();

    // Create a lookup for original feedback text
    const feedbackLookup = Object.fromEntries(feedbackData.map((item) => [item.id, item.text]));

    const succeeded = results.filter((r) => r.batch_result?.response?.chat_get_completion);
    const failed = results.filter((r) => !r.batch_result?.response?.chat_get_completion);

    for (const result of succeeded) {
      const originalText = feedbackLookup[result.batch_request_id] ?? "";
      const sentiment = result.batch_result.response.chat_get_completion.choices[0].message.content.trim().toLowerCase();
      console.log(\`[\${sentiment.toUpperCase()}] \${originalText.slice(0, 50)}...\`);
    }

    // Report any failures
    if (failed.length > 0) {
      console.log("\\n--- Errors ---");
      for (const result of failed) {
        console.log(\`[\${result.batch_request_id}] \${result.error_message}\`);
      }
    }

    // Display cost
    let totalTicks = 0;
    for (const r of results) {
      totalTicks += r.batch_result?.response?.chat_get_completion?.usage?.cost_in_usd_ticks ?? 0;
    }
    console.log(\`\\nTotal cost: $\${(totalTicks / 1e10).toFixed(4)}\`);
  }
}, 2000);

JSONL File Upload

As an alternative to adding requests via the SDK, you can create batches by uploading a JSONL file. This is useful when generating requests from scripts, pipelines, or external tools.

Each line in the file is a JSON object with four fields: custom_id (unique identifier, maps to batch_request_id), method (always "POST"), url (API endpoint path), and body (the JSON request payload matching the REST API reference for that endpoint).

{"custom_id": "chat-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "grok-4.3", "messages": [{"role": "user", "content": "Classify this as positive, negative, or neutral: The product exceeded my expectations!"}]}}
{"custom_id": "search-1", "method": "POST", "url": "/v1/responses", "body": {"model": "grok-4.3", "tools": [{"type": "web_search"}, {"type": "x_search"}], "input": [{"role": "user", "content": "What are the latest SpaceX launches?"}]}}
{"custom_id": "mcp-1", "method": "POST", "url": "/v1/responses", "body": {"model": "grok-4.3", "tools": [{"type": "mcp", "server_label": "deepwiki", "server_url": "https://mcp.deepwiki.com/mcp"}], "input": [{"role": "user", "content": "What does the xai-sdk-python repo do?"}]}}
{"custom_id": "img-1", "method": "POST", "url": "/v1/images/generations", "body": {"model": "grok-imagine-image-quality", "prompt": "A futuristic city skyline at sunset"}}
{"custom_id":
…
Previouspages/developers/advanced-api-usage/async.mdNextpages/developers/advanced-api-usage/context-compaction.md

© 2026 Tech Dev Notes

RSSAboutAPIPrivacyTermsSitemap@techdevnotes