• Tech Dev NotesTech Dev Notes
Track apps
  • App Lookup
  • Compare Apps
Market movement
  • Store Charts
  • Global Rankings
  • Movers
  • Version Changelog
Visual research
  • In-app Screenshots
  • Store Screenshots
  • App Icons
Build intelligence
  • Tool Releases
  • Tech Stack
  • Developers
Tools
  • X Feature Flags
  • Grokipedia
  • Blog
  • Follow on X
Skip to content
xAI Docs

xAI · Docs

xAI Docs

June 15, 2026 at 10:17 AM UTC

Developers – API Performance

Changelog

Developers – API Performance

  • New page: Priority Processing — explains how to request higher scheduling priority for Chat Completions and Responses endpoints to reduce time-to-first-token and inter-token latency during high demand. Includes quick start examples (cURL, Python, JavaScript) and best practices.

Developers – Pricing

  • Added Priority Processing Pricing section: requests using service_tier: "priority" are billed at 2× standard rates (applies to all token types). Standard rates apply if the request falls back to the default tier. Prompt caching discounts are applied before the multiplier.
  • Clarified that Priority Processing is available only for Chat Completions and Responses; not supported for image generation, video generation, or Batch API.

Developers – Batch API

  • Added cross-reference to the new Priority Processing page for users seeking lower latency on real-time requests instead of batch processing.
Older release← Jun 15, 00:17 UTC
All xAI Docs releases →

© 2026 Tech Dev Notes

RSSAboutAPIPrivacyTermsSitemap@techdevnotes