• Tech Dev NotesTech Dev Notes
Apps
  • App lookup
  • App compare
Market movement
  • App charts
  • App rankings
Visual proof
  • App screens
  • App listing screenshots
  • App icons
Build intelligence
  • App tech stacks
  • Tool releases
  • Developers
More
  • X feature flags
  • Grokipedia
  • Blog
  • Follow on X
Skip to content
All content/ filesChangelog

xai-docs/latest/content · Jun 27, 00:17 UTC

pages/developers/files/collections.md

MD·5.4 KB·173 lines

content/

  • .

    • llms.txt
  • pages

    • overview.md
  • pages/build

    • enterprise.md
    • modes-and-commands.md
    • overview.md
    • settings.md
  • pages/build/cli

    • headless-scripting.md
  • pages/build/features

    • skills-plugins-marketplaces.md
  • pages/console

    • billing.md
    • collections.md
    • usage.md
  • pages/console/faq

    • accounts.md
    • billing.md
    • security.md
  • pages/developers

    • community.md
    • cost-tracking.md
    • debugging.md
    • docs-mcp.md
    • files.md
    • grpc-api-reference.md
    • management-api-guide.md
    • models.md
    • pricing.md
    • quickstart.md
    • rate-limits.md
    • release-notes.md
  • pages/developers/advanced-api-usage

    • async.md
    • batch-api.md
    • context-compaction.md
    • deferred-chat-completions.md
    • mtls.md
    • priority-processing.md
    • prompt-caching.md
    • websocket-mode.md
  • pages/developers/advanced-api-usage/prompt-caching

    • best-practices.md
    • how-it-works.md
    • maximizing-cache-hits.md
    • multi-turn.md
    • usage-and-pricing.md
  • pages/developers/faq

    • accounts.md
    • billing.md
    • general.md
    • security.md
    • team-management.md
  • pages/developers/files

    • collections.md
    • managing-files.md
    • public-urls.md
  • pages/developers/files/collections

    • api.md
    • metadata.md
  • pages/developers/migration

    • may-15-retirement.md
  • pages/developers/model-capabilities

    • imagine.md
  • pages/developers/model-capabilities/audio

    • custom-voices.md
    • ephemeral-tokens.md
    • speech-to-text.md
    • text-to-speech.md
    • voice-agent.md
    • voice.md
  • pages/developers/model-capabilities/audio/voice-agent

    • sip.md
  • pages/developers/model-capabilities/files

    • chat-with-files.md
  • pages/developers/model-capabilities/images

    • editing.md
    • generation.md
    • multi-image-editing.md
    • understanding.md
  • pages/developers/model-capabilities/imagine

    • files.md
  • pages/developers/model-capabilities/imagine/files

    • inputs.md
    • outputs.md
  • pages/developers/model-capabilities/legacy

    • chat-completions.md
  • pages/developers/model-capabilities/text

    • comparison.md
    • generate-text.md
    • multi-agent.md
    • reasoning.md
    • streaming.md
    • structured-outputs.md
  • pages/developers/model-capabilities/video

    • editing.md
    • extension.md
    • generation.md
    • image-to-video.md
    • reference-to-video.md
  • pages/developers/models

    • speech-to-text.md
    • text-to-speech.md
    • voice-agent-api.md
  • pages/developers/rest-api-reference

    • collections.md
    • files.md
    • inference.md
    • management.md
  • pages/developers/rest-api-reference/collections

    • collection.md
    • search.md
  • pages/developers/rest-api-reference/files

    • download.md
    • manage.md
    • upload.md
  • pages/developers/rest-api-reference/inference

    • batches.md
    • chat.md
    • images.md
    • legacy.md
    • models.md
    • other.md
    • speech-to-text.md
    • videos.md
    • voice.md
  • pages/developers/rest-api-reference/management

    • audit.md
    • auth.md
    • billing.md
  • pages/developers/tools

    • advanced-usage.md
    • citations.md
    • code-execution.md
    • collections-search.md
    • function-calling.md
    • overview.md
    • remote-mcp.md
    • streaming-and-sync.md
    • tool-usage-details.md
    • web-search.md
    • x-search.md
  • pages/grok

    • connector-management.md
    • connectors.md
    • faq.md
    • management.md
    • organization.md
    • user-guide.md
  • pages/grok/connectors

    • custom-mcp-tunneling.md
    • gmail-google-calendar.md
    • google-drive.md
    • microsoft-teams.md
    • onedrive.md
    • outlook.md
    • salesforce.md
    • sharepoint.md
  • pages/grok/faq

    • team-management.md
  • pages/integrations

    • hubspot-mcp-setup.md

Files & Collections

Collections

Collections offers xAI API users a robust set of tools and methods to seamlessly integrate their enterprise requirements and internal knowledge bases with the xAI API. Whether you're building a RAG application or need to search across large document sets, Collections provides the infrastructure to manage and query your content.

[!NOTE]

Looking for Files? If you want to attach files directly to chat messages for conversation context, see Files. Collections are different—they provide persistent document storage with semantic search across many documents.

Core Concepts

There are two entities that users can create within the Collections service:

  • File — A single entity of a user-uploaded file.
  • Collection — A group of files linked together, with an embedding index for efficient retrieval.
    • When you create a collection you have the option to automatically generate embeddings for any files uploaded to that collection. You can then perform semantic search across files in multiple collections.
    • A single file can belong to multiple collections.

What You Can Do

With Collections, you can:

  • Create collections to organize your documents
  • Upload documents in various formats (HTML, PDF, text, etc.)
  • Search semantically across your documents using natural language queries
  • Configure chunking and embeddings to optimize retrieval
  • Manage documents by listing, updating, and deleting them

Getting Started

Choose how you want to work with Collections:

  • Using the Console → - Create collections and upload documents through the xAI Console interface
  • Using the API → - Programmatically manage collections with the SDK and REST API

Metadata Fields

Collections support metadata fields — structured attributes you can attach to documents for enhanced retrieval and data integrity:

  • Filtered retrieval — Narrow search results to documents matching specific criteria (e.g., author="Sandra Kim")
  • Contextual embeddings — Inject metadata into chunks to improve retrieval accuracy (e.g., prepending document title to each chunk)
  • Data integrity constraints — Enforce required fields or uniqueness across documents

When creating a collection, define metadata fields with options like required, unique, and inject_into_chunk to control how metadata is validated and used during search.

Learn more about metadata fields →

Usage Limits

To be able to upload files and add to a collection you must have credits in your account.

Maximum file size: 100MBMaximum number of files: 100,000 files uploaded globally.Maximum total size: 100GB

Please contact us to increase any of these limits.

Data Privacy

We do not use user data stored on Collections for model training purposes.

Supported MIME Types

While we support any UTF-8 encoded text file, we also have special file conversion and chunking techniques for certain MIME types.

The following would be a non-exhaustive list for the MIME types that we support:

  • application/csv
  • application/dart
  • application/ecmascript
  • application/epub
  • application/epub+zip
  • application/json
  • application/ms-java
  • application/msword
  • application/pdf
  • application/typescript
  • application/vnd.adobe.pdf
  • application/vnd.curl
  • application/vnd.dart
  • application/vnd.jupyter
  • application/vnd.ms-excel
  • application/vnd.ms-outlook
  • application/vnd.oasis.opendocument.text
  • application/vnd.openxmlformats-officedocument.presentationml.presentation
  • application/vnd.openxmlformats-officedocument.presentationml.slide
  • application/vnd.openxmlformats-officedocument.presentationml.slideshow
  • application/vnd.openxmlformats-officedocument.presentationml.template
  • application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  • application/vnd.openxmlformats-officedocument.spreadsheetml.template
  • application/vnd.openxmlformats-officedocument.wordprocessingml.document
  • application/x-csh
  • application/x-epub+zip
  • application/x-hwp
  • application/x-hwp-v5
  • application/x-latex
  • application/x-pdf
  • application/x-php
  • application/x-powershell
  • application/x-sh
  • application/x-shellscript
  • application/x-tex
  • application/x-zsh
  • application/xhtml
  • application/xml
  • application/zip
  • text/cache-manifest
  • text/calendar
  • text/css
  • text/csv
  • text/html
  • text/javascript
  • text/jsx
  • text/markdown
  • text/n3
  • text/php
  • text/plain
  • text/rtf
  • text/tab-separated-values
  • text/troff
  • text/tsv
  • text/tsx
  • text/turtle
  • text/uri-list
  • text/vcard
  • text/vtt
  • text/x-asm
  • text/x-bibtex
  • text/x-c
  • text/x-c++hdr
  • text/x-c++src
  • text/x-chdr
  • text/x-coffeescript
  • text/x-csh
  • text/x-csharp
  • text/x-csrc
  • text/x-d
  • text/x-diff
  • text/x-emacs-lisp
  • text/x-erlang
  • text/x-go
  • text/x-haskell
  • text/x-java
  • text/x-java-properties
  • text/x-java-source
  • text/x-kotlin
  • text/x-lisp
  • text/x-lua
  • text/x-objcsrc
  • text/x-pascal
  • text/x-perl
  • text/x-perl-script
  • text/x-python
  • text/x-python-script
  • text/x-r-markdown
  • text/x-rst
  • text/x-ruby-script
  • text/x-rust
  • text/x-sass
  • text/x-scala
  • text/x-scheme
  • text/x-script.python
  • text/x-scss
  • text/x-sh
  • text/x-sql
  • text/x-swift
  • text/x-tcl
  • text/x-tex
  • text/x-vbasic
  • text/x-vcalendar
  • text/xml
  • text/xml-dtd
  • text/yaml
Previouspages/developers/files.mdNextpages/developers/files/collections/api.md

© 2026 Tech Dev Notes

RSSAboutAPIPrivacyTermsSitemap@techdevnotes