• Tech Dev NotesTech Dev Notes
Apps
  • App lookup
  • App compare
Market movement
  • App charts
  • App rankings
Visual proof
  • App screens
  • App listing screenshots
  • App icons
Build intelligence
  • App tech stacks
  • Tool releases
  • Developers
More
  • X feature flags
  • Grokipedia
  • Blog
  • Follow on X
Skip to content
All content/ filesChangelog

gemini-docs/latest/content · Jun 26, 14:03 UTC

pages/file-input-methods.txt

TXT·17.2 KB·238 lines

content/

  • pages

    • agent-environment.txt
    • agents.txt
    • ai-studio-quickstart.txt
    • aistudio-agents.txt
    • aistudio-android.txt
    • aistudio-build-mode.txt
    • aistudio-deploying.txt
    • aistudio-fullstack.txt
    • antigravity-agent.txt
    • api-key.txt
    • api-versions.txt
    • audio.txt
    • available-regions.txt
    • background-execution.txt
    • batch-api.txt
    • billing.txt
    • caching.txt
    • changelog.txt
    • code-execution.txt
    • coding-agents.txt
    • computer-use.txt
    • crewai-example.txt
    • custom-agents.txt
    • deep-research.txt
    • deprecations.txt
    • document-processing.txt
    • embeddings.txt
    • feedback-policies.txt
    • file-input-methods.txt
    • file-search.txt
    • files.txt
    • flex-inference.txt
    • function-calling.txt
    • gemini-3.txt
    • gemini-for-research.txt
    • get-started.txt
    • google-search.txt
    • image-generation.txt
    • image-understanding.txt
    • imagen.txt
    • index.txt
    • interactions-breaking-changes-may-2026.txt
    • interactions-overview.txt
    • langgraph-example.txt
    • learnlm.txt
    • libraries.txt
    • live-api.txt
    • llama-index.txt
    • logs-datasets.txt
    • logs-policy.txt
    • long-context.txt
    • managed-agents-quickstart.txt
    • maps-grounding.txt
    • media-resolution.txt
    • migrate-to-cloud.txt
    • migrate-to-interactions.txt
    • migrate.txt
    • model-tuning.txt
    • models.txt
    • music-generation.txt
    • oauth.txt
    • openai.txt
    • optimization.txt
    • partner-integration.txt
    • pricing.txt
    • priority-inference.txt
    • prompting-strategies.txt
    • rate-limits.txt
    • realtime-music-generation.txt
    • robotics-overview.txt
    • safety-guidance.txt
    • safety-settings.txt
    • speech-generation.txt
    • streaming.txt
    • structured-output.txt
    • temporal-example.txt
    • text-generation.txt
    • thinking.txt
    • thought-signatures.txt
    • tokens.txt
    • tool-combination.txt
    • tools.txt
  • pages/generate-content

    • api-key.txt
    • audio.txt
    • caching.txt
    • code-execution.txt
    • computer-use.txt
    • document-processing.txt
    • file-input-methods.txt
    • file-search.txt
    • files.txt
    • flex-inference.txt
    • function-calling.txt
    • gemini-3.txt
    • get-started.txt
    • google-search.txt
    • image-generation.txt
    • image-understanding.txt
    • maps-grounding.txt
    • media-resolution.txt
    • music-generation.txt
    • priority-inference.txt
    • speech-generation.txt
    • structured-output.txt
    • text-generation.txt
    • thinking.txt
    • thought-signatures.txt
    • tokens.txt
    • tool-combination.txt
    • url-context.txt
    • video-understanding.txt
    • webhooks.txt
    • whats-new-gemini-3.5.txt
  • pages/live-api

    • best-practices.txt
    • capabilities.txt
    • ephemeral-tokens.txt
    • get-started-sdk.txt
    • get-started-websocket.txt
    • live-translate.txt
    • session-management.txt
    • tools.txt
  • pages/models

    • antigravity-preview-05-2026.txt
    • deep-research-max-preview-04-2026.txt
    • deep-research-preview-04-2026.txt
    • deep-research-pro-preview-12-2025.txt
    • gemini-2.0-flash-lite.txt
    • gemini-2.0-flash.txt
    • gemini-2.5-computer-use-preview-10-2025.txt
    • gemini-2.5-flash-image.txt
    • gemini-2.5-flash-lite-preview-09-2025.txt
    • gemini-2.5-flash-lite.txt
    • gemini-2.5-flash-native-audio-preview-12-2025.txt
    • gemini-2.5-flash-preview-09-2025.txt
    • gemini-2.5-flash-preview-tts.txt
    • gemini-2.5-flash.txt
    • gemini-2.5-pro-preview-tts.txt
    • gemini-2.5-pro.txt
    • gemini-3-flash-preview.txt
    • gemini-3-pro-image.txt
    • gemini-3-pro-preview.txt
    • gemini-3.1-flash-image.txt
    • gemini-3.1-flash-lite-preview.txt
    • gemini-3.1-flash-lite.txt
    • gemini-3.1-flash-live-preview.txt
    • gemini-3.1-flash-tts-preview.txt
    • gemini-3.1-pro-preview.txt
    • gemini-3.5-flash.txt
    • gemini-3.5-live-translate-preview.txt
    • gemini-embedding-001.txt
    • gemini-embedding-2-preview.txt
    • gemini-embedding-2.txt
    • gemini-robotics-er-1.5-preview.txt
    • gemini-robotics-er-1.6-preview.txt
    • imagen.txt
    • lyria-3-clip-preview.txt
    • lyria-3-pro-preview.txt
    • lyria-realtime-exp.txt
    • veo-2.0-generate-001.txt
    • veo-3.1-generate-preview.txt
    • veo-3.1-lite-generate-preview.txt
route: /gemini-api/docs/file-input-methods
title: File input methods
description: Get started building with Gemini's multimodal capabilities in the Gemini API

Note: This version of the page covers the Interactions API. You can use the toggle on this page to switch to the generateContent API version of this page.
This guide explains the different ways you can include media files such as
images, audio, video, and documents when making requests to the Gemini API.
The new methods are supported in all of the Gemini API endpoints, including
Batch, Interactions and Live API.
Choosing the right method depends on the size of your file, where your data is
stored, and how frequently you plan to use the file.
The simplest way to include a file as your input is to read a local file and
include it in a prompt. The following example shows how to read a local PDF
file. PDFs are limited to 50MB for this method. See the
Input method comparison table for a complete list of file
input types and limits.
Python
from google import genai
import pathlib
import base64
client = genai.Client()
filepath = pathlib.Path('my_local_file.pdf')
prompt = "Summarize this document"
interaction = client.interactions.create(
model="gemini-3.5-flash",
input=[
{"type": "text", "text": prompt},
{"type": "document", "data": base64.b64encode(filepath.read_bytes()).decode('utf-8'), "mime_type": "application/pdf"}
]
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from "@google/genai";
import * as fs from 'node:fs';
const client = new GoogleGenAI({});
const prompt = "Summarize this document";
async function main() {
const filePath = 'my_local_file.pdf';
const interaction = await client.interactions.create({
model: "gemini-3.5-flash",
input: [
{ type: "text", text: prompt },
{
type: "document",
data: fs.readFileSync(filePath).toString("base64"),
mime_type: "application/pdf"
}
]
});
console.log(interaction.output_text);
}
main();
REST
# Encode the local file to base64
B64_CONTENT=$(base64 -w 0 my_local_file.pdf)
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "gemini-3.5-flash",
"input": [
{"type": "text", "text": "Summarize this document"},
{
"type": "document",
"data": "'${B64_CONTENT}'",
"mime_type": "application/pdf"
}
]
}'
Input method comparison
The following table compares each input method with file limits and best use
cases. Note that the file size limit may vary depending on the file type and
model or tokenizer used to process the file.
Method
Best for
Max file size
Persistence
Inline data
Quick testing, small files, real-time applications.
100 MB per request or payload
(50 MB for PDFs)
None (sent with every request)
File API upload
Large files, files used multiple times.
2 GB per file,
up to 20GB per project
48 Hours
File API GCS URI registration
Large files already in Google Cloud Storage, files used multiple times.
2 GB per file, no overall storage limits
None (fetched per request). One time registration can give access for up to 30 days.
External URLs
Public data or data in cloud buckets (AWS, Azure, GCS) without re-uploading.
100 MB per request/payload
None (fetched per request)
Inline data
For smaller files (under 100MB, or 50MB for PDFs), you can pass the data
directly in the request payload. This is the simplest method for quick tests or
applications handling real-time, transient data. You can provide data as
base64 encoded strings or by reading local files directly.
For an example of reading from a local file, see the example at the beginning of
this page.
Fetch from a URL
You can also fetch a file from a URL, convert it to bytes, and include it in the
input.
Python
from google import genai
import httpx
client = genai.Client()
doc_url = "https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"
doc_data = httpx.get(doc_url).content
prompt = "Summarize this document"
interaction = client.interactions.create(
model="gemini-3.5-flash",
input=[
{"type": "document", "data": base64.b64encode(doc_data).decode('utf-8'), "mime_type": "application/pdf"},
{"type": "text", "text": prompt}
]
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const docUrl = 'https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf';
const prompt = "Summarize this document";
async function main() {
const pdfResp = await fetch(docUrl)
.then((response) => response.arrayBuffer());
const interaction = await client.interactions.create({
model: "gemini-3.5-flash",
input: [
{ type: "text", text: prompt },
{
type: "document",
data: Buffer.from(pdfResp).toString("base64"),
mime_type: "application/pdf"
}
]
});
console.log(interaction.output_text);
}
main();
REST
DOC_URL="https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"
PROMPT="Summarize this document"
DISPLAY_NAME="base64_pdf"
# Download the PDF
wget -O "${DISPLAY_NAME}.pdf" "${DOC_URL}"
# Check for FreeBSD base64 and set flags accordingly
if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
B64FLAGS="--input"
else
B64FLAGS="-w0"
fi
# Base64 encode the PDF
ENCODED_PDF=$(base64 $B64FLAGS "${DISPLAY_NAME}.pdf")
# Create JSON payload file
cat <<EOF > payload.json
{
"model": "gemini-3.5-flash",
"input": [
{"type": "document", "data": "${ENCODED_PDF}", "mime_type": "application/pdf"},
{"type": "text", "text": "${PROMPT}"}
]
}
EOF
# Generate content using interactions
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d @payload.json 2> /dev/null > response.json
cat response.json
echo
jq ".outputs[] | select(.type == \"text\") | .text" response.json
Gemini File API
The File API is designed for larger files (up to 2GB) or files you intend to
use in multiple requests.
Standard file upload
Upload a local file to the Gemini API. Files uploaded this way are stored
temporarily (48 hours) and processed for efficient retrieval by the model.
Python
from google import genai
client = genai.Client()
doc_file = client.files.upload(file="path/to/your/sample.pdf")
prompt = "Summarize this document"
interaction = client.interactions.create(
model="gemini-3.5-flash",
input=[
{"type": "text", "text": prompt},
{"type": "document", "uri": doc_file.uri, "mime_type": doc_file.mime_type}
]
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const prompt = "Summarize this document";
async function main() {
const filePath = "path/to/your/sample.pdf";
const myfile = await client.files.upload({
file: filePath,
config: { mime_type: "application/pdf" },
});
const interaction = await client.interactions.create({
model: "gemini-3.5-flash",
input: [
{ type: "text", text: prompt },
{ type: "document", uri: myfile.uri, mime_type: myfile.mimeType }
]
});
console.log(interaction.output_text);
}
await main();
REST
FILE_PATH="path/to/sample.pdf"
MIME_TYPE=$(file -b --mime-type "${FILE_PATH}")
NUM_BYTES=$(wc -c < "${FILE_PATH}")
DISPLAY_NAME=DOCUMENT
tmp_header_file=upload-header.tmp
# Initial resumable request defining metadata.
curl "https://generativelanguage.googleapis.com/upload/v1beta/files" \
-D "${tmp_header_file}" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
-H "Content-Type: application/json" \
-d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null
upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"
# Upload the actual bytes.
curl "${upload_url}" \
-H "Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
--data-bina
…
Previouspages/feedback-policies.txtNextpages/file-search.txt

© 2026 Tech Dev Notes

RSSAboutAPIPrivacyTermsSitemap@techdevnotes