Gemini API in Vertex AI
Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Vertex AI.
Provide these key capabilities:
- Text generation - Chat, completion, summarization
- Multimodal understanding - Process images, audio, video, and documents
- Function calling - Let the model invoke your functions
- Structured output - Generate valid JSON matching your schema
- Context caching - Cache large contexts for efficiency
- Embeddings - Generate text embeddings for semantic search
- Live Realtime API - Bidirectional streaming for low latency Voice and Video interactions
- Batch Prediction - Handle massive async dataset prediction workloads
Core Directives
- Unified SDK: ALWAYS use the Gen AI SDK (
google-genai for Python, @google/genai for JS/TS, google.golang.org/genai for Go, com.google.genai:google-genai for Java, Google.GenAI for C#).
- Legacy SDKs: DO NOT use
google-cloud-aiplatform, @google-cloud/vertexai, or google-generativeai.
SDKs
- Python: Install
google-genai with pip install google-genai
- JavaScript/TypeScript: Install
@google/genai with npm install @google/genai
- Go: Install
google.golang.org/genai with go get google.golang.org/genai
- C#/.NET: Install
Google.GenAI with dotnet add package Google.GenAI
- Java:
-
groupId: com.google.genai, artifactId: google-genai
-
Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it LAST_VERSION)
-
Install in build.gradle:
implementation("com.google.genai:google-genai:${LAST_VERSION}")
-
Install Maven dependency in pom.xml:
<dependency>
<groupId>com.google.genai</groupId>
<artifactId>google-genai</artifactId>
<version>${LAST_VERSION}</version>
</dependency>
[!WARNING]
Legacy SDKs like google-cloud-aiplatform, @google-cloud/vertexai, and google-generativeai are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.
Authentication & Configuration
Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.
Application Default Credentials (ADC)
Set these variables for standard Google Cloud authentication:
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true
- By default, use
location="global" to access the global endpoint, which provides automatic routing to regions with available capacity.
- If a user explicitly asks to use a specific region (e.g.,
us-central1, europe-west4), specify that region in the GOOGLE_CLOUD_LOCATION parameter instead. Reference the supported regions documentation if needed.
Vertex AI in Express Mode
Set these variables when using Express Mode with an API key:
export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
Initialization
Initialize the client without arguments to pick up environment variables:
from google import genai
client = genai.Client()
Alternatively, you can hard-code in parameters when creating the client.
from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")
Models
- Use
gemini-3.1-pro-preview for complex reasoning, coding, research (1M tokens)
- Use
gemini-3-flash-preview for fast, balanced performance, multimodal (1M tokens)
- Use
gemini-3-pro-image-preview for Nano Banana Pro image generation and editing
- Use
gemini-live-2.5-flash-native-audio for Live Realtime API including native audio
Use the following models if explicitly requested:
- Use
gemini-2.5-flash-image for Nano Banana image generation and editing
- Use
gemini-2.5-flash
- Use
gemini-2.5-flash-lite
- Use
gemini-2.5-pro
[!IMPORTANT]
Models like gemini-2.0-*, gemini-1.5-*, gemini-1.0-*, gemini-pro are legacy and deprecated. Use the new models above. Your knowledge is outdated.
For production environments, consult the Vertex AI documentation for stable model versions (e.g. gemini-3-flash).
Quick Start
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Explain quantum computing"
)
print(response.text)
TypeScript/JavaScript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Explain quantum computing"
});
console.log(response.text);
Go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
Backend: genai.BackendVertexAI,
Project: "your-project-id",
Location: "global",
})
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}
Java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
public class GenerateTextFromTextInput {
public static void main(String[] args) {
Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
GenerateContentResponse response =
client.models.generateContent(
"gemini-3-flash-preview",
"Explain quantum computing",
null);
System.out.println(response.text());
}
}
C#/.NET
using Google.GenAI;
var client = new Client(
project: "your-project-id",
location: "global",
vertexAI: true
);
var response = await client.Models.GenerateContent(
"gemini-3-flash-preview",
"Explain quantum computing"
);
Console.W