Exploring Built-in AI Capabilities in Chromium-based browsers

In this article we will explore Chromium-based browsers’ new built-in AI capabilities. These capabilities let you run AI models directly in the browser without needing any API keys or external services.

Why would you want this?

  • Your data never leaves the device
  • No network latency
  • No API costs
  • Works offline once the model is downloaded

Want to try these demos? See how to set up Chrome’s built-in AI at the bottom of this article.


Prompt API

The Prompt API lets you generate text using a local language model. It’s like having a small ChatGPT running in your browser.

Try it out:

Checking Prompt API availability...

How to use it

// Create a session
const session = await LanguageModel.create({
monitor: (monitor) => {
monitor.addEventListener("downloadprogress", (e) => {
console.log(`Download: ${(e.loaded * 100).toFixed(1)}%`);
});
},
});
// Stream a response
const stream = session.promptStreaming("Write a haiku about coding");
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(value);
}
// Or get the full response at once
const result = await session.prompt("Tell me a joke");
console.log(result);

Summarizer API

This one condenses long text into summaries. Pretty useful for articles or documents.

Checking Summarizer API availability...

How to use it

const session = await Summarizer.create();
const text = `Your long text here...`;
const summary = await session.summarize(text);
console.log(summary);

Rewriter API

The Rewriter API transforms text to match different tones - professional, casual, friendly, or confident.

Checking Rewriter API availability...

How to use it

const session = await Rewriter.create();
const text = "I need you to review the document ASAP.";
// Make it more professional
const professional = await session.rewrite(text, { tone: "professional" });
console.log(professional);
// Or more casual
const casual = await session.rewrite(text, { tone: "casual" });
console.log(casual);

Language Detector API

Detects the language of any text and gives you a confidence score.

Checking Language Detector API availability...

How to use it

const session = await LanguageDetector.create();
const result = await session.detect("Hola, ¿cómo estás?");
console.log(result.detectedLanguage); // "es"
console.log(result.confidence); // 0.95

Translator API

Translates text between languages, all running locally.

Checking Translator API availability...

How to use it

const session = await Translator.create({
sourceLanguage: "en",
targetLanguage: "es",
});
const translation = await session.translate("Hello, how are you?");
console.log(translation); // "Hola, ¿cómo estás?"

Using the Vercel AI SDK

If you want a more unified interface that works across different providers, you can use the Vercel AI SDK with the @built-in-ai community providers.

This gives you three options:

  1. @built-in-ai/core - Chrome/Edge’s native built-in AI
  2. @built-in-ai/transformers-js - Hugging Face models via Transformers.js
  3. @built-in-ai/web-llm - Open-source models via WebLLM

Installation

Terminal window
# For Chrome/Edge built-in AI
pnpm add ai @built-in-ai/core
# For Transformers.js (Hugging Face models)
pnpm add ai @built-in-ai/transformers-js
# For WebLLM (Llama, Qwen, Gemma, etc.)
pnpm add ai @built-in-ai/web-llm

@built-in-ai/core

This wraps Chrome’s Prompt API. Uses Gemini Nano on Chrome and Phi4-mini on Edge.

Checking Built-in AI SDK availability...

import { streamText } from "ai";
import { builtInAI } from "@built-in-ai/core";
const result = streamText({
model: builtInAI(),
prompt: "Write a haiku about coding",
});
for await (const chunk of result.textStream) {
console.log(chunk);
}

@built-in-ai/transformers-js

Runs Hugging Face models in the browser using WebAssembly. Works on all modern browsers.

Models are downloaded on first use and cached locally.

import { streamText } from "ai";
import { transformersJS } from "@built-in-ai/transformers-js";
const result = streamText({
model: transformersJS("HuggingFaceTB/SmolLM2-360M-Instruct"),
prompt: "Explain machine learning in one sentence.",
});
for await (const chunk of result.textStream) {
console.log(chunk);
}

@built-in-ai/web-llm

Uses WebGPU for hardware-accelerated inference. Supports larger models like Llama 3 and Phi-3.5.

WebLLM uses WebGPU for hardware-accelerated inference. Models are cached after first download.

import { streamText } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
const result = streamText({
model: webLLM("Qwen3-0.6B-q4f16_1-MLC"),
prompt: "What are the benefits of WebAssembly?",
});
for await (const chunk of result.textStream) {
console.log(chunk);
}

Which one should you use?

ProviderBest ForBrowser SupportModel Size
@built-in-ai/coreFastest, Chrome/Edge usersChrome 138+, Edge Dev2-4GB (built-in)
@built-in-ai/transformers-jsWide compatibilityAll modern browsers270MB - 1GB
@built-in-ai/web-llmBetter quality, larger modelsChrome/Edge with WebGPU500MB - 3GB+

Setting Up Built-in AI

If you want to try these APIs yourself, here’s how to set up Chrome.

Requirements

You’ll need:

  • Chrome 138+ (desktop only)
  • At least 22 GB of free storage
  • Either a GPU with 4GB+ VRAM, or 16GB+ RAM with 4+ CPU cores
  • Windows 10/11, macOS 13+, or Linux

Enabling the APIs

  1. Update Chrome to version 138 or later

  2. Go to chrome://flags and enable these flags:

    • chrome://flags/#optimization-guide-on-device-model
    • chrome://flags/#prompt-api-for-gemini-nano-multimodal-input
  3. Restart Chrome

  4. Go to chrome://components and click “Check for update” on “Optimization Guide On Device Model”

  5. Wait for the model to download (it’s about 2-4 GB)

Verifying the Setup

Open your browser console and run:

const available = await LanguageModel.availability();
console.log("Prompt API:", available);

If it returns "available", you’re good to go. Other possible values are "downloadable", "downloading", or "unavailable".

For TypeScript projects, you can install the type definitions:

Terminal window
pnpm add -D @types/dom-chromium-ai

Things to Keep in Mind

  • The initial model download is 2-4GB
  • Currently Chrome-only and experimental
  • Needs decent hardware to run smoothly

Conclusion

Built-in AI is still experimental, but it’s pretty cool to run AI models directly in the browser without any external dependencies. The privacy benefits alone make it worth exploring.

All the demos above are interactive - give them a try and see how they perform on your machine.

I hope you found this article helpful. If you have any questions or feedback, feel free to reach out to me.