Nov. 2024 - Dec. 20244 min read

What I Learned Building a Browser Extension Around Chrome's Built-in AI

Chrome's Prompt, Summarizer, and Translator APIs make on-device AI a reality. InstaQ stitches them together so any webpage becomes a Q&A surface, with sessions you can come back to.

ChromePrompt APIOn-Device AIExtension

Watch the demo Source code

On-device AI in the browser changes the assumption that every AI feature needs a server. InstaQ is a browser extension built on Chrome's built-in Prompt, Summarizer, and Translator APIs that lets users ask questions, summarize, and translate any page, locally.

InstaQ Chrome extension panel open beside a webpage, showing the Prompt, Summarize, and Translate tabs. — InstaQ runs entirely in the browser using Chrome's built-in Prompt, Summarizer, and Translator APIs.

Why I built it

When I talk to an AI on a hosted platform, closing the tab usually means losing the conversation context, and most services don't offer a free way to save sessions for re-use. I wanted to centralize questions, summaries, and translations in one place without ever leaving the browser, and without sending page text off the device.

Sessions, not single shots

The thing that turns this from a toy into a tool is session storage. Conversations resume where you left them, with the original page context preserved. You can come back to a research thread tomorrow on a different tab. Sessions can be cloned, renamed, and removed, and each one remembers its own context (stored in the background script).

What it does

Three modes share the same panel: ask, summarize, translate. Each mode is built around what makes that task actually useful in the middle of reading a page, not just a wrapper around a model call.

Ask

Question responses save into sessions so previous conversations stay reachable.
Top-K and Temperature on the Prompt API are exposed so I can finetune responses per session.
A button copies all the text on the current page into the extension for review.

Summarize

Highlight text on a webpage and summarize it directly from the extension. Type, format, length, and shared context are configurable. Free text input also works for things you paste in manually.

Translate

Translation works on highlighted or free text. Supported languages: English, Spanish, Mandarin Chinese, Japanese, Portuguese, Russian, Turkish, Hindi, Vietnamese, Bengali. A swap button toggles source and target language and re-translates automatically.

The whole panel is persistent. When it loses focus, every input, response, and selected parameter is still there when I reopen it.

How I built it

React, Tailwind, TypeScript and JavaScript, with a manifest file to package it as a browser extension. The three Chrome APIs (Prompt, Summarizer, Translator) are the core. All API calls run in a background script so the extension keeps working even when its popup is closed, and so session state survives between opens.

What was hard

Making the extension fully persistent was the biggest challenge. Variables had to live in the background script so re-opening restored every input, every response, every session, especially across the Prompt API's session model. Chrome's storage API ended up doing most of that heavy lifting.

The Translator API doesn't work with service workers, which is exactly what a background script is. The fix was to call it from offscreen.js and have the background script communicate with the offscreen page whenever the frontend triggered a translation. Not obvious from the docs, but it works cleanly once wired up.

What I learned

Offscreen scripts: first time encountering an API that doesn't work with service workers.
Chrome's storage API for persistent background-script state across opens and closes.
First hands-on with Chrome's Prompt, Summarizer, and Translator APIs, and with Google Canary as the dev target.

What's next

Video summarization and translation by URL are the obvious next step, but privacy restrictions on YouTube and similar sites made it risky to ship inside the hackathon window. After that: UX polish, including friendlier names for Top-K and Temperature so casual users aren't staring at raw sampling parameters, and a cleaner layout for the three modes.

Related project

InstaQ Chrome Extension

View the project

Letting Gemini Write SQL Against BigQuery, So Fans Don't Have To

Personalized baseball coverage that actually personalizes: follow your players, pick your schedule, and an LLM-translated SQL layer lets the AI generate stats and visualizations against fresh MLB data without code changes.

Jan. 2025 - Feb. 20255 min read

Driving a 3D Map by Voice

Google's photorealistic 3D Maps API meets WebkitSpeechRecognition. Say 'drive me from Mile End to Old Port via the Lachine Canal' and the camera flies. Toggle between driving, walking, cycling, transit.

Oct. 2024 - Nov. 20244 min read