Gemma Gem brings something genuinely useful to browser extensions: a fully local AI agent that actually does things. Built by developer kessler, this Chrome extension runs Google's Gemma 4 model directly in your browser using WebGPU and WebAssembly. It can read page content, click buttons, fill forms, scroll, and execute JavaScript. No API keys, no cloud calls, no data leaving your machine. This is the same philosophy behind DocMason, which keeps your files local while making them AI-readable. The smaller E2B model needs about 500MB of disk space. The larger E4B takes roughly 1.5GB.

The architecture is clever. An offscreen document hosts the model and agent loop using Hugging Face's Transformers.js. A service worker handles message routing and screenshots. The content script manages the chat UI and DOM interaction. The agent logic itself has zero dependencies and could be extracted as a standalone library. It's built on WXT, a Vite-based extension framework that enforces Manifest V3 standards.

Here's the trade-off. Running an autonomous agent locally means trusting the model with broad permissions, including reading DOM content and executing JavaScript. Unlike cloud-based solutions that can monitor and filter commands centrally, Gemma Gem runs as a black box on your machine. Prompt injection is a real concern. Malicious web content could theoretically manipulate the model into performing unauthorized actions. The security perimeter shifts from server-side guardrails to Chrome's extension sandbox.

Chrome's experimental Prompt API uses Gemini Nano but requires a 4GB download. Gemma Gem packs similar capabilities into a smaller footprint.

Local agents probably won't replace cloud assistants for complex reasoning anytime soon. But this proves you don't have to choose between AI help and keeping your browsing data to yourself.