Documentation

How Tokenbase collects AI coding telemetry source-side, normalizes it per tool, and rolls it up by team — without proxying model traffic or uploading source by default.

The collector

A signed, device-scoped binary tails the local session logs your AI coding agents already write — read-only, incremental, checkpointed by offset. It enrolls as a device with its own scoped credential and never depends on a live browser session. It is never in the model request path.

Metadata-only is the default. A local preview shows exactly what would be sent before anything leaves the machine.

Adapters

One adapter per tool, each versioned and shipped with golden fixtures. Adapters emit coverage stats and raise warnings on unknown records; one adapter failing never crashes the collector. v1 covers Claude Code and Codex CLI (core), Copilot CLI (core, partial cost coverage), and Cursor (beta — its local storage is reverse-engineered, so coverage is labeled, not assumed).

Confidence tiers

Tier A is factual (straight from logs) — tokens, models, session counts, tool calls, estimated cost. Tier B is inferred (heuristics on local data) — abandonment, retry loops, likely workflow type — and is always labeled as inferred, never presented as fact.

Deployment modes

SaaS metadata-only (default), SaaS redacted-text (opt-in), VPC / on-prem, and local-only pilot. In zero-egress modes raw transcripts never leave the device and analysis runs against a local or VPC model.