Most workspace AI tools have a dirty secret: they flatten your carefully formatted spreadsheets, slide decks, and PDFs into generic text blobs. DocMason takes a different approach. The new open-source tool preserves document structure (slide layouts, multi-sheet references, nested tables, color-coded formatting) and builds a local knowledge base that AI agents can reason over without your data ever leaving your machine.
The tool runs as a repo-native application inside OpenAI's Codex for macOS, which acts as its runtime environment. Drop files into a folder, open it in Codex, and the agent builds a searchable evidence layer in the background. Provenance was a priority from the start: every answer comes with source identity and traceability back to the exact file and page. No more AI tools hallucinating connections between unrelated documents.
For anyone who's tried asking ChatGPT about a 50-page proposal spread across PowerPoint, Excel, and email threads, the appeal is obvious. Existing document AI strips away visual and structural cues. Presenter notes and chart-text relationships disappear. Color-as-meaning signals (red text meaning "risk") get flattened. DocMason uses LibreOffice for parsing to maintain fidelity, and enforces what its developers call "strict data contracts" to keep evidence deterministic and auditable.
The tool is free and Apache 2.0 licensed, available on GitHub. No cloud backend required, which will matter to anyone handling sensitive corporate materials who can't justify uploading proprietary decks to a third-party service. This is similar to how Ownscribe handles sensitive data locally. If you're already paying for OpenAI, Codex comes included in your plan, making this essentially a zero-cost add-on for subscribers willing to run local AI via Apple's on-device LLM.