Bayer's production agent system bets on 'harness engineering' over bigger models

Bayer and Thoughtworks have detailed PRINCE, a production agentic RAG system that lets scientists query decades of preclinical drug-safety reports in plain language and draft regulatory documents. Martin Fowler's site published the case study.

The interesting framing is the two disciplines they say carried the reliability. Context engineering shapes and routes information between specialised agents: a Researcher, a Reflection agent that validates whether the retrieved data is sufficient, and a Writer that synthesises the answer. Harness engineering builds the orchestration, recovery and observability around the models. The system pairs Text-to-SQL with retrieval and keeps a human in the loop through transparency and explainability rather than trusting raw model output.

It is a useful corrective to model-centric hype. The team's claim is that moving from keyword search to a trustworthy research assistant was an orchestration and evaluation problem, not a question of a bigger LLM. For regulated industries that is the realistic bar: control, recovery and auditability around the model, not the model alone.