A new Python library called Lythonic takes a different approach to building data pipelines. Instead of treating jobs as opaque units like traditional task schedulers, it tracks the data itself. You write plain Python functions and wire them together with the `>>` operator. Each node receives typed data from upstream and passes results downstream, so you can see what went in, what came out, and what failed.

Sync and async code mix freely in the same DAG. Sync functions run in a thread executor automatically. DAGs nest inside other DAGs too: `dag.node(sub_dag)` runs a sub-DAG as a single step, while `dag.map(sub_dag)` runs it on each element of a collection concurrently. The `@dag_factory` decorator lets you build reusable templates as decorated functions.

For production use, Lythonic includes SQLite-backed provenance tracking that records actual data flow through each edge, not just task completion. There's per-callable caching with probabilistic TTL refresh, and a `lyth` CLI for starting, stopping, and monitoring long-running pipelines with cron triggers.

The project sits at version 0.0.14, so it's early days. It requires Python 3.11+ and ships under MIT license. The developer, who goes by Walnut Geek, has documentation hosted on GitHub Pages. The data-flow-first approach sets it apart from heavier orchestrators, but that beta version number is a real consideration if you're thinking about production autonomous AI agents.