Antithesis, a company that tests distributed systems by injecting faults and watching what breaks, hit a wall with Google BigQuery. Their testing platform creates branching trees of timelines, and they needed to query relationships within those trees. Parent pointers and recursive queries are the natural approach, but BigQuery is built for parallel scans, not point lookups. Walking up a tree one node at a time means O(depth) full table scans. That's not a query. That's a billing nightmare.
CEO Will Wilson's solution was to invent what he calls a 'skiptree,' a generalization of skiplists applied to tree structures. Skiplists add express lanes to linked lists for O(log n) search. Skiptrees do the same for trees. The idea is to create multiple levels of the tree, where each level has roughly half the nodes of the level below. Each level gets its own SQL table with columns tracking ancestors at the next level up. Ancestor queries become a fixed number of JOINs. Wilson says they needed about 40 JOINs, but that beats the alternative of scanning your entire dataset dozens of times.
They ran this setup for six years. Six years of production queries against a data structure most developers have never heard of. Eventually they built Pangolin, their own analytic database purpose-built for hierarchical data. BigQuery's limitations pushed them toward it. Pangolin uses columnar storage optimized for ingesting event logs and pulling specific system states across execution branches. The skiptree hack bought them years of runway. Then they built something that speaks tree natively.