Research Interests

Most of my work has been research-by-building — questions I couldn't answer by reading, so I shipped a prototype and felt it out. Some were too early. The ones that landed informed the next.

The pattern that emerged: composition with provenance, sensed not scripted, trust as a primitive. Three threads I keep coming back to. They show up across a decade of work and converge in what I'm building now.

What I'm tracking

Composition with provenance

How do you let many people — and now agents — edit shared state non-destructively, with deterministic conflict resolution and a complete history you can audit, branch, and replay? Histo was the first version. Daslab's scene graph is the current version. The primitive — content-addressed state with deterministic merge — is what makes git's collaboration work, what makes USD's scene composition work, and what will make agent collaboration work.

In conversation with: Pixar / OpenUSD (composition arcs, layers as opinions); Alan Kay / VPRI's Worlds (content-addressed branchable state); the CRDT and local-first lineage (Automerge, Yjs, Martin Kleppmann's group).

Sensed, not scripted

Most software automation is hand-authored. The bet behind autocompile is that the observed behavior of users and agents is itself the spec — and the right systems compile that observation into executable programs that get cheaper and more reliable over time. Separate what's invariant (compile to deterministic code) from what varies (parameterize). Within the varying parts, separate predictable variation (small specialized model) from genuine novelty (full LLM). Intelligence is relocated, not eliminated.

In conversation with: Wil van der Aalst's process mining (textbook: Process Mining: Data Science in Action); Carl Adam Petri's nets and the workflow-net subclass; program synthesis & sketch-language work (MIT).

Trust as a primitive

The cleanest supervision signal in any human-AI system is the human's actual decision: did you commit this? edit it? reject it? Most ML systems struggle with where labels come from. A well-designed trust layer produces them as a byproduct of normal use. That's what closes the loop between the runtime, the compiler, and the composition model.

In conversation with: Carl Hewitt's actor model (supervised concurrency); RLHF and preference-learning literature; HCI work on staged effects and undo.

Fields I watch closely

Process mining and workflow languages
CRDTs and local-first software
OpenUSD and spatial scene composition
Neuro-symbolic AI (constraint solving + small specialist models filling the holes)
Agent UX and trust layers in coding tools
On-device and edge compute (LLMs that run on phones, ESP32-class hardware)

What it led to

A timeline of the work the threads above produced.

~2003 — Hardcoded chatbots and pranks (age 15)

Got deep into Visual Basic and AppleScript as a teenager. Built hardcoded chatbots that pretended to be agents — including a fake "we're wiping your hard drive right now" routine that held a live conversation with whichever friend had walked away from their machine. Hand-rolled parsing, regex everywhere, decision trees by hand — naïve NLP by any current standard, but the obsession with agents that hold a conversation started here.

~2004 — Company OS simulation (age 16)

Built my own enterprise simulation game — a car-company OS where you ran factories, managed budgets, modeled supply chains. Heavily inspired by Wall Street Raider, Capitalism 2, and Railroad Tycoon, which I was deep into at the time. The instinct for tracked state + time stepping + policy choice carries straight through to how I think about agentic worlds now.

2007 — DKFZ wet lab internship

First proper research environment. Interned at the German Cancer Research Center (DKFZ) in Heidelberg — grew cell cultures, ran experiments, learned the equipment (PCR cyclers, gel electrophoresis, the rest of the molecular biology stack), worked alongside the PhD students.

On the side I wrote a small inventory tracker app for the lab — reagents, samples, freezer slots, the usual entropy of a working bench.

~2008 — Gene expression visualizer for BioQuant

A second Heidelberg lab project: an interactive expression-mining tool for BioQuant — gene expression data overlaid on a taxonomic ontology (bacteria, if I remember right), with drill-down navigation through the tree to see what was differentially expressed at each level. Pre-d3, pre-Observable era; the visualization was hand-rolled.

2008 — Bookatruck.co.za (Cape Town)

Online truck-booking platform built during an internship at Reds Road Express in Cape Town. Shippers list loads, available trucks accept. Same instinct as the lab tracker in Heidelberg — turn a paper-and-phone process into a working web app — applied to industrial logistics.

Original screenshots are in old archives.

2009 — Domain modeling, RDF/OWL, enterprise architecture

A year deep in the semantic web. Domain modeling, ontologies, OWL reasoning, enterprise architecture frameworks. Built a semantic modeling extension for OpenOffice Draw that turned freeform diagrams into typed RDF graphs.

Defigner started in this period as a JavaScript modeling playground; the current Defigner reuses the name for the same modify-in-isolation-then-merge instinct, applied now to execution contexts.

2010 — SmokeSignal (Facebook app)

A peer-to-peer marketplace built as a Facebook app, on top of their newly opened Platform APIs and social graph. Used the social graph itself as the trust layer for transactions. Facebook shipped Marketplace natively in 2016. Too early, too small.

Repo and screenshots are in old archives — adding details once I dig them up.

2010 — Early Node and Erlang tooling

aws-lib — an extensible Node.js client for the AWS API. Started in September 2010 on Node 0.2 — before Node 0.4 stabilized, before npm hit 1.0, and before the official aws-sdk for Node existed.

Concurrent Erlang / CouchDB work: hovercraft (direct Erlang CouchDB client) and LivelyCouch (CouchDB + Node.js fusion as an HTTP event-driven framework). CouchDB's revision-tree model is what later turned into Histo.

Plus small utilities — Node-Magick, spawn.js.

2011 — Pasteboard

Pasteboard — peer-to-peer text sync between Mac, iPhone, and iPad, over Bonjour on the local network. No internet, no server in the middle. Native apps on each platform. Co-built with Johannes Auer.

Same instinct — sync happens between devices, not through a cloud — got formalized two years later in the Histo thesis.

2011 — Erlang, parsers, and a DSL phase

A year deep in language design and Erlang:

Grammars — a parser generator written in Erlang. I was deep in OMeta, PEGs, and the wider "your own DSL in 200 lines" world.
osm-routing — OpenStreetMap-based geo-routing, also Erlang.
flights — interactive map of outgoing flights from US airports.

The instinct continued into 2012 with eventlang and token-streams (parallel parsing experiments). The compiler / grammar interest from this stretch is what later became autocompile.

2012–2013 — Histo and the building-blocks cluster

Histo / syncing-thesis — full thesis PDF. A protocol for peer-to-peer data synchronization built around a Merkle DAG, three-way merge, semantic conflict resolution, and history tracking for offline-first apps. Inspired by git's data model.

The same primitive — content-addressed state with deterministic merge — now shows up in USD scene composition, agent commit chains, and CRDT systems like Automerge and Yjs.

The thesis was the capstone of a year-long cluster: I built each piece of the data-sync stack from first principles as a small single-purpose library.

Diff & merge — array-diff, array-merge (3-way), id-diff, id-merge, diff-merge-patch (sets/dicts/lists), longest-common-substring, range-merge, fuzzy-match, diff-utils, merge-utils
Sync & storage — synclib, content-addressable, pluggable-store (unified KV interface), HistoDB (forkable browser-and-Node DB), histo-fs, histo-revisions
Graphs — ancestor.js (LCA on DAGs), graph-difference.js (DAG diffing)
Encoding — canonical-json (RFC 8785 deterministic JSON)

Three of these — ancestor.js, graph-difference.js, and canonical-json — quietly keep getting pulled into modern stacks that need deterministic content-addressing.

2013 — Lua and embedded experiments

A portable-runtime phase, starting around the time of the thesis. MoonStore (2012) was an early Lua-based sync library — the first try at the same instinct in a small, embeddable runtime. Then lua-experiments (C, Lua, mongoose web server, CMake), lua.cmake, luajit.cmake, mongoose.cmake — building Lua and Lua-driven web servers as embeddable artifacts.

The Lua-as-portable-runtime instinct that resurfaces in Daslab's reactive SDK, on-edge job runner, and ESP32 work (shelf, growos) started here.

2018–2021 — Zapier

Zappy — screen capture and annotation for macOS. Zapier for Mobile — workflow automation from your phone, where I worked on the founding mobile efforts.

Four years close to the daily UX of non-technical knowledge workers — what they reach for, what trips them up, where automation actually lives in their day. A lot of the patterns I'm working with now trace back to what I saw at Zapier.

2022 — ShortcutAI

ShortcutAI — started on the raw OpenAI completion API (GPT-3), before ChatGPT shipped. The first version was the simplest thing that worked: invoke a model from any text selection in Apple Notes.

The project then grew well beyond Apple Notes — a multi-channel AI assistant that worked across Telegram, Line, Facebook Messenger, and WhatsApp, with native macOS shortcuts, a Playwright-based browser-automation agent, a Spotlight-style command palette on the web, a marketplace, and a server runtime tying it all together.

The system was architected around what we'd now call Skills — typed agent commands with declared input schemas and a params → review → result flow per invocation. The 2023 Skill catalog included Scrape Website (URL → CSV), Summarize PDF, Transcribe Audio, Text-to-Speech, Text-to-Image, Image-to-Text, Image Maps (depth/segmentation), Resize Images, Obfuscate Video, Translate, Find Emoji / Find Illustration, and a "Create Custom Command" surface so users could ship their own Skills into the marketplace.

Architecture across multiple repos:

shortcutai.swift — macOS system integrations: accessibility, global key events, focused-text-view fetching, custom key-sequence detection, streaming responses.
shortcutai_agent — Playwright-based browser-automation agent over HTTP. Launch sessions, navigate, screenshot, scrape. An early take on what's now called computer use.
shortcutai-modules — community-contributed Skills.
Plus private repos for the server, API, docs, marketplace, and a Next.js web dashboard with a Spotlight-style command palette — the cross-platform surface for invoking Skills outside the macOS app.

The same pattern — typed agent commands + marketplace + dashboard — later became canonical as ChatGPT plugins (Mar 2023), OpenAI GPTs (Nov 2023), and Anthropic Skills (2025). ChatGPT shipped a macOS app two years after ShortcutAI's first version; Apple Intelligence put LLMs into the OS at the system level after that.

2024 — Discotalk

Discotalk — generative image and video iOS app, multi-model from day one (every major image/video model integrated), with a social layer for sharing prompts and remixing. Pre-Sora era.

What landed and what didn't there directly shaped how I think about agent UX now.

2025–26 — Daslab and autocompile

Daslab (private) — a workspace where humans and agents collaborate via scenes. Content-addressed scene graph, USD-inspired layered composition, a trust layer that turns approve/reject signals into supervision, and a job runner that ships everywhere (Bun server, iOS, Rust/WASM core that targets browser, Mac, ESP32, iPhone). The integration of every thread above.

autocompile — Apache 2.0. Compiles observed agent workflows into deterministic programs via Answer Set Programming, with a roadmap for library learning (Stitch), rule discovery (ILASP), and per-slot neural policies. Sits in the process mining lineage with two extensions outside the field's mainstream: trust-layer signals as the supervision channel, and small policies filling each stochastic slot in the discovered structure.

Adjacent work in the same cluster:

Defigner — fork execution contexts, modify in isolation, merge back. Histo's instinct applied to runtime state.
shelf — physical Daslab widget rendering scene data on an e-ink display.
growos — ESP32 sensors + edge AI for autonomous plant care.
pii-proxy — privacy proxy that masks PII before sending to LLMs and unmasks responses going back.

In progress

Drafts I'm developing into longer pieces:

Lessons from a decade of data sync: applying Histo's principles to LLM agents
Towards provable provenance in AI systems: challenges and opportunities
The future of automation is sensed, not scripted

Get in touch

If any of this overlaps with what you're working on: [email protected]