Execution Receipts and the Problem of Citable AI
AI citations are failing because they lack verifiable provenance. Execution receipts offer a path toward AI outputs that can be meaningfully cited.
We focus on traceable model runs and offline deployment.
Clear provenance, documented configs, and repeatable setups.
Designed to run without outbound network calls or telemetry.
We aim for reproducible setups with documented tolerances and signed artifacts.
We design for minimal collection and keep evidence local where possible.
We do not promise the model is right. We aim to show what model ran, with what config, on what input. That's the part you can verify.
Artifacts should trace back to their origin. Model weights, adapters, and runtime are identified where possible.
Structured declarations of what should run. Machine-readable. Diffable.
Configurations can be signed so tampering is detectable.
Log entries can reference the previous; deletion or modification becomes detectable.
In transfer-heavy workloads, data movement dominates energy cost. Unified memory architectures can reduce this cost by eliminating copies between CPU and GPU memory. We measure this with Joules per token.
We document a measurement methodology for Joules/token benchmarking on Apple silicon.
macOS powermetrics API • 10-run averaging • thermal normalization • documented tolerances
AI citations are failing because they lack verifiable provenance. Execution receipts offer a path toward AI outputs that can be meaningfully cited.
Why running the same CUDA code twice can produce different floating-point results, and what you can do about it.
A technical governance study examining how nondeterminism in AI systems creates audit, compliance, and operational control failures across regulated industries.
Get notified when we publish new research or open access to our tools.
No spam, ever. We only email when we have something worth sharing.