A research log — Closed beta · 15 runs documented · In flight
Antacog
Exploring whether grounded reasoning in autonomous agents
produces better outcomes than confident-sounding action without it.
Antacog is an experimental governance system for AI agents. It gives
agents the ability to detect contradictions in their own reasoning
and substrate, surface them, and — in Mode 1 — autonomously initiate
governance turns.
I'm currently testing this by using Claude Code as the primary agent
working on Antacog itself — with full documentation of every run,
failure, and correction. (Long-term I want to be model-agnostic, but
right now cost and iteration speed make this the lowest-friction
path.)
Current IRATA Level 3 Boilermaker and former Team Leader at an NGO
youth re-engagement programme. I've worked in environments where
governance, risk, and accountability have real consequences. That
shapes this work.
i.
Grounded reasoning beats ungrounded action.
Traceable provenance matters. Every claim should be traceable back
to the dialogue, evidence, or decision that produced it. I believe
this produces more reliable outcomes in human thinking, and I'm
testing whether the same holds for autonomous agents. The work
here is whether that belief survives contact with reality.
ii.
The conversation is the artifact.
What's load-bearing isn't the final model. It's the argument that
produced it — what was challenged, what was conceded, and what
wasn't worth pulling on. The dialogue is the primary record. The
model is the residue.
iii.
Friction is epistemically valuable.
Tools that optimise for agreement create artifacts that don't
survive pressure. Tools that support productive challenge create
artifacts that do. This work is built around the second.
✻ ✻ ✻
MAY ’26 · IN FLIGHT
Phase 0 substrate refactor (SESSION_MODEL) shipped to production
this month after fifteen runs of dogfooding. Building Mode 1 —
the autonomous detection layer that lets Ant initiate governance
turns — against the AgentAction provenance surface. A bootstrap
workstream runs in parallel: the on-ramp that takes an existing
system into substrate without dialogue having to do all the work.
Findings — four
Snapshots — detail on request
№ 04
May 2026
Run 15
A multi-system isolation probe ran an agent in plan mode and
produced zero ask_ant calls. Watching only dialogue traffic
was structurally biased toward dialogue-active agents — a
foreseeable blind spot the probe doc didn't name. Per-tool-call
provenance was promoted to a Mode 1 prerequisite the same day.
Request more information →
№ 03
May 2026
Runs 9, 12, 13, 14
Across four runs at structurally different surfaces — librarian
summaries, spec writebacks, inferred substrate — the loop
substituted summary-of-source for independent verification. The
pattern transferred across domains and produced a named
diagnostic — current accident, not current contract —
now load-bearing in the methodology.
Request more information →
№ 02
May 2026
Run 14, labelling probe
The behavioural mode embedded in the loop wasn't designed up
front. It was named retroactively from a false-positive
labelling probe across the run corpus. The methodology
generated the spec; the spec didn't generate the methodology —
a sequencing claim with consequences for how later modes get
added.
Request more information →
№ 01
May 2026
Run 12, external transfer
Running autonomous-trigger against infieldOS — an in-house
project of mine, separate from Antacog — the loop recognised
that an append-only constraint tracked as a convention should
be enforced at the registry level. Convention upgraded to
structural rule — the transferability claim earned its first
concrete artifact outside the loop's own corpus.
Request more information →
Open questions — four
unresolved
i.
When the model says it doesn't know, can a partner reading the
dialogue distinguish "the model doesn't know and the
substrate doesn't anchor it" from "the model doesn't know
but the substrate does"?
- Stakes
- Determines whether Ant's questions expose real design gaps or
merely model-coverage gaps. The difference matters most when
Antacog operates against a system the operator doesn't know
cold.
- Best guess
- Visible in the artifacts — the librarian, the substrate write,
the file-discovery action — but invisible in the dialogue
surface alone. Untested as an explicit claim.
- Evidence
- A probe with deliberate substrate-coverage gaps, disclosed
post-hoc to a third reader of the dialogue trace.
ii.
What's the failure mode under deliberately deceptive evidence?
- Stakes
- The grounding claim weakens if the loop can be made to
confidently ground a false claim through curated input.
- Best guess
- Adversarial inputs degrade quality faster than they degrade
confidence — the more dangerous failure mode.
- Evidence
- Pre-registered adversarial run; not yet attempted.
iii.
Does the discipline transfer to domains where the operator's
expertise is shallow?
- Stakes
- If grounding requires deep prior knowledge, the loop is an
amplifier for experts, not a tool for thinking generally.
- Best guess
- Untested. Suspected: structural questions transfer further than
domain-specific ones, mirroring the content-vs-template
distinction from Run 13.
- Evidence
- Two probes against domains the operator does not specialise in.
iv.
Does the antagonistic edge erode under sustained architectural
load — and if so, on which side first?
- Stakes
- The edge is the product. If voice-register drifts toward
accommodation under load, the loop's value claim weakens
before the operator notices.
- Best guess
- Calibrated by input quality, not domain or duration — four data
points consistent with this so far.
- Evidence
- Continued register-watch across multi-evening builds; an
explicit voice-fidelity probe under degraded input.
The loop is documented separately — run corpus, substrate model,
and the dogfooding pattern that produces the findings above.
Methodology notes are available on request alongside closed-beta
access.
Antacog is in closed beta. Operators are invited; the run corpus
is private. The dialogue product is the surface where the work
above is generated and tested.
Request access →