Spoor · autonomous DFIR

Autonomous DFIR you can audit

An AI agent investigated a real compromised domain controller (DC01 (10.42.85.10)) and reached a verdict. Don't trust it — recompute its evidence chain yourself, right here in your browser.

Verify it yourself

DC01 (10.42.85.10) — confirmed compromised

DC01 (10.42.85.10), a Windows Server 2012 R2 domain controller, is CONFIRMED COMPROMISED. Memory forensics of citadeldc01.mem (captured 2020-09-19) reveals a multi-stage intrusion: a rogue process named coreupdater.ex (PID 3644) established a confirmed outbound C2 connection to 203.78.103.109:443 before self-terminating after a 15-second execution window.

recomputing SHA-256 in your browser…

Accuracy, measured honestly

Scored against the verified answer key for Case 001. We report the unflattering number and exactly why.

0.25
Precision
0.50
Recall
0.33
F1
0.000
Hallucination
Evidence integrity: INTACT — SHA-256 byte-identical before and after the run.

Zero hallucination, real recall, and a precision cost that is over-reporting — not fabrication. The same type-driven rule that finally scores the real malware also surfaces the agent's injection-victim processes (legitimate Windows binaries flagged for RWX regions) as file false positives — real anomalies, just not in the 4-item memory-visible answer key.

Pre-correction bridge: P 0.33 · R 0.25 · F1 0.29. The agent's output never changed — only the deterministic scorer was fixed, in the open.

The trust stack

Five pillars. Each one ships with the command that proves it.

Spoor ArchitecturePipeline from read-only evidence through guardrailed forensic tools, into a hash-chained audit log consumed by lead and specialist LLM agents, which produce citation-enforced findings validated by a deterministic scorer.EVIDENCE (read-only)citadeldc01.memSHA-256 sealedpre == post (immutable)GUARDRAIL SPINEB1 PATH JAILevidence resolved+ containedsymlinks rejectedbefore execB2 NO-SHELL7 forensic binsallow-listed onlyshell cmdsimpossible by ctorB3 WORKSPACE JAILtool artifacts writeONLY to workspacedisjoint fromevidenceB5 — HASH-CHAINED AUDIT LOGgenesisprev=0x000…tool call nprev=hash(n-1)seq n — report sealedtip hash in report.jsonevery record: ts · tool · args · stdout_sha256 · prev_hashtamper → hash break → chain fails · recompute with verify-audit CLIMCP SERVER — spoor-sift (FastMCP) · 12 read-only typed toolsmemoryvol_pslistvol_pstreevol_netscanvol_malfindvol_cmdlinetimelinelog2timeline_runpsort_queryrows cachedserver-sidedisk / registrytsk_flstsk_icatregripper_runSHA-256 sealedindicatorshash_fileyara_scanfailures returnas data · self-correctsHUMAN ANALYSTreads enforced reportcase-level decisions onlyB4 — APPROVAL GATElive actions: interrupt() beforeany side-effect · human approvesORCHESTRATION — LangGraph supervisor (checkpointed)Lead Investigatorroutes · re-routeson failureclaude-sonnet-4-xSonnet only (cost)no Opustriagespecialist agenttimelinespecialist agentioc_correlationspecialist agentCITATION CONTRACT (enforced in code — submit_report)"confirmed" must cite a valid tool_call_id · uncited claims downgradedbroken chain → nothing confirmable · report hash sealed on chainreporterverdict · findings · IOCs · timeline · scored accuracy (P / R / F1 / halluc)deterministic scorer runs pre + post every correction · sealed into audit chainHONEST ACCURACY SCORERP / R / F1 / halluc — deterministiccanonical run: P0.25 R0.50 F1 0.33halluc 0.000 · rescore_run.pySELF-CORRECTION LOOPtool failure → agent recoversrecovery is replayablespoor show-selfcorrect <run>VERIFIABLE AUDITCLI: verify-audit <run>/audit.jsonlSHA-256 chain recomputed client-sideevery finding traces to exact tool callSecurity boundaries: B1 path jail · B2 no-shell allow-list · B3 workspace/evidence disjoint · B4 human approval interrupt · B5 tamper-evident audit