AI Agents, Attribution, and SOC 2 CC8.1
AI coding agents are running in production at organizations that haven't thought through what this means for their change management evidence package. CC8.1 hasn't changed. The way changes get made has.
Download the Framework →SOC 2 CC8.1 requires that every production change come with a consistent evidence package: who authorized it, what was the plan, who implemented it, was it tested, and what's the backout path. That package has worked for human engineers and automated pipelines for years.
AI coding agents don't fit either model. They receive a goal and determine their own method at runtime — which files to touch, which commands to run, how many changes to make. They typically operate under a human engineer's credentials. The result is a log that looks internally clean while failing to answer the questions the audit depends on.
This isn't a logging problem. It's an evidence problem. And it's the kind auditors are starting to ask about.
The breaks aren't subtle. They go to the core structure of what change management evidence is supposed to prove.
Most organizations let AI agents operate under the credentials of whoever invoked them. Every action the agent takes gets recorded under that person's name. The record looks complete. What it cannot tell you — or an auditor — is which decisions a human actually made, and which ones the machine made while no one was watching. That distinction is the whole point of a change review trail.
When developers use AI tools, they describe an outcome, not a method. "Fix the login bug." The AI decides in real time which files to change, which systems to touch, and how to approach the problem. Nobody writes that plan down first because nobody knows it yet. So when approval happens, there is no written scope, no defined method, and no backout steps. The auditor asks what was approved. What exists is a description of what someone wanted done, and a log of what the machine decided.
A single AI agent session can produce hundreds of individual file changes. The person reviewing that output before it ships faces more material than any human can meaningfully read in the time a typical review allows. The approval record looks the same whether the reviewer spent an hour or clicked through in seconds. An approval that can't be exercised as a real review isn't evidence of review. It's evidence that an approval step exists on paper.
The most common responses to the agent governance question are all partially correct — and each one is routinely mistaken for a complete answer.
Service account treatment ends the attribution problem — it gives auditors a truthful actor in the log. But it says nothing about what method was planned, what scope was authorized for this session, or whether a backout plan existed. The deeper breaks remain fully open. And a standing service account carries persistent, broad credentials — the opposite of bounded, per-change authorization.
Require sign-off before every agent action and you've restored human review in theory while destroying it in practice. At any meaningful scale, per-action approvals at agent speed become rubber stamps: recorded as if evaluation occurred, impossible to exercise with actual judgment. Manufactured evidence of review is worse than honestly documenting that an agent operated under appropriate governance.
Written policy is necessary. "Agents must not deploy without review" is the right policy to have. But the auditor doesn't test whether the policy exists — she tests whether the control operated. If agent activity isn't instrumented before the action occurs, the compliance evidence is generated after the fact by the same systems that did the work. A well-formatted self-attestation is still self-attestation.
The framework's premise is straightforward: hold AI agents to the same evidentiary bar human engineers already meet — at plan granularity, not keystroke granularity. A named human approves a concrete plan before work starts. An identified actor executes it. Deviations are detected and handled. Records exist that no one can quietly rewrite.
Before any session starts, the agent generates a plan record — objective, method, scope, test approach, backout path — and a human with appropriate authority approves it. Approval is bound to this artifact, not to a goal.
Each session runs under its own ephemeral credential, issued for that session, bound to the approved plan, expired when the session ends. The record can now distinguish what the human did from what the agent did.
Every action is logged under the session credential. Resources the agent actually touched are compared against the declared scope. Action outside the envelope triggers revocation and an exception record — proof the boundary was real.
Verification at credential issuance (plan complete, approver authenticated, no change freeze) and at promotion (vulnerability scan, test evidence, scope conformance). Both stages write to the audit ledger.
A governance ledger operated by the same team running the agents is self-attestation with cryptographic decoration. Independence requires key custody separation and external anchoring to a system the organization cannot rewrite.
In the absence of formal guidance, these are the questions beginning to appear in CC8.1 fieldwork. Organizations that can answer them from records are ahead of the requirement. Organizations that can't are accumulating compliance debt with each agent session running today.
The Evidence Parity Framework is a practitioner standard for governing agentic AI in SDLC environments under SOC 2 CC8.1. Its premise: hold AI agents to the same evidentiary bar human engineers already meet — at plan granularity, not keystroke granularity. Five components close the gap: plan-bound authorization, agent identity separation, execution records and deviation handling, two-stage verification, and independent anchoring. The full paper details each component and a five-phase implementation roadmap ordered by evidentiary yield.
Yes, if you've adopted AI coding tools since your last audit. A SOC 2 audit is a point-in-time assessment of the controls that operated during the audit period. If your SDLC changed after certification — AI tools adopted, new agents running in CI/CD, team expanded — the controls the auditor verified may no longer match reality. Your next audit's CC8.1 fieldwork will sample changes from the period after adoption. If the evidence those changes produce can't answer the five questions auditors ask, the gap is yours to explain before the report issues.
Service account treatment closes one of the three breaks — the identity conflation problem. The audit log now has a truthful actor. It doesn't close the authorization scope problem (the plan still isn't written before execution) or the review parity problem (the approval record still can't distinguish meaningful review from rubber-stamping at machine speed). A standing service account also carries persistent, broad credentials — the opposite of bounded, session-specific authorization. It is the right first step. It is not the complete answer.
The gap is structural, not framework-specific. ISO 27001:2022 A.8.32 (change management) requires the same evidence package: authorized, planned, tested, documented. NIST SP 800-53 CM-3 and CM-4 carry the same logic. The EU Cyber Resilience Act requires manufacturers to document and control changes to products with digital elements — which implicates the same SDLC governance question. ISO 42001 (the AI management system standard) addresses AI system change governance directly. The Evidence Parity Framework was written against SOC 2 CC8.1 because that's where auditors are beginning to ask the question, but its five components satisfy the evidence requirements of all these frameworks simultaneously.
No. The five-phase implementation roadmap in the full paper is designed to layer evidence controls on top of existing pipelines rather than replace them. Most organizations start with agent inventory and identity separation — neither of which requires pipeline replacement — and build toward independent anchoring over several cycles. The roadmap is ordered by evidentiary yield: each phase closes a meaningful piece of the gap even if the subsequent phases haven't been implemented yet. You get audit-defensible improvement at each step, not just when the whole program is complete.
The complete paper covers the evidence package CC8.1 actually requires, exactly why AI agents break it in three specific ways, why the most common responses close only part of the gap, the full Evidence Parity Framework with all five components, a five-phase implementation roadmap, and the questions auditors are already starting to ask.
Enter your work email to download the framework.
No spam. Occasional advisory briefings on compliance and AI governance, from which you can unsubscribe anytime.
Book a 30-minute conversation. We'll talk about where your agent governance stands today, what an auditor would find, and what it would take to close the gap before they do. No pitch. No proposal. Just an honest conversation.
Or reach out directly: sweltman@aletheiasecurity.com