Consider a scenario based on actual events. A sheriff’s deputy encounters a vehicle driving erratically on an interstate highway. Someone in the car opens fire, hitting the deputy’s car multiple times. Dashcam audio records the gunfire, but the suspect is not visible on video. No one is injured. The driver flees, escaping arrest. But police later identify the driver and arrest him.
For the public, this story — the real version of which happened in June 2021 in Albuquerque — is a dramatic newscast. For a prosecutor — who would discover that some of the reporting was incorrect — it’s the start of a disciplined process determining which law governs, what the evidence proves, and how the state will exercise authority. Every decision is consequential, which is why every step is deliberate: reading the suspect’s prior arrest reports, reviewing video footage, reconciling witness accounts, researching case law, and consulting with investigators and supervisors. The documents that result from this process exercise state authority under established legal procedures and the rule of law.
Generative AI can now instantly produce documents that closely resemble these legal artifacts. What looks like dozens of hours of work can be done in seconds. The American Bar Association reports widespread use of AI — almost a third of all lawyers surveyed for a 2024 report said their offices were using AI, and almost half in the largest firms — even as three quarters of lawyers worried about the technology’s accuracy, by far the greatest concern. Prosecutors, law enforcement officers, and court officials increasingly use consumer chatbots to generate charging documents, case summaries, and legal research. We know this not only from surveys but also because court filings have been found increasingly to include AI slop.
The problem is simple enough to demonstrate. If you give the news report above to Claude and to ChatGPT and ask them to draft a charging memorandum, you will find that each of the documents is fluent and professionally formatted, citing statutes from the state in which the events occurred, New Mexico. But the charging decisions the chatbots reach may well be different. One may recommend attempted murder, the other aggravated assault. Each will likely also fabricate facts, for example referencing body camera footage when only a dashboard camera was present, and describing the suspect as visible when he was not.
But what if someday better models could produce more accurate and reliable results? The focus on mistakes as the main problem at stake distracts us from a deeper danger, particularly in prosecution: AI cannot replace human judgment. AI does not apply statutes, weigh precedent, or enforce official standards. It produces no audit trail showing which authorities governed its conclusions. When prosecutors replace their work with AI output, institutions lose the capacity to justify and defend their decisions.
To see why AI systems cannot provide an authoritative charging decision, it helps to examine what the decision actually requires when made by a human. The account here is based on the experience of one of the authors of this article, a former trial attorney in the same District Attorney’s Office that prosecuted the real case above.
At first glance, the charging decision in the case appears straightforward. Law enforcement filed a criminal complaint alleging aggravated assault on a peace officer. At felony first appearance, a judge found probable cause. The case then proceeded to a grand jury, which returned an indictment. Along the way, as evidence developed, the prosecutor added or dropped additional charges.
It is simple to model this activity and its dependencies as a workflow. But capturing the legal reasoning that goes into each decision makes the workflow exponentially more complex.
The critical question in this case was not how charges were filed, but which charge was chosen. Attempted murder was legally available but not charged. That choice reflects professional judgment grounded in legal meaning and constrained by law.
Under New Mexico law, attempted first-degree murder requires proof of willful and deliberate intent to kill and a substantial step toward but failure to complete that objective. Courts have consistently held that intent to kill cannot be presumed from the mere firing of a weapon. The state must prove intent. The prosecutor’s judgment depends on the meanings of dozens of interlocking legal terms like “intent” and “substantial step.” Once the prosecutor understands what the statute requires, she reviews and maps the evidence against each element. Does this conduct show specific intent? Did the defendant take a substantial step? What can we prove to a jury?
The deputy’s dash camera faced forward but the suspect was not visible on video. Shots were audible but none struck the vehicle, contrary to early news reports saying the car was hit several times. The defense can advance multiple theories: that the shots were warning shots fired in frustration, that the suspect did not intend to harm anyone, that a person intent on killing from close range would not have missed, or that the sounds were fireworks or a vehicle backfiring on the highway. Taken together, these narratives create reasonable doubt as to intent to kill.
The prosecutor’s professional judgment requires anticipating the narrative of the defense and selecting the charge that can best survive it. Charging attempted murder could be no-billed by a grand jury or result in acquittal at trial, leaving criminal conduct unpunished. Charging aggravated assault on a peace officer instead captures the provable conduct without requiring proof of homicidal intent, while carrying comparable sentencing.
The decision to charge the suspect with aggravated assault was not timidity or bargaining. It was institutional reasoning about provability, jury interpretation, and proportionality. Selecting the right charge synthesized years of experience with statutes, juries, and sentencing. Notably, none of that judgment appears in the criminal complaint or indictment itself. But all of it governed the decision.
The prosecutor who made that decision affixed her name to it. She assumed responsibility for defending the charge against defense motions, explaining it on appeal if challenged, and proving it to a jury at trial. If the decision proved wrong, the reasoning that produced it could be examined, criticized, and corrected.
That accountability is what distinguishes institutional judgment from the mere production of plausible documents. A prosecutor cannot claim a decision emerged from an opaque process. She must explain which meanings governed, which judgments were exercised, and why those judgments fell within prosecutorial discretion.
This is the standard that AI is unable to meet. Large language models generate outputs with no audit trail, no named authority, no mechanism for correction. They produce plausible text, not defensible judgment. Greater accuracy does not close this gap. Rather, the better the models become at reproducing prosecutorial form, the easier it is to mistake their linguistic competence for governed judgment.
There is no formal approval process for lawyers using AI chatbots — no implementation teams, no institutional review. Anyone in the workflow can use AI without asking permission. Users may have the best intentions and believe they are augmenting judgment and improving productivity. A skilled attorney may even refine chatbot outputs through iteration, verifying facts and correcting citations.
But the apparent efficiency gains conceal a deeper institutional cost. The work AI compresses is how junior prosecutors learn legal structure and develop judgment through supervised correction. By short-circuiting that process, AI consumes the mechanism by which institutions grow in competence over time. The expertise applied in refining AI output accrues to no one but the chat session. The costs accumulate quietly through lost accountability and eroded professional formation. Preventing that outcome requires placing AI under institutional authority.
The role AI must actually play here is not to exercise judgment, but to reveal the rules and practices that govern judgment. The charging standards prosecutors use are enforced through supervision and review. Many of the rules that govern this process are informal. AI could be used to compare both historic and new cases against approved policy, flagging departures from established practice, and from the explicit definitions for charging standards, thresholds, and review. Those flags will not make decisions. They will force human prosecutors to explain whether a deviation reflects legitimate judgment or a breakdown in standards, such as semantic drift or inconsistent application. AI can surface patterns, but only prosecutors can decide which ones are acceptable and which require correction.
The sequence matters. Institutions will have to formalize and approve their governing logic first, tying it to human authority and review mechanisms. Only then should computational systems be permitted to operate within those boundaries. When fluent systems are deployed before institutional logic is fully encoded, AI will substitute approximation for authority, and appearance for governance.
For AI to support rather than undermine prosecutorial authority, AI-generated outputs must be constrained within the prosecutorial workflow in four ways:
Without these constraints on AI, institutions will make decisions that no one can credibly stand behind, not because the decisions are wrong but because the reasoning that produced them cannot be inspected or explained.
Institutions can make mistakes and still be legitimate because they can explain and correct their reasoning. Ungoverned AI cannot do either, even when it produces a plausible answer.
The suspect in the aggravated-assault case was convicted through a reliable institutional process whose underlying human coordination we all take for granted. When the Magna Carta declared in 1215 that no free man could be punished “except by the lawful judgment of his equals or by the law of the land,” it marked the beginning of a long effort to bind state authority to publicly knowable standards. Inheritance of that ideal is fragile. When institutions substitute machine-generated text for governed judgment, they abandon the foundation of their legitimacy.
The technology is powerful. Commercial chatbots score top marks on the LSAT and bar exam. The paradox is that institutions are deploying systems that simulate mastery but that also disavow responsibility in their terms of use. AI can pass the bar exam, but it can’t prosecute a criminal case.
The choice is not whether prosecutors will use AI. They already do. The choice is whether legal institutions will govern these tools deliberately, or discover too late that governance has been ceded to them.
Keep reading our
Summer 2026 issue
Against REITs • Covid origins • Tech–Trad obit • Staying human • Subscribe