← Back to all posts

Stop Dooming: Codegen will have its Gang of Four moment

Extending the Compiler Analogy

In February of this year, Vivek Haldar wrote a piece called "When Compilers Were the 'AI' That Scared Programmers" that draws a compelling parallel between the resistance to compilers in the 1950s and the resistance to codegen today. Many of us found his article to be simultaneously provoking and encouraging. The objections to compilers are almost word-for-word identical to those of AI codegen. Assembly programmers didn't trust compilers. They feared losing control. They worried that making programming easier would dilute the craft. We are hearing the exact same arguments now, seventy years later, about AI-generated code.

Haldar's piece stops at the comparison. This essay is about what comes next: if we accept that codegen is following the compiler arc, where are we on that arc? And what does history tell us about how long the rest of it takes?

The Priesthood Problem

In 1952, Grace Hopper built one of the first compilers. Her colleagues' reaction was not enthusiasm.

"Nobody believed that I had a running compiler and nobody would touch it. They told me computers could only do arithmetic."

This wasn't ignorance. These were brilliant people making a rational calculation: they had spent years mastering the intricacies of assembly language and machine code. They had intimate control over memory and CPU instructions. Surrendering that control to an automated tool felt reckless. As Haldar describes it, there was a "priesthood culture" (John Backus's term) where experts guarded the domain closely. The objections were about efficiency, about debuggability, and perhaps scarier than all of that, about identity: "If anyone could write programs in a high-level language, what did it mean to be a programmer?"

These objections sound familiar today. Replace "compiler" with "Claude Code" and "assembly" with "Python" and you could publish them on Hacker News today without anyone noticing the anachronism.

The priesthood sees this and panics. But the priesthood panicked about Fortran too. In the end, compilers didn't destroy programming, they created an explosion of capability.

The question was never whether to adopt the new tools. The question was how long it would take to learn to use them well.

The Timeline You Need to Be Familiar With

You should recalibrate your expectations about codegen. Let's look at compilers:

1957
Year Zero

Fortran ships. Skepticism is intense, but adoption is immediate. By 1958, a survey finds that over half of all code running on IBM machines is compiler-generated. The technical objections that compiled code was too slow, too wasteful, were resolved in 1-2 years. Adoption was not the problem.

1959 ← You are here
1968
+11 years — The Dijkstra Moment

Eleven years after Fortran, Edsger Dijkstra publishes a letter to the editor of Communications of the ACM titled "Go To Statement Considered Harmful." The field's most prominent computer scientist arguing that a fundamental, universally used programming construct, GOTO, is actively causing harm and should be eliminated. Eleven years in, and the industry is still arguing about basics. Not edge cases or optimizations, but the basic question of how to structure the flow of a program.

That same year, a NATO conference brings together practitioners from across the software industry. They are shocked to discover that the same problems plague every kind of software project: chronic delays, blown budgets, unreliable output. They call this the "software crisis."

1972
+15 years — Best Practices Begin

Dijkstra wins the Turing Award for his contributions to structured programming. Fifteen years after Fortran, best practices for how to write programs in high-level languages are just beginning to solidify. Beginning.

1984
+27 years — "Can You Trust the Tool?"

Ken Thompson, UNIX creator, delivers his Turing Award lecture, "Reflections on Trusting Trust." In it, he demonstrates that a compiler can be modified to insert a backdoor into any program it compiles. This modification would be invisible to anyone reading the source code. His conclusion is stark:

"You can't trust code that you did not totally create yourself."

Sound familiar?

Twenty-seven years after Fortran, the deepest question about compilers, "Can you trust the tool itself?" is still being debated with rigor.

1994
+37 years — The Gang of Four

The Gang of Four publishes Design Patterns: Elements of Reusable Object-Oriented Software. Thirty-seven years after the release of Fortran. The book doesn't invent new patterns... It recognizes that experienced developers around the world have been independently converging on similar solutions to similar problems, and it gives those solutions names. Before the book, every team reinvented the wheel. After it, the industry had a common language.

"This follows the Observer pattern."
"That's a Factory."

Suddenly, teams could discuss architecture in depth.

The book sold over 500,000 copies in English and was translated into 13 languages. In 2005, ACM SIGPLAN awarded the authors the Programming Languages Achievement Award for the impact of their work on "programming practice and programming language design."

From "the tool exists" to "here are the mature patterns for using it well," we saw 37 years of accumulated thinking, debating, failures, and refinement.

Now consider... we are roughly two years into serious codegen adoption. If the compiler timeline is any guide, we haven't even had our Dijkstra moment yet. The clear, opinionated articulation of what the fundamental problem actually is. Let alone our Gang of Four.

Adoption was never the bottleneck, even if the luddites are the loudest. Understanding was the bottleneck. And understanding takes decades, not months.

The Codegen Version Will Look Different

"But compilers are deterministic!!!"

The compiler arc tells us that maturity takes time. But it doesn't tell us what codegen maturity will look like, because the tools are fundamentally different.

Compilers are deterministic. Given valid input conforming to the language grammar, the output is provably correct. This mathematical certainty is what made the entire academic apparatus possible. Type theory, formal grammars, Knuth's work on parsing, the Dragon Book, formal verification, all of it rests on the foundation that compilers behave predictably and can be reasoned about with mathematical rigor. The Gang of Four patterns themselves only work because the languages they describe have well-specified, deterministic behavior. You can write a pattern for the Observer in Java because Java will behave the same way every time you implement it.

Codegen is stochastic. Same prompt, different output. Even at temperature zero, you're getting the most probable token sequence, not the provably correct one. You can't formally verify it. You can't write a proof that a given prompt will always produce correct code. The mathematical foundation that underpinned every layer of compiler trust simply does not exist for large language models.

Does this distinction still matter as models get better? As codegen improves, the practical gap between "stochastic" and "deterministic" narrows. The error rate approaches zero. For many use cases, it may functionally close. And there are entire fields that have built rigorous, trustworthy systems on probabilistic foundations without ever achieving determinism.

But "almost deterministic" and "deterministic" remain categorically different things. You can't write a formal proof on top of "usually right." And the failure modes are different in a way that matters deeply.

A compiler either compiles or it doesn't. It fails loudly. You get an error message. You fix it. An LLM produces code that looks correct, compiles cleanly, even passes your testing, but might have a subtle logical error or a security vulnerability buried three layers deep. The failure mode isn't "broken." The failure mode is "plausible but wrong." This is the risk curve that should give you pause: as models get better, the plausible-but-wrong outputs become rarer, but they also become harder to catch, because your vigilance drops. The better the tool gets, the more dangerous the remaining errors become.

So what does maturity look like for a tool that is fundamentally probabilistic?

It probably doesn't look like compiler theory. It might look more like medicine, or aviation.

Medicine doesn't demand mathematical proofs from its treatments. It builds evidence-based practice guidelines through years of clinical trials, case studies, peer review, and structured error reporting. Physicians work with probabilistic systems (the human body) and they have developed rigorous frameworks for making good decisions under uncertainty.

Aviation doesn't eliminate human error. It builds systems around it: checklists, redundancy, crew resource management, black box analysis. Pilots operate complex stochastic systems, including their own fallible brains, and aviation has become one of the safest industries in the world. They built trust infrastructure designed for uncertainty.

Codegen's maturity frameworks might need to borrow from these traditions rather than from compiler theory. The Gang of Four equivalent for codegen, when it eventually arrives, might read less like a computer science textbook and more like a clinical practice handbook. The patterns won't be "Observer" and "Factory." They'll be something we don't have words for yet. And they'll be grounded in empirical practice rather than formal proof.

We Haven't Named Anything Yet

This is where we are. The pre-pattern era. It's awful. It's why we're all dooming.

The Gang of Four's real contribution wasn't inventing 23 patterns. It was recognizing that experienced developers around the world were already converging on similar solutions independently. The Gang of Four gave those solutions names. Before the book, knowledge was fragmented. Every team had their own way of solving the same problems, their own internal vocabulary, their own tribal knowledge. The book didn't create consensus, rather it revealed consensus that already existed but hadn't been articulated.

The same thing is happening right now with codegen, but in the earliest possible stage. Every team, every developer, is building their own set of heuristics for working with AI tools. Some of these heuristics are probably universal, and once named, everyone will say "yes, obviously." Some are probably wrong and will be discarded. Some are probably brilliant but locked inside one person's workflow, invisible to the rest of the industry. Nobody has collected them, tested them across contexts, named them, or published them in a way that creates shared vocabulary.

But I don't think we're ready for that yet. The structured programming debate is the closer parallel for where codegen is right now. Before the GoF gave us patterns in 1994, the field endured 20 years of arguing about fundamentals, Dijkstra's "Go To" letter, the structured programming wars, and the NATO software crisis conference. These were about paradigms, not patterns. The question wasn't "what's the best way to implement this design?" The question was "what is the right way to think about programming at all?"

That's where codegen is today. The basic questions, when to use codegen, when not to, what constitutes responsible use, what "understanding your code" means when you didn't write it, what code review looks like when the author is a language model... these are paradigm-level questions, and they are still wide open.

You can't write design patterns for a practice that the field hasn't agreed how to think about yet.

First we need the Dijkstra moment. We need a clear, opinionated articulation of what the fundamental problem actually is. A letter to the editor that reframes how everyone thinks about the practice. We haven't gotten that yet. But we will.

Patience Is the Point

Assembly to high-level languages. Unstructured to structured programming. Procedural to object-oriented. Each time, the new tool was adopted quickly and the wisdom about how to use it well lagged behind by years or decades.

We WILL figure out codegen, but we are in 1959, not 1994. We very well may end up in another software crisis. They literally coined that term in 1968, eleven years into the compiler revolution, when the industry finally admitted that it had adopted powerful new tools without understanding how to use them responsibly.

Dijkstra was depressed in 1968. The NATO conference was grim. Practitioners were struggling with tools they had enthusiastically adopted but didn't fully understand. That discomfort was not a sign of failure. It was the beginning of the intellectual work that produced structured programming, formal methods, and eventually the design patterns that shaped modern software engineering.

Stop dooming.

The discomfort we feel right now about codegen, the anxiety about quality, the arguments about best practices, the nagging sense that we're building on a foundation we don't fully understand, might be exactly the same kind of productive crisis.

Adoption was never the problem. Understanding was the problem. It's the problem we face today.

The understanding will come. It just takes longer than we want it to.