Skip to content

Latest commit

 

History

History
264 lines (165 loc) · 31.4 KB

File metadata and controls

264 lines (165 loc) · 31.4 KB

A Code Complete Constitution for AI Coding Agents

A distilled, principle-first synthesis of Steve McConnell's Code Complete, 2nd Edition (Microsoft Press, 2004), tailored for AI coding agents. This is a guiding document, not a checklist. The goal is for the agent to internalize why high-quality code looks the way it does, so that the code it generates, reads, and modifies is something a human will still want to maintain a year from now.


1. The Core Philosophy: Software's Primary Technical Imperative Is Managing Complexity

Almost every other principle in this document is a corollary of one claim McConnell makes in Chapter 5: managing complexity is the most important technical topic in software development. He is explicit: "Software's Primary Technical Imperative has to be managing complexity."

The reasoning is grounded in a human limitation that does not go away just because the programmer is an AI. Edsger Dijkstra put it precisely in his 1972 Turing Award lecture "The Humble Programmer": "The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague." When a project reaches the point where no one understands the impact of a change in one area on other areas, progress halts. McConnell identifies three sources of unnecessarily costly designs: a complex solution to a simple problem, a simple but incorrect solution to a complex problem, and an inappropriate complex solution to a complex problem. The first and third are the failure modes most relevant to AI agents, which tend to over-engineer.

There are two fronts in the war on complexity, drawing on Fred Brooks's distinction in "No Silver Bullet: Essence and Accident in Software Engineering" (IEEE Computer, April 1986) between essential complexity (inherent to the problem and not removable) and accidental complexity (introduced by implementation tools and methods, and largely removable):

  1. Minimize essential complexity so that anyone reading or modifying one piece of the system has to hold as little as possible in mind at one time.
  2. Keep accidental complexity from proliferating. Accidental complexity is everything that is not inherent to the problem: clever abstractions added "just in case," speculative configuration, framework gymnastics, and unnecessary indirection.

C. A. R. Hoare's formulation, which McConnell quotes approvingly, is the standard to aim at: "There are two ways of constructing a software design: one way is to make it so simple that there are obviously no deficiencies, and the other is to make it so complicated that there are no obvious deficiencies." Aim for the first. If a generated solution has no obvious deficiencies but you cannot prove it has no deficiencies, it is in the second category and needs to be simpler.

A practical consequence for an AI agent: prefer the boring, obvious solution. Cleverness, exotic language features, deep type pyrotechnics, premature generalization, and "future-proof" abstractions all increase the cognitive load of every future reader. McConnell: "Build in as much flexibility as needed to meet the software's requirements but do not add flexibility, and related complexity, beyond what is required."


2. What "High-Quality Design" Actually Means

McConnell lists internal characteristics that good designs share. They sometimes conflict; design is a tradeoff. The list below is the working definition of "good" that the rest of this document operates against.

  • Minimal complexity. The primary goal. Avoid clever designs. If a design does not let you safely ignore most of the program while working on one piece, it is not doing its job.
  • Ease of maintenance. Imagine the maintenance programmer (which may be the next instance of you) as the audience. Design the system to be self-explanatory.
  • Loose coupling. Hold connections among different parts of the program to a minimum. Use abstraction, encapsulation, and information hiding to keep classes with as few interconnections as possible.
  • Extensibility. You can change one piece without violence to the rest. The most likely changes cause the least trauma.
  • Reusability. Pieces can be lifted out and used in other systems.
  • High fan-in. Many parts of the system use a small number of well-designed utility classes at lower levels.
  • Low to medium fan-out. A given class uses a low to medium number of other classes (more than about seven is a warning sign).
  • Portability. The system can be moved to another environment.
  • Leanness. No extra parts. Voltaire: a book is finished not when nothing more can be added but when nothing more can be taken away. Extra code has to be developed, reviewed, tested, and considered when the rest is modified.
  • Stratification. You can view the system at any single level and get a consistent picture without having to dip into other levels.
  • Standard techniques. The more a system relies on exotic pieces, the harder it is to understand. Familiar patterns lower the barrier.

Two of these deserve special emphasis for AI agents. Leanness is a hedge against the tendency to over-generate: every helper, every parameter, every optional code path, every speculative interface is debt. Stratification is a hedge against mixing levels of abstraction in a single function or class, which is one of the most common shapes of unmaintainable AI-generated code.


3. Design Is a Heuristic, Iterative, Wicked Process

McConnell devotes the first section of Chapter 5 to deflating the myth of design as deterministic engineering. Several properties matter for how an AI agent should approach a non-trivial change:

  • Design is a wicked problem. A wicked problem can be clearly defined only by partly solving it. The implication: do not expect to nail the design from a one-shot read of the requirements. Expect to revise as understanding grows.
  • Design is sloppy even when the result is tidy. The path to a clean design is full of dead ends, throwaway sketches, and decisions that are reversed. Treat your first design as a draft, not a deliverable.
  • Design is about tradeoffs and priorities. The right design depends on which of the quality attributes above matter most for this program. There is no universally correct answer.
  • Design is non-deterministic. Different competent designers will produce different reasonable designs.
  • Design is a heuristic process. "The techniques are not rules; they are analytical tools" (McConnell, Chapter 5). Dogmatic adherence to any single methodology, including any of the heuristics in this document, hurts both creativity and the resulting program.
  • Design is emergent. Designs evolve and improve through iteration, refactoring, and use.

The practical conclusion McConnell draws is blunt: "Don't settle for the first design that occurs to you. Iterate, iterate, and iterate again." For an AI agent this means: if the task is non-trivial, sketch the approach, critique it, then revise before committing fully to code. The cheapest place to fix a design is in the design.


4. The Design Heuristics That Do Most of the Work

McConnell catalogs eight major heuristics. They are tools, not commandments, and they are all in service of managing complexity.

4.1 Find Real-World Objects

Identify the things in the problem domain (customers, invoices, sensors, sessions) and let them shape the structure of the code. This grounds the abstraction in something stable: the world changes more slowly than implementation choices.

4.2 Form Consistent Abstractions

Every routine, class, and module should let the caller think at one level of detail and ignore others. McConnell's metaphor: a builder installs doors, not panes of glass and pieces of wood. A routine that mixes a high-level intent ("post the order") with low-level details ("increment the byte at offset 12") is making the caller hold both levels in mind. P. J. Plauger: "It ain't abstract if you have to look at the underlying implementation to understand what's going on."

4.3 Encapsulate Implementation Details

The complement to abstraction. Once a thing is abstracted, the details must be inaccessible from outside, or the abstraction will erode the moment someone takes a shortcut.

4.4 Inherit Only When Inheritance Simplifies the Design

Inheritance is the most heavily qualified heuristic in the book. McConnell calls it "a dangerous technique" and quotes Joshua Bloch (Item 15, Effective Java Programming Language Guide, Addison-Wesley, 2001): "Design and document for inheritance or else prohibit it." His decision rule:

  • If multiple classes share data but not behavior, create a common object they contain.
  • If multiple classes share behavior but not data, derive them from a common base class that defines the common routines.
  • If multiple classes share both data and behavior, inherit from a common base class.
  • Inherit when you want the base class to control your interface; contain when you want to control your interface.

The single most important rule McConnell highlights, citing Liskov: do not inherit from a base class unless the derived class is truly a more specific version of the base. If the "is-a" relationship is shaky, prefer composition. Inheritance increases coupling between classes more than almost any other relationship.

4.5 Hide Secrets (Information Hiding)

This is the heuristic McConnell singles out as having "unique heuristic power." It originates in David Parnas, "On the Criteria to Be Used in Decomposing Systems into Modules," Communications of the ACM, vol. 15, no. 12, December 1972. The discipline is: ask of every class, every routine, every interface, what does this hide? If the answer is "nothing important," the abstraction is probably wrong.

McConnell distinguishes two categories of secrets:

  • Hiding complexity so you do not have to deal with it unless you are working on it directly.
  • Hiding sources of change so when change happens, the effect is localized.

Information hiding is the discipline that makes loose coupling possible in practice. It is also a check against the AI failure mode of generating classes that "expose all of a class's private data" because doing so saves a few lines of access code.

4.6 Identify Areas Likely to Change

A standard McConnell list of likely-to-change areas, expanded for modern systems: business rules and policy, hardware and platform dependencies, input and output formats, non-standard language features and library calls, difficult algorithms, complex or hard-to-design data structures, status variables, and any place where requirements are vague or contested. For each, separate the volatile design decision into its own class or routine and place a stable interface in front of it. The standard pattern: identify items that seem likely to change, separate them, and isolate them so a change in one does not cause a cascade.

4.7 Keep Coupling Loose

Coupling is judged on three axes: size (the number of connections), visibility (prominent connections through parameter lists are better than hidden ones through globals or shared state), and flexibility (how easily the connection can be changed). McConnell ranks coupling kinds from loose to tight, and the bottom of the list deserves explicit attention because it is invisible and pernicious:

  • Simple-data-parameter coupling (passing primitives via parameters): normal and fine.
  • Simple-object coupling (one class instantiates another): fine.
  • Object-parameter coupling (A passes B an instance of C, so both must know C): tighter, use sparingly.
  • Semantic coupling: the worst kind. One module relies not on another module's interface but on its inner workings: a control flag that triggers a specific internal branch, a global variable that must be in a specific state, an unstated requirement that one routine be called before another, an assumption that a sort is stable. Semantic coupling does not break the compile when the depended-upon code is changed; it breaks at runtime, silently, sometimes much later. AI agents should be especially wary of semantic coupling because it is easy to introduce by inferring "what the code seems to do" rather than what it formally promises.

4.8 Look for Common Design Patterns

Patterns reduce complexity by giving a name to a recurring structure, lower the chance of error by codifying solutions that have been worked through, and accelerate communication. Patterns are a vocabulary, not a goal. Reach for one when the problem fits, not because the pattern is in fashion.

McConnell also lists secondary heuristics, all worth keeping in mind: aim for strong cohesion, build hierarchies, formalize class contracts, assign responsibilities clearly, design for testability, anticipate failure modes, choose binding time consciously (compile-time vs. run-time vs. configuration), make central points of control, consider brute force when it works, draw a diagram, and keep the design modular.


5. Levels of Design and Where to Spend Effort

McConnell describes five levels of design (Chapter 5):

  1. The system as a whole. Architectural choices that affect everything.
  2. Subsystems and packages. The big organizational units (data access, business rules, user interface, platform interface). The crucial design activity here is restricting communication between subsystems. With no rules, every subsystem ends up talking to every other subsystem, and the benefit of separation evaporates. A useful default rule: the subsystem-level diagram should be acyclic. If A uses B, B uses C, and C uses A, you have lost the ability to understand any of them in isolation.
  3. Classes within subsystems. Defining the things and their interfaces.
  4. Routines within classes. Decomposing class behavior.
  5. Internal routine design. The logic inside each routine.

Two practical implications for an agent. First, when modifying an existing codebase, understand what level you are operating at before making changes. A change that looks local at the routine level may violate a subsystem contract. Second, when generating new code at the system or subsystem level, decide explicitly what is allowed to talk to what. Restricting communication early is much cheaper than untangling it later.


6. High-Quality Routines and Classes: The Things That Actually Matter

The fine-grained details of routine design (parameter ordering, naming conventions, line counts) are areas where modern AI agents already perform reasonably well, so this section concentrates on the conceptual points that are most often violated.

Cohesion is a stronger constraint than length. McConnell's strongest endorsement is for functional cohesion: a routine performs one and only one operation. Sequential cohesion (steps that must happen in order, all using a shared data flow) is acceptable. Communicational cohesion (operations that share data but are otherwise unrelated) is weaker. Temporal cohesion (operations grouped because they happen "at the same time," like initialization) is weaker still and should be treated as an organizer of other functions, not a unit of meaning. Logical cohesion (a flag-driven routine that does one of several unrelated things) and coincidental cohesion (no relationship between parts) are smells.

Length should be a consequence, not a goal. McConnell's guidance, supported by decades of evidence: routines of 100 to 200 noncomment, nonblank lines are not inherently more error-prone than shorter ones. Let cohesion, depth of nesting, number of variables, number of decision points, and the number of comments needed to explain the routine drive the length. Routines that try to do too much are the problem, not routines that are physically long because the algorithm is long.

Parameters reveal coupling problems. If a routine takes too many parameters, or the same group of parameters keeps moving through several routines together, the design is asking you to bundle them into an object. If a routine needs more features of another class than its own (feature envy), it is in the wrong class.

Classes to avoid.

  • God classes that know about and orchestrate everything else, especially classes whose work consists of Get/Set calls into other classes telling them what to do; the work probably belongs in those other classes.
  • Data-only classes with no behavior, which often want to be a property of another class rather than a class.
  • Classes named after verbs (StringBuilder, DatabaseInitialization) that are really procedures masquerading as objects; usually they should be routines on a noun-named class.
  • Classes with too many data members (more than about seven is the warning threshold) or inheritance trees deeper than a few levels.

Abstract Data Types are the foundation. A class is, properly, an Abstract Data Type: a collection of data and the operations on that data, expressed at the level of the problem (employees, fonts, orders) rather than at the level of the implementation (lists, hashes, strings). Programming with ADTs is what McConnell calls "programming into your language, not in it" (Chapter 34): you decide what you want to express, then build the abstraction, rather than letting the language's primitives dictate your structure.


7. Defensive Programming, Without the Paranoia

Chapter 8's defensive programming chapter provides a calibrated philosophy of error handling. The keystone distinction:

  • Use assertions for conditions that should never occur. A failed assertion is a bug in the program, not a runtime condition. Assertions document the programmer's beliefs about preconditions, postconditions, and invariants. McConnell calls them "executable documentation."
  • Use error-handling code for conditions you expect to occur, including conditions that are impossible in your code's logic but possible in inputs from the outside world.

A second crucial distinction is between correctness (never returning an inaccurate result; better to return nothing) and robustness (always trying to keep operating, even if results are sometimes inaccurate). Safety-critical systems lean toward correctness; consumer applications usually lean toward robustness. The choice is a design decision and must be made consciously, not left to whichever line of code the agent generates first.

The pattern McConnell recommends for input validation is the barricade: define a clear boundary between code that handles "dirty" untrusted data (external inputs, user input, third-party API responses, parsed files) and code that handles "clean" trusted data. At the barricade, validate aggressively. Inside the barricade, you can rely on assertions and assume invariants hold. This avoids the alternative of defensive checks scattered throughout every function, which both bloats the code and gives a false sense of security.

Two principles round out the philosophy. First, "Be defensive about defensive programming": too much defensive code obscures the main logic and itself becomes a source of bugs. Second, exceptions, like inheritance, are powerful and dangerous. Used judiciously they reduce complexity; used imprudently they make control flow nearly impossible to follow. Throw them only for genuinely exceptional conditions, document what each routine throws, and do not use them as a glorified goto.


8. Construction Practices That Most Influence Maintainability

Several of McConnell's recommendations cut against common AI code-generation patterns.

Iterate in small steps and verify often. McConnell's preferred construction style is incremental: design a small piece, write it, test it, integrate it, then expand. The Pseudocode Programming Process (Chapter 9) is one expression of this: before writing code for a non-trivial routine, sketch its logic in plain English at a level above the code, refine the sketch until each statement is small enough to translate into a few lines of code, then write the code beneath each pseudocode statement. The pseudocode becomes the comments. The deeper point is not the specific technique but the discipline of thinking in the problem domain before committing to syntax. An AI agent that drafts pseudocode, criticizes it, then writes code will produce more cohesive routines than one that streams code from the first token.

Table-driven methods (Chapter 18). When code needs to choose among many cases, a lookup table is often clearer, shorter, and more maintainable than a long if/else or switch. Tables are particularly powerful for business rules, configuration, and state machines: changes mean editing data, not editing logic. Logic statements are easier for simple cases, but as the number of cases grows, table-driven code scales better. Consider tables whenever you are tempted to write more than a handful of parallel if arms.

Keep variable scope tight and "live time" short. Localize references to variables. Keep statements that work with the same variable close together. Avoid global data; it dramatically increases the percentage of the program you have to think about at once. Each variable should have one purpose and one only.

Boolean expressions and control structures. Form boolean expressions positively. Use parentheses to make precedence explicit. Write numeric expressions in number-line order (MIN <= x && x <= MAX). Decompose complex conditions into named boolean variables or small functions whose name explains the test. The goal is for a reader to understand the condition without simulating it in their head.

Self-documenting code first, comments second. McConnell strongly endorses self-documenting code: clear naming, good decomposition, expressive types. Comments explain why, not what. A comment that explains what a difficult line does is a sign the line should be rewritten. "Don't document bad code; rewrite it."


9. When Complexity Is Warranted

Not every increase in apparent complexity is bad. Two situations justify it:

  • Inherent complexity in the problem. If the domain is genuinely complex (concurrent systems, real-time constraints, regulatory rules with many cases, numerical algorithms), the code will be complex. Pretending otherwise produces toy solutions that fail in production. In these cases, push the complexity into a small number of well-isolated, well-documented modules and keep the surrounding code simple.
  • Unusual control structures with clear local benefit. Chapter 17 covers cases where things like multiple returns, recursion, or even a goto are the simplest expression. Multiple returns are fine when they make the routine clearer (especially as guard clauses at the top). Recursion is right for problems with recursive structure (trees, parsers) and wrong for problems that fit a loop. The principle is the same: choose the structure that minimizes complexity for the reader, not the one that follows a rule.

The contrary mistake is to add complexity that is not warranted. Speculative generality (an extension point for a feature no one has asked for), overuse of design patterns when a function would do, deep inheritance hierarchies, exception-driven control flow used as alternative return values, and elaborate type-level encodings that obscure the runtime behavior are all common ways AI-generated code increases complexity without paying for it.


10. Refactoring as a Continuous Discipline

McConnell defines refactoring (Chapter 24) as a change to the internal structure of software that makes it easier to understand and cheaper to modify, without changing observable behavior. The cardinal rule of software evolution is that programs should get better over time, not worse. Entropy is the default; refactoring is the work that opposes it.

The reasons-to-refactor list is the field guide for AI agents working on existing code. Treat any of these as a signal that something needs attention, not necessarily immediate action:

  • Code is duplicated.
  • A routine is too long, or a loop is too long or too deeply nested.
  • A class has poor cohesion or an interface that does not present a consistent level of abstraction.
  • A routine has too many parameters; or a group of parameters consistently moves together (they want to be a class).
  • Changes within a class are compartmentalized to one part (it wants to be its own class), or changes require parallel modifications across multiple classes (the boundaries are wrong).
  • Inheritance hierarchies have to be modified in parallel.
  • Related data items used together are not organized into a class.
  • A routine uses more features of another class than of its own (feature envy).
  • A primitive data type is overloaded with meaning that wants to be a class (a string that is really an email address).
  • A "lazy" class does very little; merge it. A middleman just delegates; eliminate it.
  • Tramp data passes through a chain of routines just so a deep callee can use it.
  • One class is overly intimate with another's internals.
  • A routine has a name that is hard to write because the abstraction is muddled.
  • Data members are public.
  • A subclass uses only a small fraction of its parent's routines (the "is-a" relationship is wrong).
  • Comments are required to explain difficult code.
  • Global variables are used.
  • A routine requires setup or takedown logic at every call site.
  • The program contains code "that might be needed someday" (speculative generality, YAGNI).

Equally important is the safe practice of refactoring. Do refactorings one at a time, recompiling and retesting between each. Treat small changes with the same seriousness as large ones; programmers reliably underestimate the risk of changes that look trivial. Keep refactoring and feature work separate when you can: mixing them makes it impossible to tell which change broke something. McConnell's 80/20 rule applies: spend time on the 20 percent of refactorings that yield 80 percent of the benefit.

When code is so structurally wrong that refactoring keeps producing new refactorings, the answer may be to throw the section out and rewrite it. McConnell is explicit that this is sometimes the right answer.


11. Performance and Code Tuning: Measure, Then Decide

Chapter 25 makes a sharp distinction between program performance, which depends on architecture, algorithms, and data structures, and code tuning, which is the local rewriting of small sections for speed. The order matters: nearly all real performance problems are solved at the design level, not by tweaking inner loops.

McConnell endorses a strict version of Donald Knuth's doctrine on premature optimization, given verbatim in "Structured Programming with Go To Statements," ACM Computing Surveys, December 1974: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." The Pareto principle applies: a small fraction of the code is responsible for most of the runtime, and the only reliable way to find that fraction is to profile. Optimizing without profiling is optimizing the wrong code, and it has a real cost: tuned code is almost always less clear, harder to maintain, and more likely to harbor subtle bugs.

The discipline:

  1. Write clean, well-structured code that follows the design principles above.
  2. If performance matters, set explicit performance goals.
  3. Profile against the goals.
  4. Tune only the bottlenecks the profiler identifies, and only as far as needed to meet the goals.
  5. Measure each tuning change; many "optimizations" make things slower or have no effect on modern compilers and hardware.
  6. Keep the untuned version available so you can revert when the optimization stops paying off, which it eventually will.

For an AI agent, the practical rule is: do not generate "optimized" code by default. Generate code that is correct and clear. Optimize when there is a measured reason to, and document why.


12. Programs Are Written for People to Read

Chapter 34 is McConnell's synthesis chapter, and its themes are the moral of the entire book.

Conquer complexity. Repeated because everything else flows from it. When projects fail for technical reasons, the cause is almost always that the code grew complex enough that no one knew what it did anymore.

Pick your process. A good outcome requires attention to quality from requirements through deployment, not heroic effort at the end. Small, high-discipline steps beat sporadic big rewrites.

Write programs for people first, computers second. McConnell's most quoted line: "It's OK to figure out murder mysteries, but you shouldn't need to figure out code. You should be able to read it." Code is read many more times than it is written, and the next reader, including a future instance of the author or a future AI agent, has less context than the original author. Optimizing for human readability is the highest-leverage investment in software.

Program into your language, not in it. Decide what you want to express, then express it using the tools the language gives you. If the language lacks a useful construct, simulate it with conventions, helpers, or libraries. Do not let the language's primitives dictate your design.

Use conventions to focus attention. Conventions are not interesting in themselves, but they free attention. When naming, formatting, and error-handling are predictable, the human mind has more room for the design.

Program in terms of the problem domain. Top-level code should read in terms of customers, invoices, and pages, not arrays, pointers, and bytes. Abstraction layers exist to push code higher up the conceptual stack.

Watch for falling rocks. Pay attention to warning signs. A routine you cannot name. A class that keeps growing. Code you are afraid to change. Comments that apologize. A bug that keeps coming back in the same module. A "temporary" hack that has been there for a year. None of these is proof of error, but each is a sign to slow down. McConnell calls these "code smells" before that term was popularized; the response is the same. Investigate, do not paper over.

Iterate, repeatedly, again and again. Iteration is fundamental at every level: requirements, design, code, debugging, testing, refactoring, integration. The best programmers do not get it right the first time; they get it right by getting it wrong faster than everyone else.

Avoid dogma. McConnell's last theme is "Thou shalt rend software and religion asunder." Do not treat any methodology, language, framework, or design pattern (or any line in this document) as gospel. Use the best technique for each situation, drawing freely from multiple sources. Real software practice is too varied for any single ideology to be right everywhere.


Closing Note: The Spirit of Construction

The point of Code Complete is not the rules; it is the intellectual honesty the rules support. McConnell devotes a chapter to the personal characteristics of effective programmers, and the ones he singles out are humility about your own correctness, curiosity about how things actually work, willingness to change your beliefs based on evidence, creativity disciplined by craft, and what he calls "enlightened laziness," the instinct to find the simplest path that works.

For an AI coding agent, the most important translation of these traits is this: do not pretend to certainty you do not have. If a generated solution is plausible but unverified, say so. If a refactor might have effects you cannot predict, narrow the change. If a test would tell you whether the code works, write the test. If the design is complex and you cannot say why, the design is probably wrong. The code is for the next human or agent who has to read it; act accordingly.