What the flowchart hides

Listen · 4:16

Every programmer in 1972 knew how to break a program into modules: you drew a flowchart. Read the input, process it, write the output — one box per stage, arrows between them, each box becomes a module. David Parnas, a thirty-one-year-old computer scientist at Carnegie-Mellon University in Pittsburgh, read those flowcharts and concluded that almost every programmer was doing it wrong.

His argument appeared in the December 1972 issue of Communications of the ACM — five pages, one running example, and a principle that has since quietly embedded itself into every API, class, and service boundary in software. The principle was information hiding: module boundaries should be drawn around design decisions, not processing steps.

The demonstration used a program most of his readers had already written: a KWIC index generator. KWIC stands for Key Word In Context; the system takes a set of input lines and produces every circular permutation of each, sorted alphabetically. Given “Turing machines and computability,” it outputs “and computability Turing machines,” “computability Turing machines and,” and every other rotation. A homework exercise. Parnas decomposed it two different ways and put them side by side.

The conventional decomposition followed the flowchart — five modules for five processing steps: Input, Circular Shifter, Alphabetizer, Output, Master Control. Tidy. Each module matched a stage in the pipeline. Parnas’s decomposition had a similar surface: similar module names, similar count. The difference was in what each module hid. His Line Storage module hid the data structure — whether lines were stored as a flat array, a linked list, or something else. His Circular Shifter hid how shifts were represented. Change the internals of any one module and nothing else in the system needed to know.

This was the insight that took time to land: the two decompositions look nearly identical until you ask what happens when a design decision changes. In the conventional version, that change propagates — the alphabetizer might depend on how lines are stored, the output module on how shifts are indexed. In Parnas’s version, the change stays inside its module. “It is almost always incorrect,” Parnas wrote, “to begin the decomposition of a system into modules on the basis of a flowchart.” The flowchart captures the order of execution. It never captures the likely sources of change — and those are exactly what module boundaries need to contain.

The paper was five pages and used a homework problem. Parnas could have reached for an operating system or a database. He chose the smallest example that made the principle visible, which was itself an act of information hiding: strip away every distraction, reveal only the argument.

The ideas moved slowly at first, then all at once. Object-oriented languages of the 1970s and 1980s — CLU, Modula-2, C++ — adopted the vocabulary of encapsulation, which is information hiding with a class drawn around it. Every keyword in every language since that separates public from private is a direct descendant. Microservices, fifty years later, are information hiding at the network boundary: the service exposes an interface and conceals everything else.

Every interface you have written — in a class, a module, a REST endpoint — is a commitment about what the caller does not need to know. Parnas named that commitment in five pages in 1972. The commitment has grown larger. The principle has not changed.

Sources

On the Criteria To Be Used in Decomposing Systems into Modules — Communications of the ACM — the 1972 paper; both KWIC decompositions; Parnas’s direct quote about flowcharts.
David Parnas — Wikipedia — biography, Carnegie-Mellon context, career.
On the criteria to be used in decomposing systems into modules — the morning paper — close reading of the two decompositions and their implications for modern system design.

Spot a mistake?

What the flowchart hides

Sources

Subscribe by email