Model-code gap

From Just Enough Software Architecture book section 10: The Code Model

Source code is both the end deliverable and the medium in which solutions are expressed. Architecture models are not the end deliverable, so they are useful only when they can be related to the code. Consequently, it is important to understand the relationship between architecture models and code.

At first glance, that relationship may seem straightforward. For example, a model that talks about modules or components is easy to relate to the corresponding code elements. But models can also include ideas that are hard to relate, such as, “A lock must be held before every access to this data.” You can relate that architectural idea to code, but it is not a straightforward structural correspondence. There is a gap between architecture models and code.

Model-Code Gap

To begin understanding the differences between architecture models and code, it is helpful to start with an inventory of what each of them contains. Figure 10.1 summarizes the kinds of elements you commonly find in architecture models and in source code. As you scrutinize the lists of elements, you can notice differences in their vocabulary, abstraction, design commitments, and presence of general/enumerated (i.e., intensional/extensional) statements. Let’s take a look at each of those differences.

Location Examples of elements
Architecture model Modules, components, connectors, ports, component assemblies, styles, invariants, responsibility allocations, design decisions, rationales, protocols, quality attributes, and models (e.g., security policies, concurrency models)
Source code Packages, classes, methods, variables, functions, procedures, statements

Figure 10.1: A summary of the kinds of elements commonly found in architecture models and in source code. There are differences in vocabulary, abstraction, design commitments, and presence of intensional/extensional elements.

Vocabulary. A simple comparison of the lists of elements in architecture models and source code reveals that they use different vocabulary to talk about the same things. For example, architecture models contain modules while source code contains packages, which is a nomenclature difference, but in essence the same thing.

Other vocabulary differences exist because architecture models and code express distinct ideas. Consider a thought experiment where you express your architecture model in UML and you also automatically generate a UML model of your source code. When you compare these two UML models, you will find differences. For example, your source code model does not express the component types or instances found in your architecture model. While method call connectors and event bus connectors both show up in your architecture model, only method calls are seen in your source code model. The architecture model and source code have different vocabulary because each expresses ideas the other does not.

Abstraction. Architecture models tend to be more abstract than source code in two ways. First, a single element in an architecture model often aggregates multiple elements in source code. For example, a component type in an architecture model may map to a dozen classes in the source code. Similarly, an architecture model may show a client or a server, each corresponding to many classes or procedures.

Second, when they describe the same element, architecture models generally describe that element with fewer details than source code does. An architecture model may stop its descriptions once it reaches the level of modules and components, but source code continues the detail through classes, methods, and instance variables. If you imagine a gradient on which you place different kinds of elements, architecture models contain the more abstract elements, source code contains the less abstract elements, and the two overlap by both including some of the elements in the middle of the gradient.

Design commitments. Another difference is that architecture models and source code do not both contain the same design commitments. An architecture model may commit to the use of some technologies (e.g., AJAX and REST) but source code goes much farther and commits to how those technologies are implemented. Architecture and design models can make a partial commitment, but source code must make a full commitment, or at least enough of a commitment to be executable. For example, it may be sufficient in an architecture model to state a quality attribute scenario that account lookups happen with 0.25s, but source code will describe the data structures and algorithms necessary to make that happen.

Intensional-extensional. Perhaps the biggest difference between architecture models and source code is that architecture models contain a mixture of intensional and extensional elements, while code has only extensional elements. Intensional elements are those that are universally quantified, such as “All filters communicate via pipes,” while extensional elements are enumerated, such as “The system is composed of a client, an order processor, and an order storage components.” Figure 10.2 lists which architecture elements are intensional and extensional.

Intensional / Extensional Architecture model element Mapping into source code
Extensional (defined by enumerated instances) Modules, components, connectors, ports, component assemblies These correspond neatly to elements in the source code, though often at a higher level of abstraction (e.g., one component corresponds to multiple classes)
Intensional (quantified across all instances) Styles, invariants, responsibility allocations, design decisions, rationales, protocols, quality attributes, and models Source code will conform to these, but they are not directly expressed in the code. Architecture model has general rules, code has examples.

Figure 10.2: Tabulation of various architectural elements and how they map into code. Extensional elements in the architecture map fairly cleanly to code elements, but intensional elements do not.

The distinction between intensional and extensional elements in architecture and code, identified by Amnon Eden and Rick Kazman (Eden and Kazman, 2003), is important because it explains which parts of the architecture model will be harder to map into the source code. Since source code is extensional, extensional elements of the architecture model, like components and component assemblies, are easy to map into the source code. Recall the component type from the Yinzer system design model that was called Contacts. That component would correspond to several classes in the source code. You can even imagine minor changes to the programming language to let you express components directly. For example, ArchJava adds architectural elements like components and ports to Java (Aldrich, Chambers and Notkin, 2002).

Conversely, it is hard to relate intensional elements, like design decisions, styles, and invariants to the (extensional) source code. Intensional elements establish general rules that apply to all elements, but standard programming languages cannot directly express these rules. Though source code cannot express the rules, it should respect the rules. So, for example, if your architecture model has a design decision (an intensional element) that says to avoid using vendor-specific API’s, you cannot express that rule in your C++ source code, but none of your code should use those API’s. When you look at source code, you cannot see the design intent of the intensional elements, but the code should respect that design intent.

Model-code gap. Your architecture models and your source code will not show the same things. The difference between them is the model-code gap. Your architecture models include some abstract concepts, like components, that your programming language does not, but could. Beyond that, architecture models include intensional elements, like design decisions and constraints, that cannot be expressed in procedural source code at all.

Consequently, the relationship between the architecture model and source code is complicated. It is mostly a refinement relationship, where the extensional elements in the architecture model are refined into extensional elements in source code. This is shown in Figure 10.3. However, intensional elements are not refined into corresponding elements in source code.

Model-code refinement

Figure 10.3: Extensional elements in the design model have a refinement relationship to the source code. Intensional elements do not, since they are rarely expressible in the source code, and contribute to the model-code gap.

Upon learning about the model-code gap, your first instinct may be to avoid it. But reflecting on the origins of the gap gives little hope of a general solution in the short term: architecture models help you reason about complexity and scale because they are abstract and intensional; source code executes on machines because it is concrete and extensional.

Attempts to avoid the gap. People react differently when they hear about the model-code gap. Some see the difficulty as an opportunity to retreat to known ground by avoiding architectural abstractions entirely. However, this would be just one step away from the big ball of mud architecture (Section 14.7). Your ability to handle complexity and scale would be greatly diminished because architecture abstractions exist to give you a handle complexity and scale. While there are difficulties in using the abstractions, managing a sea of classes is probably more difficult.

If you cannot avoid the abstraction gap then you must manage it. There are two primary ways to manage the gap: mechanically and by hand. Mechanically, it may be possible to write in a higher-level, N-th generation language and generate the source code. This is the technique of application builder/generators and Model Driven Engineering (MDE) (Selic, 2003b). By generating code, the gap between the architecture models and the higher-level language is reduced or even eliminated compared to writing normal source code. In a few domains this approach is practical, but the grand vision of MDE is not yet ready for mainstream use.