What is (still) hard about software architecture

A while back I posted a list of things that are hard about software architecture, and this posting is an elaboration on that list that will make its way into the software architecture book. Comments on this list are welcome. In fact, if you have a pet peeve about software architecture then it can probably be refactored into an entry in this list.

Process: What to do when

Cost-benefit analysis. We have seen that by identifying risks we can selectively perform architecture modeling to reduce risks in order of their priority. While this is better than simply guessing how much architecture work is enough, it leaves much to be desired. It is difficult to predict and prioritize risks, so choosing the right amount amount of architecture is also difficult. Engineers will have different opinions of risks and priorities, so you may find yourself trusting one engineer’s estimates over another’s.

The community of agile software developers has an ongoing discussion of how much advance planning needs to be done, with some arguing that most projects are better off with no architecture planning. The best developers have excellent design instincts and simply following their hunches can lead to success.

Behavior. Architecture models that only describe quality attributes tend to reach a natural level of detail in modeling, but models that include functionality and behavior can easily be elaborated until they describe data structures and operations. Architecture modeling can transition into design, then detailed design, then a paper-based version of coding. This ability to go deep is a benefit because we can dig into details when needed, but a challenge because we must decide when to dig in and when to resist. Time spent modeling has an opportunity cost: time spent building the system.

Evaluating alternative architectures. Seen from a distance, evaluating alternative designs is as simple as building a few models and evaluating how well each enables the desired qualities. In practice, evaluating alternatives is difficult because the devil often lives in the details, and those details may not have been elaborated yet. Since we have not yet committed to a design we are hesitant to spend much time adding details, but we may not discover problems until we take a look at details like the specifics of external API’s we must use, or prototype to learn actual performance figures.

Expressability / analyzability

Non-static component configurations. Most systems settle down into a stable set of runtime component instances, even though during initialization there are changes. When we draw diagrams showing the runtime configuration of component instances, we usually simplify the problem by not drawing diagrams of the intermediate configurations during startup and shutdown. A diagram that shows the internal configuration of components in our systems looks reasonable at first glance, but it is static, not dynamic, and has omitted the other configurations during startup and shutdown. We make the simplification because reasoning about dynamic configurations is hard and we have few tools to make it easier.

It is reasonable to only consider the steady-state configuration when startup and shutdown are straightforward, but we can easily imagine counter-examples. We even know that some systems change at runtime, though we generally avoid this because of the possibility of failure. Peer-to-peer systems evolve at runtime into different configurations of components, as do frameworks that can dynamically load new components. It is difficult to convince yourself that untime re-configuration like this is free from problems, so as developers we tend to avoid it, but some problem domains demand it.

Shoehorning abstractions. The transition to structured programming saw some developers complaining that they could not express their existing programs in the new, more constrained, programming languages. They argued that their old programs were more efficient and perfectly understandable, so the new constraints were undesirable.

A similar transition happens when moving from a sea of unconstrained object-oriented designs into a world constrained by a set of architectural abstractions like components and connectors. At first you may look to your existing designs and find that parts of them do not match up well with the new abstractions. Perhaps you could have designed them to fit, but looking at them now you see that they do not fit, and you are tempted to reject the new abstractions. After a while you will become accustomed to these abstractions, and you will feel comfortable building within the limited set, but the transition can be uncomfortable.

Frameworks. Frameworks present a particular example of shoehorning abstractions because the interaction beween client code and a framework does not neatly align with the standard architectural abstractions. Frameworks provide complex, wide interfaces to the clients that use them, often exposing the implementation details of the framework. Ports, in contrast, provide narrow interfaces and encourage encapsulation. Some frameworks can be represented as components because they exist at runtime, while other frameworks, especially older ones, are collections of code that cannot be instantiated until augmented with client code.

Representing concurrency. Concurrency has always been one of the most challenging problems in developing systems. Novice developers may relish the challenge and seek out opportunities for concurrency, but jaded developers view it warily as a source of difficult bugs and are happy to get it right then leave it alone. Concurrency is introduced into systems either through forces in the problem domain or by a desire to improve a quality attribute, such as performance or usability.

With a clean-slate design you might be able to perfectly align the threads or processes in your system with the component boundaries. If so, you can annotate the components and connectors to indicate the concurrency. Anytime a concern cross-cuts your decomposition there will be trouble expressing it, and concurrency is particularly difficult.

View consistency. The ubiquitous advice on software architecture is to build multiple views of your system. We do this because each view can focus attention on one aspect, some views cannot be easily reconciled (recall the definition of a viewtype), and creating a single view would create a muddle of details that defeats the purpose of having a model. Reasoning from a particular view means having to separately reconcile the views for consistency. You might find one arrangement of windows on the outside of the house aesthetically pleasing, but a different arrangement of windows leads to good lighting in a room, yet these views must be consistent.

The challenge is to detect consistency problems, preferably earlier than later. Some view consistency problems are simply cruft because you update one view but have not yet updated older views. Other consistency problems stem from design errors and may lead to un-buildable designs.

Precision. It is difficult to to know when your model is precise enough, or detailed enough. The general advice is to make the model precise enough to answer the questions will you ask of it or sufficient to reduce the risks you perceive. However, you may not be able to perceive the risk until after you have built the detailed model. Analytic models, generally, require more precision and are more expensive than analogic models.

Design

Promoting details. Choosing which details to promote to the architectural level is difficult. The challenge is how to select relevant details for the model at the same time keeping the model minimally sufficient. Different developers are likely to choose different details, which means that some models will be better than others, yet we do not have guidance on how best to choose.

You may build a model of a component one day, then later decide to use it in a concurrent setting, but your model does not show if the code is thread-safe. Generally, a model built for one purpose will work for another purpose only if you are lucky. You were not wrong to omit that detail from your model if it was able to answer the questions originally asked of it.

Modeling connectors and ports. Connectors provide a developer with great flexibility in expressing how components communicate. Imagine two components, A and B, that communicate via a third component C, or even via a shared resource like a file. You could model this with connections to component C or file, or hide the existence of component C or file within a connector. An enterprise service bus connector is likely an expensive purchase, so there is a temptation to expose it as a component rather than a connector. Both options are accurate and allowed.

The most common kind of connector is a simple two-way connector, but N-way connectors are possible and make sense, for example an N-way connector that reports a consensus. It would not be wrong to model this N-way connector as several 2-way connectors that all connect to a single component that calculates the consensus, and perhaps that more accurately reflects the implementation. Neither is more right than the other.

Another choice you will face is whether you should route two different connectors to the same port, or to two different ports. Again, neither is more right than the other, though the details of the situation may bias you one way or the other, particularly if the port has a shared state for the two connectors or not.

Refinement. Models will become unsychronized with other models and with code, and problem is particularly hard with a refinement relationship between models. It is easy to forget to revise the high-level model of system when you revise the low-level model. Forgetfulness aside, you may deliberately allow your various models to become out-of-date because it is expensive to keep them updated.

It is possible to be sufficiently precise in the refinement map so that you can detect refinement problems, but prohibitively expensive. In practice few developers even write down a sketch of the correspondences between a high- and low-level model, though they may eyeball each to convince themselves that the refinement is ok.

Modeling for prediction. Using architecture models to discover problems in advance is harder, and requires more effort, than modeling simply to document a design because small details can distort predictions. A friend related an experience where his performance predictions were substantially incorrect because the actual distribution of requests into his system were burstier than his model allowed. Improved architecture modeling technology holds the promise of better predictions about performance, but producing a sufficiently detailed model for accurate predictions can be expensive.

Non-encapsulated issues and patterns. Components, modules, and nodes allow us to encapsulate our thinking in different viewtypes, but some ideas will crosscut these elements. Low-level ideas like coding standards will apply to many modules, as will large-scale patterns like using MapReduce whenever possible to ease handling of large datasets. Many times we can use the open-ended ability to annotate elements with custom properties, but that solution works poorly here because there is no obvious element to annotate with the property.

Issues spanning engineering and management. It is unlikely that your organization’s management will pay much attention to lower-level design decisions like the indentation style in your code, but they are likely to be interested in the functionality and qualities of your system. Sometimes, when deciding the architecture for your system, you will face a choice that can either be solved by engineering or by management. For example, a distributed system might be cheaper to build if you can assume that each site will support the software that runs there, or you could design it for central administration at a greater cost. The decision regarding system administrators is likely to be made by management, not engineers, and other similar situation occur at the architectural level of design.

Bridging design to implementation

Non-object languages. As we have discussed, every system will have at least one component instance at runtime, which is the entire system itself. When programming in object-oriented languages, it is comfortable to think about this component having internal runtime structures that are objects, and not a big stretch to think about grouping those objects into subcomponents. In non-object-oriented languages such as functional, rule-based, or even procedural languages, it is harder to envision what the runtime substructure is. It is possible to allocate responsibility to subcomponents and then build the subcomponent with whatever style of language is most appropriate, including non-object languages.

Bridging objects to components. Even when using an object-oriented language there are problems moving between the architecture abstractions and the object abstractions because each has a different vocabulary and communication idioms. Objects (or functions, or procedures, …) are concretely represented in programming languages, and substantial design guidance exists for them. Architecture abstractions are not yet concretely available in mainstream programming languages, which raises the question of when to switch from one abstraction to another.

A standard object-oriented pattern is to use an adapter to convert from one interface to another. However, we can represent an adaapter as a component, not an object. We had the choice of building this adapter into the existing components and revealing its existence as a new port, or making the adapter into its own component. It is hard to give general advice on how big or small components should be, but it is uncommon for a single object to be a component.

Multiple languages. Within a single language, you can develop a coding style that makes the components and connectors evident. In practice, scripting languages are often used expediently and without the same attention to coding discipline as the rest of the code. Keeping up the discipline of an architecturally evident coding style can be difficult with multiple languages, especially when they are substantially different and the conventions you are following in one do not translate well into the other.

Non-greenfield implementations. If you start with arbitrary machine code it is difficult to imagine fitting structured programming language constructs to it because the machine code was not written with those constraints in mind. Similarly, it will be difficult to align the architectural concepts with an existing system that was not built with those constraints in mind. In this case, an expensive strategy is to refactor the existing code into modules and components, but more practical is simply to think of the existing system as a collection of really large modules or components. As you move forward you can build reasonably sized components or subcomponents. If the existing code has truly poor encapsulation, however, there are few inexpensive options to evolve it into a better state.