Difference between revisions of "Programming/Principles"
Line 194: | Line 194: | ||
</blockquote> | </blockquote> | ||
==The | ==The Principle of Least Astonishment== | ||
The '''Principle of Least Astonishment''' '''[J87, R03]''': | The '''Principle of Least Astonishment (POLA)''' '''[J87, R03]''': | ||
<blockquote> | <blockquote> | ||
A component of a system should behave in a way that most users will expect it to behave. | A component of a system should behave in a way that most users will expect it to behave. |
Revision as of 10:56, 17 July 2021
Principles
First make it work. Then make it right. Then make it fast.
From [M06]:
"First make it work. Then make it right. Then make it fast." This quotation, often with slight variations, is widely known as "the golden rule of programming." As far as I've been able to ascertain, the quotation is by Kent Beck, who credits his father with it. Being widely known makes the principle no less important, particularly because it's more honored in the breach than in the observance. A negative form, slightly exaggerated for emphasis, is in a quotation by Don Knuth (who credits Hoare with it): "Premature optimization is the root of all evil in programming."
Optimization is premature if your code is not working yet, or if you're not sure about what, exactly, your code should be doing (since then you cannot be sure if it's working). First make it work. Optimization is also premature if your code is working but you are not satisfied with the overall architecture and design. Remedy structural flaws before worrying about optimization: first make it work, then make it right. These first two steps are not optional; working, well-architected code is always a must.
In contrast, you don't always need to make it fast. Benchmarks may show that your code's performance is already acceptable after the first two steps. When performance is not acceptable, profiling often shows that all performance issues are in a small part of the code, perhaps 10 to 20 percent of the code where your program spends 80 or 90 percent of the time. Such performance-crucial regions of your code are known as its bottlenecks, or hot spots. It's a waste of effort to optimize large portions of code that account for, say, 10 percent of your program's running time. Even if you made that part run 10 times as fast (a rare feat), your program's overall runtime would only decrease by 9 percent, a speedup no user would even notice. If optimization is needed, focus your efforts where they'll matter—on bottlenecks.
Make Interfaces Easy to Use Correctly and Hard to Use Incorrectly
Scott Meyers in [H10]:
One of the most common tasks in software development is interface specification. Interfaces occur at the highest level of abstraction (user interfaces), at the lowest (function interfaces), and at levels in between (class interfaces, library interfaces, etc.). Regardless of whether you work with end users to specify how they'll interact with a system, collaborate with developers to specify an API, or declare functions private to a class, interface design is an important part of your job. If you do it well, your interfaces will be a pleasure to use and will boost others' productivity. If you do it poorly, your interfaces will be a source of frustration and errors.
Good interfaces are:
- Easy to use correctly
People using a well-designed interface almost always use the interface correctly, because that's the path of least resistance. In a GUI, they almost always click on the right icon, button, or menu entry, because it's the obvious and easy thing to do. In an API, they almost always pass the correct parameters with the correct values because that's what's most natural. With interfaces that are easy to use correctly, things just work.
- Hard to use incorrectly
Good interfaces anticipate mistakes people might make, and make them difficult—ideally, impossible—to commit. A GUI might disable or remove commands that make no sense in the current context, for example, or an API might eliminate argument-ordering problems by allowing parameters to be passed in any order.
A good way to design interfaces that are easy to use correctly is to exercise them before they exist. Mock up a GUI—possibly on a whiteboard or using index cards on a table—and play with it before any underlying code has been created. Write calls to an API before the functions have been declared. Walk through common use cases and specify how you want the interface to behave. What do you want to be able to click on? What do you want to be able to pass? Easy-to-use interfaces seem natural, because they let you do what you want to do. You're more likely to come up with such interfaces if you develop them from a user's point of view. (This perspective is one of the strengths of test-first programming.)
Making interfaces hard to use incorrectly requires two things. First, you must anticipate errors users might make and find ways to prevent them. Second, you must observe how an interface is misused during early release and modify the interface—yes, modify the interface!—to prevent such errors. The best way to prevent incorrect use is to make such use impossible. If users keep wanting to undo an irrevocable action, try to make the action revocable. If they keep passing the wrong value to an API, do your best to modify the API to take the values that users want to pass.
Above all, remember that interfaces exist for the convenience of their users, not their implementers.
Keep it simple, stupid (KISS)
Filip Hanik writing in https://people.apache.org/~fhanik/kiss.html:
What does KISS stand for?
The KISS is an abbreviation of Keep It Stupid Simple or Keep It Simple, Stupid.
What does that mean?
This principle has been a key, and a huge success in my years of software engineering. A common problem among software engineers and developers today is that they tend to over complicate problems.
Typically when a developer is faced with a problem, they break it down into smaller pieces that they think they understand and then try to implement the solution in code. I would say 8 or 9 out of 10 developers make the mistake that they don't break down the problem into small enough or understandable enough pieces. This results in very complex implementations of even the most simple problems, another side effect is spaghetti code, something we thought only BASIC would do with its goto statements, but in Java this results in classes with 500–1000 lines of code, methods that each have several hundred of lines.
This code clutter is a result of the developer realizing exception cases to his original solution while he is typing in code. These exception cases would have solved if the developer had broken down the problem further.
How will I benefit from KISS
- You will be able to solve more problems, faster.
- You will be able to produce code to solve complex problems in fewer lines of code.
- You will be able to produce higher quality code.
- You will be able to build larger systems, easier to maintain.
- Your code base will be more flexible, easier to extend, modify or refactor when new requirements arrive.
- You will be able to achieve more than you ever imagined.
- You will be able to work in large development groups and on large projects since all the code is stupid simple.
How can I apply the KISS principle to my work
There are several steps to take, very simple, but could be challenging for some. As easy as it sounds, keeping it simple, is a matter of patience, mostly with yourself.
- Be Humble, don't think of yourself as a super genius, this is your first mistake. By being humble, you will eventually achieve super genius status =), and even if you don't, who cares! your code is stupid simple, so you don't have to be a genius to work with it.
- Break down your tasks into sub tasks that you think should take no longer than 4–12 hours to code.
- Break down your problems into many small problems. Each problem should be able to be solved within one or a very few classes.
- Keep your methods small, each method should never be more than 30–40 lines. Each method should only solve one little problem, not many use cases. If you have a lot of conditions in your method, break these out into smaller methods. Not only will this be easier to read and maintain, but you will find bugs a lot faster. You will learn to love Right Click+Refactor in your editor.
- Keep your classes small, same methodology applies here as we described for methods.
- Solve the problem, then code it. Not the other way around. Many developers solve their problem while they are coding, and there is nothing wrong doing that. As a matter of fact, you can do that and still adhere to the above statement. If you have the ability to mentally break down things into very small pieces, then by all means do that while you are coding. But don't be afraid to refactor your code over and over and over again. It's the end result that counts, and number of lines is not a measurement, unless you measure that fewer is better of course.
- Don't be afraid to throw away code. Refactoring and recoding are two very important areas. As you come across requirements that didn't exist, or you weren't aware of when you wrote the code to begin with you might be able to solve the old and the new problems with an even better solution. If you had followed the advice above, the amount of code to rewrite would have been minimal, and if you hadn't followed the advice above, then the code should probably be rewritten anyway.
- And for all other scenarios, try to keep it as simple as possible, this is the hardest behavior pattern to adhere to, but once you have it, you'll look back and say, I can't imagine how I was doing work before.
Some of the world's greatest algorithms are always the ones with the fewest lines of code. And when we go through the lines of code, we can easily understand them. The innovator of that algorithm, broke down the problem until it was so easy to understand that he/she could implement it.
Many great problem solvers were not great coders, but yet they produced great code!
The acronym was reportedly coined by Kelly Johnson, lead engineer at the Lockheed Skunk Works (creators of the Lockheed U-2 and SR-71 Blackbird spy planes, among many others).
The Unix philosophy
From [R03]:
The "Unix philosophy" originated with Ken Thompson's early meditations on how to design a small but capable operating system with a clean service interface. It grew as the Unix culture learned things about how to get maximum leverage out of Thompson's design. It absorbed lessons from many sources along the way.
The Unix philosophy is not a formal design method. It wasn't handed down from the high fastness of theoretical computer science as a way to produce theoretically perfect software. Nor is it that perennial executive's mirage, some way to magically extract innovative but reliable software on too short a deadline from unmotivated, badly managed, and underpaid programmers.
The Unix philosophy (like successful folk traditions in other engineering disciplines) is bottom-up, not top-down. It is pragmatic and grounded in experience. It is not to be found in official methods and standards, but rather in the implicit half-reflexive knowledge, the expertise that the Unix culture transmits. It encourages a sense of proportion and skepticism—and shows both by having a sense of (often subversive) humor.
Doug McIlroy, the inventor of Unix pipes and one of the founders of the Unix tradition, had this to say at the time [MPT78]:
- Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.
- Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input.
- Design and build software, even operating systems, to be tried early, ideally within weeks. Don't hesitate to throw away the clumsy parts and rebuild them.
- Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you've finished using them.
He later summarized it this way (quoted in A Quarter Century of Unix [S94]):
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
Rob Pike, who became one of the great masters of C, offers a slightly different angle in Notes on C Programming [P89]:
- Rule 1. You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so fon't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.
- Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.
- Rule 3. Fancy algorithms are slow when is small, and is usually small. Fancy algorithms have big constants. Until you know that is frequently going to be big, don't get fancy. (Even if does get big, use Rule 2 first.)
- Rule 4. Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures.
- Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.
- Rule 6. There is no Rule 6.
Ken Thompson, the man who designed and implemented the first Unix, reinforced Pike's rule 4 with a gnomic maxim worthy of a Zen patriarch:
When in doubt, use brute force.
More of the Unix philosophy was implied not by what these elders said but by what they did and the example Unix itself set. Looking at the whole, we can abstract the following ideas:
- Rule of Modularity: Write simple parts connected by clean interfaces.
- Rule of Clarity: Clarity is better than cleverness.
- Rule of Composition: Design programs to be connected to other programs.
- Rule of Separation: Separate policy from mechanism; separate interfaces from engines.
- Rule of Simplicity: Design for simplicity; add complexity only where you must.
- Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do.
- Rule of Transparency: Design for visibility to make inspection and debugging easier.
- Rule of Robustness: Robustness is the child of transparency and simplicity.
- Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.
- Rule of Least Surprise: In interface design, always do the least surprising thing.
- Rule of Silence: When a program has nothing surprising to say, it should say nothing.
- Rule of Repair: When you must fail, fail noisily and as soon as possible.
- Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.
- Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.
- Rule of Optimization: Prototype before polishing. Get it working before you optimize it.
- Rule of Diversity: Distrust all claims for "one true way".
- Rule of Extensibility: Design for the future, because it will be here sooner than you think.
If you're new to Unix, these principles are worth some meditation. Software-engineering texts recommend most of them; but most other operating systems lack the right tools and traditions to turn them into practice, so most programmers can't apply them with any consistency. They come to accept blunt tools, bad designs, overwork, and bloated code as normal—and then wonder what Unix fans are so annoyed about.
Easier to change (ETC)
From [TH20]:
Good Design Is Easier to Change Than Bad Design.
A thing is well designed if it adapts to the people who use it. For code, that means it must adapt by changing. So we believe in the ETC principle: Easier to Change. ETC. That's it.
As far as we can tell, every design principle out there is a special case of ETC.
Why is decoupling good? Because by isolating concerns we make each easier to change. ETC.
Why is the single responsibility principle useful? Because a change in requirements is mirrored by a change in just one module. ETC.
Why is naming important? Because good names make code easier to read, and you have to read it to change it. ETC!
ETC Is a Value, Not a Rule.
Values are things that help you make decisions: should I do this, or that? When it comes to thinking about software, ETC is a guide, helping you choose between paths. Just like all your other values, it should be floating just behind your conscious thought, subtly nudging you in the right direction.
Don't Repeat Yourself (DRY)
From [TH20]:
We feel that the only way to develop software reliably, and to make our developments easier to understand and maintain, is to follow what we call the DRY principle:
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
Why do we call it DRY?
DRY—Don't Repeat Yourself.
[...]
DRY Is More Than Code
[...]
DRY is about the duplication of knowledge, of intent. It's about expressing the same thing in two different places, possibly in two totally different ways.
Here's the acid test: when some single facet of the code has to change, do you find yourself making that change in multiple places, and in multiple different formats? Do you have to change code and documentation, or a database schema and a structure that holds it, or...? If so, your code isn't DRY.
The Principle of Least Astonishment
The Principle of Least Astonishment (POLA) [J87, R03]:
A component of a system should behave in a way that most users will expect it to behave.
As explained in [R03]:
The easiest programs to use are those that demand the least new learning from the user—or, to put it another way, the easiest programs to use are those that most effectively connect to the user's pre-existing knowledge.
Therefore, avoid gratuitous novelty and excessive cleverness in interface design. If you're writing a calculator program, "+" should always mean addition! When designing an interface, model it on the interfaces of functionally similar or analogous programs with which your users are likely to be familiar.
Pay attention to your expected audience. They may be end users, they may be other programmers, or they may be system administrators. What is least surprising can differ among these groups.
Pay attention to tradition. The Unix world has rather well-documented conventions among things like the format of configuration and run-control files, command-line switches, and the like. These traditions exist for a good reason: to tame the learning curve. Learn and use them.
"The flip side of the Rule of Least Surprise is to avoid making things superficially similar but really a little bit different. This is extremely treacherous because the seeming familiarity raises false expectations. It's often better to make things distinctly different than to make them almost the same."—Henry Spencer.
Bertrand Meyer's Five Criteria for modular design
From [M97]:
Modular decomposability
A software construction method satisfies Modular Decomposability if it helps in the task of decomposing a software problem into a smaller number of less complex subproblems, connected by a simple structure, and independent enough to allow further work to proceed separately on each of them.
Modular composability
A method satisfies Modular Composability if it favors the production of software elements which may then be freely combined with each other to produce new systems, possibly in an environment quite different from the one in which they were initially developed.
Modular understandability
A method favors Modular Understandability if it helps produce software in which a human reader can understand each module without having to know the others, or, at worst, by having to examine only a few of the others.
Modular continuity
A method satisfies Modular Continuity if, in the software architectures that it yields, a small change in a problem specification will trigger a change of just one module, or a small number of modules.
Modular protection
A method satisfies Modular Protection if it yields architectures in which the effect of an abnormal condition occurring at run time in a module will remain confined to that module, or at worst will only propagate to a few neighbouring modules.
Bertrand Meyer's Five Rules which we must observe to ensure modularity
From [M97]:
Direct Mapping
The modular structure devised in the process of building a software system should remain compatible with any modular structure devised in the process of modeling the problem domain.
Few Interfaces
Every module should communicate with as few others as possible.
Small Interfaces (weak coupling)
If two modules communicate, they should exchange as little information as possible.
Explicit Interfaces
Whenever two modules and communicate, this must be obvious from the text of or or both.
Information Hiding
The designer of every module must select a subset of the module's properties as the official information about the module, to be made available to authors of client modules.
Bertrand Meyer's Five Principles of software construction
From [M97]:
Linguistic Modular Units principle
Modules must correspond to syntactic units in the language used.
Self-Documentation principle
The designer of a module should strive to make all information about the module part of the module itself.
Uniform Access principle
All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation.
Open-Closed principle
Modules should be both open and closed.
The Single Choice principle
Whenever a software system must support a set of alternatives, one and only one module in the system should know their exhaustive list.
Structured Design by Edward Yourdon and Larry L. Constantine
Coupling
From [Y79]:
The key question is: How much of one module must be known in order to understand another module? The more that we must know of module B in order to understand module A, the more closely connected A is to B. The fact that we must know something about another module is a priori evidence of some degree of interconnection even if the form of the interconnection is not known.
[...]
The measure that we are seeking is known as coupling; it is a measure of the strength of interconnection. [...] Obviously, what we are striving for is loosely coupled systems—that is, systems in which one can study (or debug, or maintain) any one module without having to know very much about any other modules in the system.
Coupling as an abstract concept—the degree of interdependence between modules—may be operationalized as the probability that in coding, debugging, or modifying one module, a programmer will have to take into account something about another module.
[...]
Four major aspects of computer systems can increase or decrease intermodular coupling. In order of estimated magnitude of their effect on coupling, these are
- Type of connection between modules. [...]
- Complexity of the interface. [...]
- Type of information flow along the connection. [...]
- Binding time of the connection. [...]
[...]
The concept of coupling invites the development of a reciprical concept: decoupling. Decoupling is any systematic method or technique by which modules can be made more independent.
Cohesion
From [Y79]:
We already have seen that the choice of modules in a system is not arbitrary. The manner in which we physically divide a system into pieces (particularly in relation to the problem structure) can affect significantly the structural complexity of the resulting system, as well as the total number of intermodular references. Adapting the system's design to the problem structure (or "application structure") is an extremely important design philosophy; we generally find that problematically related processing elements translate into highly interconnected code. Even if this were not true, structures that tend to group together highly interrelated elements (from the viewpoint of the problem, once again) tend to be more effectively modular.
[...]
[T]he most effectively modular system is the one for which the sum of functional relatedness between pairs of elements not in the same module is minimized; among other things, this tends to minimize the required number of intermodular connections and the amount of intermodular coupling.
"Intramodular functional relatedness" is a clumsy term. What we are considering is the cohesion of each module in isolation—how tightly bound or related its internal elements are to one another.
[...]
Clearly, cohesion and coupling are interrelated. The greater the cohesion of individual modules in the system, the lower the coupling between modules will be.
[...]
Both coupling and cohesion are powerful tools in the design of modular structures, but of the two, cohesion emerges from extensive practice as more important.
Encapsulation
From [BMECHY07]:
"For abstraction to work, implementations must be encapsulated." [L87]. In practice this means that each class must have two parts: an interface and an implementation. The interface of a class captures only its outside view, encompassing our abstraction of the behavior common to all instances of the class. The implementation of a class comprises the representation of the abstraction as well as the mechanisms that achieve the desired behavior. The interface of a class is the one place where we assert all of the assumptions that a client may make about any instances of the class; the implementation encapsulates details about which no client may make assumptions.
To summarize, we define encapsulation as follows:
Encapsulation is the process of compartmentalizing the elements of an abstraction that constitute its structure and behavior; encapsulation serves to separate the contractual interface of an abstraction and its implementation.
Britton and Parnas [BP81] call these encapsulated elements the "secrets" of an abstraction.
Bibliography
- [BMECHY07] Grady Booch, Robert Maksimchuk, Michael Engle, Jim Conallen, Kelli Houston, Bobbi Young. Object-Oriented Analysis and Design with Applications. Pearson Education, 2007.
- [BP81] K. Britton and D. Parnas. A-7E Software Module Guide. Naval Research Laboratory, 1981.
- [H10] Kevin Henney. 97 Things Every Programmer Should Know. O'Reilly, 2010.
- [J87] Geoffrey James. The Tao of Programming. 1987.
- [L87] Barbara Liskov. Data Abstraction and Hierarchy. OOPSLA, 1987.
- [M06] Alex Martelli. Python in a Nutshell, second edition. O'Reilly, 2006.
- [M97] Bertrand Meyer. Object-Oriented Software Construction, second edition. Prentice Hall, 1997.
- [MPT78] M. D. McIlroy, E. N. Pinson, and B. A. Tague. UNIX Time-Sharing System: Foreword. The Bell System Technical Journal, Vol. 57, No. 6, July–August 1978.
- [P89] Rob Pike. Notes on Programming in C. February 21, 1989. https://www.lysator.liu.se/c/pikestyle.html
- [R03] Eric Steven Raymond. The Art of UNIX Programming. Pearson Education, 2003.
- [S94] Peter H. Salus. A Quarter-Century of Unix. Addison–Wesley, 1994.
- [TH20] David Thomas, Andrew Hunt. The Pragmatic Programmer, 20th Anniversary Edition. Addison–Wesley, 2020.
- [Y79] Edward Yourdon, Larry L. Constantine. Structured Design: Fundamentals of a Discipline of Computer Program and System Design. Prentice–Hall, 1979.