Another Fine Fix - The Sequel
byon 23-Oct-2009 at 09:00 AM (2315 Views)
In my recent post And here is another fine fix Iíve (almost) gotten us into I attempted to present some of the issues we face any time we make changes in our product. I purposefully chose a sample that appeared to be simple and easy to change but then turned out to be neither. As software designers and developers we've all encountered this. This was meant to illustrate the kind of processes we all go through and the issues we all face as we work to improve our product. While the story was about a specific instance in a specific product, the issues are meant to be generalized.
A comment by Jakob Kruse indicated that he did not fully agree with some of my generalizations. He made some good points and I wanted to respond to them. My original plan was to reply in another article but this seemed a little one sided. Instead I asked Jakob if heíd be interested in "stepping outside and settling this man to man Internet style" by engaging in a email conversation. He kindly accepted my challenge and you are now reading the results of our collaboration.
As you read this keep in mind that we continue to talk about a single specific example (pseudo-booleans and oddities of our expression evaluator) because it gives us a concrete way to discuss abstract issues. There is nothing unique about this one instance and the issues encountered are certainly not unique to Visual DataFlex. It is these higher level issues that interest us and will hopefully interest you.
Now letís set the stage: In honor of Jakobís contribution this play will take place in a Danish castle (I wonder if this has been done before). John Tuohy has just finished his soliloquy and exits stage left. Jakob Kruse enters stage right. He looks grim. He steps forward, faces the audienceÖ.
Jakob: Fine article John, but reading it I find two things wrong (or broken), and you didn't mention either of them.
1) "If it's not broken, don't fix it" is an awful policy. I was schooled this way too, and the only thing that policy promotes is fear of exercising control over your own code. This fear leads to bad code, which leads to more fear, etc. One of the more popular ways of ridding yourself of this fear (and before you think it: I know that wouldn't help in Johnís case) is to make sure you have complete test suites for all your code. If the behavior of every single line of code is checked by a test there is no fear of making your code better.
2) Like I said, having a complete test wouldn't have saved John, because the error was external. But the reason for that is quite simple: The expression evaluator is obviously broken. "That's just the way it works" is another one of those statements that should earn you an immediate slap on the wrist. Please fix it, so the rest of us don't have to chase errors produced by the order in which we include classes in our programs.
John: These are fair points Jakob. In particular I completely agree with the notion that you should take control of your code and not be ruled by fear of change. There are some areas where I don't completely agree. I must however give you credit for finding the two phrases in the article, "If it's not broken, don't fix it" and "that's just the way it works", that were overly simplistic and easily misunderstood. Let me try againÖ.
I don't agree that "If it's not broken, don't fix it" is awful policy but perhaps I oversimplified this. First of all, every software change has a cost and a benefit. Most software products have a seemingly infinite number of changes that can be made to them and it is the job of a good design team to choose the changes that provide the best return on investment based on finite resources. The cost of making a change includes designing, testing, releasing and occasionally having to deal with post release issues. When you prioritize the things you want to change you should take into consideration whether the feature is currently broken. If it is not, it will probably get a lower prioritization. That often, but not always, results in not fixing things that are not broken. There are more important things to do.
You mentioned that two things were wrong or broken. Here I strongly disagree and I consider this to be important. Both the use of integer as pseudo-booleans and the behavior of the expression evaluator are working as designed. You and I may not like how they were designed but they are working the way they were designed. I consider this distinction crucial and I always object when someone refers to a feature they don't like as a bug. The feature may be poorly designed, but if it is working as designed, it's not a bug. Words matter. Bugs are treated as a higher priority item and changes in behavior that are the result of a bug fix are more easily justified than changes in behavior that were made because we didn't like the old behavior. If someone is using a feature "as designed" and it stops working we get complaints!
I admit that referring to the expression evaluator's behavior with "that's just the way it works" was a poor choice of words. This implies that we don't care that it's imperfect and that it will never change. What I meant to say was that the expression evaluator works this way by design, we consider this to be an imperfect design, it is on our list of things we would like to improve in the product and that changing this will be expensive. (I guess I can maybe see how you would not infer all that from "that's the way it works".)
Finally I must say that I am quite proud of the fact that when these issues arose we made the decision roll back the change. For what's it worth, it's really hard to roll back your work and I consider it a real sign of maturity. This has nothing to do with fear and I actually consider it a good example of taking control of your code. Let's consider another scenario, which is probably one I would have taken 10 years ago. First I'd label the integer/boolean property thing as a bug and change them all to booleans. When I discovered the issue with the expression evaluator, I would have labeled it a bug and proceeded to fix that. Several months later, I would have discovered that this somehow broke the constraint engine and I would have proceeded to fix that. Next I would have discovered this caused strange interactions with the datetime data-type and proceeded to fix that. We'd discover more places where developers are using the underlying integer-ness of booleans and we'd change the expression evaluator to handle some kind of crazy casting of booleans. This would go on and on. We'd finally release something that was months late, full of compatibility issues that had nothing to do with boolean properties and missing some of the more important changes that fell by the wayside. We've been down this path and this is not an exaggeration!
Jakob: Concerning the expression evaluator, I too must amend my words. I said the expression evaluator was broken which, as you pointed out, is not true. I agree with you on that. What I meant was that the expression evaluator design was broken. I do realize that this is very much a legacy issue. The expression evaluator was designed for procedural code and this design now shows flaws when being applied in object-oriented code. In effect, introducing object-oriented programming broke the design of the expression evaluator. Had it been perfect, it would not have broken, so I suspect we pretty much agree there. Obviously the design cannot be changed lightly, because that will break a lot of programs (only the imperfect ones though, by parallel ;-).
As to my first point, my dislike for "if itís not broken, donít fix it" only applies to code, not design! Changes to design should not be taken lightly, and as you say, this is where the design team comes in. I read your initial article as an attempt to change code without affecting design, and it probably would have worked that way if not for the broken state of the expression evaluator Ė or on a larger scale, the broken (as in "not in line with the general understanding of basic OOP concepts") state of the object-oriented paradigm in the product.
Changes to code should be embraced, which is why I think "if itís not broken, donít fix it" is awful (again, with respect to code, not design). Almost every programmer spends a lot more time reading source code than writing it. The quality of source code Ė not just what it does but how it does it Ė affects readability. Aiming for a "minimal change" solution to any given problem will always decrease the quality of source code. To move the other way, to increase the quality of source code, involves rewriting, refactoring, moving things around. Adopt the boy scout rule: always leave the campground cleaner than you found it.
One of the tools you can use to make sure that your changes are restricted to code, and do not affect design, is test automation. If your design is locked in place by a series of automated tests (unit, functional, behavioral), then you can change every single line of your source code without fear of changing your design. Your tests will tell you conclusively (and easily) whether anything has changed or not.
Ultimately, because most programmers spend more time reading source code than writing it, improving readability of your source code will reduce the cost of making changes. Adopting a policy of not changing broken code will almost certainly cost you more in the long run. Or to paraphrase, choosing, from your seemingly infinite list of possible changes, to apply some change that does not involve anything broken, could (should!) very well reduce the cost of some of the changes that do involve something broken. In concrete terms, choosing to change Property Integer pbHotTrack to Property Boolean pbHotTrack before it turned into a bug in the property panel could have prevented someone from copying (and misusing) your misleading source code into the ListView class in the first place.
This does not mean that I disagree with you. If your main priority were improving source code readability you would never get around to fixing any bugs. And yes, of course you should take into consideration whether some feature is broken or not before deciding to change it. But when you do decide to change it, leaving the parts that you touch more readable, more easily understood, should be very high on your list of priorities. "If itís not broken, donít fix it" shouldnít be on your list at all.
And your programming language and development environment should support that!
John: Concerning your code cleanup comment: "Adopt the boy scout rule: always leave the campground cleaner than you found it."
Here's an old tale. There was once a group of campers staying at a campground. One day they noticed that the fence next to the stream was broken. Being good campers they spent the afternoon repairing this fence making it even stronger than before. When complete they were able to tie a rope to the restored fence rail and tie the other end to a sack that contained their drinks (beer), which they then dropped into the water. Any time they needed a cold drink they could lean over the fence, pull on the rope and grab a beer. This was a great change.
That night an old bear came tromping through the campground on his way to the stream for his regular dinner of fresh salmon. He was shocked to be stopped by a fence. "That's strange," he growled, "I thought I tore that fence down years ago". He tried once again to tear down the fence but this time it was too strong. His angry thrashing about awoke the campers. They went over to the bear and calmly tried to explain to him that this was actually an improvement. The fence was broken, he was not supposed to be going through the campground and there was another designated bear path just up the road. The bear listened, paused for a few seconds, shook his head and ate the campers.
Moral: Be careful mending fences. With a small change, beers become bears and bears don't always understand.
The above story is just my way of saying, "be careful". Maybe a little fear is warranted.
It might be worth recognizing that not all development conditions are the same. Consider the code we write for the Studio versus the code in our packages and runtime. We can easily alter, extend and refactor code that comprises the Studio because we have full control over how it is used. We have the entire source, we can see exactly how the code is used, we can easily change this usage and we are the last ones to compile it. This situation is similar to most of our developer's environments. In this kind of environment your point about taking control of your code is spot on. Our package and runtime is a little different. We don't do the final compile of the application and this is significant. It means we have little control over how you use this in your code. Since we don't know, we have to assume you use it in just about every way imaginable (we are rarely disappointed). We even have to be careful when we change private interfaces or make other changes that would impact only the most unusual or imperfect usage. This requires a more conservative approach to change. We have no choice - the bear does not always understand.
Jakob: Thatís an interesting tale. I wonder how many boy scouts got eaten that way? I think it all boils down to a question of design versus implementation, or public interface versus internal code. You should be very careful changing the public interface of your "product". With regards to the internal code, you should set yourself up so that you need not be careful making changes.
If those campers had spent the afternoon picking up the garbage left by the previous campers instead of mending the fence, or perhaps even scouting the surroundings to find the bear path and then moving the entire campground so that it became just a campground and not a campground-slash-bear-habitat (the single responsibility principle, SRP, is another thing to keep in mind when cleaning your code), they would have survived to enjoy many more years of camping. In this story the broken fence has become a part of the public interface of the campground, which should probably have been avoided in the first place.
To return to where this all started, a design error was obviously made. A better design might have been to isolate hot tracking (SRP again). That way it wouldnít have been in the TreeView class, and it wouldnít have been misused in the ListView class. Naming is also a problem. Indicating that something is a boolean when itís really an integer is not good. At least realizing mistakes like these can help us design better interfaces in the future, even if it doesnít help us fix the mistakes of the past.
John: This seems like a good ending point. I feel that this has been a productive discussion and your contributions are greatly appreciated. This was a lot of fun. Thanks again, Jakob!
Jakob: Thank you John, for giving me this opportunity.
(Turns to face the audience)
Remember, good code speaks for itself. Only bad code needs an explanation. The rest is silence.
(Exits stage right)