View RSS Feed

Development Team Blog

And here is another fine fix Iíve (almost) gotten us into

Rate this Entry
I want to tell you a story about upgrading software. I think you might find the situation familiar.

With the introduction of the 12.0 Studio we introduced a new Property Panel. We made the Studio smart enough so that it could look at a propertyís data type and meta-data tags and figure out how best to present the data. In the case of boolean properties that data would be displayed using a drop down box containing a value for "True" and "False". This worked great but there were some properties that were not displaying this drop down. An example of this was the pbHotTracking property in the TreeView class. One look at this propertyís definition made the problem obvious.
Code:
Property Integer pbHotTracking False
Although this property is defined as an integer it is treated and even prefixed as a boolean. This class and its property was defined before Visual DataFlex supported boolean as a native data type and we had never gotten around to changing the underlying type from integer to boolean. Now that the Studio actually did something with the data type it seemed like a good time to make this change. This sounded simple enough and it seemed safe enough.

Before continuing with the story, I would like to point out that the decision to not update this code previously was deliberate. Making these kinds of changes is always a trade-off. Our package code should set a good example and therefore we should be using the latest techniques. We especially want to avoid using non-recommended or obsolete code. On the other hand, these kind of changes take time to design, implement and test. We always have to ask if that time can be better used for other purposes. And finally there is the old "if itís not broken, donít fix it" rule. As we shall see, sometimes the most innocent and straightforward changes will cause side effects that bubble up in the most unexpected places.

Now back to the story. We decided that it was time to convert all of our integer pseudo-boolean properties into real booleans. I now challenge you figure out what could possibly go wrong Ė and something will go wrong.

Before proceeding I tried to anticipate any problems that might arise. The most likely problem would be that the boolean property actually has more than two values. Donít laugh Ė our Dynamic_Update_State property supports three values: True, False and "2" (donít ask). Another possibility would be that True and False values may not be the expected 1 or 0 and that somehow this would matter. For example, some windows functions returns negative one as a non-false value. Another possibility might be that the property value was being used in some strange mathematical or logical way that works with integers but not booleans. This was all I could think of and I was ready to deal with them. I figured I would have to:

1. Find all the pseudo-boolean properties

2. Change the type from integer to boolean

3. Check every place where it is used and very carefully make sure there is no unusual usage.

I also set limits for this work. Unless required I would not change any of the code that actually used the properties. For example, I might have a procedure like:
Code:
Procedure Page_Object integer iState
      Integer bValue
      :
      Get pbHotTrack to bValue
      :
End_Procedure

I would not change bHotTrack to a boolean. This kind of change would be too big and it would introduce too many ways for something to go wrong.


To summarize, before making any changes I tried to anticipate all possible issues. I planned how I would go about making and verifying the changes. I set some limits so I knew what I would not be doing (a very important step if you are trying to properly estimate the cost of development). This was a doable task and I did it. The work went quite well. It took about three days to make and test the changes. Everything looked good. For the most part I just found the property type and changed the data type. For example, my TreeView now read
Code:
Property Boolean pbHotTracking False
Everything worked and the Property Panel looked great. I merged the changes so that our internal builds would all contain these changes. This meant that I had the entire team exercising these changes as part of their normal development.

We didnít see any problems for a few weeks. When tracking down another unrelated issue we started seeing runtime errors. What was surprising is that the error occurred in an external package. A third party ListView class started displaying message boxes with "Internal Expression Error" every time the window control was created. The debugger pointed to the offending code:
Code:
 Move (iStyle + (pbHotTrack(self) * LVS_EX_TRACKSELECT))   To iStyle

This might look a bit confusing. The class is setting windows styles before the control is created. If pbHotTrack is true you want to add (iOR) LVS_EX_TRACKSELECT to the style. It does this by multiplying the style by 1 or 0. As long as pbHotTrack is always 0 or 1, this should work.


The problem is you cannot multiply boolean values. If you try to do this, you get an "Internal Expression Error". It appeared that by changing the integer to a boolean we introduced a new error when this was run through the expression evaluator. Upon closer inspection this made no sense. This was not our class and we did not change any of the code in this class. The ListView class still defined pbHotTrack as an integer and therefore the old code should work. I took the ListView class and placed it in a smaller program and the error disappeared. I added it back to a larger program and the error came back. I finally figured out that adding the TreeView class package before the ListView class package would create the error. This was caused by the way our expression evaluator works. When an expression evaluator evaluates a function like phHotTrack(self) its data type is determined by the first compiled instance of that function or property. If the first instance is boolean, the type will be a boolean within the expression even if the actual class defines it as an integer. If the TreeView class is compiled first, pbHotTrack(self) will return a boolean and the ListView will fail. If the ListView is compiled first pbHotTrack(self) will return an integer and it will work. This is just the way the expression evaluator works.

This code construction is unusual. There are other, perhaps clearer, ways this could have been coded such as
Code:
If (pbHotTrack(self)) Begin
      Move (iStyle iOr LVS_EX_TRACKSELECT))   To iStyle
  End
  
  // or
  
  Move (iStyle + if(pbHotTrack(self),  LVS_EX_TRACKSELECT,0))   To iStyle
  
  // or (actually not so clear)
  
  integer bHot
  Get pbHotTrack to bHot
  Move (iStyle + (bHot * LVS_EX_TRACKSELECT))   To iStyle
Ironically, any of these alternatives would have worked.

Did you see that one coming? I sure didnít. When coded in this one special way it was possible to break existing code. Changing one of our classes broke a completely unrelated class. This one coding style along with one particular ordering of code fails.

At this point, we made the decision that we could not risk breaking existing applications and we rolled this change back. Although the chance of this happening again was rather small, there was no way we could control it as this was not happening in our code. We also knew that the errors might not appear until you deployed your application. Our developers are never pleased when we break their applications. Our developers are particularly not pleased when this happens after theyíve deployed their application. We take this kind of situation very seriously and we always try our best to minimize it.

This became a classic case of "itís not broken, donít fix it". And that is why we will probably never change our pseudo-boolean properties. (Oh, and we solved the Property Panel problem using a different technique - a new meta-data tag.)

Thereís really not much of a lesson to this story. We all know that changing code is risky. We also know that it is important to continue to maintain, improve and modernize your software. The skill involved is striking a proper balance. Sometimes it is best to completely rewrite sections of code. Sometimes it is best to make small surgical changes in your code. Sometimes it is best to do nothing at all. A good programmer understands this and applies the proper strategy at the appropriate time. A good programmer will expect the unexpected and act accordingly. In some cases, this means not acting at all.

Earlier I stated, "Our package code should set a good example". Sometimes not changing old working code is the best choice and therefore we may be setting a good example after all. To paraphrase Shakespeare, "There are more things (bugs) in heaven and earth, Horatio, than are dreamt of in your philosophy (code)".

Comments

  1. Jake Moffatt's Avatar
    When an expression evaluator evaluates a function like phHotTrack(self) its data type is determined by the first compiled instance of that function or property. If the first instance is boolean, the type will be a boolean within the expression even if the actual class defines it as an integer. If the TreeView class is compiled first, pbHotTrack(self) will return a boolean and the ListView will fail. If the ListView is compiled first pbHotTrack(self) will return an integer and it will work. This is just the way the expression evaluator works.
    Can you provide a code example of that? I am having trouble wrapping my head around this concept.

    Aside from that, I would say that this is a situation that makes a case for having a quick and easy way of making getter/setter functions for all properties in any class built into the language, ala C#'s get/set methods. This would mean that instead of accessing a property value directly, you would always be going through a function to retrieve a value and that function could have clearly defined return values (i.e This function, although it returns an integer, will ALWAYS return the boolean TRUE or FALSE) thus making it the responsibility of the consumer of the class to code to the specifications of the class instead of assumptions they make about its workings.

    Then, if the implementation changes later on, as in this case where the data type of the property was changed, the get/set methods still work properly because they can still do what the specifications say they do - return TRUE or FALSE. This is one of the basic tenets of OOP - encapsulation - abstracting the inner workings of a data structure into an interface that talks to the underlying data structure and has clearly defined specifications on how to work with that interface.
  2. Jakob Kruse's Avatar
    Fine article John, but reading it I find two things wrong (or broken), and you didn't mention either of them.

    1) "If it's not broken, don't fix it" is an awful policy. I was schooled this way too, and the only thing that policy promotes is fear of exercising control over your own code. This fear leads to bad code, which leads to more fear etc. One of the more popular ways of ridding yourself of this fear (and before you think it: I know that wouldn't help in Johns case) is to make sure you have complete test suites for all your code. If the behaviour of every single line of code is checked by a test there is no fear of making your code better.

    2) Like I said, having a complete test wouldn't have saved John, because the error was external. But the reason for that is quite simple: The expression evalutator is obviously broken. "That's just the way it works" is another one of those statements that should earn you an immediate slap on the wrist. Please fix it, so the rest of us don't have to chase errors produced by the order in which we include classes in our programs.

    I read a book recently, called "Clean Code". I would recommend it to everyone and anyone. I wouldn't say there's anything new in it, but it's very well written, and it has left me with a longing for the day when VDF turns into a real programming language and the Studio turns into a real development environment. Those might seem like harsh words, but where is the unit testing framework? Where is the refactoring menu? The code coverage tester? Operator precedence and true OOP might be difficult to introduce now, but those things are just fundemental tools that any developer should use.
  3. Sture's Avatar
    Luckily we're paid in real money :-)