Arrays & Structs in-depth Part II
by
, 8-Sep-2009 at 08:00 AM (6922 Views)
In Part I we were just getting warmed up and started looking at the basics of arrays and struct types. Now things are about to get more complicated as we dig in deeper.
Sorting Arrays
If you have Integer[] myArray for example, then sorting is very simple and straightforward. You simply do Move (SortArray(myArray)) to myArray, and you're done. The runtime takes care of all the magic to sort the array elements. But if you have an array of struct type, then it gets a little more complicated as the runtime would need some help to determine the desired sort order of two elements. You provide that with a custom comparison function. Consider the following:
If you want to sort customers on first name, you'd write something like this:Code:Struct tCustomerInfo String sFirstName String sLastName End_Struct ... tCustomerInfo[] customers ...
If you want to sort on last name you'd obviously compare sLastName instead. If you want to sort on last name followed by first name, you'd write something like this:Code:Function CompareCustomerFirstName tCustomerInfo lhs tCustomerInfo rhs If (lhs.sFirstName < rhs.sFirstName) Function_Return (LT) If (lhs.sFirstName > rhs.sFirstName) Function_Return (GT) Function_Return (EQ) End_Function Procedure Foo tCustomerInfo[] customers ... Move (SortArray(customers,self,get_CompareCustomerFirstName)) to customers ... End_Procedure
SearchArray FunctionCode:Function CompareCustomerFirstName tCustomerInfo lhs tCustomerInfo rhs If (lhs.sLastName < rhs.sLastName) Function_Return (LT) If (lhs.sLastName > rhs.sLastName) Function_Return (GT) If (lhs.sFirstName < rhs.sFirstName) Function_Return (LT) If (lhs.sFirstName > rhs.sFirstName) Function_Return (GT) Function_Return (EQ) End_Function
Searching for an element in an array is just as easy as sorting if you have an Integer[] for example. You simply do Move (SearchArray(1234,myArray)) to iFoundIndex. Likewise, if you have an array of struct type, you must provide a comparison function to help the runtime compare two struct elements. You can use the same comparison function as in the example with sorting an array above. You can also use a slightly simpler function where you're returning only EQ or NE, like this:
Often it's easier to just reuse an existing comparison function that returns one of the three possible values LT, GT, EQ. But sometimes you're looking for something specific that may not be the same as the full sort order, such as if you're looking for the first match of sFirstName while the sort order is last name followed by first name. Or if you're looking for a match in either first name or last name. It's easy to see how you can implement such a comparison function, and that you can have many different comparison functions(even with the same struct type) and pick and choose which one you want to use at the moment.Code:Function CompareCustomerFirstName tCustomerInfo lhs tCustomerInfo rhs If (lhs.sFirstName = rhs.sFirstName) Function_Return (EQ) Function_Return (NE) End_Function
BinarySearchArray Function
You can also search for an element using BinarySeachArray(), which can be more effective than SearchArray() if the array has a lot of elements. However, BinarySearchArray() is not as flexible. First, in order to use BinarySearchArray(), the array must already be sorted. Second, you must also use the same comparison function that was used to sort the array, and it must return one of the three possible values LT, GT or EQ.
In most cases it's preferred to just use SearchArray() as it's much more flexible and allows searching without first sorting. The advantage of BinarySearchArray() with regards to performance doesn't really come into play unless you have a very large array (think thousands or tens of thousands of elements at least), or you perform many searches in performance critical code without adding/changing elements.
When to use Struct vs. Array
Deciding whether to use a struct or array is obviously a design decision and it depends on the circumstances. Technically in many cases you can use either or, but very often it can lead to very different complexities. As a DataFlex developer it's often very easy to fall into the trap of picking the more familiar approach of using an array, even when a struct would perhaps be a better choice. Afterall, arrays have been around much longer in VDF. In that case it might be worthwhile to step back and reflect on whether a struct might actually be a better choice.
In general the recommendation would be to use a struct first and foremost, and only when necessary use an array. Of course, sometimes you don't really have a choice, such as when interfacing with an external control or dll, and you just have to go along with the design at hand. But when you do have a choice:
- Consider whether what you're modeling is many of the same values(array is preferred), or whether it's many different values(struct is preferred). For example, a set of lottery numbers are many-of-the-same, whereas first name, last name and balance are many-different-values. If referring to the different values by index comes naturally, then array is a good choice. But if referring to the different values by name comes more natural, then you should really consider a struct.
- Consider whether the multiple values are naturally of the same type(array is ok) or different type(struct preferred).
- And finally consider whether the number of values are reasonably fixed for the specific problem(struct is ok), or if the number of values will vary all the time(array is required).
If you find that you need a multi-dimensional array, you should reconsider whether a struct is a better choice. Multi-dimensional arrays are more difficult to work with conceptually. It takes a lot more thinking to model Move foo[index1][index2] to xyz than using struct member names with dot notation.
If you need sorting capabilities, then multi-dimensional arrays are definitely not a good choice, and you should consider a single-dimensional array of struct type instead.
In Part III we'll look at multi-dimensional arrays and why it's often better to replace it with a single dimensional array of struct type instead.