View RSS Feed

Development Team Blog

Arrays & Structs in-depth Part I

Rate this Entry
Native arrays have been around in Visual DataFlex for years now, it's obvious that native array types are preferred by far over the old Array class, but just what are the differences? For one, with native array types you can create local array variables without resorting to creating an object. That also means you can use array types as parameter types and return types, and the data is carried across method calls far easier than array objects. You also never have to worry about destroying an object, array variables like any other variable utilize automatic memory management.

Struct types obviously also deprecates the use of the legacy Type/End_Type structures. Anyone who's ever used Type/End_Type structures knows what I'm talking about, what a pain that was to deal with. Native struct types are just infinitely better.

There are other advantages that you may not think of right away, such as interoperability and integrated support for COM arrays and structs, complex structured types with XML web services and SOAP, compatibility with C-style arrays and struct types for improved use of dll calls and using the Win32 API.

In this multi-part series we'll explore structs and arrays more in depth. We'll start with the most simple basic information you need to know.

Arrays
You can use static or dynamic arrays. The only difference is whether the array has a static, fixed size of elements, or if the array can grow dynamically. Dynamic arrays are definitely preferred and should be your first choice. There's really no performance difference, so there's no reason to sacrifice the flexibility of dynamic arrays.

Static arrays have a use when declaring an array member in a struct that must meet certain requirements for compatibility with C-style arrays and structs. Unless you're defining such a struct for use with a DLL for example, you should always use a dynamic array.

Dynamic arrays are created like this:
Code:
String[] customerNames
Integer[] myIntArray

Move "Joe" to customerNames[0]
Move 1234 to myIntArray[0]
You can create arrays out of any supported data type. The question about naming convention often comes up, and while there's no real hard requirement, the recommendation and the style we usually(but not always) follow is a good name describing the variable in plural form(or optionally including the word array if it makes more sense), with no specific prefix for designating array.

You can use multi-dimensional arrays like this:
Code:
Integer[][] lotteryNumbers
You can even have more than two dimensions, but in many cases multi-dimensional arrays should be avoided in favor of struct types. More about that in part II.

Dynamic arrays grow automatically as needed during element assignment in a Move statement. Internally this expansion is optimized so it grows in progressively larger chunks, rather than one-by-one, defining the array's internal capacity. Thus adding elements to a dynamic array is typically very fast. This is not to be confused with the actual size of the array, which is never larger than specified. The current array capacity may be larger than the current size.

You can also grow and shrink the array directly, using the ResizeArray() function. This is often used to "clear" the array and remove all elements in one go, resizing the array to zero elements.

In Visual DataFlex 2009 we also introduced two new functions; InsertInArray() which inserts an element at a specified index and expands the array. and RemoveFromArray() which removes the element at a specified index and contracts the array.

Struct Types
Struct types are used to keep related values together like a database record, so you can conveniently pass around the whole record as a parameter or return value for example.

Code:
Struct tCustomerInfo
    String sFirstName
    String sLastName
    Number nBalance
End_Struct

Procedure Foo
    tCustomerInfo customer
    Move "Joe" to customer.sFirstName
    Move "Johnsson" to customer.sLastName
    ...
End_Procedure
As far as naming conventions, again there's no hard rule but our recommendation is to prefix the struct type with lowercase t, and no specific prefix for designating struct for the variable names, instead use a descriptive variable name that makes it clear what struct type is used. This is often misunderstood, but the basic idea is that for the variable name, the fact that it's a struct variable in general is of little interest compared to which struct type. i.e. conceptually it's more important to convey the information that this variable holds customer information than that this variable is of some sort of struct in general. Prefixing the struct type name with lowercase t will help to reduce the possibility of name clashes with other constructs that are defined globally. Remember, this is only a recommendation that we try follow ourselves(but not always), and should not be taken as a requirement.

Struct & Array Together in Harmony
You can obviously use struct types and arrays together, that's when it becomes really powerful:
Code:
Struct tCustomerInfo
    String sFirstName
    String sLastName
    Number nBalance
End_Struct

Struct tOrderItem
    String sName
    Number nPrice
End_Struct

Struct tOrderInfo
    tCustomerInfo customer
    tOrderItem[] orderItems
End_Struct

Function CreateOrder ... Returns tOrderInfo
....
As you can see the possibilities are endless, you can nest structs and arrays and yet keep it organized. With the familiar dot syntax, accessing and referencing deep structures becomes a breeze.

A little known trick is that you can also easily create tree structures by nesting a recursive struct using array. The value tree structure used internally for web services and xml serialization looks like this:

Code:
Struct tValueTree
    String sValue
    tValueTree[] children
End_Struct
This works as long as the recursive struct member is a dynamic array. You obviously cannot create direct recursive struct members, as it would create a struct of infinite size. But by using a dynamic array, the size of the member is fixed, and the array is stored indirectly so the size can change dynamically.

Struct & Array as Parameter and Return Type
Obviously you can create local variables of struct and array type, you can also declare parameters of struct & array as well as return types. This offers an enormous flexibility. Struct and array parameters and return values also work just like any other parameter and return value, you never have to worry about memory management, and parameters are always passed by value, just like any other parameter type.

Now you might think that passing around huge structures and arrays by value would suffer a performance penalty, but it's actually not copying the parameter values, yet retaining copy semantics. This is all accomplished with a built-in copy-on-write optimization, where essentially all struct and array values are passed around by reference internally, until an attempt to modify the value is performed, where upon a copy is first created. All this occurs behind the scenes, and you don't have to worry about it. This even applies to return values, so there's no performance penalty for returning a huge struct or array. In a rare feat you get to have your cake and eat it too. You get the performance, while you also get copy semantics.

In part II we'll look at sorting and searching, and how to choose between using a struct or array.

Updated 3-Sep-2009 at 06:48 PM by Sonny Falk

Categories
Uncategorized

Comments

  1. chuckatkinson's Avatar
    Thanks Sonny. More enlightening than the help.
  2. Chris Spencer's Avatar
    Just curious, I use the valueTree in a few places to manage XML transforms etc but why is it always prefixed with the word "infamous"?
  3. Sonny Falk's Avatar
    Quote Originally Posted by Chris Spencer
    Just curious, I use the valueTree in a few places to manage XML transforms etc but why is it always prefixed with the word "infamous"?
    Good question, my bad, an old habit from way back when we came up with the trick. It has since become a fully supported technique, and there's nothing bad about it at all, quite the opposite, it's a very useful and elegant technique. The tValueTree was also one of the most famous struct types that nobody knew about at one time, and I had so many questions about that back then when it was considered more of a trick, nowadays I think it's even documented. In fact, I'll edit the article to remove that word. Thanks for pointing it out.
  4. Clive Richmond's Avatar
    Thanks Sonny for yet another informative blog.

    FWIW we've adopted the prefix rg when naming arrays and the prefix l for naming local structs.

    Code:
    Struct tColorPalette
        String  sName
        Integer iContainerColor
        . . . . 
    End_Struct // tColorPalette
    
    
    Property tColorPalette   ptColorPalette 
    Property tColorPalette[] ptrgColorPalette
    Property String[]        psrgNames
    
    
    // Save_Color_Palette:
    //
    Procedure Save_Color_Palette
        tColorPalette ltColorPalette 
        tColorPalette[] ltrgColorPalette 
        String[] srgNames
    
        Get ptColorPalette To ltColorPalette
        Get ptrgColorPalette To ltrgColorPalette
        Get psrgNames To srgNames
    
        Move C_TA_DEFAULT To ltColorPalette.sName
        Move C_TA_DEFAULT To ltrgColorPalette[0].sName
        Move C_TA_DEFAULT To srgNames[0]
  5. Anders Ohrt's Avatar
    Coding style and naming conventions are always controversial, but FWIW we've gone with 'a' for array. We almost never use the Address type, when we do it's 'p' as in Pointer so there is no collision. So a string array would be saData, an integer array array would be iaaNumberMap, and a property boolean array array array would be pbaaaMagic.

    Also, it was an inconsistent decision for DAW to prefix structs. Structs are types, just like Integer and String, and they are not prefixed. We prefix our _variables_ with 't', and we suffix out structs with 'Struct'.
    Code:
    Struct SomeDataStruct
        ...
    End_Struct
    
    SomeDataStruct tSomeData
    SomeDataStruct[][] taaSomeMoreData
  6. Jakob Kruse's Avatar
    You guys should really read the book "Clean Code" (http://www.amazon.com/Clean-Code-Han.../dp/0132350882), in particular what it has to say about naming. It is one of the best books I've ever read. Highly recommended.

    In summary, using naming prefixes or suffixes such as those from hungarian notation lowers readability and thus quality of code, and should therefore be avoided. It also promotes poor naming.

    The Clean Code book does illustrate quite a few deficiencies in a language like DataFlex (it left me wishing for "VDF Pro"), but although naming collision is poorly handled in DataFlex and a real problem, using hungarian notation and the likes is a poor solution.

    Didn't mean to step on anyone's toes, but the book was a real eye-opener for me.