View RSS Feed

Development Team Blog

Recursive Use of Structs and Arrays

Rate this Entry
In a recent post at the forums the question was given how to make use of a recursive struct. Years ago I wrote a training manual about Structs and Arrays and in one of the exercises the trainees had to write a routine that could read the files in a folder from a disk. A folder can have 0-N files and 0-N sub-folders. In this blog I will explain you how you can build the code to read the information using arrays and structs.

Nested Structs
Because a folder can have 0-N files and sub-folders we need a struct where a member refers to itself as a folder can have sub-folders. We need a struct where one of the members is an array of the same struct:
Code:
Struct tFolder
    String sFolderName
    String[] sFiles
    tFolder[] Folders
End_Struct
Without the array marker on the third member the code would create a infinitive recursion which is of course not allowed. A struct is a data definition as integer, real, number, decimal, pointer etc. so to use the struct you need to create a variable.
Code:
Procedure Test
    tFolder MyFolder
End_Procedure
The variable is not an array itself as we can start with a single folder (e.g. c:\program files). Accessing a nested member can be done in static mode or by reference. If you would need to write in a static mode the code would be:
Code:
Move 'Visual DataFlex' to MyFolder.Folders[0].Folders[0].sFolderName
While it works it is quite clumsy and if the depth is dynamic writing code this way cannot be done.

The Solution
The solution lies in making use of recursion and referencing members by reference. Recursion in writing a procedure or function that calls itself and by reference in passing a member of the struct variable via its by reference marker to the recursive routine.

If you call a procedure or a function is a recursive way you need to have a way to terminate the recursion else you will run out of stack space and get a very nice GPF (General Protection Fault). For the exercise to read files and folders from a disk the operating system determines the number of recursions as at a certain moment you won't have sub-folders anymore. Hopefully you reach that point before you run out of stack space or - in general - run out of memory. Memory as you should know is not unlimited, not in real, not on a 32 bit application.

Read Disk Information
Lets take a look at how we can read disk information into the variable. We can make use of the build in sequential I/O statements of Visual DataFlex (Direct_Input, ReadLn, SeqEof etc). This means that the following code can be used to read the files of a folder:
Code:
Procedure ReadDirectory String sDir
    String sDirEntry 
    Integer iChannel iFolderItem
    tFolder MyFolder
    
    Get Seq_New_Channel to iChannel
    If (iChannel <> DF_SEQ_CHANNEL_NOT_AVAILABLE) Begin
        // Open an input channel with the DIR: device plus the directory path
        Direct_Input channel iChannel ("DIR:" + sDir + "*.*")
        While (not (SeqEof))
             // Read an entry from the directory
             Readln channel iChannel sDirEntry
             // Only continue if you have not found end of the directory
             If (not (SeqEof)) Begin
                 // If the entry is not a folder
                 If (Left (sDirEntry, 1) <> '[') Begin
                     // Store the name of file
                     Move sDirEntry to MyFolder.sFiles[SizeOfArray(MyFolder.sFiles)]
                 End
                 Else Begin
                     // Only store folders that are not [.] and [..]
                     If (Left (sDirEntry, 2) <> '[.') Begin
                         // Retrieve the current number of folders
                         Move (SizeOfArray (MyFolder.Folders)) to iFolderItem
                         // Add the directory (plus full path) to the array of folders. Without the square brackets!
                         Move (sDir + Mid (sDirEntry, Length (sDirEntry) - 2, 2)) to MyFolder.Folders[iFolderItem].sFolderName
                     End
                 End
             End
        Loop
        // Close the I/O channel
        Close_Input channel iChannel
        // Release the channel
        Send Seq_Release_Channel iChannel
    End
End_Procedure

Send ReadDirectory "c:\program files\"
The above code reads the files in a folder in the string array member names sFiles and the names of the sub-folders in an array member named Folders. Now how to continue reading? How to get the contents of the nested folders? Add the following code between the last end statement and the End_Procedure
Code:
// Determine the number of foldersMove (SizeOfArray (MyFolder.Folders)) to iFolderItems
Decrement iFolderItems
// Loop through the folders and enumerate each of them recursively
For iFolderItem from 0 to iFolderItems
    Send ReadDirectory (MyFolder.Folders[iFolderItem].sFolderName + '\')
Loop
The above code will loop through each of the sub-folders and call itself for this sub-folder. In the iteration it might find sub-folders again so the ReadDirectory can be easily called often. As a tip, don't start "c:\" or even "c:\program files\" but something more simple such as "c:\Visual DataFlex Examples\" or "c:\tmp\" (f you have a temp folder on the root of your disk).

Now, how do you get everything in one array with information as above code will build up arrays but as they are not kept somewhere you will loose the information when the recursion ends. The "trick" as mentioned before is by passing a variable via by reference. For this change two things. The first change is the line where the ReadDirectory is declared. Change it into:
Code:
Procedure ReadDirectory String sDir tFolder BYREF MyFolder
The second change is where the ReadDirectory is invoked. Change that/those lines into:
Code:
Send ReadDirectory (MyFolder.Folders[iFolderItem].sFolderName + '\') (&MyFolder.Folders[iFolderItem])
This means that the initial call needs to use a variable. Change:
Code:
Send ReadDirectory "c:\program files\"
into
Code:
tFolder MyFolder
Send ReadDirectory "c:\program files\" (&MyFolder)
Each time the procedure calls itself a reference to a deeper member is passed. So the first time MyFolder is passed, the second time MyFolder.Folders[N], the third time MyFolder.Folders[N].Folders[N]. On return of the original instruction the array of tFolder contains all files and sub-folders of the passed root folder.

Count Files and Folders
To count the total number of files and folders in the MyFolder variable you can make use of a procedure that takes three arguments by reference. The first is the MyFolder array, the second a reference to an integer for counting files and the third a reference to an integer to count the folders. The method and the call can be written as follow:
Code:
// This routine enumerates the recursive struct to count the files and folders
Procedure CountFilesAndFolders tFolder ByRef MyFolders Integer ByRef iFiles Integer ByRef iFolders
    Integer iFolderItems iFolder

    // We don't need to count the items; there is a nice build in function for this
    Move (iFiles + SizeOfArray (MyFolders.Files)) to iFiles
    // Get the number of folders first and store this in a variable because we need to enumerate from there
    Move (SizeOfArray (MyFolders.Folders)) to iFolderItems
    // Add the folders (not decremented) to the number of folders
    Move (iFolders + iFolderItems) to iFolders
    // Decrement the count because we loop from 0 to N
    Decrement iFolderItems
    // Loop the contents of the array recursively.
    For iFolder from 0 to iFolderItems
        Send CountFilesAndFolders (&MyFolders.Folders[iFolder]) (&iFiles) (&iFolders)
    Loop
End_Procedure
As you can see this method calls itself recursively as well. Because the array is passed by reference it does not take up more memory.

Conclusion
Once you understand that you can use recursion and by reference it is quite easy to make complex routines like this. The code won't execute as fast if you would write it in a 3rd generation language as 'C' but it is much easier as you don't have to do all the memory operations and let the DataFlex 4GL take care of that.

Happy coding, hope this will contribute to keep you busy with Visual DataFlex!

Comments