Recursive Use of Structs and Arrays
by
, 8-Jul-2012 at 10:15 AM (14812 Views)
In a recent post at the forums the question was given how to make use of a recursive struct. Years ago I wrote a training manual about Structs and Arrays and in one of the exercises the trainees had to write a routine that could read the files in a folder from a disk. A folder can have 0-N files and 0-N sub-folders. In this blog I will explain you how you can build the code to read the information using arrays and structs.
Nested Structs
Because a folder can have 0-N files and sub-folders we need a struct where a member refers to itself as a folder can have sub-folders. We need a struct where one of the members is an array of the same struct:
Without the array marker on the third member the code would create a infinitive recursion which is of course not allowed. A struct is a data definition as integer, real, number, decimal, pointer etc. so to use the struct you need to create a variable.Code:Struct tFolder String sFolderName String[] sFiles tFolder[] Folders End_Struct
The variable is not an array itself as we can start with a single folder (e.g. c:\program files). Accessing a nested member can be done in static mode or by reference. If you would need to write in a static mode the code would be:Code:Procedure Test tFolder MyFolder End_Procedure
While it works it is quite clumsy and if the depth is dynamic writing code this way cannot be done.Code:Move 'Visual DataFlex' to MyFolder.Folders[0].Folders[0].sFolderName
The Solution
The solution lies in making use of recursion and referencing members by reference. Recursion in writing a procedure or function that calls itself and by reference in passing a member of the struct variable via its by reference marker to the recursive routine.
If you call a procedure or a function is a recursive way you need to have a way to terminate the recursion else you will run out of stack space and get a very nice GPF (General Protection Fault). For the exercise to read files and folders from a disk the operating system determines the number of recursions as at a certain moment you won't have sub-folders anymore. Hopefully you reach that point before you run out of stack space or - in general - run out of memory. Memory as you should know is not unlimited, not in real, not on a 32 bit application.
Read Disk Information
Lets take a look at how we can read disk information into the variable. We can make use of the build in sequential I/O statements of Visual DataFlex (Direct_Input, ReadLn, SeqEof etc). This means that the following code can be used to read the files of a folder:
The above code reads the files in a folder in the string array member names sFiles and the names of the sub-folders in an array member named Folders. Now how to continue reading? How to get the contents of the nested folders? Add the following code between the last end statement and the End_ProcedureCode:Procedure ReadDirectory String sDir String sDirEntry Integer iChannel iFolderItem tFolder MyFolder Get Seq_New_Channel to iChannel If (iChannel <> DF_SEQ_CHANNEL_NOT_AVAILABLE) Begin // Open an input channel with the DIR: device plus the directory path Direct_Input channel iChannel ("DIR:" + sDir + "*.*") While (not (SeqEof)) // Read an entry from the directory Readln channel iChannel sDirEntry // Only continue if you have not found end of the directory If (not (SeqEof)) Begin // If the entry is not a folder If (Left (sDirEntry, 1) <> '[') Begin // Store the name of file Move sDirEntry to MyFolder.sFiles[SizeOfArray(MyFolder.sFiles)] End Else Begin // Only store folders that are not [.] and [..] If (Left (sDirEntry, 2) <> '[.') Begin // Retrieve the current number of folders Move (SizeOfArray (MyFolder.Folders)) to iFolderItem // Add the directory (plus full path) to the array of folders. Without the square brackets! Move (sDir + Mid (sDirEntry, Length (sDirEntry) - 2, 2)) to MyFolder.Folders[iFolderItem].sFolderName End End End Loop // Close the I/O channel Close_Input channel iChannel // Release the channel Send Seq_Release_Channel iChannel End End_Procedure Send ReadDirectory "c:\program files\"
The above code will loop through each of the sub-folders and call itself for this sub-folder. In the iteration it might find sub-folders again so the ReadDirectory can be easily called often. As a tip, don't start "c:\" or even "c:\program files\" but something more simple such as "c:\Visual DataFlex Examples\" or "c:\tmp\" (f you have a temp folder on the root of your disk).Code:// Determine the number of foldersMove (SizeOfArray (MyFolder.Folders)) to iFolderItems Decrement iFolderItems // Loop through the folders and enumerate each of them recursively For iFolderItem from 0 to iFolderItems Send ReadDirectory (MyFolder.Folders[iFolderItem].sFolderName + '\') Loop
Now, how do you get everything in one array with information as above code will build up arrays but as they are not kept somewhere you will loose the information when the recursion ends. The "trick" as mentioned before is by passing a variable via by reference. For this change two things. The first change is the line where the ReadDirectory is declared. Change it into:
The second change is where the ReadDirectory is invoked. Change that/those lines into:Code:Procedure ReadDirectory String sDir tFolder BYREF MyFolder
This means that the initial call needs to use a variable. Change:Code:Send ReadDirectory (MyFolder.Folders[iFolderItem].sFolderName + '\') (&MyFolder.Folders[iFolderItem])
intoCode:Send ReadDirectory "c:\program files\"
Each time the procedure calls itself a reference to a deeper member is passed. So the first time MyFolder is passed, the second time MyFolder.Folders[N], the third time MyFolder.Folders[N].Folders[N]. On return of the original instruction the array of tFolder contains all files and sub-folders of the passed root folder.Code:tFolder MyFolder Send ReadDirectory "c:\program files\" (&MyFolder)
Count Files and Folders
To count the total number of files and folders in the MyFolder variable you can make use of a procedure that takes three arguments by reference. The first is the MyFolder array, the second a reference to an integer for counting files and the third a reference to an integer to count the folders. The method and the call can be written as follow:
As you can see this method calls itself recursively as well. Because the array is passed by reference it does not take up more memory.Code:// This routine enumerates the recursive struct to count the files and folders Procedure CountFilesAndFolders tFolder ByRef MyFolders Integer ByRef iFiles Integer ByRef iFolders Integer iFolderItems iFolder // We don't need to count the items; there is a nice build in function for this Move (iFiles + SizeOfArray (MyFolders.Files)) to iFiles // Get the number of folders first and store this in a variable because we need to enumerate from there Move (SizeOfArray (MyFolders.Folders)) to iFolderItems // Add the folders (not decremented) to the number of folders Move (iFolders + iFolderItems) to iFolders // Decrement the count because we loop from 0 to N Decrement iFolderItems // Loop the contents of the array recursively. For iFolder from 0 to iFolderItems Send CountFilesAndFolders (&MyFolders.Folders[iFolder]) (&iFiles) (&iFolders) Loop End_Procedure
Conclusion
Once you understand that you can use recursion and by reference it is quite easy to make complex routines like this. The code won't execute as fast if you would write it in a 3rd generation language as 'C' but it is much easier as you don't have to do all the memory operations and let the DataFlex 4GL take care of that.
Happy coding, hope this will contribute to keep you busy with Visual DataFlex!