View RSS Feed

Development Team Blog

Sequential Reading Data

Rate this Entry
We all know DataFlex is a great product to build database applications, most of the data will be entered by end-users and printed or otherwise processed. But what if you need to import data from other manufacturers? What do we have to do?

Sequential I/O commands
When we want to read data in a DataFlex application we can make a choice between the commands Read, ReadLn, Read_Block and Read_Hex. Of course before you can read the input stream (usually a file) needs to be opened with the Direct_Input command.

I/O Channels
DataFlex supports up to 10 concurrent input or output devices at the same time. This can be 1 input and 9 output, 5 input and 5 output, you name it, as long as it does not exceed 10. You are free to define your input channel number but when omitted, this was done to give backward compatibility, the channel 0 is used for input and channel 1 for output.
Specifying a channel number yourself is possible but in a good object oriented program it is not wise to do so and you should make use of channel management. This can be done via the functions in the package seq_chnl.pkg. After including the package you get the next free channel number, use it and when ready release the channel.

Code:
Use Seq_Chnl.Pkg
:
Move (Seq_New_Channel ()) to iChannel
If (iChannel >= 0) Begin
    Direct_Input Channel iChannel sFileName
    :
    Close_Input Channel iChannel
    Send Seq_Release_Channel iChannel
End
When you forget to release the channel your application will use channels until 10 has reached and then it does not work anymore. The returned channel number is an error status ID and then it depends on your code what happens next.

Read
The Read command is a special one, it is meant for reading from CSV files. CSV originally stood for Comma Separated Values and this the command reads until a comma is found in the input stream. When it find this the comma is not placed in the destination variable but skipped. If the data is surrounded by double quotes these characters are stripped as well.

Code:
Direct_Input Channel iChannel sFileName
While (Not (Seqeof))
    Read Channel iChannel sCustomerId
    Read Channel iChannel sCustomerName
    Readln Channel iChannel
Loop
Close_Input Channel iChannel
ReadLn
The ReadLn command, also shown in above code, reads from the current position in the input stream until the end-of-line marker in the file. The EOL marker is usually a CR/LF character pair but can be specified with the Direct_Input command if they are different.

Code:
Direct_Input Channel iChannel "Customer.Txt EOL:14"
The use of this is rare (these days), with the exception of binary data where you usually do not want a EOL or/and an EOF dectection.

Read_Block
With the Read_Block command you can read an X number of bytes from the input stream, where X is not limited but should be above 0 and usually you also want it less than the argument size (by default 64kB).

Code:
Direct_Input Channel iChannel sFileName
Get_Channel_Size iChannel to iSize
Get_Argument_Size to iArgSize
While (iSize > 0)
    Move (iSize Min iArgSize) to iDataSize
    Read_Block Channel iChannel sData iDataSize
    // Process the data in sData
    Move (iSize - iDataSize) to iSize
Loop
Close_Input Channel iChannel
Read_Hex
This is a very special command, it reads from the input stream two bytes and checks if the data - by string compare - is greater or equal than 00 and less or equal to FF. Then the two bytes are converted to an ASCII character. If they are not 00-FF (for example when it reads something like "Wo" of "Worldwide") the reading stops and the current channel position is moved back to the position before the "wrong" character pair was read.
The command is the counter part of Write_Hex and designed for escaping special characters like the CR character in a file caused by exporting a memo column.

Code:
Direct_Input Channel iChannel sFileName
While (Not (Seqeof))
    Read Channel iChannel iCustomerId
    Read Channel iChannel sCustomerName
    Read_Hex Channel iChannel sCustomerComments
    ReadLn Channel iChannel
Loop
Close_Input Channel iChannel
Different Separator
As mentioned before the Read command looks for a comma to separate the fields in the data. What if your data does not use a comma? Strange? No, it is not strange because Microsoft Excel uses a semi-colon (in fact the Windows List Separator) for field separation. There are also tools that use a TAB character.

If your data contains a TAB character as separator you should read and make use of this KB item.

If your data contains the Windows List separator you should read and make use of this KB item.

If you want to get the Windows List separator you can find the code in this KB item.

OEM or ANSI?
When data that needs to be imported is delivered to you you will need to know what character set is used. Internally DataFlex strings are considered to be OEM strings so the standard import commands just read the data as if it was OEM data. This means that when you know it is ANSI data you need to use ToOem() on the data. If you do not do that and you save strings to the DataFlex embedded database you will end up with a mix of OEM and ANSI data which will cause conversion problems later.

Code:
Readln Channel iChannel sData
Move (ToOem (sData)) to sData
It is not possible to see if the data is OEM or ANSI.

One more thing, OEM is a codepage and ANSI too. The data should be written and read with the same codepage set else you will get some or more characters wrong.