View RSS Feed

Development Team Blog

How to make an XML file look 'nice'?

Rating: 32 votes, 5.00 average.
This blog is about XML readability. Since 1999 DataFlex is capable to handle XML in and XML out via classes defined in the FleXML.pkg (cXMLDomDocument etc). One of the most heard remarks in support questions is how to make XML readable for the human being as Microsoft XML writes the XML data as one long line of text. While OK most developers want to be able to read the content in an editor. You can easily format the XML file with tools like Notepad++ but how to do this from your own code?

Knowledge Base
As mentioned it is a question often received and I wrote a knowledge base article about this in 2002 named "Add formatting to an XML file". While it is OK it still feels unpleasant that one needs to walk this way.

Microsoft has a class!
At the time I wrote the knowledge base item there was no other way but with newer versions of MSXML there is a way to let a module do it for you. I found the information in a topic at Stackoverflow named Forcing MSXML to format XML output with indents and newlines. With a bit of testing I was able to make this working in DataFlex. The following steps need to be taken.

Import the MSXML6.0 COM library
The article uses a MXXMLWriter and SAXXMLReader object which are defined in the Microsoft XML 6.0 object library. This means you need to go to create a COM class by importing the MSXML6.0 automation library. Go to "create new", "class", "import COM automation" and find "Microsoft XML, v6.0 (version 6.0)" in the list. If you point to this entry you will see that the Studio will create a package called MSXML6.pkg in your workspace.

If you compile this package in your application you will get two compiler errors which can be "fixed" by adding the following line BEFORE the Use statement for the package.
Code:
Define OLE_VT_UI8 for 21
Convering the VB code to DataFlex code
It is not so difficult to convert the code for reading and parsing the XML to DataFlex code. You need to create an object of the cComSAXXMLReader60 and the cComMXXMLWriter60 class. In the following code they are create via the Create function. Because these classes are automation classes you can create the DataFlex object but you will not create the COM object (the peAutoCreate propety controls this and it is by default set to acNoAutoCreate). This means that you need to send a CreateComObject message to the DataFlex objects. After that you can get the dispatch ID of the writer object to connect reader and writer together.
Code:
Procedure WriteXMLFormatted Global String sFile
    Handle hoReader hoWriter
    Variant vWriter vData
    
    Get Create (RefClass (cComSAXXMLReader60)) to hoReader
    Send CreateComObject of hoReader
    
    Get Create (RefClass (cComMXXMLWriter60)) to hoWriter
    Send CreateComObject of hoWriter
    Get pvComObject of hoWriter to vWriter
    
    Set ComStandalone of hoWriter to True
    Set ComByteOrderMark of hoWriter to False
    Set ComEncoding of hoWriter to "utf-8"
    Set ComIndent of hoWriter to True
    Set ComOmitXMLDeclaration of hoWriter to True
    
    Set ComContentHandler of hoReader to vWriter
    Set ComDtdHandler of hoReader to vWriter
    Set ComErrorHandler of hoReader to vWriter
    
    Send ComPutProperty of hoReader "http://xml.org/sax/properties/lexical-handler" vWriter
    Send ComPutProperty of hoReader "http://xml.org/sax/properties/declaration-handler" vWriter
    
    Get ComOutput of hoWriter to vData
    
    Send ComParseURL of hoReader sFile
    
    Send ReleaseComObject of hoWriter
    Send ReleaseComObject of hoReader
    
    Send Destroy of hoReader
    Send Destroy of hoWriter
End_Procedure
Don't just copy and paste above code as it is not finished yet.

Take an XML file that is not "pretty". Like a DBE-Filter file which looks like:

Add a line of code to use this file such as:
Code:
Send WriteXMLFormatted "C:\Order Entry\Data\vendor.DBE-Filter"
The WriteXMLFormatted procedure does not write the contents back to disk, it gets the information into a variant string. There is no write to file function. The ComOutput can be retrieved into a string as the current code does or connected to a Stream or a XMLDomDocument object. You don't want to use the XMLDomDocument as it will destroy the nice formatting again as it does with opening a nice formatted file and saving.

Is a String OK?
So a string, is this OK? No it is not. There are two problems with the string way.
  1. The string is limited to the DataFlex argument size (default 65k) which means that more data will be truncated and make the result not a well-formed XML file.
  2. As soon as the string is written to disk with a DataFlex command Write it will be converted from UNICODE to OEM.
If the XML data is less than 65k and there are no OEM conversion issues in your environment you could use the standard Direct_Output, Write and Close_Output statements.

Stream object
Because of the string limitations you should be looking at the Stream object also used the sample code. How do I create such an object? It is not standard available in DataFlex as we have the sequential I/O commands build in the language for over 35 years.

The sample code shows an object of the ADODB.Stream class is used. If you go to "import COM automation" again (as for the MSXML6 class) and you browse the list you won't find it. The solution for this is the "browse" button at the bottom of the dialog. After clicking the button a Windows Common File Dialog is opened and you can browse for the COM library file. You need to browse for the MSADO60.TLB (typelibrary) file. On my Windows 8 (64bit) machine this is located in: C:\Program Files (x86)\Common Files\System\ado\msado60.tlb.

Once the package is created you can add code to create the stream object by adding:
Code:
Get Create (RefClass (cComStream)) to hoStream
Send CreateComObject of hoStream
Get pvComObject of hoStream to vStream
The dispatch ID now in vStream needs to be used with ComOutput of the writer object by replacing:
Code:
Get ComOutput to vData
with:
Code:
Set ComOutput to vStream
Note that there are two changes in the code line!

Are we done now? No, you are not. If you use the code now constructed you will get a "COM object method invocation error. Can't save." error. You get this error because the stream is not opened yet. The following code needs to be added.
Code:
Send ComOpen of hoStream Nothing OLEadModeUnknown OLEadOpenStreamUnspecified '' ''
If you lookup the OPEN method in the MSDN documentation you will see that the UID and PWD parameters are optional but if you pass the usual NOTHING parameter you will get a COM error from the Stream object that it does not like that, therefore I pass two empty strings. In fact if I pass strings with a value (a username and a password) you will get the same error.

Write to Disk
With a ComSaveToFile message to the Stream object the converted information will be written to disk.
Code:
Send ComSaveToFile of hoStream sFile OLEadSaveCreateOverWrite
Send ComClose of hoStream
Add the above code after the ComParseURL instruction.

Chinese?
Now run the code but don't do this with a file you don't want to destroy! The result is a file that looks like:

Fun: Copy the contents and let Google translate it for you! I got something with "careful with the last..."

We forgot one setting which apparently did this. The setting of the Type property. The default is adTypeText which depends on the CharSet attribute. If you change the type (also shown in the original article to Binary) you won't see this "problem". So add:
Code:
Set ComType of hoStream to OLEadTypeBinary
The result will then be like:

You should also remove the Stream instance by sending a ReleaseComObject and a Destroy to the hoStream object. Then all COM objects are released and the DataFlex side is destroyed too

I hope you learned from this blog on how to do things like this yourself and I will look forward to meet you all at a next conference.
Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	xml raw.png 
Views:	10263 
Size:	3.7 KB 
ID:	7161   Click image for larger version. 

Name:	converted.png 
Views:	9279 
Size:	7.5 KB 
ID:	7162   Click image for larger version. 

Name:	google translated.png 
Views:	9560 
Size:	31.3 KB 
ID:	7163   Click image for larger version. 

Name:	converted pretty.png 
Views:	9720 
Size:	6.4 KB 
ID:	7164  

Comments