How to make an XML file look 'nice'?
by
, 14-Dec-2013 at 05:08 AM (96858 Views)
This blog is about XML readability. Since 1999 DataFlex is capable to handle XML in and XML out via classes defined in the FleXML.pkg (cXMLDomDocument etc). One of the most heard remarks in support questions is how to make XML readable for the human being as Microsoft XML writes the XML data as one long line of text. While OK most developers want to be able to read the content in an editor. You can easily format the XML file with tools like Notepad++ but how to do this from your own code?
Knowledge Base
As mentioned it is a question often received and I wrote a knowledge base article about this in 2002 named "Add formatting to an XML file". While it is OK it still feels unpleasant that one needs to walk this way.
Microsoft has a class!
At the time I wrote the knowledge base item there was no other way but with newer versions of MSXML there is a way to let a module do it for you. I found the information in a topic at Stackoverflow named Forcing MSXML to format XML output with indents and newlines. With a bit of testing I was able to make this working in DataFlex. The following steps need to be taken.
Import the MSXML6.0 COM library
The article uses a MXXMLWriter and SAXXMLReader object which are defined in the Microsoft XML 6.0 object library. This means you need to go to create a COM class by importing the MSXML6.0 automation library. Go to "create new", "class", "import COM automation" and find "Microsoft XML, v6.0 (version 6.0)" in the list. If you point to this entry you will see that the Studio will create a package called MSXML6.pkg in your workspace.
If you compile this package in your application you will get two compiler errors which can be "fixed" by adding the following line BEFORE the Use statement for the package.
Convering the VB code to DataFlex codeCode:Define OLE_VT_UI8 for 21
It is not so difficult to convert the code for reading and parsing the XML to DataFlex code. You need to create an object of the cComSAXXMLReader60 and the cComMXXMLWriter60 class. In the following code they are create via the Create function. Because these classes are automation classes you can create the DataFlex object but you will not create the COM object (the peAutoCreate propety controls this and it is by default set to acNoAutoCreate). This means that you need to send a CreateComObject message to the DataFlex objects. After that you can get the dispatch ID of the writer object to connect reader and writer together.
Don't just copy and paste above code as it is not finished yet.Code:Procedure WriteXMLFormatted Global String sFile Handle hoReader hoWriter Variant vWriter vData Get Create (RefClass (cComSAXXMLReader60)) to hoReader Send CreateComObject of hoReader Get Create (RefClass (cComMXXMLWriter60)) to hoWriter Send CreateComObject of hoWriter Get pvComObject of hoWriter to vWriter Set ComStandalone of hoWriter to True Set ComByteOrderMark of hoWriter to False Set ComEncoding of hoWriter to "utf-8" Set ComIndent of hoWriter to True Set ComOmitXMLDeclaration of hoWriter to True Set ComContentHandler of hoReader to vWriter Set ComDtdHandler of hoReader to vWriter Set ComErrorHandler of hoReader to vWriter Send ComPutProperty of hoReader "http://xml.org/sax/properties/lexical-handler" vWriter Send ComPutProperty of hoReader "http://xml.org/sax/properties/declaration-handler" vWriter Get ComOutput of hoWriter to vData Send ComParseURL of hoReader sFile Send ReleaseComObject of hoWriter Send ReleaseComObject of hoReader Send Destroy of hoReader Send Destroy of hoWriter End_Procedure
Take an XML file that is not "pretty". Like a DBE-Filter file which looks like:
Add a line of code to use this file such as:
The WriteXMLFormatted procedure does not write the contents back to disk, it gets the information into a variant string. There is no write to file function. The ComOutput can be retrieved into a string as the current code does or connected to a Stream or a XMLDomDocument object. You don't want to use the XMLDomDocument as it will destroy the nice formatting again as it does with opening a nice formatted file and saving.Code:Send WriteXMLFormatted "C:\Order Entry\Data\vendor.DBE-Filter"
Is a String OK?
So a string, is this OK? No it is not. There are two problems with the string way.If the XML data is less than 65k and there are no OEM conversion issues in your environment you could use the standard Direct_Output, Write and Close_Output statements.
- The string is limited to the DataFlex argument size (default 65k) which means that more data will be truncated and make the result not a well-formed XML file.
- As soon as the string is written to disk with a DataFlex command Write it will be converted from UNICODE to OEM.
Stream object
Because of the string limitations you should be looking at the Stream object also used the sample code. How do I create such an object? It is not standard available in DataFlex as we have the sequential I/O commands build in the language for over 35 years.
The sample code shows an object of the ADODB.Stream class is used. If you go to "import COM automation" again (as for the MSXML6 class) and you browse the list you won't find it. The solution for this is the "browse" button at the bottom of the dialog. After clicking the button a Windows Common File Dialog is opened and you can browse for the COM library file. You need to browse for the MSADO60.TLB (typelibrary) file. On my Windows 8 (64bit) machine this is located in: C:\Program Files (x86)\Common Files\System\ado\msado60.tlb.
Once the package is created you can add code to create the stream object by adding:
The dispatch ID now in vStream needs to be used with ComOutput of the writer object by replacing:Code:Get Create (RefClass (cComStream)) to hoStream Send CreateComObject of hoStream Get pvComObject of hoStream to vStream
with:Code:Get ComOutput to vData
Note that there are two changes in the code line!Code:Set ComOutput to vStream
Are we done now? No, you are not. If you use the code now constructed you will get a "COM object method invocation error. Can't save." error. You get this error because the stream is not opened yet. The following code needs to be added.
If you lookup the OPEN method in the MSDN documentation you will see that the UID and PWD parameters are optional but if you pass the usual NOTHING parameter you will get a COM error from the Stream object that it does not like that, therefore I pass two empty strings. In fact if I pass strings with a value (a username and a password) you will get the same error.Code:Send ComOpen of hoStream Nothing OLEadModeUnknown OLEadOpenStreamUnspecified '' ''
Write to Disk
With a ComSaveToFile message to the Stream object the converted information will be written to disk.
Add the above code after the ComParseURL instruction.Code:Send ComSaveToFile of hoStream sFile OLEadSaveCreateOverWrite Send ComClose of hoStream
Chinese?
Now run the code but don't do this with a file you don't want to destroy! The result is a file that looks like:
Fun: Copy the contents and let Google translate it for you! I got something with "careful with the last..."
We forgot one setting which apparently did this. The setting of the Type property. The default is adTypeText which depends on the CharSet attribute. If you change the type (also shown in the original article to Binary) you won't see this "problem". So add:
The result will then be like:Code:Set ComType of hoStream to OLEadTypeBinary
You should also remove the Stream instance by sending a ReleaseComObject and a Destroy to the hoStream object. Then all COM objects are released and the DataFlex side is destroyed too
I hope you learned from this blog on how to do things like this yourself and I will look forward to meet you all at a next conference.