PDA

View Full Version : XML into ASP page



Nick Wright
5-Jun-2005, 10:34 PM
Can anyone help with this problem?

I have an xml document - encoding="utf-8", I am trying to display part
of it's contents to an asp page also content="text/html;charset=utf-8".

Here is an example:
I parse the XML document in webapp using the following code:



Procedure Unicode_label Integer iRef String sLang
String sRet sRef
String sFile
Address pXML
handle hoXML hoRoot hoList hoTrans hoChi hoDesc
Integer iItems i bOK iID

If (slang="CHI") Move "c:\legendshop\chinese.xml" to sFile
Get Create U_cXMLDOMDocument to hoXML
Set psDocumentName of hoXML to sFile
Set pbAsync of hoXML To False
Set pbValidateOnParse of hoXML To True

Get LoadXMLDocument of hoXML to bOK
If not bOK Begin
Send BasicParseErrorReport Of hoXml
procedure_return
End
Get DocumentElement of hoXML to hoRoot
Move (sLang+string(iRef)) to sRef
Get FindNode of hoRoot sRef to hoChi
Get FindNode of hoChi "desc" to hoDesc
Get paXML of hoDesc to pXML
Send WriteData pXML
Move (Free(pXML)) to bOk
Send Destroy of hoDesc
send Destroy of hoChi
Send Destroy of hoRoot
send Destroy of hoXML
End_Procedure

I Call the procedure from a webpage twice using like this:

<tr><td><% oWebtrans.call "msg_Unicode_label",1,"CHI" %></td></tr>
<tr><td><% oWebtrans.call "msg_Unicode_label",2,"CHI" %></td></tr>


The xml document looks like this:




The result on the webpage is this:

????
English

I don't understand why the extended characters are not showing, I
realise that vdf uses OEM however does it convert the unicode into OEM
during the parsing process? Is there any way to do this without the
characters being converted?

Any help will be very, very greatfully received.

TIA

Nick

Knut Sparhell
6-Jun-2005, 03:13 AM
Nick Wright wrote:

> I have an xml document - encoding="utf-8", I am trying to display part
> of it's contents to an asp page also content="text/html;charset=utf-8".

Also set the ASP property

Response.Charset = "utf-8"

This will set the HTTP response encoding to the one actually used in the
content, and this will again instruct the browser to interpret the
content using this encoding. You can't expect either the web server nor
the web browser to convert the content based on a html header declaration.

--
Knut Sparhell, Norway

Nick Wright
6-Jun-2005, 07:54 PM
Thanks Knut,

but it didn't work, I still get ???? I am convinced from all the tests I
have done that webapp is mucking with the data. I have tested other
languages not in the iso-8859-1 range and all extended characters get
converted to something different I stored l' in Slovak, used iso-8859-2
(ANSI) and unicode, the xml file opened in Windows explorer and Mozilla
correctly, however when I parsed the string through webapp I got a square
symbol.

Data Access have been suspiciously quiet on all my posts about foreign
languages, if I knew what was going on it would help me, and maybe stop me
wasting hours of my time.

Many thanks again.

Nick


"Knut Sparhell" <knut@sparhell.no> wrote in message
news:iNwQT%23maFHA.4692@dacmail.dataaccess.com...
> Nick Wright wrote:
>
>> I have an xml document - encoding="utf-8", I am trying to display part of
>> it's contents to an asp page also content="text/html;charset=utf-8".
>
> Also set the ASP property
>
> Response.Charset = "utf-8"
>
> This will set the HTTP response encoding to the one actually used in the
> content, and this will again instruct the browser to interpret the content
> using this encoding. You can't expect either the web server nor the web
> browser to convert the content based on a html header declaration.
>
> --
> Knut Sparhell, Norway

Anders Öhrt
7-Jun-2005, 02:17 AM
> but it didn't work, I still get ???? I am convinced from all the tests I
> have done that webapp is mucking with the data.

Try capturing the data from webapp to text, and see if that's true. Use Wget
for instance, not IE. Then you would get exactly what webapp is outputting,
and you would _know_ if that was the problem.

// Anders

Nick Wright
7-Jun-2005, 09:09 PM
Thanks for the tip Anders,

though the output was still ????

Just going to shoot myself now :-(

Nick


"Anders Öhrt" <Anders.Ohrt@capslock.se> wrote in message
news:GhujVEzaFHA.4692@dacmail.dataaccess.com...
>
>> but it didn't work, I still get ???? I am convinced from all the tests I
>> have done that webapp is mucking with the data.
>
> Try capturing the data from webapp to text, and see if that's true. Use
> Wget for instance, not IE. Then you would get exactly what webapp is
> outputting, and you would _know_ if that was the problem.
>
> // Anders
>
>

Knut Sparhell
7-Jun-2005, 10:03 PM
Nick Wright wrote:

> though the output was still ????

If you write back the xml from the VDF object to the file, is it still
correct?

--
Knut Sparhell, Norway

Nick Wright
7-Jun-2005, 11:14 PM
Hi Knut,

I have just tried it, using two methods,

by writing the original handle back to a new file
I got what was in the original file.

by using the procedure below I get :

- <language>
- <CHI1>
<desc><desc>????</desc></desc>
</CHI1>
- <CHI2>
<desc><desc>????</desc></desc>
</CHI2>
</language>

Procedure WriteAddressData Address pXML String sData
Handle hoXML hoRoot hoCHI
Integer bERr
Get Create U_cXMLDOMDocument to hoXML
Set psDocumentName of hoXML to "c:\legendshop\testback2.xml"
set pbAsync of hoXML to False
Set pbValidateOnParse of hoXML To True
Get CreateDocumentElement of hoXML "language" to hoRoot
Get AddElement of hoRoot "CHI1" "" to hoCHI
Send AddElement of hoCHI "desc" sData
Send Destroy of hoCHI
Get AddElement of hoRoot "CHI2" "" to hoCHI
Send AddElement of hoCHI "desc" pXML
Send Destroy of hoCHI
Send Destroy of hoRoot
Get SaveXMLDocument of hoXML to bErr

Send Destroy of hoXML
End_Procedure

I know the code is dirty but it's just a test.
pXML is the address obtained and used in writedata from my earlier post
sData is obtained from psXML.

Thanks,

Nick


"Knut Sparhell" <knut@sparhell.no> wrote in message
news:JqarPa9aFHA.1308@dacmail.dataaccess.com...
> Nick Wright wrote:
>
>> though the output was still ????
>
> If you write back the xml from the VDF object to the file, is it still
> correct?
>
> --
> Knut Sparhell, Norway

Nick Wright
7-Jun-2005, 11:24 PM
I just noticed the new file with ???? saved without encoding data in ANSI
format (i guess default DF behaviour) the copy kept the encoding=utf-8
information.

Which has just occurred to me, how do you get the encoding information into
an xml document?

Nick

"Nick Wright" <nwright@legend.com.au> wrote in message
news:rR0fFC%23aFHA.1276@dacmail.dataaccess.com...
> Hi Knut,
>
> I have just tried it, using two methods,
>
> by writing the original handle back to a new file
> I got what was in the original file.
>
> by using the procedure below I get :
>
> - <language>
> - <CHI1>
> <desc><desc>????</desc></desc>
> </CHI1>
> - <CHI2>
> <desc><desc>????</desc></desc>
> </CHI2>
> </language>
>
> Procedure WriteAddressData Address pXML String sData
> Handle hoXML hoRoot hoCHI
> Integer bERr
> Get Create U_cXMLDOMDocument to hoXML
> Set psDocumentName of hoXML to "c:\legendshop\testback2.xml"
> set pbAsync of hoXML to False
> Set pbValidateOnParse of hoXML To True
> Get CreateDocumentElement of hoXML "language" to hoRoot
> Get AddElement of hoRoot "CHI1" "" to hoCHI
> Send AddElement of hoCHI "desc" sData
> Send Destroy of hoCHI
> Get AddElement of hoRoot "CHI2" "" to hoCHI
> Send AddElement of hoCHI "desc" pXML
> Send Destroy of hoCHI
> Send Destroy of hoRoot
> Get SaveXMLDocument of hoXML to bErr
>
> Send Destroy of hoXML
> End_Procedure
>
> I know the code is dirty but it's just a test.
> pXML is the address obtained and used in writedata from my earlier post
> sData is obtained from psXML.
>
> Thanks,
>
> Nick
>
>
> "Knut Sparhell" <knut@sparhell.no> wrote in message
> news:JqarPa9aFHA.1308@dacmail.dataaccess.com...
>> Nick Wright wrote:
>>
>>> though the output was still ????
>>
>> If you write back the xml from the VDF object to the file, is it still
>> correct?
>>
>> --
>> Knut Sparhell, Norway
>
>

Knut Sparhell
8-Jun-2005, 12:10 AM
Nick Wright wrote:

> Which has just occurred to me, how do you get the encoding information into
> an xml document?

From VDF? It's an attribute to the XML processing instruction.
Setting it doesn't convert anything, just a declaration to be used by
the processor. I haven't read the VDF details on how to set it.

--
Knut Sparhell, Norway

wila
8-Jun-2005, 04:34 AM
Nick,


Nick Wright wrote:
> I just noticed the new file with ???? saved without encoding data in ANSI
> format (i guess default DF behaviour) the copy kept the encoding=utf-8
> information.

[WvA] Actually default VDF behaviour is to use OEM encoding whereas
windows controls expect ANSI data to be displayed. This is why in some
languages you need to switch off the oem_translate_state property in
order to display the correct characters.

In VDF and webapp you have the ToAnsi and ToOem functions to convert the
data from one format to the other and back.
In VDF11 (possibly earlier) there's a function called OemToUtf8Buffer in
the chartranslate.pkg package.

hth,
Wil

Nick Wright
8-Jun-2005, 07:02 AM
Thanks Wil,

I am familiar with the chartranslate.pkg but haven't used the
OemToUtf8Buffer on the XML yet but I shall try tomorrow. I am still not
convinced that webapp should convert utf-8 prior to writing to an asp page,
or as I suspect in order to store the content in memory at the address
accessed through paXML, I also think that webapp is unable to enterpret the
UTF-8 characters and converts them to ?????? prior to me being able to
manipulate it in otherwords the original meaning is lost so converting back
will be pointless.

I realise what I am trying to do can be done in asp without webapp but I am
doing this to get round the inability to store UTF-8 data, and I need to
pick and choose which part of the file to display where on the asp page. It
seems such a basic requirement to me, someone sends you XML in utf-8, you
should be able to display all or part of the content in your webpage as
utf-8, without webapp trying to interpret it.

Here is a little background as to what I am trying to achieve:
We have a very complex webapp, it is completely dynamic in it's make-up
every user sees different menus, pages, the same page can look different
from one user to the next. In order to do this we have much of the site
content in our database, the english you see on the site is already in
tables and displays very quickly. We have a set of translation files which
swap the content from one language to another at the press of a menu item,
the snag is we can't store anything other than english reliably, we have
tried various ANSI tables, however while we can store some ANSI characters
in the database as doublecharacter equivolents, when we recall the data it
does not display on the webpage correctly - Chinese for example on Chinese
Windows stores and displays correctly, view this on an English windows
server and it all goes wrong, even when all encoding indicators are set
correctly. In order to overcome the inability to store the data in a table I
thought I would store it in a series of xml files which does hold unicode.

We are converting to DB2 very shortly and hope to configure our database to
Unicode, my biggest fear is that vdf will not enterpret the unicode
correctly, or _interfere_ with the unicode such that the data will not
display on the web or in our apps correctly.
Having a multi-ligual webapp is absolutely essential to us, I know we could
drop vdf and migrate over to .NET but it really gets to me that that could
end up being the only way. I would happily work with Data Access or anybody
for that matter to establish a set of classes that will enable us to stay
with vdf and use unicode. We must display data and write data from web entry
pages in something other than OEM, our current requirement is to display in
7 different languages on the web, only one of those is OEM compatible
(english), 5 are double-byte character languages.

Any suggestions on how to move forward on this would be greatfully received.

Nick

"Wil van Antwerpen" <info@antwise.com> wrote in message
news:vOIBU1AbFHA.1308@dacmail.dataaccess.com...
> Nick,
>
>
> Nick Wright wrote:
>> I just noticed the new file with ???? saved without encoding data in ANSI
>> format (i guess default DF behaviour) the copy kept the encoding=utf-8
>> information.
>
> [WvA] Actually default VDF behaviour is to use OEM encoding whereas
> windows controls expect ANSI data to be displayed. This is why in some
> languages you need to switch off the oem_translate_state property in order
> to display the correct characters.
>
> In VDF and webapp you have the ToAnsi and ToOem functions to convert the
> data from one format to the other and back.
> In VDF11 (possibly earlier) there's a function called OemToUtf8Buffer in
> the chartranslate.pkg package.
>
> hth,
> Wil

wila
8-Jun-2005, 08:27 AM
Nick,

Nick Wright wrote:
> Thanks Wil,
>
> I am familiar with the chartranslate.pkg but haven't used the
> OemToUtf8Buffer on the XML yet but I shall try tomorrow. I am still not
> convinced that webapp should convert utf-8 prior to writing to an asp page,
> or as I suspect in order to store the content in memory at the address
> accessed through paXML, I also think that webapp is unable to enterpret the
> UTF-8 characters and converts them to ?????? prior to me being able to
> manipulate it in otherwords the original meaning is lost so converting back
> will be pointless.

[WvA] Well... webapp doesn't convert the characters to anything afaik.
The ???? characters you see is just how the browser renders the text
when it doesn't know what to do with it. To me it sounds like that the
browser doesn't recognize your character encoding.
UTF8 characters as you call them are just characters. It's data, but
less readable as we are used to. As has been suggested by Anders, try to
wget the raw data and see if it matches your data with a binary compare
tool (or use FC /B if you like)

>
> I realise what I am trying to do can be done in asp without webapp but I am
> doing this to get round the inability to store UTF-8 data, and I need to
> pick and choose which part of the file to display where on the asp page. It
> seems such a basic requirement to me, someone sends you XML in utf-8, you
> should be able to display all or part of the content in your webpage as
> utf-8, without webapp trying to interpret it.

[WvA] The problem is not that you would be inable to store UTF-8 data.
UTF-8 data consists of just bytes, save it as binary data in your
database and it should be no problem. The UNICODE and VDF problem lies
more in the windows product where several visual runtime controls are
bound to using the ANSI variants of operating system API calls and not
the unicode ones. If this would include string operations then you might
be correct in that webapp messes your data, but i doubt that this
problem exists.

I might be proven wrong, but i don't think the problem exists to the
same extent in webapp.

Maybe you should try and get this working in plain asp first and then
move up one level and include webapp in the mix. Rule of thumb, start
out with something you know that works and slowly make the case closer
to your real life problem.

--
Wil

>
> Here is a little background as to what I am trying to achieve:
> We have a very complex webapp, it is completely dynamic in it's make-up
> every user sees different menus, pages, the same page can look different
> from one user to the next. In order to do this we have much of the site
> content in our database, the english you see on the site is already in
> tables and displays very quickly. We have a set of translation files which
> swap the content from one language to another at the press of a menu item,
> the snag is we can't store anything other than english reliably, we have
> tried various ANSI tables, however while we can store some ANSI characters
> in the database as doublecharacter equivolents, when we recall the data it
> does not display on the webpage correctly - Chinese for example on Chinese
> Windows stores and displays correctly, view this on an English windows
> server and it all goes wrong, even when all encoding indicators are set
> correctly. In order to overcome the inability to store the data in a table I
> thought I would store it in a series of xml files which does hold unicode.
>
> We are converting to DB2 very shortly and hope to configure our database to
> Unicode, my biggest fear is that vdf will not enterpret the unicode
> correctly, or _interfere_ with the unicode such that the data will not
> display on the web or in our apps correctly.
> Having a multi-ligual webapp is absolutely essential to us, I know we could
> drop vdf and migrate over to .NET but it really gets to me that that could
> end up being the only way. I would happily work with Data Access or anybody
> for that matter to establish a set of classes that will enable us to stay
> with vdf and use unicode. We must display data and write data from web entry
> pages in something other than OEM, our current requirement is to display in
> 7 different languages on the web, only one of those is OEM compatible
> (english), 5 are double-byte character languages.
>
> Any suggestions on how to move forward on this would be greatfully received.
>
> Nick
>
> "Wil van Antwerpen" <info@antwise.com> wrote in message
> news:vOIBU1AbFHA.1308@dacmail.dataaccess.com...
>
>>Nick,
>>
>>
>>Nick Wright wrote:
>>
>>>I just noticed the new file with ???? saved without encoding data in ANSI
>>>format (i guess default DF behaviour) the copy kept the encoding=utf-8
>>>information.
>>
>>[WvA] Actually default VDF behaviour is to use OEM encoding whereas
>>windows controls expect ANSI data to be displayed. This is why in some
>>languages you need to switch off the oem_translate_state property in order
>>to display the correct characters.
>>
>>In VDF and webapp you have the ToAnsi and ToOem functions to convert the
>>data from one format to the other and back.
>>In VDF11 (possibly earlier) there's a function called OemToUtf8Buffer in
>>the chartranslate.pkg package.
>>
>>hth,
>>Wil
>
>
>

Vincent Oorsprong
8-Jun-2005, 08:45 AM
When creating an XML file from VDF (using FleXML) the output will always be
in ANSI. There is no choice. So the encoding is Windows-1252 I believe.

--
Kind Regards,
Vincent Oorsprong
Data Access Europe BV
Lansinkesweg 4
7553 AE Hengelo
The Netherlands
Telephone: +31 (0)74 - 255 56 09
Fax: +31 (0)74 - 250 34 66
http://www.dataaccess.nl

wila
8-Jun-2005, 09:19 AM
Hi Vincent,

Vincent Oorsprong wrote:
> When creating an XML file from VDF (using FleXML) the output will always be
> in ANSI.

[WvA] Really? I was under the impression that Sonny had done something
in this respect last year, but alas that is not the case then.

There is no choice. So the encoding is Windows-1252 I believe.

[WvA] There's always a choice, he could for example base64 encode the
unicode text before adding it to the XML file.

--
Wil

>

wila
8-Jun-2005, 09:20 AM
Nick,

Also do check out the following articles:

http://www.dataaccess.com/kbasepublic/KBPrint.asp?ArticleID=2024

http://www.dataaccess.com/kbasepublic/KBPrint.asp?ArticleID=2023

http://www.dataaccess.com/kbasepublic/KBPrint.asp?ArticleID=1343

hth,
Wil


Nick Wright wrote:
> Thanks Wil,
>
> I am familiar with the chartranslate.pkg but haven't used the
> OemToUtf8Buffer on the XML yet but I shall try tomorrow. I am still not
> convinced that webapp should convert utf-8 prior to writing to an asp page,
> or as I suspect in order to store the content in memory at the address
> accessed through paXML, I also think that webapp is unable to enterpret the
> UTF-8 characters and converts them to ?????? prior to me being able to
> manipulate it in otherwords the original meaning is lost so converting back
> will be pointless.
>
> I realise what I am trying to do can be done in asp without webapp but I am
> doing this to get round the inability to store UTF-8 data, and I need to
> pick and choose which part of the file to display where on the asp page. It
> seems such a basic requirement to me, someone sends you XML in utf-8, you
> should be able to display all or part of the content in your webpage as
> utf-8, without webapp trying to interpret it.
>
> Here is a little background as to what I am trying to achieve:
> We have a very complex webapp, it is completely dynamic in it's make-up
> every user sees different menus, pages, the same page can look different
> from one user to the next. In order to do this we have much of the site
> content in our database, the english you see on the site is already in
> tables and displays very quickly. We have a set of translation files which
> swap the content from one language to another at the press of a menu item,
> the snag is we can't store anything other than english reliably, we have
> tried various ANSI tables, however while we can store some ANSI characters
> in the database as doublecharacter equivolents, when we recall the data it
> does not display on the webpage correctly - Chinese for example on Chinese
> Windows stores and displays correctly, view this on an English windows
> server and it all goes wrong, even when all encoding indicators are set
> correctly. In order to overcome the inability to store the data in a table I
> thought I would store it in a series of xml files which does hold unicode.
>
> We are converting to DB2 very shortly and hope to configure our database to
> Unicode, my biggest fear is that vdf will not enterpret the unicode
> correctly, or _interfere_ with the unicode such that the data will not
> display on the web or in our apps correctly.
> Having a multi-ligual webapp is absolutely essential to us, I know we could
> drop vdf and migrate over to .NET but it really gets to me that that could
> end up being the only way. I would happily work with Data Access or anybody
> for that matter to establish a set of classes that will enable us to stay
> with vdf and use unicode. We must display data and write data from web entry
> pages in something other than OEM, our current requirement is to display in
> 7 different languages on the web, only one of those is OEM compatible
> (english), 5 are double-byte character languages.
>
> Any suggestions on how to move forward on this would be greatfully received.
>
> Nick
>
> "Wil van Antwerpen" <info@antwise.com> wrote in message
> news:vOIBU1AbFHA.1308@dacmail.dataaccess.com...
>
>>Nick,
>>
>>
>>Nick Wright wrote:
>>
>>>I just noticed the new file with ???? saved without encoding data in ANSI
>>>format (i guess default DF behaviour) the copy kept the encoding=utf-8
>>>information.
>>
>>[WvA] Actually default VDF behaviour is to use OEM encoding whereas
>>windows controls expect ANSI data to be displayed. This is why in some
>>languages you need to switch off the oem_translate_state property in order
>>to display the correct characters.
>>
>>In VDF and webapp you have the ToAnsi and ToOem functions to convert the
>>data from one format to the other and back.
>>In VDF11 (possibly earlier) there's a function called OemToUtf8Buffer in
>>the chartranslate.pkg package.
>>
>>hth,
>>Wil
>
>
>

Knut Sparhell
8-Jun-2005, 10:12 AM
Vincent Oorsprong wrote:
> When creating an XML file from VDF (using FleXML) the output will always be
> in ANSI. There is no choice. So the encoding is Windows-1252 I believe.

This is a bad thing that needs to be fixed. We should be able to output
binary data, especially to a web. Let the user handle the encoding and
declare it properly for the receiver to handle and display correctly.

--
Knut Sparhell, Norway

Nick Wright
9-Jun-2005, 11:49 PM
Thanks everyone for your help, I have not had any success in this yet, I
will get back to it when we are testing DB2. I'll let you know how I get on
then.

Nick


"Nick Wright" <nwright@legend.com.au> wrote in message
news:uwoydikaFHA.1312@dacmail.dataaccess.com...
Can anyone help with this problem?

I have an xml document - encoding="utf-8", I am trying to display part of
it's contents to an asp page also content="text/html;charset=utf-8".

Here is an example:
I parse the XML document in webapp using the following code:

Procedure Unicode_label Integer iRef String sLang
String sRet sRef
String sFile
Address pXML
handle hoXML hoRoot hoList hoTrans hoChi hoDesc
Integer iItems i bOK iID

If (slang="CHI") Move "c:\legendshop\chinese.xml" to sFile
Get Create U_cXMLDOMDocument to hoXML
Set psDocumentName of hoXML to sFile
Set pbAsync of hoXML To False
Set pbValidateOnParse of hoXML To True

Get LoadXMLDocument of hoXML to bOK
If not bOK Begin
Send BasicParseErrorReport Of hoXml
procedure_return
End
Get DocumentElement of hoXML to hoRoot
Move (sLang+string(iRef)) to sRef
Get FindNode of hoRoot sRef to hoChi
Get FindNode of hoChi "desc" to hoDesc
Get paXML of hoDesc to pXML
Send WriteData pXML
Move (Free(pXML)) to bOk
Send Destroy of hoDesc
send Destroy of hoChi
Send Destroy of hoRoot
send Destroy of hoXML
End_Procedure

I Call the procedure from a webpage twice using like this:

<tr><td><% oWebtrans.call "msg_Unicode_label",1,"CHI" %></td></tr>
<tr><td><% oWebtrans.call "msg_Unicode_label",2,"CHI" %></td></tr>

The xml document looks like this:




The result on the webpage is this:

????
English

I don't understand why the extended characters are not showing, I realise
that vdf uses OEM however does it convert the unicode into OEM during the
parsing process? Is there any way to do this without the characters being
converted?

Any help will be very, very greatfully received.

TIA

Nick

Sonny Falk
10-Jun-2005, 10:52 AM
> When creating an XML file from VDF (using FleXML) the output will always
be
> in ANSI. There is no choice. So the encoding is Windows-1252 I believe.

I'm sorry, but that's not true. Saving XML files using the VDF XML classes
will properly use the correct encoding, which by default is UTF-8.

Also let me clarify that there's no need to write out encoding=utf-8. If the
encoding is utf-8, then there's no need for a XML declaration at all, as
UTF-8 is the default encoding. You only need to specify the encoding if it's
not UTF-8. There's no harm in specifying encoding=utf-8, but it makes no
difference either, as that is merely saying, "yes it's using the default
encoding which you would have figured anyway".

The KB article that Wil referenced which goes into detail explaining OEM,
ANSI and Unicode, was originally a response to a NG question where the OP
wasn't using the XML classes to write an XML file. IIRC the OP was simply
using direct_output and writeln to write the file. Thus outputting OEM
literal strings while stating encoding=utf-8, causing a mismatch in the
actual output format and the declared output format.

Now, on to the problem at hand. Nick is absolutely correct that the data is
getting screwed up in the VDF code.

The XML classes handle Unicode and UTF-8 correctly, both reading and writing
UTF-8 correctly. However, as is also mentioned in the KB article that Wil
referenced, VDF is traditionally OEM. This means that any data passed
to/from the XML classes must be converted to/from Unicode/OEM. For example,
anytime you call Get paXML, the data is converted from Unicode to OEM. And
anytime you call Get AddElement for example, the element name as well as the
data supplied is converted from OEM to Unicode.

Now it should hopefully be a little more clear where the problem lies. paXml
converts to OEM, and if the specified characters are not available in the
current OEM character set, then the conversion fails, normally resulting in
"?" in the output.

This also explains why it may work on a machine already configured for
chinese. If the current OEM character set is one that includes these chinese
characters, then it will also work. The OEM/Unicode conversion is always
using the currently configured OEM character set.

As Nick also already said, the best option in this case is probably to move
the code dealing with the output of this XML document to ASP code.

Hope this clears up any confusion.

-Sonny Falk (DAC)

Nick Wright
10-Jun-2005, 07:32 PM
Thanks Sonny,

I feel a lot better now, at least I know what's going on, still doesn't help
my problem though, I shall have to play with some com objects I think.

Regards,

Nick


"Sonny Falk" <sonny-f@dataaccess.com> wrote in message
news:VMil8QdbFHA.1276@dacmail.dataaccess.com...
>> When creating an XML file from VDF (using FleXML) the output will always
> be
>> in ANSI. There is no choice. So the encoding is Windows-1252 I believe.
>
> I'm sorry, but that's not true. Saving XML files using the VDF XML classes
> will properly use the correct encoding, which by default is UTF-8.
>
> Also let me clarify that there's no need to write out encoding=utf-8. If
> the
> encoding is utf-8, then there's no need for a XML declaration at all, as
> UTF-8 is the default encoding. You only need to specify the encoding if
> it's
> not UTF-8. There's no harm in specifying encoding=utf-8, but it makes no
> difference either, as that is merely saying, "yes it's using the default
> encoding which you would have figured anyway".
>
> The KB article that Wil referenced which goes into detail explaining OEM,
> ANSI and Unicode, was originally a response to a NG question where the OP
> wasn't using the XML classes to write an XML file. IIRC the OP was simply
> using direct_output and writeln to write the file. Thus outputting OEM
> literal strings while stating encoding=utf-8, causing a mismatch in the
> actual output format and the declared output format.
>
> Now, on to the problem at hand. Nick is absolutely correct that the data
> is
> getting screwed up in the VDF code.
>
> The XML classes handle Unicode and UTF-8 correctly, both reading and
> writing
> UTF-8 correctly. However, as is also mentioned in the KB article that Wil
> referenced, VDF is traditionally OEM. This means that any data passed
> to/from the XML classes must be converted to/from Unicode/OEM. For
> example,
> anytime you call Get paXML, the data is converted from Unicode to OEM. And
> anytime you call Get AddElement for example, the element name as well as
> the
> data supplied is converted from OEM to Unicode.
>
> Now it should hopefully be a little more clear where the problem lies.
> paXml
> converts to OEM, and if the specified characters are not available in the
> current OEM character set, then the conversion fails, normally resulting
> in
> "?" in the output.
>
> This also explains why it may work on a machine already configured for
> chinese. If the current OEM character set is one that includes these
> chinese
> characters, then it will also work. The OEM/Unicode conversion is always
> using the currently configured OEM character set.
>
> As Nick also already said, the best option in this case is probably to
> move
> the code dealing with the output of this XML document to ASP code.
>
> Hope this clears up any confusion.
>
> -Sonny Falk (DAC)
>
>