Character encoding issue with XML file

Scenario:

An xml file is read using CFHTTP from a remote web service. The xml string is converted to an xml object using the XmlParse function. Xpath and XmlSearch are used to extract data from the object. Data is then inserted into a database.

 

Problem:

All punctuation marks are replaced by non readable characters in the database and output. 

 

Analysis:

The xml sent by the remote server uses the Windows-1521 encoding. When CFHTTP performs a GET request the encoding is set to UTF-8 by default. Since the remote server is sending the file encoded in Windows-1521and ColdFusion is attempting to read it as a UTF-8 file the characters are displayed incorrectly. 

 

Resolution:

To solve the problem the CHARSET attribute was set on the CFHTTP tag to the encoding of the xml response.

 

<cfhttp url="#variables.feedURL#"

   charset="windows-1252"

>

</cfhttp>

 

 

You can achieve this by setting the encoding of the response using the charset parameter in the HTTP Header. Most languages allow you to set the content type which includes the charset:

Content-Type: text/html; charset=utf-8

See the documentation for the language for details on setting the response header. I think though that it is a good idea, when creating webservices that return xml responses to set the response to UTF-8 to avoid incompatibilities across systems.

Related Posts

This entry was posted in ColdFusion, Programming and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.