Rogue Wave banner
Previous fileTop of DocumentContentsIndex pageNext file
HydraExpress Web Service Development Guide

19.4 XML and Character Encodings Concepts

This section briefly introduces some concepts useful in working with HydraExpress and XML documents in various character encodings.

19.4.1 What is a Character Encoding?

A character encoding -- or more formally a "coded character set" -- is a character set and its numerical representation.

If your XML document's character encoding is anything other than UTF-8, you can use HydraExpress' international capabilities to convert it to and from your own encoding in order to manipulate it in the encoding of your choice.

The related code examples on internationalization refer to the Unicode encoding forms UTF-8 and UTF-16, as they are used internally by HydraExpress to manipulate text and convert XML documents between UTF-8 and other encodings.

19.4.2 Character Encoding in an XML Prolog

An XML document always starts with a prolog. The prolog describes the contents of the document including its character encoding. The following prolog contains a mandatory version number and the optional encoding declaration.

The entire contents of the XML document following the "EncodingDecl" section of the XML prolog must be in the specified character set. This includes everything in the message: URIs, end-of-line characters, whitespace, etc.

For example, in the XML fragment above, all characters following the "?>" must be in the Shift-JIS encoding. For more information on XML Declarations see the XML 1.0 specification at http://www.w3.org/TR/REC-xml#sec-prolog-dtd.



Previous fileTop of DocumentContentsNo linkNext file

Copyright © Rogue Wave Software, Inc. All Rights Reserved.

The Rogue Wave name and logo are registered trademarks of Rogue Wave Software, and HydraExpress is a trademark of Rogue Wave Software. All other trademarks are the property of their respective owners.
Provide feedback to Rogue Wave about its documentation.