Rogue Wave banner
Previous fileTop of DocumentContentsIndex pageNext file
Internationalization Module User's Guide
Rogue Wave web site:  Home Page  |  Main Documentation Page

5.3 Normalization Forms

A normalization form produces a unique representation for any given string. The two types of character equivalence described in Section 5.2 give rise to four normalization forms, as defined by the Unicode Standard Annex #15, Unicode Normalization Forms:

http://www.unicode.org/unicode/reports/tr15/

The four normalization forms are:

Two of the normalization forms, NFD and NFKD, replace composite characters with their canonical decompositions. The other two forms, NFC and NFKC, perform the opposite operation: they replace sequences of characters with canonical composites, where possible.

Two of the normalization forms, NFD and NFC, do not affect compatibility characters. These normalization forms are non-lossy; that is, a string may be converted to NFD or NFC with no loss of information. The other two forms, NFKD and NFKC, replace compatibility characters with their nominal equivalents. As compatibility characters may differ in appearance from their nominal equivalents, information may be lost in converting a string to NFKD or NFKC. In other words, converting to NFKD or NFKC is a lossy operation.



Previous fileTop of DocumentContentsNo linkNext file

Copyright © Rogue Wave Software, Inc. All Rights Reserved.

The Rogue Wave name and logo, and SourcePro, are registered trademarks of Rogue Wave Software. All other trademarks are the property of their respective owners.
Provide feedback to Rogue Wave about its documentation.