Internationalization Module User’s Guide : Chapter 3 Character and String Processing : Character Properties : Valid Code Points
Valid Code Points
The range of Unicode code points is 0x0 to 0x10FFFF. However, some values within this range are reserved and are not valid characters. RWUCharTraits provides the static method RWUCharTraits::isCharacter(), which returns true if a given RWUChar32 value is a valid Unicode character code point.
RWUCharTraits::isDefined() returns true if a given value RWUChar32 is defined as the code point for a named character in the Unicode Character Database. A defined character is assigned various properties under the Unicode Standard. These properties can be accessed using other methods provided by RWUCharTraits, as described in the following sections.
RWUCharTraits::isCharacter()tests whether a code point is valid, and hence may be a defined character. RWUCharTraits::isDefined() tests whether a code point has already been defined.
Surrogate Pairs
In UTF-16, most Unicode characters can be represented with a single 16-bit code unit. Only characters in the range 0x10000 to 0x10FFFF must be represented with a surrogate pair of two UTF-16 code units. RWUCharTraits provides the static method RWUCharTraits::requiresSurrogatePair(), which returns true if a given RWUChar32 code point requires a surrogate representation.
Similarly, RWUCharTraits::isHighSurrogate() returns true if a given RWUChar16 code unit is the first, or high, code unit of a surrogate pair. A high surrogate has a value in the range U+D800 to U+DBFF. The function RWUCharTraits::isLowSurrogate() returns true if a given RWUChar16 code unit is the second, or low, code unit of a surrogate pair. A low surrogate has a value in the range U+DC00 to U+DFFF. The method RWUCharTraits::isSurrogate() returns true if a given RWUChar16 is a surrogate in the range U+D800 to U+DFFF. Surrogates are not characters themselves; they are reserved for use as the low or high code unit in a surrogate pair.
Finally, RWUCharTraits::isSingle() returns true if a given RWUChar16 code unit corresponds to a single code point, or false if the value is part of a surrogate pair.