Valid Code Points
The range of Unicode code points is
0x0 to
0x10FFFF. However, some values within this range are reserved and are not valid characters.
RWUCharTraits provides the static method
RWUCharTraits::isCharacter(), which returns
true if a given
RWUChar32 value is a valid Unicode character code point.
RWUCharTraits::isDefined() returns
true if a given value
RWUChar32 is defined as the code point for a named character in the Unicode Character Database. A defined character is assigned various properties under the Unicode Standard. These properties can be accessed using other methods provided by
RWUCharTraits, as described in the following sections.
RWUCharTraits::isCharacter()tests whether a code point is valid, and hence may be a defined character. RWUCharTraits::isDefined() tests whether a code point has already been defined.
Surrogate Pairs
In UTF-16, most Unicode characters can be represented with a single 16-bit code unit. Only characters in the range 0x10000 to 0x10FFFF must be represented with a
surrogate pair of two UTF-16 code units.
RWUCharTraits provides the static method
RWUCharTraits::requiresSurrogatePair(), which returns
true if a given
RWUChar32 code point requires a surrogate representation.
Similarly,
RWUCharTraits::isHighSurrogate() returns
true if a given
RWUChar16 code unit is the first, or high, code unit of a surrogate pair. A high surrogate has a value in the range
U+D800 to
U+DBFF. The function
RWUCharTraits::isLowSurrogate() returns
true if a given
RWUChar16 code unit is the second, or low, code unit of a surrogate pair. A low surrogate has a value in the range
U+DC00 to
U+DFFF. The method
RWUCharTraits::isSurrogate() returns
true if a given
RWUChar16 is a surrogate in the range
U+D800 to
U+DFFF. Surrogates are not characters themselves; they are reserved for use as the low or high code unit in a surrogate pair.
Finally,
RWUCharTraits::isSingle() returns
true if a given
RWUChar16 code unit corresponds to a single code point, or
false if the value is part of a surrogate pair.