Every Unicode character is also assigned to a general character category in the Unicode Character Database.
RWUCharTraits provides a
GeneralCategory enum with values that identify the various categories, such as
UppercaseLetter,
LowercaseLetter,
DecimalDigitNumber,
LineSeparator,
ConnectorPunctuation, and so on. (See the documentation for
RWUCharTraits in the
SourcePro C++ API Reference Guide for a complete list of enumerated values.) The values in this enumeration correspond to the general category property codes that appear in the Unicode Character Database, as described in:
The static method
RWUCharTraits::getGeneralCategory() returns the value in the
GeneralCategory enumeration that identifies the general character category associated with a given code point. Various convenience methods are also provided, which return
true if a given RWUChar32 represents a code point in a particular character category:
RWUCharTraits::isControl(),
RWUCharTraits::isError(),
RWUCharTraits::isLetter(),
RWUCharTraits::isPunctuation(),
RWUCharTraits::isSpace(), and
RWUCharTraits::isWhitespace(). The static method
getWhitespace() returns a null-terminated array of whitespace code points, as a convenience for use as delimiters (see
“Tokenizing”).