Every Unicode character is also assigned to a general character category in the Unicode Character Database.
RWUCharTraits provides a
GeneralCategory enum with values that identify the various categories, such as
UppercaseLetter,
LowercaseLetter,
DecimalDigitNumber,
LineSeparator,
ConnectorPunctuation, and so on. (See the documentation for
RWUCharTraits in the
SourcePro API Reference Guide for a complete list of enumerated values.) The values in this enumeration correspond to the general category property codes that appear in the Unicode Character Database, as described in:
The static method
RWUCharTraits::getGeneralCategory() returns the value in the
GeneralCategory enumeration that identifies the general character category associated with a given code point. Various convenience methods are also provided, which return
true if a given RWUChar32 represents a code point in a particular character category:
RWUCharTraits::isControl(),
RWUCharTraits::isError(),
RWUCharTraits::isLetter(),
RWUCharTraits::isPunctuation(),
RWUCharTraits::isSpace(), and
RWUCharTraits::isWhitespace(). The static method
getWhitespace() returns a null-terminated array of whitespace code points, as a convenience for use as delimiters (see
“Tokenizing”).