Character Blocks
A character block is a grouping of related characters within the Unicode encoding space.
RWUCharTraits provides a
Block enum with values that identify the various blocks, such as the
BasicLatinBlock, the
GreekAndCopticBlock, the
BengaliBlock, the
ThaiBlock, the
EthiopicBlock, the
CherokeeBlock, and so on. The values in this enumeration correspond to the block names that appear in the Unicode Character Database, as described in Chapter 14, “Code Charts,” of the Unicode Standard.
The static method
RWUCharTraits::getBlock() returns the value in the
Block enumeration that identifies the character block containing the Unicode character with a given code point.
Character Scripts
Every Unicode character is assigned a script name in the Unicode Character Database. The script name associated with a code point is often a better basis for distinguishing characters than the block name. Blocks are simply code point ranges; characters from the same script may be in several different blocks, while characters from different scripts may be in the same block.
RWUCharTraits provides a
Script enum with values that identify the various scripts, such as
Latin,
Cyrillic,
Hebrew,
Tibetan,
Runic, and so on. The values in this enumeration correspond to the script property names defined in the Unicode Character Database, as described in Unicode Technical Report #24, “Script Names”:
http://www.unicode.org/unicode/reports/tr24
The static method
RWUCharTraits::getScript() returns the value in the
Script enumeration that identifies the script associated with a given code point.