SourcePro C++ 12.0 |
SourcePro® C++ API Reference Guide |
SourcePro C++ Documentation Home |
Provides common functionality used to encode and decode UTF-8 sequences. More...
#include <rw/stream/RWUTF8Helper.h>
Public Types | |
enum | EncodingCategory { oneByte, twoBytes, threeBytes, fourBytes, highSurrogate, missingLowSurrogate, lowSurrogateWithoutHighSurrogate, invalidUTF8Encoding } |
Static Public Member Functions | |
static EncodingCategory | encodeOneUChar (RWUChar uc, RWByte *res, RWUChar highSurrogateValue=0) |
static EncodingCategory | decodeFirstByte (RWByte b) |
static EncodingCategory | decodeTwoBytesEncoding (RWByte firstByte, RWByte secondByte, RWUChar &res) |
static EncodingCategory | decodeThreeBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWUChar &res) |
static EncodingCategory | decodeFourBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWByte fourthByte, RWUChar &highSurrogateValue, RWUChar &lowSurrogateValue) |
The class RWUTF8Helper provides common functionality used to encode and decode UTF-8 sequences.
static EncodingCategory RWUTF8Helper::decodeFirstByte | ( | RWByte | b | ) | [static] |
Takes the first byte of a UTF-8 byte sequence encoding a single UTF-16 character, and returns the encoding category to which it belongs. Throws no exceptions.
b | The first byte of a UTF-8 byte sequence encoding a single UTF-16 character |
static EncodingCategory RWUTF8Helper::decodeFourBytesEncoding | ( | RWByte | firstByte, | |
RWByte | secondByte, | |||
RWByte | thirdByte, | |||
RWByte | fourthByte, | |||
RWUChar & | highSurrogateValue, | |||
RWUChar & | lowSurrogateValue | |||
) | [static] |
Decodes a four-byte UTF-8 sequence. The function returns invalidUTF8Encoding in case the four-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.
firstByte | The first byte of a UTF-8 four-byte sequence encoding a single UTF-16 character. | |
secondByte | The second byte of a UTF-8 four-byte sequence encoding a single UTF-16 character. | |
thirdByte | The third byte of a UTF-8 four-byte sequence encoding a single UTF-16 character. | |
fourthByte | The fourth byte of a UTF-8 four-byte sequence encoding a single UTF-16 character. | |
highSurrogateValue | The UTF-16 high surrogate resulting from the decoding of the four-byte UTF-8 sequence. | |
lowSurrogateValue | The UTF-16 low surrogate resulting from the decoding of the four-byte UTF-8 sequence. |
static EncodingCategory RWUTF8Helper::decodeThreeBytesEncoding | ( | RWByte | firstByte, | |
RWByte | secondByte, | |||
RWByte | thirdByte, | |||
RWUChar & | res | |||
) | [static] |
Decodes a three-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding if the three-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.
firstByte | The first byte of a UTF-8 three-byte sequence encoding a single UTF-16 character. | |
secondByte | The second byte of a UTF-8 three-byte sequence encoding a single UTF-16 character. | |
thirdByte | The third byte of a UTF-8 three-byte sequence encoding a single UTF-16 character. | |
res | The UTF-16 character resulting from the decoding of the three-byte UTF-8 sequence |
static EncodingCategory RWUTF8Helper::decodeTwoBytesEncoding | ( | RWByte | firstByte, | |
RWByte | secondByte, | |||
RWUChar & | res | |||
) | [static] |
Decodes a two-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding in case the two-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.
firstByte | The first byte of a UTF-8 two-byte sequence encoding a single UTF-16 character. | |
secondByte | The second byte of a UTF-8 two-byte sequence encoding a single UTF-16 character. | |
res | The UTF-16 character resulting from the decoding of the two-byte UTF-8 sequence |
static EncodingCategory RWUTF8Helper::encodeOneUChar | ( | RWUChar | uc, | |
RWByte * | res, | |||
RWUChar | highSurrogateValue = 0 | |||
) | [static] |
Encodes the UTF-16 character uc according to UTF-8. The function returns the UTF-8 encoding category that was used to convert the UTF-16 character, or an error if the UTF-16 character could not be transformed. Throws no exceptions.
uc | The UTF-16 character to be transformed. | |
res | A pointer to a byte array containing at least four bytes. The byte array is used to store the transformation result. | |
highSurrogateValue | This parameter is only used when a high surrogate was previously encountered. |
© Copyright Rogue Wave Software, Inc. All Rights Reserved.
Rogue Wave and SourcePro are registered trademarks of Rogue Wave Software, Inc. in the United States and other countries. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.