rwlogo
SourcePro C++ 12.0

SourcePro® C++ API Reference Guide



   SourcePro C++
Documentation Home

RWUTF8Helper Class Reference
[Streams]

Provides common functionality used to encode and decode UTF-8 sequences. More...

#include <rw/stream/RWUTF8Helper.h>

List of all members.

Public Types

enum  EncodingCategory {
  oneByte, twoBytes, threeBytes, fourBytes,
  highSurrogate, missingLowSurrogate, lowSurrogateWithoutHighSurrogate, invalidUTF8Encoding
}

Static Public Member Functions

static EncodingCategory encodeOneUChar (RWUChar uc, RWByte *res, RWUChar highSurrogateValue=0)
static EncodingCategory decodeFirstByte (RWByte b)
static EncodingCategory decodeTwoBytesEncoding (RWByte firstByte, RWByte secondByte, RWUChar &res)
static EncodingCategory decodeThreeBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWUChar &res)
static EncodingCategory decodeFourBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWByte fourthByte, RWUChar &highSurrogateValue, RWUChar &lowSurrogateValue)

Detailed Description

The class RWUTF8Helper provides common functionality used to encode and decode UTF-8 sequences.


Member Enumeration Documentation

 

Enumerator:
oneByte 

One byte encoding form of UTF-8.

twoBytes 

Two bytes encoding form of UTF-8.

threeBytes 

Three bytes encoding form of UTF-8.

fourBytes 

Four bytes encoding from of UTF-8.

highSurrogate 

The character to be encoded is a high surrogate.

missingLowSurrogate 

No low surrogate after a high surrogate.

lowSurrogateWithoutHighSurrogate 

A low surrogate was not preceded by a high surrogate.

invalidUTF8Encoding 

The encoding is not recognized as UTF-8.


Member Function Documentation

static EncodingCategory RWUTF8Helper::decodeFirstByte ( RWByte  b  )  [static]

Takes the first byte of a UTF-8 byte sequence encoding a single UTF-16 character, and returns the encoding category to which it belongs. Throws no exceptions.

Parameters:
b The first byte of a UTF-8 byte sequence encoding a single UTF-16 character
static EncodingCategory RWUTF8Helper::decodeFourBytesEncoding ( RWByte  firstByte,
RWByte  secondByte,
RWByte  thirdByte,
RWByte  fourthByte,
RWUChar highSurrogateValue,
RWUChar lowSurrogateValue 
) [static]

Decodes a four-byte UTF-8 sequence. The function returns invalidUTF8Encoding in case the four-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters:
firstByte The first byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
secondByte The second byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
thirdByte The third byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
fourthByte The fourth byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
highSurrogateValue The UTF-16 high surrogate resulting from the decoding of the four-byte UTF-8 sequence.
lowSurrogateValue The UTF-16 low surrogate resulting from the decoding of the four-byte UTF-8 sequence.
static EncodingCategory RWUTF8Helper::decodeThreeBytesEncoding ( RWByte  firstByte,
RWByte  secondByte,
RWByte  thirdByte,
RWUChar res 
) [static]

Decodes a three-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding if the three-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters:
firstByte The first byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
secondByte The second byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
thirdByte The third byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
res The UTF-16 character resulting from the decoding of the three-byte UTF-8 sequence
static EncodingCategory RWUTF8Helper::decodeTwoBytesEncoding ( RWByte  firstByte,
RWByte  secondByte,
RWUChar res 
) [static]

Decodes a two-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding in case the two-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters:
firstByte The first byte of a UTF-8 two-byte sequence encoding a single UTF-16 character.
secondByte The second byte of a UTF-8 two-byte sequence encoding a single UTF-16 character.
res The UTF-16 character resulting from the decoding of the two-byte UTF-8 sequence
static EncodingCategory RWUTF8Helper::encodeOneUChar ( RWUChar  uc,
RWByte res,
RWUChar  highSurrogateValue = 0 
) [static]

Encodes the UTF-16 character uc according to UTF-8. The function returns the UTF-8 encoding category that was used to convert the UTF-16 character, or an error if the UTF-16 character could not be transformed. Throws no exceptions.

Parameters:
uc The UTF-16 character to be transformed.
res A pointer to a byte array containing at least four bytes. The byte array is used to store the transformation result.
highSurrogateValue This parameter is only used when a high surrogate was previously encountered.
 All Classes Functions Variables Typedefs Enumerations Enumerator Friends

© Copyright Rogue Wave Software, Inc. All Rights Reserved.
Rogue Wave and SourcePro are registered trademarks of Rogue Wave Software, Inc. in the United States and other countries. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.