3.5 Iterating Over Strings
The Internationalization Module provides standard iterators that represent a location within an
RWUString, and provides access to the code point at that location:
RWUStringIterator instances are returned from non-const
RWUString objects.
RWUConstStringIterator instances are returned from const
RWUString objects, and from instances of classes such as
RWUBreakSearch (
Chapter 7). These classes return
RWUConstStringIterator instances to prevent the
RWUString from being changed during processing.
You can construct an
RWUConstStringIterator from a given
RWUStringIterator, but there is no corresponding conversion from
RWUConstStringIterator to
RWUStringIterator. This preserves
const correctness for the iterator classes.
To iterate over the code units in an RWUString, you can use the operator[] interface inherited from RWBasicUString.
3.5.1 Accessing Code Points with Iterators
Code points within a given
RWUString are accessed in forward or reverse order. You can access the beginning of a string using the
RWUString::beginCodePointIterator() method. Use
operator*() on an iterator positioned at
RWUString::beginCodePointIterator() to return the first code point in the string.
You can access the end of a string using the
RWUString::endCodePointIterator() method. An iterator positioned at
RWUString::endCodePointIterator() references the location just after the last code point in the string. Attempts to reference the code point from an iterator positioned at
RWUString::endCodePointIterator() throw an
RWUException with status code
RWUIndexOutOfBoundsError. (See Chapter 9 for a discussion of error handling in the Internationalization Module.) Similarly, attempts to advance an iterator beyond
RWUString::endCodePointIterator() also throw the same type of exception.
For example, assuming
str is a non-const
RWUString, the following code iterates forward over all code points in the string:
for (RWUConstStringIterator it = str.beginCodePointIterator();
it != str.endCodePointIterator();
++it) {
// Do something with *it here
}
This code iterates backward over all code points in the string:
for (RWUConstStringIterator it = str.endCodePointIterator();
it != str.beginCodePointIterator(); ) {
--it;
// Do something with *it here
}
3.5.2 Modifying Code Points with Iterators
Code points in an
RWUString can be changed using an
RWUStringIterator. For example, consider an
RWUStringIterator it, and a code point stored in an RWUChar32 named
cp. The statement changes the code point referenced by
it to the value in
cp:
*it = cp;
Note that this operation may change the code unit length of the original
RWUString if a surrogate pair is replaced with a code point represented by a single code unit, or vice-versa.
The following code replaces all carriage return characters in an
RWUString str with
NULL characters:
for (RWUStringIterator it = str.beginCodePointIterator();
it != str.endCodePointIterator();
++it) {
if (RWUChar32(0x000d) == *it) *it = RWUChar32(0x0000);
}
Code points in an
RWUString cannot be changed using an
RWUConstStringIterator. This class provides read-only access to an
RWUString.