Module: Internationalization Module Group: Unicode String Processing
Does Not Inherit
advanceCodePoints() data() difference_type iterator_category |
operator operator size_t() operator*() operator++() |
operator--() operator=() pointer reference |
RWUStringIterator() size_type value_type |
#include <rw/i18n/RWUStringIterator.h>
RWUStringIterator provides read-write access to the code points encoded by the code units within an RWUString. Code points within a given string are accessed in forward or reverse order, starting from the beginning or end of the string, respectively.
Access the beginning of a string using RWUString::beginCodePointIterator(). Use operator*() on a begin iterator to return the first code point in the string.
Access the end of a string using RWUString::endCodePointIterator(). The end iterator references the location just after the last code point in the string.
Use RWUString::beginCodePointIterator() and RWUString::endCodePointIterator() and the comparison operators to determine whether an iterator has reached the beginning or end of the string.
Attempting to dereference or advance an iterator that is positioned past the end of the string, such as the iterator returned by RWUString::endCodePointIterator(), throws RWBoundsErr.
The code points in an RWBasicUString can be changed using an RWUStringIterator. For example, consider an RWUStringIterator iter, and a code point stored in an RWUChar32 named codePoint. This statement changes the code point referenced by iter to the value in codePoint:
*iter = codePoint;
Note that this operation may change the length of the original string if a surrogate pair (two 16-bit code units) is replaced with a single 16-bit code point, or vice-versa.
Bounds Checking
Bounds checking always occurs when repositioning or dereferencing an iterator. Any attempt to dereference an iterator that is in the past-the-end condition throws RWBoundsErr.
Conversion Errors
An RWUStringIterator converts code units into code points. Code points with scalar values greater than 0xFFFF are encoded as a pair of surrogate code units within the string. An RWUStringIterator cannot advance over or dereference an incomplete surrogate pair. RWUStringIterator throws RWConversionErr if an incomplete surrogate pair is encountered.
#include <rw/i18n/RWUStringIterator.h> #include <rw/i18n/RWUString.h> #include <iostream> int main() { // "Hello,\n world.\n"; RWUChar16 buf[16] = { 0x0048, 0x0065, 0x006c, 0x006c, 0x006f, 0x002c, 0x000d, 0x0020, 0x0077, 0x006f, 0x0072, 0x006c, 0x0064, 0x002e, 0x000d, 0x0000 }; // Create the RWUString from the buffer RWUString str(buf); RWUStringIterator it = str.beginCodePointIterator(); // Replace all carriage return characters with NULL characters for (; it != str.endCodePointIterator(); ++it) { if (*it == RWUChar32(0x000d)) *it = RWUChar32(0x0000); } // for // Iterate forward over all code points in the string for (it = str.beginCodePointIterator(); it != str.endCodePointIterator(); ++it) { std::cout << std::hex << int(*it) << std::endl; } // for return 0; } // main Results: ======== 48 65 6c 6c 6f 2c 0 20 77 6f 72 6c 64 2e 0
RWBasicUString, RWUString, RWUConstStringIterator
typedef int difference_type;
Declares a conventional Standard C++ alias for the type used to represent iterator offsets and differences.
typedef RW_SL_STD(bidirectional_iterator_tag) iterator_category;
Declares this class to be a Standard C++ bidirectional iterator.
typedef value_type* pointer;
Declares a conventional Standard C++ alias for the value_type pointer.
typedef value_type& reference;
Declares a conventional Standard C++ alias for the value_type reference.
typedef size_t size_type;
Declares a conventional Standard C++ alias for the type used to represent sizes and indices.
typedef RWUChar32 value_type;
Declares a conventional Standard C++ alias for the value type of returned by this iterator class.
bool operator==(const RWUStringIterator& lhs, const RWUStringIterator& rhs);
Returns true if both iterators reference the same RWBasicUString, and are positioned at the same code point offset within the RWBasicUString; otherwise, false.
bool operator==(const RWUStringIterator& lhs, const RWUConstStringIterator& rhs);
Returns true if both iterators reference the same RWBasicUString, and are positioned at the same code point offset within RWBasicUString; otherwise, false.
bool operator!=(const RWUStringIterator& lhs, const RWUStringIterator& rhs);
Returns true if the two iterators reference different RWBasicUString objects, or if the code point offsets referenced by the iterators are different; otherwise, false.
bool operator!=(const RWUStringIterator& lhs, const RWUConstStringIterator& rhs);
Returns true if the two iterators reference different RWBasicUString objects, or if the code point offsets referenced by the iterators are different; otherwise, false.
RWUStringIterator();
Constructs a null, non-dereferencable iterator. The instance cannot be used until a dereferencable iterator is assigned to it. Any attempt to reposition or dereference a null iterator throws RWBoundsErr.
RWUStringIterator(RWBasicUString& ustr);
Constructs an RWUStringIterator that is positioned at the first element of ustr, or past-the-end of ustr if ustr is empty.
RWUStringIterator(const RWUStringIterator& iter);
Constructs an RWUStringIterator that references the same string and offset as iter.
RWUStringIterator& operator=(const RWUStringIterator& iter);
Assignment operator. Changes self so it references the same string and offset as iter, and returns a reference to self.
RWUStringIterator& operator++();
Advances self to the next code point in the string and returns a reference to self.
Throws RWBoundsErr if self is already positioned one past the last code point in the string, or if self is null. The iterator position is left unchanged.
Throws RWConversionErr if the code unit sequence found at the current iterator position contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing. Self is left pointing at the code unit immediately following the unpaired surrogate code unit, or one past the last code unit of the string if the unpaired surrogate is the last code unit in the string.
RWUStringIterator operator++(int);
Advances self to the next code point in the string, and returns a copy of the previous value of self.
Throws RWBoundsErr if self is already positioned one past the last code point in the string, or if self is null. The iterator position is left unchanged.
Throws RWConversionErr if the code unit sequence found at the current iterator position contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing. Self is left pointing at the code unit immediately following the unpaired surrogate code unit, or one past the last code unit of the string if the unpaired surrogate is the last code unit in the string.
RWUStringIterator& operator--();
Advances self to the previous code point in the string and returns a reference to self.
Throws RWBoundsErr if self is already positioned at the beginning of the string, or if self is null. The iterator position is left unchanged.
Throws RWConversionErr if the code unit sequence found at the current iterator position contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing. Self is left pointing at the unpaired code unit.
RWUStringIterator operator--(int);
Advances self to the previous code point in the string and returns a copy of the previous value of self.
Throws RWBoundsErr if self is already positioned at the beginning of the string, or if self is null. The iterator position is left unchanged.
Throws RWConversionErr if the code unit sequence found at the current iterator position contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing. Self is left pointing at the unpaired code unit.
RWUChar32 operator*() const;
Returns the value of the code point currently referenced by self.
Throws RWBoundsErr if self is null or otherwise references a position outside the bounds of the string.
Throws RWConversionErr if the code unit sequence found at the current iterator position contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing.
RWUChar32Reference operator*();
Returns a reference object that provides read-write access to the code point located at the current iterator position. Invoked when dereferencing non-const RWUStringIterator objects. The result can be used as an RWUChar32 value, or as an l-value in an RWUChar32 assignment expression.
This method does not throw any exceptions, even if self is non-dereferencable or refers to a malformed surrogate pair. However, the reference object throws RWBoundsErr or RWConversionErr if self is used as an r-value or l-value in an expression and these conditions are detected.
operator size_t() const;
Returns the current code unit offset of self.
void advanceCodePoints(int offset);
Advances self by offset code points.
Throws RWBoundsErr if self is advanced before the first element or after one past the last element in the string. If offset is negative, the iterator is left pointing at the first element in the string. If offset is positive, the iterator is left pointing one past the last element in the string. RWBoundsErr is also thrown if self is null.
Throws RWConversionErr if an incomplete surrogate pair is encountered where either the high surrogate or the low surrogate code unit is missing. If offset is negative, the iterator is positioned at the code unit immediately preceding the un-paired surrogate code unit, or at the first code unit of the string if the un-paired surrogate is the first code unit in the string. If offset is positive, the iterator is positioned at the code unit immediately following the un-paired surrogate code unit, or one past the last code unit of the string if the un-paired surrogate is the last code unit in the string.
const RWUChar16* data() const;
Returns a pointer to the string contents at the location referenced by self.
The storage referenced by this pointer is owned by the RWBasicUString associated with this iterator. This storage may not be deleted or modified. The pointer becomes invalid if the RWBasicUString is modified or destroyed.
RWUStringIterator::RWUChar32Reference provides transparent read-write access to a code point referenced by an RWUStringIterator. The code points are made available for read access as RWUChar32 values via the operator RWUChar32() conversion.
The code points are made available for write access using two assignment operators. One assignment operator accepts an RWUChar16 value while the other accepts an RWUChar32 value. Both assignment operators properly update the state of both the referenced RWUString and the iterator from which the reference was created. Specifically, the assignment operators ensure that if an assignment is made that changes the length of the referenced RWUString, the RWUString will be updated, and the RWUStringIterator will be updated so as not to be invalidated by the operation. Note, however, that such an assignment may invalidate other iterators, if it changes the number of code units in the string.
RWUStringIterator::RWUChar32Reference objects are obtained from the non-const dereference operator on RWUStringIterator, and are typically used anonymously. For example:
RWUChar16 buf[16] = { 0x0048, 0x0065, 0x006c, 0x006c, 0x006f, 0x002c, 0x000d, 0x0020, 0x0077, 0x006f, 0x0072, 0x006c, 0x0064, 0x002e, 0x000d, x0000 }; RWUString str(buf); RWUStringIterator it = str.beginCodePointIterator(); // Replace all carriage return characters with NULL characters for (; it != str.endCodePointIterator(); ++it) { if (*it == RWUChar32(0x000d)) *it = RWUChar32(0x0000); }
Public Member Operators
operator RWUChar32() const;
Returns the value of the code point referenced by self.
Throws RWBoundsErr if the iterator associated with self is invalid. Throws RWConversionErr if the iterator associated with self points to a code unit sequence that contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing.
RWUChar32Reference& operator=(RWUChar32 codePoint);
Replaces the code point referenced by self with the value given for codePoint and returns a reference to self.
If the operation changes the length of the original RWUString, then both the RWUString and the RWUStringIterator are updated to reflect the change. The iterator is not invalidated. Note, however, that such an assignment may invalidate other iterators, if it changes the number of code units in the string.
Throws RWBoundsErr if the iterator associated with self is invalid. Throws RWConversionErr if the iterator associated with self points to a code unit sequence that contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing.
RWUChar32Reference& operator=(RWUChar16 codeUnit);
Replaces the code point referenced by self with the value given for codeUnit and returns a reference to self.
If the operation changes the length of the original RWUString, then both the RWBasicUString and the RWUStringIterator are updated to reflect the change. The iterator is not invalidated. Note, however, that such an assignment may invalidate other iterators, if it changes the number of code units in the string.
Throws RWBoundsErr if the iterator associated with self is invalid. Throws RWConversionErr if the iterator associated with self points to a code unit sequence that contains an incomplete surrogate pair where either the high surrogate or the low surrogate code unit is missing.
© Copyright Rogue Wave Software, Inc. All Rights Reserved.
Rogue Wave and SourcePro are registered trademarks of Rogue Wave Software, Inc. in the United States and other countries. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.