SourcePro® API Reference Guide

 
List of all members | Public Member Functions

Searches text for occurrences of a specified Unicode string. More...

#include <rw/i18n/RWUStringSearch.h>

Public Member Functions

 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWUCollator &collator)
 
 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWUCollator &collator, const RWUBreakSearch &breakSearch)
 
 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWULocale &locale=RWULocale::getDefault(), RWUBreakSearch::BreakType breakType=RWUBreakSearch::CodePoint)
 
 RWUStringSearch (const RWUStringSearch &source)
 
 ~RWUStringSearch ()
 
void clearBreakSearch (void)
 
RWUConstStringIterator current (void) const
 
RWUConstStringIterator first (void)
 
RWUBreakSearch::BreakType getBreakType () const
 
const RWUCollatorgetCollator (void) const
 
RWUConstSubString getMatch (void) const
 
size_t getMatchLength (void) const
 
RWUConstStringIterator getMatchStart (void) const
 
RWUString getPattern (void) const
 
const RWUStringgetString (void) const
 
bool isMatch (const RWUConstStringIterator &position)
 
RWUConstStringIterator last (void)
 
RWUConstStringIterator next (void)
 
RWUConstStringIterator next (const RWUConstStringIterator &position)
 
RWUStringSearchoperator= (const RWUStringSearch &rhs)
 
RWUConstStringIterator previous (void)
 
RWUConstStringIterator previous (const RWUConstStringIterator &position)
 
size_t replace (RWUString &str, const RWUString &replacement, size_t occurrences=1)
 
void setBreakSearch (const RWUBreakSearch &bSearch)
 
void setCollator (const RWUCollator &collator)
 
void setPattern (const RWUString &pattern)
 
void setString (const RWUString &text)
 

Detailed Description

RWUStringSearch searches text for occurrences of a specified pattern string. The pattern string is not a pattern in the sense of a regular expression (see RWURegularExpression), but rather a string to be searched for.

RWUStringSearch allows for flexible, collation-based string searches, unlike searches performed by RWUString::index() and RWUString::subString(). RWUString uses simple bit-wise comparisons of the code units in the strings, but RWUStringSearch employs the rules encapsulated by an RWUCollator and an optional RWUBreakSearch to determine if and where a match occurs.

RWUStringSearch provides a number of options to search for occurrences of the pattern string in a text string:

With iterator-style searches, RWUStringSearch, like RWUBreakSearch, maintains a "current" position within the source string. A call to first() or last() sets the current position to the code unit offset just past that of the first or last match, respectively, and returns the location of the beginning of the match. Method next() advances the current position to the code unit offset immediately following that of the next match, and returns the location of the new match. Method previous() moves the current position to the beginning of the previous non-overlapping match, and returns the location of the new match.

Example
#include <rw/i18n/RWUStringSearch.h>
#include <rw/i18n/RWUCollator.h>
#include <rw/i18n/RWUConversionContext.h>
#include <iostream>
using std::cout;
using std::endl;
int
main()
{
// Indicate that source and target strings are
// encoded as UTF-8.
RWUConversionContext context("UTF-8");
// Create a pattern for which to search.
RWUString pattern("UTF-8");
// Create a string in which to search.
RWUString text("Utf8 serializes a Unicode code point "
"as a sequence of one to four bytes. Table 3-1 of "
"The Unicode Standard shows the bit distribution used "
"in utf-8.");
// Create a collator based on the "en" locale
// that ignores differences in diacritics, case,
// and punctuation.
RWUCollator collator("en");
collator.setStrength(RWUCollator::Primary);
collator.enablePunctuationShifting(true);
// Create an RWUStringSearch that will search for
// occurrences of `pattern' in `text', according to
// the string comparison rules contained within
// `collator'.
RWUStringSearch searcher(pattern, text, collator);
// Count the number of occurrences of `pattern'.
int count = 0;
while (searcher.next() != text.endCodePointIterator()) ++count;
cout << "Pattern was found " << count << " times." << endl;
return 0;
} // main

Program output:

Pattern was found 2 times.
See also
RWUString, RWURegularExpression, RWUBreakSearch

Constructor & Destructor Documentation

RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWUCollator collator 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text, using the string comparison rules encapsulated by collator. A distinct (deep) copy is made of the pattern string and the collator, but only a reference to the text string is stored.

Exceptions
RWUExceptionThrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWUCollator collator,
const RWUBreakSearch breakSearch 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text, using the string comparison rules encapsulated by collator. A substring is considered a match only if it falls on boundaries returned by breakSearch. This makes it possible, for example, to search for entire words or entire sentences.

A distinct (deep) copy is made of the pattern string, collator, and breakSearch, but only a reference to the text string is stored.

Exceptions
RWUExceptionThrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWULocale locale = RWULocale::getDefault(),
RWUBreakSearch::BreakType  breakType = RWUBreakSearch::CodePoint 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text. String comparisons are performed using a collator, instantiated on the specified locale, at the default strength for that locale. A substring is considered a match only if it falls on boundaries returned by an instance of RWUBreakSearch with the specified breakType.

A distinct (deep) copy is made of the pattern string, but only a reference to the text string is stored.

Exceptions
RWUExceptionThrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUStringSearch source)

Copy constructor. Constructs an RWUStringSearch that is a deep copy of source, with the exception of the search string. Self refers to the same search string referred to by source. The current position of self is set to that of source.

Exceptions
RWUExceptionThrown to report construction errors.
RWUStringSearch::~RWUStringSearch ( )

Destructor.

Member Function Documentation

void RWUStringSearch::clearBreakSearch ( void  )

Stops the use of break search in matching by clearing any association between self and an RWUBreakSearch. This operation does not re-initialize or reset self.

RWUConstStringIterator RWUStringSearch::current ( void  ) const

Returns an RWUConstStringIterator to the current position maintained by self, just after the prior match.

RWUConstStringIterator RWUStringSearch::first ( void  )

Sets self's current position to just after the first pattern match within its source string, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator.

Exceptions
RWUExceptionThrown to report any errors.
RWUBreakSearch::BreakType RWUStringSearch::getBreakType ( ) const

Returns the type of break searcher in use by this RWUStringSearch instance.

const RWUCollator& RWUStringSearch::getCollator ( void  ) const

Returns a const reference to the RWUCollator that self is using to perform string comparisons.

RWUConstSubString RWUStringSearch::getMatch ( void  ) const

Returns an RWUConstSubString containing the current match. The current match is the string matched by the last call to first(), last(), next(), or previous(). Returns an empty string if no match has yet been found, or if there are no remaining matches.

size_t RWUStringSearch::getMatchLength ( void  ) const

Returns the length in code units of the string matched by the last call to first(), last(), next(), or previous(). Returns 0 if no match has yet been found, or if there are no remaining matches.

RWUConstStringIterator RWUStringSearch::getMatchStart ( void  ) const

Returns a const string iterator to the position of the string matched by the last call to first(), last(), next(), or previous(). Returns the end iterator of the source string if no match has yet been found, or if there are no remaining matches.

RWUString RWUStringSearch::getPattern ( void  ) const

Returns a copy of the pattern string associated with self.

const RWUString& RWUStringSearch::getString ( void  ) const

Returns a copy of the text string over which self is searching for its pattern.

bool RWUStringSearch::isMatch ( const RWUConstStringIterator position)

Returns true if specified position in the search string begins a match for the pattern string; otherwise, false.

Exceptions
RWUExceptionThrown to report any errors.
RWUConstStringIterator RWUStringSearch::last ( void  )

Sets self's current position to that of the last pattern match within its source string, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator.

Exceptions
RWUExceptionThrown to report any errors.
RWUConstStringIterator RWUStringSearch::next ( void  )

Finds the position of the match that appears after the current position. Sets self's new current position to just beyond the end of the match, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for iteration over a set of breaks in a string.

Exceptions
RWUExceptionThrown to report any errors.
RWUConstStringIterator RWUStringSearch::next ( const RWUConstStringIterator position)

Finds the position of the match that appears after the specified position. Sets self's new current position to just beyond the end of the match, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for iteration over a set of breaks in a string.

Exceptions
RWUExceptionThrown to report any errors.
RWUStringSearch& RWUStringSearch::operator= ( const RWUStringSearch rhs)

Makes self a deep copy of rhs, with the exception of the text, or search, string. Self refers to the same search string referred to by rhs. The current position of self is set to that of the rhs.

This operation is exception safe; if an exception is thrown, then the state of self remains the same as it was prior to the attempt to perform the assignment operation

Exceptions
std::bad_allocThrown if memory resources are exhausted.
RWUExceptionThrown to report other assignment errors.
RWUConstStringIterator RWUStringSearch::previous ( void  )

Finds the position of the match that appears fully before the current position. Sets self's new current position to that of the match, and returns the location of the match.

Note that only the nearest match that appears entirely before the specified position is returned. For example, assume that the pattern is the, the search string is thethe, and the current position is 4. Although a match occurs at position 3, the nearest offset prior to offset 4 at which an entire match can be found is position 0. Therefore, position 0 is returned.

If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for backward iteration over a set of breaks in a string.

Exceptions
RWUExceptionThrown to report any errors.
RWUConstStringIterator RWUStringSearch::previous ( const RWUConstStringIterator position)

Finds the position of the match that appears fully before the specified position. Sets self's new current position to that of the match, and returns the location of the match.

Note that only the nearest match that appears entirely before the specified position is returned. For example, assume that the pattern is the, the search string is thethe, and the current position is 4. Although a match occurs at position 3, the nearest offset prior to offset 4 at which an entire match can be found is position 0. Therefore, position 0 is returned.

If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for backward iteration over a set of breaks in a string.

Exceptions
RWUExceptionThrown to report any errors.
size_t RWUStringSearch::replace ( RWUString str,
const RWUString replacement,
size_t  occurrences = 1 
)

Searches string str for matches with the pattern stored in self. Each match is replaced with the replacement string, up to the specified number of occurrences. The default number of occurrences to replace is one. To replace all occurrences of the pattern, specify 0 occurrences. Returns the number of occurrences actually replaced.

void RWUStringSearch::setBreakSearch ( const RWUBreakSearch bSearch)

Sets the RWUBreakSearch that self uses to determine if a pattern falls between two boundary locations. A distinct (deep) copy of the break searcher is saved. Self is not re-initialized, and the current position is not reset.

Exceptions
RWUExceptionThrown to report any errors.
void RWUStringSearch::setCollator ( const RWUCollator collator)

Sets the RWUCollator that self uses to perform string comparisons. A distinct (deep) copy of the collator is saved. Self is re-initialized with the new collator, and the existing pattern string, search string, and break searcher. The current position is reset.

Exceptions
RWUExceptionThrown to report any errors.
void RWUStringSearch::setPattern ( const RWUString pattern)

Sets the pattern string for self. A distinct (deep) copy of pattern is saved. Self is re-initialized with the new pattern string, and the existing search string, collator, and break searcher. The current position is reset.

Exceptions
RWUExceptionThrown to report any errors.
void RWUStringSearch::setString ( const RWUString text)

Sets the text string over which self searches for its pattern. Self is re-initialized with the new string, and the current pattern, collator, and break searcher. The current position is reset. Only a reference to the new search string is stored.

Exceptions
RWUExceptionThrown to report any errors.

Copyright © 2023 Rogue Wave Software, Inc., a Perforce company. All Rights Reserved.