rwlogo
SourcePro C++ 12.0

SourcePro® C++ API Reference Guide



   SourcePro C++
Documentation Home

RWUStringSearch Class Reference
[Unicode String Processing]

Searches text for occurrences of a specified Unicode string. More...

#include <rw/i18n/RWUStringSearch.h>

List of all members.

Public Member Functions

 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWUCollator &collator)
 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWUCollator &collator, const RWUBreakSearch &breakSearch)
 RWUStringSearch (const RWUString &pattern, const RWUString &text, const RWULocale &locale=RWULocale::getDefault(), RWUBreakSearch::BreakType breakType=RWUBreakSearch::CodePoint)
 RWUStringSearch (const RWUStringSearch &source)
 ~RWUStringSearch ()
RWUStringSearchoperator= (const RWUStringSearch &rhs)
const RWUStringgetString (void) const
void setString (const RWUString &text)
RWUString getPattern (void) const
void setPattern (const RWUString &pattern)
const RWUCollatorgetCollator (void) const
void setCollator (const RWUCollator &collator)
RWUBreakSearch::BreakType getBreakType () const
void setBreakSearch (const RWUBreakSearch &bSearch)
void clearBreakSearch (void)
RWUConstStringIterator first (void)
RWUConstStringIterator last (void)
RWUConstStringIterator next (void)
RWUConstStringIterator next (const RWUConstStringIterator &position)
RWUConstStringIterator previous (void)
RWUConstStringIterator previous (const RWUConstStringIterator &position)
RWUConstStringIterator current (void) const
RWUConstSubString getMatch (void) const
bool isMatch (const RWUConstStringIterator &position)
RWUConstStringIterator getMatchStart (void) const
size_t getMatchLength (void) const
size_t replace (RWUString &str, const RWUString &replacement, size_t occurrences=1)

Detailed Description

RWUStringSearch searches text for occurrences of a specified pattern string. The pattern string is not a pattern in the sense of a regular expression (see RWURegularExpression), but rather a string to be searched for.

RWUStringSearch allows for flexible, collation-based string searches, unlike searches performed by RWUString::index() and RWUString::subString(). RWUString uses simple bit-wise comparisons of the code units in the strings, but RWUStringSearch employs the rules encapsulated by an RWUCollator and an optional RWUBreakSearch to determine if and where a match occurs.

RWUStringSearch provides a number of options to search for occurrences of the pattern string in a text string:

With iterator-style searches, RWUStringSearch, like RWUBreakSearch, maintains a "current" position within the source string. A call to first() or last() sets the current position to the code unit offset just past that of the first or last match, respectively, and returns the location of the beginning of the match. Method next() advances the current position to the code unit offset immediately following that of the next match, and returns the location of the new match. Method previous() moves the current position to the beginning of the previous non-overlapping match, and returns the location of the new match.

Examples

 #include <rw/i18n/RWUStringSearch.h>
 #include <rw/i18n/RWUCollator.h>
 #include <rw/i18n/RWUConversionContext.h>
 #include <iostream>
 
 using std::cout;
 using std::endl;
 
 int
 main()
 {
   // Indicate that source and target strings are
   // encoded as UTF-8.
   RWUConversionContext context("UTF-8");
 
   // Create a pattern for which to search.
   RWUString pattern("UTF-8");
 
   // Create a string in which to search.
   RWUString text("Utf8 serializes a Unicode code point "
    "as a sequence of one to four bytes.  Table 3-1 of "
    "The Unicode Standard shows the bit distribution used "
    "in utf-8.");
 
   // Create a collator based on the "en" locale
   // that ignores differences in diacritics, case,
   // and punctuation.
   RWUCollator collator("en");
   collator.setStrength(RWUCollator::Primary);
   collator.enablePunctuationShifting(true);
 
   // Create an RWUStringSearch that will search for
   // occurances of `pattern' in `text', according to
   // the string comparison rules contained within
   // `collator'.
   RWUStringSearch searcher(pattern, text, collator);
 
   // Count the number of occurances of `pattern'.
   int count = 0;
   while (searcher.next() != text.endCodePointIterator()) ++count;
   cout << "Pattern was found " << count << " times." << endl;
 
   return 0;
 } // main

Program output:

 Pattern was found 2 times.
See also:
RWUString, RWURegularExpression, RWUBreakSearch

Constructor & Destructor Documentation

RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWUCollator collator 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text, using the string comparison rules encapsulated by collator. A distinct (deep) copy is made of the pattern string and the collator, but only a reference to the text string is stored.

Exceptions:
RWUException Thrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWUCollator collator,
const RWUBreakSearch breakSearch 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text, using the string comparison rules encapsulated by collator. A substring is considered a match only if it falls on boundaries returned by breakSearch. This makes it possible, for example, to search for entire words or entire sentences.

A distinct (deep) copy is made of the pattern string, collator, and breakSearch, but only a reference to the text string is stored.

Exceptions:
RWUException Thrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUString pattern,
const RWUString text,
const RWULocale locale = RWULocale::getDefault(),
RWUBreakSearch::BreakType  breakType = RWUBreakSearch::CodePoint 
)

Constructs an RWUStringSearch that searches for occurrences of pattern in text. String comparisons are performed using a collator, instantiated on the specified locale, at the default strength for that locale. A substring is considered a match only if it falls on boundaries returned by an instance of RWUBreakSearch with the specified breakType.

A distinct (deep) copy is made of the pattern string, but only a reference to the text string is stored.

Exceptions:
RWUException Thrown to report construction errors.
RWUStringSearch::RWUStringSearch ( const RWUStringSearch source  ) 

Copy constructor. Constructs an RWUStringSearch that is a deep copy of source, with the exception of the search string. Self refers to the same search string referred to by source. The current position of self is set to that of source.

Exceptions:
RWUException Thrown to report construction errors.
RWUStringSearch::~RWUStringSearch (  ) 

Destructor.


Member Function Documentation

void RWUStringSearch::clearBreakSearch ( void   ) 

Stops the use of break search in matching by clearing any association between self and an RWUBreakSearch. This operation does not re-initialize or reset self.

RWUConstStringIterator RWUStringSearch::current ( void   )  const

Returns an RWUConstStringIterator to the current position maintained by self, just after the prior match.

RWUConstStringIterator RWUStringSearch::first ( void   ) 

Sets self's current position to just after the first pattern match within its source string, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator.

Exceptions:
RWUException Thrown to report any errors.
RWUBreakSearch::BreakType RWUStringSearch::getBreakType (  )  const

Returns the type of break searcher in use by this RWUStringSearch instance.

const RWUCollator& RWUStringSearch::getCollator ( void   )  const

Returns a const reference to the RWUCollator that self is using to perform string comparisons.

RWUConstSubString RWUStringSearch::getMatch ( void   )  const

Returns an RWUConstSubString containing the current match. The current match is the string matched by the last call to first(), last(), next(), or previous(). Returns an empty string if no match has yet been found, or if there are no remaining matches.

size_t RWUStringSearch::getMatchLength ( void   )  const

Returns the length in code units of the string matched by the last call to first(), last(), next(), or previous(). Returns 0 if no match has yet been found, or if there are no remaining matches.

RWUConstStringIterator RWUStringSearch::getMatchStart ( void   )  const

Returns a const string iterator to the position of the string matched by the last call to first(), last(), next(), or previous(). Returns the end iterator of the source string if no match has yet been found, or if there are no remaining matches.

RWUString RWUStringSearch::getPattern ( void   )  const

Returns a copy of the pattern string associated with self.

const RWUString& RWUStringSearch::getString ( void   )  const

Returns a copy of the text string over which self is searching for its pattern.

bool RWUStringSearch::isMatch ( const RWUConstStringIterator position  ) 

Returns true if specified position in the search string begins a match for the pattern string; otherwise, false.

Exceptions:
RWUException Thrown to report any errors.
RWUConstStringIterator RWUStringSearch::last ( void   ) 

Sets self's current position to that of the last pattern match within its source string, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator.

Exceptions:
RWUException Thrown to report any errors.
RWUConstStringIterator RWUStringSearch::next ( const RWUConstStringIterator position  ) 

Finds the position of the match that appears after the specified position. Sets self's new current position to just beyond the end of the match, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for iteration over a set of breaks in a string.

Exceptions:
RWUException Thrown to report any errors.
RWUConstStringIterator RWUStringSearch::next ( void   ) 

Finds the position of the match that appears after the current position. Sets self's new current position to just beyond the end of the match, and returns the location of the match. If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for iteration over a set of breaks in a string.

Exceptions:
RWUException Thrown to report any errors.
RWUStringSearch& RWUStringSearch::operator= ( const RWUStringSearch rhs  ) 

Makes self a deep copy of rhs, with the exception of the text, or search, string. Self refers to the same search string referred to by rhs. The current position of self is set to that of the rhs.

This operation is exception safe; if an exception is thrown, then the state of self remains the same as it was prior to the attempt to perform the assignment operation

Exceptions:
std::bad_alloc Thrown if memory resources are exhausted.
RWUException Thrown to report other assignment errors.
RWUConstStringIterator RWUStringSearch::previous ( const RWUConstStringIterator position  ) 

Finds the position of the match that appears fully before the specified position. Sets self's new current position to that of the match, and returns the location of the match.

Note that only the nearest match that appears entirely before the specified position is returned. For example, assume that the pattern is the, the search string is thethe, and the current position is 4. Although a match occurs at position 3, the nearest offset prior to offset 4 at which an entire match can be found is position 0. Therefore, position 0 is returned.

If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for backward iteration over a set of breaks in a string.

Exceptions:
RWUException Thrown to report any errors.
RWUConstStringIterator RWUStringSearch::previous ( void   ) 

Finds the position of the match that appears fully before the current position. Sets self's new current position to that of the match, and returns the location of the match.

Note that only the nearest match that appears entirely before the specified position is returned. For example, assume that the pattern is the, the search string is thethe, and the current position is 4. Although a match occurs at position 3, the nearest offset prior to offset 4 at which an entire match can be found is position 0. Therefore, position 0 is returned.

If no match is found, sets self's current position to the end of the string, and returns the source string's end iterator. This method is intended to be used for backward iteration over a set of breaks in a string.

Exceptions:
RWUException Thrown to report any errors.
size_t RWUStringSearch::replace ( RWUString str,
const RWUString replacement,
size_t  occurrences = 1 
)

Searches string str for matches with the pattern stored in self. Each match is replaced with the replacement string, up to the specified number of occurrences. The default number of occurrences to replace is one. To replace all occurrences of the pattern, specify 0 occurrences. Returns the number of occurrences actually replaced.

void RWUStringSearch::setBreakSearch ( const RWUBreakSearch bSearch  ) 

Sets the RWUBreakSearch that self uses to determine if a pattern falls between two boundary locations. A distinct (deep) copy of the break searcher is saved. Self is not re-initialized, and the current position is not reset.

Exceptions:
RWUException Thrown to report any errors.
void RWUStringSearch::setCollator ( const RWUCollator collator  ) 

Sets the RWUCollator that self uses to perform string comparisons. A distinct (deep) copy of the collator is saved. Self is re-initialized with the new collator, and the existing pattern string, search string, and break searcher. The current position is reset.

Exceptions:
RWUException Thrown to report any errors.
void RWUStringSearch::setPattern ( const RWUString pattern  ) 

Sets the pattern string for self. A distinct (deep) copy of pattern is saved. Self is re-initialized with the new pattern string, and the existing search string, collator, and break searcher. The current position is reset.

Exceptions:
RWUException Thrown to report any errors.
void RWUStringSearch::setString ( const RWUString text  ) 

Sets the text string over which self searches for its pattern. Self is re-initialized with the new string, and the current pattern, collator, and break searcher. The current position is reset. Only a reference to the new search string is stored.

Exceptions:
RWUException Thrown to report any errors.
 All Classes Functions Variables Typedefs Enumerations Enumerator Friends

© Copyright Rogue Wave Software, Inc. All Rights Reserved.
Rogue Wave and SourcePro are registered trademarks of Rogue Wave Software, Inc. in the United States and other countries. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.