Module: Essential Tools Module Group: String Processing Classes
Does not inherit
#include <rw/wtoken.h> RWWString str("a string of tokens", RWWString::ascii); RWWTokenizer(str); // Lex the above string
Class RWWTokenizer is designed to break a string up into separate tokens, delimited by arbitrary "white space." It can be thought of as an iterator for strings and as an alternative to the C library function wstok() which has the unfortunate side effect of changing the string being tokenized.
None
#include <rw/wtoken.h> int main () { RWWString a(L"Something is rotten in the state of Denmark"); RWWTokenizer next(a); // Tokenize the string a RWWString token; // Will receive each token // Advance until the null string is returned: while (!(token=next()).isNull()) std::cout << token << "\n"; return 0; }
Program output (assuming your platform displays wide characters as ASCII if they are in the ASCII character set):
Something is rotten in the state of Denmark
RWWTokenizer(const RWWString& s);
Construct a tokenizer to lex the string s.
RWWSubString operator();
Advance to the next token and return it as a substring. The tokens are delimited by any of the four wide characters in L" \t\n\0". (space, tab, newline and null).
RWWSubString operator()(const wchar_t* s);
Advance to the next token and return it as a wide substring. The tokens are delimited by any wide character in s, or any embedded wide null.
RWWSubString operator()(const wchar_t* s,size_t num);
Advance to the next token and return it as a substring. The tokens are delimited by any of the first num wide characters in s. Buffer s may contain embedded nulls, and must contain at least num wide characters. Tokens will not be delimited by nulls unless s contains nulls.
RWWString operator()(RWTRegex<char>& regex);
Returns the next token using a delimiter pattern represented by the regular expression pattern regex.
This method, unlike the other operator() overloads, allows a single occurrence of a delimiter to span multiple characters.
For example, consider the RWWTokenizer instance tok. The statement tok(RWWString("ab")) treats either a or b as a delimiter character. On the other hand, tok(RWTRegex<char>("ab")) treats the two-character pattern, ab, as a single delimiter.
This method consumes consecutive occurrences of delimiters and skips over any empty fields that may be present in the string. To obtain empty fields as well as non-empty fields, use the nextToken() method.
boolean done() const
Returns true if the last token from the search string has been extracted. If the last token has not been extracted, it will return false. When using the function call operator interface, this is the same as the last non-empty token having been returned.
RWWString nextToken()
Returns the next token using a default set of delimiter characters.
This method may return an empty token if there are consecutive occurrences of any delimiter character in the search string.
RWWString nextToken(const RWWString& str)
Returns the next token using a specified string of delimiter characters.
This method may return an empty token if there are consecutive occurrences of any delimiter character in the search string.
RWWString nextToken(const RWWString& str, size_t num);
Returns the next token using the first num code units from the given string str of delimiter characters.
This method may return an empty token if there are consecutive occurrences of any delimiter character in the search string.
RWWString nextToken(RWTRegex<char>& regex);
Returns the next token using a delimiter pattern represented by a regular expression pattern.
Unlike the other nextToken() overloads, this method allows a single occurrence of a delimiter to span multiple characters.
For example, nextToken(RWWString("ab")) treats either a or b as a delimiter character. Conversely, nextToken(RWTRegex<char>("ab")) treats the two-character pattern ab as a single delimiter.
This method may return an empty token if there are consecutive occurrences of any delimiter character in the search string.
© Copyright Rogue Wave Software, Inc. All Rights Reserved.
Rogue Wave and SourcePro are registered trademarks of Rogue Wave Software, Inc. in the United States and other countries. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.