Stingray® Foundation : Chapter 11 String and Collection Classes : Enhanced String
Enhanced String
SFL’s enhanced string is based on the basic_string<> class that is part of the Standard C++ Library. In comparison to the standard versions, SFL’s string offers the following advantages:
Conversion between different character sets
Implicit cast operator to C string (array of characters)
Formatting capabilities
Buffer allocation capabilities
Unicode compliance
The code for SFL’s enhanced string can be found in the <String\StringEx.h> header file under your SFL include directory.
The core of the implementation is in a new class called basic_string_ex<>. This class derives directly from std::basic_string. The signature of the class is:
Example 93 – Signature for class basic_string_ex<>
template <
typename _CharType,
typename _ConversionCharType,
typename _Traits = char_traits_ex<_CharType,
_ConversionCharType>,
typename _A = std::allocator<_Traits>
>
class basic_string_ex:
public std::basic_string<_CharType, _Traits, _A>
As you can see, the first difference in the template signature is that it takes not one but two character types. These will usually be the pair <char, wchar_t> or <wchar_t, char>. The first character type corresponds to actual elements of the string. For instance, a basic_string_ex<char, wchar_t> derives from basic_string<char>; therefore it is implemented as a sequence of elements of type char. The second character type enables conversions to be done from sequences of this character type to the native character type.
Another difference is that the _Traits template parameter defaults to a new class char_traits_ex<>, as opposed to the standard char_traits<>. Class char_traits_ex<> is also an SFL specialization of its Standard C++ Library counterpart. It adds to the traits class conversion and formatting routines, which will be used to implement the additional features of our extended string. As you can see, the char_traits_ex<> template also takes two character types as parameters.
Character Set Conversion
In addition to the constructors and assignment operators present in std::basic_string<>, basic_string_ex<> declares a set of conversion constructors and operators. These routines take a sequence of characters of the conversion char type and construct a string from there using the normal API calls for conversion from and to Unicode. For example, the following code copies and converts the contents of the BSTR variable to the appropriate string type.
 
BSTR bstrSomeString = GetBstr();
foundation::string s(bstrSomeString);
This is not a supported feature of the standard string.
Casting
The standard definition of the basic_string type purposely left out any casting operator, based on the theory that implicit castings can cause problems in multiple situations. The standard defines the c_str() function in order to provide access to the internal character sequence managed by the string object.
It is often convenient, however, to have a casting operator so that the string object can be used naturally in calls to routines that take a constant C string, like many of the Win32 API functions. For this reason, the SFL string does publish the casting operator. Obviously, casting is permitted only to a const character sequence.
Formatting and Buffering
SFL’s basic_string_ex<> offers formatting capabilities similar to the classic printf() in C. This is something notably absent from the standard string. The recommended method to achieve this using the Standard C++ Library is with string streams. Sometimes, however, it is more convenient to go back to the old-fashioned way, particularly if format strings need to be stored externally, such as in a resource file.
The signature of the format() routine is:
 
void format(const _CharType* lpFormat, ...);
The format() string supports the same formatting codes as the printf() function. The result of the formatting operation is assigned to the string instance on which this method is called.
Many Win32 API routines require that you pass a previously allocated character array of a determined size as an output parameter. This often involves having to allocate a character array on the stack just as a temporary buffer, and assigning the contents of that array to a string variable afterwards. There is a particular feature of MFC’s CString that comes handy in such cases: the GetBuffer() and ReleaseBuffer() set of functions.
Our basic_string_ex implements a similar functionality. The signatures of the methods involved are:
 
_CharType* get_buffer(unsigned int _N = 0);
_CharType* get_buffer_set_length(unsigned int _N = 0);
void release_buffer(unsigned int _N = 0);
The usage of these functions is the same as in MFC. Whenever a pre-allocated character sequence is required, a call to get_buffer() is performed, specifying the size of the buffer. get_buffer() returns a non-const pointer to the internal character sequence, guaranteed to be at least of the size specified. Alternatively, get_buffer_set_length() returns a buffer of exactly the size specified.
After the call to the external function, you can optionally call release_buffer() to release the space allocated in the buffer but not used by the actual contents. release_buffer() assumes that your string ends with the first null character. If your string has embedded null chars, another mechanism will have to be used to deallocate that space. For example:
 
string sItem;
int nres = ::LoadString(hResInst, stringId,
sItem.get_buffer(256), 256);
sItem.release_buffer();
Type Definitions
The Standard C++ Library defines a type string, which is no more than a typedef for a basic_string<char>; however, it is much more convenient and natural to use just the name string than the entire templatized symbol. In a similar fashion, SFL defines some short names for the most commonly used string types. However, the Standard C++ Library doesn’t take into account the possibility of applications using the Unicode character set, string is always defined to use 1 byte characters. SFL goes one step beyond, taking into account the standard way for a Windows application to define the character set it will use. The definition of string varies depending on whether the _UNICODE preprocessor macro is defined or not. For applications that need string processing for char or wchar_t types independently of the _UNICODE symbol, two permanent definitions are also included: cstring and wstring. The definition of each of these types is as follows:
cstring: String of ANSI characters. Always defined as basic_string_ex<char, wchar_t>.
wstring: String of wide (Unicode) characters. Always defined as basic_string_ex<wchar_t, char_t>
string: Defined as synonym of cstring if the _UNICODE preprocessor flag is not defined; otherwise is defined to wstring.
Remember that the string symbol we refer to here should not conflict with the string type in the Standard C++ Library: the former is within the stingray::foundation:: namespace, whereas the latter is in the std:: namespace. If you flatten those namespaces using the using statement, a name ambiguity will occur.
A similar naming trick is included for string streams. SFL does not provide an enhanced string stream; however, it does define some convenient names depending on the _UNICODE symbol, just as explained before. Thus, we have:
cstringstream: Stream of ANSI characters. Always defined as basic_stringstream<char>.
wstringstream: String of wide (Unicode) characters. Always defined as basic_stringstream<wchar_t>
stringstream: Defined as synonym of cstringstream if the _UNICODE preprocessor flag is not defined; otherwise is defined to wstringstream.
These streams use the char_traits_ex classes as their traits parameters, so they are compatible with SFL’s string types.