Internet Protocols Module User’s Guide : PART II Packages of the Internet Protocols Module : Chapter 5 The HTTP Package : Advanced Topics : HTTP and International Documents
HTTP and International Documents
The HTTP package allows documents in different locales and character sets to be downloaded and processed in a C++ application. When combined with the Internationalization Module of SourcePro Core, they provide a complete solution for working with documents in various character sets. Example 21 illustrates using HTTP with the Internationalization Module.
NOTE >> The following example uses classes from the Internationalization Module of SourcePro Core. For more information on the Internationalization Module, refer to the Internationalization Module User’s Guide and SourcePro C++ API Reference Guide.
Example 21 – Using HTTP with the Internationalization Module
// Create a URL for the target web page.
RWURL url("http://www.amazon.co.jp/");
// Create a string to hold the charset, default to US-ASCII.
RWCString charset = "US-ASCII";
// Connect to the web server and retrieve the page specified.
RWHttpAgent agent;
RWHttpReply reply = agent.executeGet(url);
// Check and see if a Content-Type header is present.
RWHttpHeaderList headers = reply.getHeaders();
size_t index = headers.index("Content-Type");
 
if (index != RW_NPOS)
{
// A Content-Type header is present, extract it.
RWHttpContentTypeHeader ctHeader(headers[index]);
 
// Check and see if a charset is present.
RWCString tmp = ctHeader.getParameterValue("charset");
if (!tmp.isNull())
{
// We found an alternate charset.
charset = tmp;
}
}
// Create converters from the original charset of the message
// to UTF-8.
RWUToUnicodeConverter fromMsgCharset(charset);
RWUFromUnicodeConverter toUtf8("UTF-8");
// Create a RWUString from the body of the message.
RWUString body(reply.getBody(), fromMsgCharset);
 
// Output the body of the message as UTF-8.
cout << body.toBytes(toUtf8) << endl;
return 0;