DB Access Module for Sybase User’s Guide : Chapter 2 Technical Information : Internationalization and Localization
Internationalization and Localization
Sybase supports internationalization through its datatypes nchar, nvarchar, unichar, and univarchar. Datatypes nchar and nvarchar are used for storing characters in the National Language Character Set, and datatypes unichar and univarchar are used for storing data in UTF-16 encoding. For more information, please see the Sybase documentation.
On the client side, Sybase localizes with the help of six operating system environment variables: LC_CTYPE, LC_COLLATE, LC_TIME, LC_MESSAGE, LC_ALL, and LANG. The values of these environment variables change the way Sybase interprets and converts data and produces error messages.
LC_CTYPE – Indicates the character set to use for datatype conversions.
LC_COLLATE – Indicates the collating sequence to use when sorting and comparing character data.
LC_TIME – Indicates the date and time data representation to use for a datetime string, such as date and time formats, names in the native languages, and month and day abbreviations.
LC_MESSAGE – Indicates the language and character set to use for messages.
LC_ALL – Acts as one environment variable for all the four above. If LC_ALL or LANG is set, none of the above four variables are used.
LANG – Same as LC_ALL. It is used only if LC_ALL is not set.
The first four definitions are taken from the Sybase International Developer’s Guide for Open Client/Server.
Prerequisites
There are three prerequisites for using internationalization functionality with the DB Access Module for Sybase:
1. In order to use the Sybase datatypes unichar and univarchar, the Sybase Adaptive Server character set must be set as UTF-8.
2. When installing the Sybase client, all required locales, localized messages, and character sets must be installed.
3. Depending on the operating system locale, the values for the environment variables described in “Internationalization and Localization” should be specified correctly. These environment variable values and the vendor locale values mentioned in the locales.dat file located in the directory <Sybase home directory>/locales should match. For information on setting up the correct values of the environment variables, please see the Sybase International Developer’s Guide for Open Client/Server.
Changing Locales on the Fly
If your application requires more than one locale, you can obtain different database connections with different locale settings using the RWDBSybCtLibEnvironmentHandle APIs, discussed in “System and Environment Handles to Sybase Specific Resources” . With the help of these APIs, you can localize data in different localization formats within the same application.
Data Binding
To fully use the internationalization features supported by Sybase, the Sybase datatypes nchar and nvarchar are bound to the C++ datatype RWDBMBString, and the Sybase datatypes unichar and univarchar are bound to the C++ datatype RWBasicUString.
While sending data to the server, the Sybase Open Client Client-Library internationalizes the contents of the datatypes nchar and nvarchar to convert them to the server encoding, and localizes them while receiving from the server to the client’s encoding. Hence, the contents of the nchar and nvarchar datatypes are dependent on the localization settings.
On the other hand, unichar and univarchar datatypes always store UTF-16 encoded data. For this reason, their contents are never internationalized or localized. This data is never converted by the Client-Library while sending or receiving from the server. While binding to the unichar or univarchar datatypes, data must be bound as UTF-16 encoded data; therefore, these server datatypes are bound to the C++ datatype RWBasicUString. RWBasicUString is a datatype in the Essential Tools Module designed to encapsulate UTF-16 encoded data. The unichar and univarchar datatypes can also be bound to the C++ datatype RWUString provided in the Internationalization Module.
Please see “Datatypes” for more information on data binding.
Limitations
Internationalization functionality for the DB Access Module for Sybase carries the following limitations:
1. As per Sybase documentation, in order for unichar and univarchar datatypes to function correctly, Sybase Adaptive Server must use the character set of UTF-8.
2. Currently, the DB Access Module for Sybase does not support Unicode logins, passwords, or metadata.
RWWString
Sybase does not differentiate between multibyte and wide character strings while inserting or retrieving data from the server. During input binding, RWWString data are converted to RWDBMBString data before binding. In output binding, the data is received as an MBString and then converted to a WString, if required.
RWWStrings should not be used for binding with Sybase datatypes unichar or univarchar. Even if they contain UTF-16 encoded data, RWWString data is converted to RWDBMBString before binding and hence will no longer bind UTF-16 encoded data.
Date Formats
Date expressions are loaded into the database and extracted from the database in a locale neutral manner. The DB Interface Module properly formats instances of RWDateTime and RWDate automatically. The application programmer is responsible for formatting these types outside the scope of DB Interface Module API calls.