Sybase Technical Library - Product Manuals Home
[Search Forms] [Previous Section with Hits] [Next Section with Hits] [Clear Search] Expand Search

Wide tables and larger page size [Table of Contents] Capabilities and the connection's TDS level

Open Client Client-Library/C Reference Manual

[-] Chapter 2 Topics
[-] Capabilities
[-] unichar datatype

unichar datatype

Open Client/Open Server 12.5 unichar supports two-byte characters, supporting multilingual client applications, and reducing the overhead associated with character-set conversions.

Designed the same as the Open Client/Open Server CS_CHAR datatype, CS_UNICHAR is a shared, C-programming datatype that can be used anywhere the CS_CHAR datatype is used. The CS_UNICHAR datatype stores character data in Unicode UCS Transformational Format 16-bit (UTF-16), which is two-byte characters.

The Open Client/Open Server CS_UNICHAR datatype corresponds to the Adaptive Server 12.5 UNICHAR fixed-width and UNIVARCHAR variable-width datatypes, which store two-byte characters in the Adaptive Server database.

As a standalone, Open Client 12.5 applications can use this new functionality to convert other datatypes to and from CS_UNICHAR at the client site, even if the server does not have the capability to process two-byte characters.

New datatypes and capabilities

To send and receive two-byte characters, the client specifies its preferred byte order during the login phase of the connection. Any necessary byte swapping is performed on the server site.

The Open Client ct_capability() parameters:

To access two-byte character data, Open Client/Open Server implements:

Setting the CS_DATAFMT parameter's datatype to CS_UNICHAR_TYPE allows you to use existing API calls, such as ct_bind, ct_describe, ct_param, and so on.

CS_UNICHAR uses the format bitmask field of CS_DATAFMT to describe the destination format.

For example, in the Client Library sample program, rpc.c, the BuildRpcCommand() function contains the section of code that describes the datatype:

...
strcpy (datafmt.name, "@charparam");
datafmt.namelen =CS_NULLTERM;
datafmt.datatype = CS_CHAR_TYPE;
datafmt.maxlength = CS_MAX_CHAR;
datafmt.status = CS_RETURN;
datafmt.locale = NULL;
...

In this example, from the new uni_rpc.c sample program, the character type is defined as

datafmt.datatype = CS_CHAR_TYPE
. Use an ASCII text editor to edit the datafmt.datatype field to:

...
strcpy (datafmt.name, "@charparam");
datafmt.namelen =CS_NULLTERM;
datafmt.datatype = CS_UNICHAR_TYPE;
datafmt.maxlength = CS_MAX_CHAR;
datafmt.status = CS_RETURN;
datafmt.locale = NULL;
...

Since CS_UNICHAR is a UTF-16 encoded Unicode character datatype that is stored in two bytes, the maximum length of CS_UNICHAR string parameter sent to the server is restricted to one-half the length of CS_CHAR, which is stored in one-byte format.

Table 2-8 lists the CS_DATAFMT bitmask fields.

CS_DATAFMT structure

Bitmask field

Description

CS_FMT_NULLTERM

The data is two-byte Unicode null-terminated (0x0000).

CS_FMT_PADBLANK

The data is padded with two-byte Unicode blanks to the full length of the destination variable (0x0020).

CS_FMT_PADNULL

The data is padded with two-byte Unicode nulls to the full length of the destination variable (0x0000).

CS_FMT_UNUSED

No format information is provided.

isql and bcp utilities

Both the isql and the bcp utilities automatically support unichar data if the server supports two-byte character data. bcp now supports 4K, 8K and 16K page sizes.

If the client's default character set is UTF-8, isql displays two-byte character data, and bcp saves two-byte character data in the UTF-8 format. Otherwise, the data is displayed or saved, respectively, in two-byte Unicode data in binary format.

Use

isql -Jutf8
to set the client character set for isql. Use
bcp
-Jutf8
to set the client character set for the bcp utility.

Limitations

The sever to which the Open Client/Open Server is connecting must support two-byte Unicode datatypes, and use UTF-8 as the default character set.

If the server does not support two-byte Unicode datatypes, the server returns an error message:

"Type
not found. Unichar/univarchar is not supported."

CS_UNICHAR does not support the conversion from UTF-8 to UTF-16 byte format for CS_BOUNDARY and CS_SENSITIVITY. All other datatype formats are convertible.

CS_UNICHAR does not provide C programming operations on UTF-16 encoded Unicode data such as Unicode character strings. For full support for Unicode character strings, you must use the Sybase product, Unilab. See the Unilib Reference Manual at The reference manual is part of the Sybase Unicode Developers Kit 2.0.


Wide tables and larger page size [Table of Contents] Capabilities and the connection's TDS level