![]() | ![]() |
Home |
|
|
Open ServerConnect Programmer's Reference for COBOL |
|
| Chapter 2: Topics |
|
| Processing Japanese Client Requests |
Note: The Japanese Conversion Module (JCM) is available for CICS only. If you are not using the JCM, you can skip this section.
Open ServerConnect can accept and process client requests written in Japanese if you have the JCM installed. The JCM is provided on a separate tape. It does the workstation-to-mainframe-to-workstation translations necessary to process requests containing Japanese characters.
The Open ServerConnect environment must be customized to process Japanese requests. A system programmer customizes your environment when Open ServerConnect is installed. Open ServerConnect loads the customization module when TDINIT is called.
Customization information includes client login information from the client login packet that TRS forwards to the mainframe along with the client request. Among the client information contained in the login packet is the name of the client character set. See "The Login Packet" for details.
The following options are set during customization:
The use of this option depends on whether DBCS is used:
If the native language is Japanese, TDINIT loads the JCM.
An Open ServerConnect program can retrieve customization information with the function TDGETUSR.
Once the JCM is loaded, it gets control whenever an Open ServerConnect program receives a client request containing TDSCHAR or TDSVARYCHAR data. TDSCHAR and TDSVARYCHAR are the datatypes used to represent Japanese characters in workstation character sets. The JCM converts the workstation Japanese characters to the character set used on the mainframe. Once mainframe processing is completed, the JCM converts results back to the original workstation character set before returning them to the client.
The JCM uses translate tables to convert workstation characters to mainframe characters.
When an Open ServerConnect program receives a client request in Japanese that contains character datatypes, it gives control to the JCM. The JCM looks up the client character set in the translate tables.
Different brands of workstations use different character sets to represent double-byte characters. See "Character Sets" to learn what single-byte and double-byte character sets are supported on the workstation and at the mainframe.
Each character set used to handle Japanese characters has its own way of representing kanji or hankaku katakana characters and specifying lengths for Japanese character strings. While most of the differences are handled by the JCM, you need to understand a few of these differences in order to specify field lengths correctly. These differences are discussed in this section.
See Table 2-14: Length requirements in Japanese character sets and
Table 2-15: Length-settings in Japanese character set conversions for information on character set differences in tabular form.
The following datatypes can be used with Japanese characters at the workstation:
The following datatypes can be used with Japanese characters at the mainframe:
Graphic datatypes are used with double-byte characters only.
Kanji characters always occupy 2 bytes.
Hankaku katakana characters are always represented as single-byte character-type data with datatypes of TDSCHAR or TDSVARYCHAR.
Kanji characters are represented as character-type data at the workstation, and as either character-type or graphic-type data at the mainframe. The length of a Japanese character string depends on which workstation is being used and whether the datatype is graphic or character.
Some character sets use a special indicator or code in character-type strings to announce that the following series of characters are double-byte characters. With kanji, this indicator is called a Shift Out (SO) code. An SO code marks the beginning of a double-byte kanji string. The end of the kanji string is marked by a Shift In (SI) code.
When setting field lengths for Japanese character strings, you must include room for these SO/SI codes.
When sending data from a mainframe to a workstation, you can replace SO/SI codes with blanks by calling the Gateway-Library function TDSETSOI before receiving or sending data.
Graphic datatypes do not use SO/SI codes.
WARNING! When receiving data from a workstation character set that does not use SO/SI codes, IBM_Kanji always inserts the SO/SI codes at the beginning and end of double-byte character strings. If the field length specification does not take this into account, and the length is just long enough for the data itself, some of the data is lost.
If a field contains mixed single-byte and double-byte data in more than one kanji string, an SO/SI pair exists for each kanji string.
At the mainframe, the length of graphic-type strings is counted in double-byte (16-bit) characters. Thus, a string of 10 kanji characters has a length of 10.
At the workstation, the length of kanji character strings is counted in bytes. Thus, a string of 10 kanji characters has a length of 20.
The length of a hankaku katakana string is always represented in bytes, at both the workstation and the mainframe. A hankaku katakana character occupies one byte, except in eucjis.
The eucjis hankaku katakana character set uses an indicator (SS2) in character-type strings to announce that the next byte is occupied by a hankaku katakana. The SS2 indicator occupies one byte, and the hankaku katakana itself occupies one byte. As a result, the total length of each eucjis hankaku katakana character is two bytes.
The following datatypes are used with Japanese characters:
Datatype | Used With | Uses SO/SI | Length Measures |
|---|---|---|---|
TDSCHAR | DBCS and SBCS. | IBM Kanji: | For all character sets: |
TDSGRAPHIC | DBCS only. | No. | Number of characters. |
When converting from a workstation Japanese character set to a mainframe Japanese character set, you frequently need to adjust the length. The adjustment depends on which character sets, datatypes, and language are being used.
In this section:
The following table describes how Japanese characters are represented in supported character sets, and how their lengths are affected.
Character Set | SBCS or DBCS | Datatype | Length Considerations | Example |
|---|---|---|---|---|
EUC-JIS | DBCS (hankaku | character | Each 1-byte hankaku katakana character is preceded by a 1-byte SS2 indicator. As a result, each eucjis hankaku katakana character has a length of 2: the SS2 indicator and the hankaku katakana itself. | A string of 4 hankaku katakana occupies 8 bytes and has a length of 8. |
EUC-JIS | DBCS | character | Each kanji character is 2 bytes long and has a length of 2. | A string of 4 kanji occupies 8 bytes and has a length of 8. |
Shift-JIS | SBCS | character | Each hankaku katakana character is 1 byte long and has a length of 1. | A string of 4 hankaku katakana occupies 4 bytes and has a length of 4. |
Shift-JIS | DBCS | character | Each kanji character is 2 bytes long and has a length of 2. | A string of 4 kanji occupies 8 bytes and has a length of 8. |
IBM Kanji | DBCS | character | Each kanji character is 2 bytes long and has a length of 2. | A string of 4 kanji occupies 10 bytes and has a length of 10. |
IBM Kanji | DBCS | graphic | Each kanji character is a double-byte character and has a length of 1. | A string of 4 kanji occupies 8 bytes and has a length of 4. |
IBM Kanji | SBCS | character | Each hankaku katakana character is 1 byte long and has a length of 1. | A string of 4 hankaku katakana occupies 4 bytes and has a length |
Table 2-15 illustrates length adjustments required for some workstation-to-mainframe Japanese character set conversions.
Source Character Set | Source | Source | Target | Target | Target |
|---|---|---|---|---|---|
EUCJIS hankaku katakana | character | 8 | IBM Kanji hankaku katakana | character | 4 |
EUCJIS kanji | character | 8 | IBM Kanji | character | 10 |
EUCJIS kanji | character | 8 | IBM Kanji | graphic | 4 |
Shift-JIS hankaku katakana | character | 4 | IBM Kanji hankaku katakana | character | 4 |
Shift-JIS kanji | character | 8 | IBM Kanji | character | 10 |
Shift-JIS kanji | character | 8 | IBM Kanji | graphic | 4 |
IBM Kanji hankaku katakana | character | 4 | EUC-JIS hankaku katakana | character | 8 |
IBM Kanji hankaku katakana | character | 4 | Shift-JIS hankaku katakana | character | 4 |
IBM Kanji kanji | character | 10 | EUC-JIS kanji | character | 8 |
IBM Kanji kanji | character | 10 | Shift-JIS kanji | character | 8 |
IBM Kanji kanji | graphic | 4 | EUC-JIS kanji | character | 8 |
IBM Kanji kanji | graphic | 4 | Shift-JIS kanji | character | 8 |
Because differences among Japanese character sets can result in longer and shorter lengths after conversion, Gateway-Library includes the TDSETSOI function that specifies padding or stripping the SO/SI indicators.
When converting from a character set that uses SO/SI indicators to one that does not (for example, converting CHAR data from IBM Kanji to Shift-JIS kanji), you can use TDSETSOI to specify whether the SO/SI indicators are stripped or whether they are replaced with embedded blanks. When replaced with embedded blanks, the length does not change. When stripped, the length is reduced by two bytes for each kanji string.
If no strip option is set, the JCM automatically strips SO/SI indicators.
When TDSETSOI replaces SO/SI indicators with blanks, the blanks are positioned at the end of the field. For example, in an IBM Kanji CHAR field that contains four kanji, the first byte contains the SO indicator, and the tenth byte contains the SI indicator. After conversion to Shift-JIS kanji, the first eight bytes are occupied by kanji, and the blanks occupy bytes nine and ten.
By judicious use of TDSETSOI, you can minimize the length changes and calculations needed in Open ServerConnect programs. See"TDSETSOI" for details.
See "TDGETSOI" for information about how to query the SO/SI processing settings for a column or parameter.
|
|