ALPVMSI18N04_073 - I18N facilities Release Notes ¢¢2003 Hewlett-Packard Company. Microsoft and Visual C++ are trademarks of Microsoft Corporation. UNIX is a trademark of The Open Group in the United States and other countries. All other product names mentioned herein may be trademarks of their respective companies. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. HP shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is provided as is without warranty of any kind and is subject to change without notice. The warranties for HP products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. Introduction This document contains the release notes for OpenVMS Internationalization (I18N) ECO patch kit on OpenVMS. This kit allows application developers to create international software. You can obtain information about a language and a culture by reading this information from 'locale' files. This kit contains all of the supported locale files. The kit also contains a set of Unicode codeset converters which allows conversion between any supported codeset, including DEC Multinational Character Set and Microsoft Code Page 437, to any flavor of Unicode encoding: UCS-2, UCS-4 or UTF-8. For more information, please see the C Run-Time Library Utilities Reference Manual in the OpenVMS documentation set. Kit Name: ALPVMSI18N04_073 I18N facilities Release Notes Page 2 Kits superseded by this kit: None Kit Dependencies: The following remedial kit(s) must be installed BEFORE installation of this, or any required kit: On OpenVMS V7.2-2 and V7.3, this kit requires a Compaq C Runtime kit installed which supports the GB18030 locale. In order to receive all the corrections listed in this kit, the following remedial kits should also be installed: None. Kit Description: Version(s) of OpenVMS to which this kit may be applied: OpenVMS Alpha Files provided: The user selects which files to install, based on geography. Only files new in this kit will be listed. o SYS$I18N_LOCALE:*.* Locales for various locations, based on user selected options. o SYS$I18N_ICONV:*.* Converters for various character sets, based on user selected options. o New to this kit: SYS$I18N_LOCALE: JA_JP_DECKANJI2000.LOCALE (DEC Kanji2000 Locale) SYS$I18N_ICONV: DECKANJI2000_EUCJP.ICONV (DEC Kanji2000 -> Japanese EUC conversion) DECKANJI2000_ISO2022JP.ICONV (DEC Kanji2000 -> ISO2022JP) DECKANJI2000_SDECKANJI.ICONV (DEC Kanji2000 -> Super DEC Kanji) DECKANJI2000_SJIS.ICONV (DEC Kanji2000 -> Shift JIS) EUCJP_DECKANJI2000.ICONV (Japanese EUC -> DEC Kanji2000) SDECKANJI_DECKANJI2000.ICONV (Super DEC Kanji -> DEC Kanji2000) I18N facilities Release Notes Page 3 SJIS_DECKANJI2000.ICONV (Shift JIS -> DEC Kanji 2000) ISO2022JP_DECKANJI2000.ICONV (ISO2022JP -> DEC Kanji 2000) UCS-4_JISX-UDC-GLGR.ICONV (used within DECwindows Motif/Japanese) UCS-4_JISX0201-KANA-GR.ICONV (used within DECwindows Motif/Japanese) UCS-4_JISX0208-GL.ICONV (used within DECwindows Motif/Japanese) UCS-4_JISX0208-GR.ICONV (used within DECwindows Motif/Japanese) UCS-4_JISX0212-GL.ICONV (used within DECwindows Motif/Japanese) UCS-4_JISX0212-GR.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX-UDC-GLGR.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX0201-KANA-GR.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX0208-GL.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX0208-GR.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX0212-GL.ICONV (used within DECwindows Motif/Japanese) UTF-8_JISX0212-GR.ICONV (used within DECwindows Motif/Japanese) JISX-UDC-GLGR_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX-UDC-GLGR_UTF-8.ICONV (used within DECwindows Motif/Japanese) JISX0201-KANA-GR_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX0201-KANA-GR_UTF-8.ICONV (used within DECwindows Motif/Japanese) JISX0208-GL_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX0208-GL_UTF-8.ICONV (used within DECwindows Motif/Japanese) JISX0208-GR_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX0208-GR_UTF-8.ICONV (used within DECwindows Motif/Japanese) JISX0212-GL_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX0212-GL_UTF-8.ICONV (used within DECwindows Motif/Japanese) JISX0212-GR_UCS-4.ICONV (used within DECwindows Motif/Japanese) JISX0212-GR_UTF-8.ICONV (used within DECwindows Motif/Japanese) Problems addressed in kit ALPVMSI18N04_073 None Problems addressed in previous ALPVMSI18N kits VMSI18N function is usually delivered with the base OpenVMS operating system. The information is retained for convenience of customers. o GB18030 - A Chinese character set China passed a new Chinese character set standard, GB18030-2000, on March 17, 2000. GB18030-2000 is based on the existing GB2312-1980 character set and the GBK character set (a.k.a. Microsoft Code page 936) extended to 4-byte code points. (Note that GBK is not supported by OpenVMS yet.) That is, GB18030-2000 is a superset of the GBK and the GB2312-1980 character sets. This standard has more than enough code points to adopt all the characters defined in ISO/IEC 10646. Essentially, GB18030-2000 covers the same set of characters covered by the Unicode Version 3.0 and ISO/IEC 10646-2000 I18N facilities Release Notes Page 4 standards. ALPVMSI18N01_073 brings the GB18030 support to OpenVMS V7.3. ALPVMSI18N04_072 brings the GB18030 support to OpenVMS V7.2-2. GB18030 components and codeset converters from the current kit are aimed at applications based on X/Open internationalization model according to which character classification, case conversion, and cultural data are retrieved at run-time from the current programs locale using the C Run-Time Library functions. GB18030 support includes: - zh_CN.GB18030 locale, a Chinese locale for the People's Republic of China (uses the GB18030 codeset, which extends GBK by means of 4-byte encoding). - UTF8-30 locale, Unicode V3.0. - Codeset converters. The following codeset converter pairs are available for converting Simplified Chinese characters between GB18030 and UCS formats. . UCS-2_GB18030, GB18030_UCS-2 Converting from and to UCS-2 format . UCS-4_GB18030, GB18030_UCS-4 Converting from and to UCS-4 format . UTF-8_GB18030, GB18030_UTF-8 Converting from and to UTF-8 format o Euro I18N components For OpenVMS V7.2, Euro I18N Components are shipped with the VMSI18N072 kit. The ALPVMSI18N01_071 kit brings the I18N Euro support to OpenVMS V7.1. Euro I18N Components from the current kit are aimed at applications based on X/Open internationalization model according to which currency and international currency symbols are retrieved at run-time from the current program's locale using the C Run-Time Library functions. The primary objective of this kit is to ease the conversion to the Euro currency sign, for those applications which are currently using locales based on the ISO8859-1 (Latin-1) character set. This is because ISO8859-1 is the character set used by most (if not all) countries affected by Euro. I18N facilities Release Notes Page 5 ISO8859-1-Euro Character Set There are two standard character sets which include Euro sign: o ISO8859-15 (Latin 1) o Unicode. ISO8859-15 is similar to ISO8859-1, but not identical. The problem with the ISO8859-15 character set is that it redefines not only the 0xa4 codepoint, which is defined as CURRENCY SIGN in ISO8859-1 and as EURO SIGN in ISO8859-15, but also a number of other characters. If an ISO8859-1 - based application switches to ISO8859-15, it can result in different behavior (output, strings collation, etc.) not only for the currency symbol, which is expected, but also for those characters which are defined differently in ISO8859-1 and ISO8859-15. To address this compatibility problem, HP introduced a proprietary character set: ISO8859-1-Euro which is the same as ISO8859-1 except for the 0xa4 codepoint which is defined as EURO SIGN in ISO8859-1-Euro, the same definition as that in ISO8859-15. Since ISO8859-1-Euro and ISO8859-1 differ only by the 0xa4 codepoint, for a ISO8859-1 - based application, a switch to the ISO8859-1-Euro character set means a change in the currency symbol, and nothing else. Euro Locales Applications following the X/Open internationalization model select character sets through locales. Each locale is based on a particular character set. By setting a current programs locale via a call to the setlocale() C Run-Time Library function, the application sets its execution character set to that the locale is based on. Beginning with OpenVMS V6.2, OpenVMS provides the following set of locales based on the ISO8859-1character set: DA_DK_ISO8859-1 (Danish) DE_CH_ISO8859-1 (Swiss German) DE_DE_ISO8859-1 (German) EN_GB_ISO8859-1 (Great Britain) EN_US_ISO8859-1 (U.S.A) ES_ES_ISO8859-1 (Spanish) FI_FI_ISO8859-1 (Finnish) FR_BE_ISO8859-1 (Belgian French) FR_CA_ISO8859-1 (Canadian French) FR_CH_ISO8859-1 (Swiss French) FR_FR_ISO8859-1 (French) IS_IS_ISO8859-1 (Icelandic) IT_IT_ISO8859-1 (Italian) NL_BE_ISO8859-1 (Belgian Dutch) NL_NL_ISO8859-1 (Dutch) I18N facilities Release Notes Page 6 NO_NO_ISO8859-1 (Norwegian) PT_PT_ISO8859-1 (Portuguese) SV_SE_ISO8859-1 (Swedish) The locales above are shipped in the VMSI18N0nn kit where nn indicates the version of OpenVMS. For example, VMSI18N071 is for OpenVMS V7.1. What the Euro I18N Components kit adds to the OpenVMS I18N infrastructure is a "shadow" set of locales based on the ISO8859-1-Euro character set. Throughout this document, locales based on the ISO8859-1-Euro character set are referred to as Euro locales. For each ISO8859-1 based locale, the present kit adds a corresponding Euro locale. The difference between a "non-Euro" locale and its Euro counterpart is not only the character set the locale is based on; the locales also provide different definitions for the currency_symbol and int_curr_symbol items from the LC_MONETARY category. While a "non-Euro" locale defines currency and international currency symbols in a country-specific manner, all Euro locales define the currency symbol to be the Euro sign and the international currency symbol as a "EUR" string. For example, the EN_GB_ISO8859-1 locale contains the following definitions: ############## LC_MONETARY ############## int_curr_symbol "GBP" currency_symbol "¢¢ while its EN_GB_ISO8859-1-EURO counterpart has the following: ############## LC_MONETARY ############## int_curr_symbol "EUR" currency_symbol "" The only thing the application using the strfmon() C Run-Time Library function for monetary values needs to do, in order to switch to Euro, is to set the Euro locale. This is shown in the following example: TEST.C ====== #include #include #include main() { char buffer[80]; I18N facilities Release Notes Page 7 ssize_t res; double amount = 1.0; /* one Euro */ /* ** set locale according to the setting ** of the environment variables */ if ( setlocale(LC_ALL,"" ) == NULL ) perror("cannot set locale"); */ ** call strfmon using the %i format specifier ** to get international currency symbol */ res = strfmon(buffer, sizeof(buffer), "%i", amount); if ( res == -1 ) perror("strfmon failed"); else printf("output from strfmon: '%s'\n", buffer); } $ define/nolog lang EN_GB_ISO8859-1 $ r test output from strfmon: '+GBP 1.00' $ define/nolog lang EN_GB_ISO8859-1-EURO $ r test output from strfmon: '+EUR 1.00' $ Note that strfmon() is not the only function which retrieves a currency symbol from the current program's locale; nl_langinfo(CRNCYSTR) can also be used to get a locale-specific currency symbol. The LOCALE SHOW utility can be used to display information about a particular locale. For example: $ define/nolog lang FR_FR_ISO8859-1 $ locale show value int_curr_symbol "FRF" $ define/nolog lang FR_FR_ISO8859-1-EURO $ locale show value int_curr_symbol "EUR" $ The ALPVMSI18N01_071 kit provides the following set of Euro locales: a Euro locale for each ISO8859-1 - based locale: DA_DK_ISO8859-1-EURO (Danish Euro) DE_CH_ISO8859-1-EURO (Swiss German Euro) DE_DE_ISO8859-1-EURO (German Euro) EN_GB_ISO8859-1-EURO (Great Britain Euro) EN_US_ISO8859-1-EURO (U.S.A Euro) ES_ES_ISO8859-1-EURO (Spanish Euro) FI_FI_ISO8859-1-EURO (Finnish Euro) FR_BE_ISO8859-1-EURO (Belgian French Euro) FR_CA_ISO8859-1-EURO (Canadian French Euro) FR_CH_ISO8859-1-EURO (Swiss French Euro) FR_FR_ISO8859-1-EURO (French Euro) I18N facilities Release Notes Page 8 IS_IS_ISO8859-1-EURO (Icelandic Euro) IT_IT_ISO8859-1-EURO (Italian Euro) NL_BE_ISO8859-1-EURO (Belgian Dutch Euro) NL_NL_ISO8859-1-EURO (Dutch Euro) NO_NO_ISO8859-1-EURO (Norwegian Euro) NL_NL_ISO8859-1-EURO (Dutch Euro) NO_NO_ISO8859-1-EURO (Norwegian Euro) PT_PT_ISO8859-1-EURO (Portuguese Euro) SV_SE_ISO8859-1-EURO (Swedish Euro) The ALPVMSI18N01_071 kit provides binary locale files along with locale source files and a ISO8859-1-EURO.CMAP character set description file (a.k.a cmap file). If a Euro locale needs to be customized, the locale source file can be modified and the new binary locale file can be created using the LOCALE COMPILE utility. See the DEC C Run-Time Library Utilities Reference Manual for format of the locale source file and the description of the LOCALE COMPILE utility. In the future, HP is planning to add Unicode-based Euro locales and locales based on the ISO8859-15 character set. New Codeset Converters The ALPVMSI18N01_071 kit provides Unicode codeset converters for both the proprietary ISO8859-1-EURO character set and the standard ISO8859-15 character set. These codeset converters facilitate data exchange between OpenVMS systems and systems using Unicode based and ISO8859-15 based solutions for Euro. The codeset converters convert data coded in ISO8859-1-EURO and ISO8859-15 character sets to/from any flavor of Unicode encoding supported by OpenVMS, as follows: ICONV files and corresponding codeset conversions ------------------------------------------------- ICONV file codeset conversion --------------- ----------------------- ISO8859-1-EURO_UCS-2.ICONV from ISO8859-1-EURO to UCS-2 ISO8859-1-EURO_UCS-4.ICONV from ISO8859-1-EURO to UCS-4 ISO8859-1-EURO_UTF-8.ICONV from ISO8859-1-EURO to UTF-8 UCS-2_ISO8859-1-EURO.ICONV from UCS-2 to ISO8859-1-EURO UCS-4_ISO8859-1-EURO.ICONV from UCS-4 to ISO8859-1-EURO UTF-8_ISO8859-1-EURO.ICONV from UTF-8 to ISO8859-1-EURO ISO8859-15_UCS-2.ICONV from ISO8859-15 to UCS-2 ISO8859-15_UCS-4.ICONV from ISO8859-15 to UCS-4 ISO8859-15_UTF-8.ICONV from ISO8859-15 to UTF-8 UCS-2_ISO8859-15.ICONV from UCS-2 to ISO8859-15 UCS-4_ISO8859-15.ICONV from UCS-4 to ISO8859-15 UTF-8_ISO8859-15.ICONV from UTF-8 to ISO8859-15 The codeset converters can be used by DEC C Run-Time Library functions from the iconv family of functions or by ICONV CONVERT utility. See the DEC C Run-Time Library Utilities Reference Manual for the description of the ICONV CONVERT utility. I18N facilities Release Notes Page 9 Kit Installation Rating The following kit installation rating, based upon current CLD information, is provided to serve as a guide to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) INSTALLATION RATING: 2 : To be installed by all customers using the following feature(s): DECwindows Motif/Japanese V1.3 Installation Instructions Install this kit with the VMSINSTAL utility by logging into the SYSTEM account, and typing the following at the DCL prompt: $ @SYS$UPDATE:VMSINSTAL ALPVMSI18N04_073 The saveset location may be a tape drive, CD, or a disk directory that contains the kit saveset. There are categories of locales that you may select to install. You may select as many of them as you need. * Do you want European and US support [YES]? * Do you want GB18030 support [YES]? * Do you want Chinese support (not including GB18030) [YES]? * Do you want Japanese support [YES]? * Do you want Korean support [YES]? * Do you want Thai support [YES]? * Do you want Unicode support [YES]? This kit also has an Installation Verification Procedure which we recommend that you run to verify the correct installation of the kit. Instructions on running this IVP are given during the installation procedure. No reboot is necessary after successful installation of the kit.