Chinese Computing Newsletter; October, 1999, v.1 ================================================ Included in this issue: * MS Word 2000 Chinese Support * Perl 5.6 and Unicode * Java 1.3 Includes Non-OS Input Method Support * GNU Emacs 20.4 for Windows * MS Word 2000 Chinese Support Word97 introduced a whole new level of multilingual support by using Unicode as its encoding. With the introduction of Office 2000, Microsoft Word improves its multilingual support further by automatically reading and writing many different encodings. The Open and SaveAs File dialogs include an option for "Encoded Text". If selected, Word will give you the option to save the document as text in one of several different encodings, including GB, HZ, EUC, Big5, UTF-8 and Unicode. It can also read and convert to Unicode files in all of these encodings. Word 2000 also supports Simplified/Traditional Chinese conversion. From the main menu, select Tools->Language->Chinese Translation. You will have the option to select the conversion direction. Many East Asian language related formatting options are also from Format->Asian Layout. Microsoft's CJK Input Methods that formerly only worked with Internet Explorer and Outlook now also work with Office 2000. After installation, the input language can switched by pressing Alt+Shift. Unfortunately, Microsoft only includes a BoPoMoFo method to input traditional Chinese. Simplified Chinese uses pinyin. Character selection is context-sensitive, i.e. the input method will try to determine the proper character for a pinyin sound based on the possible word it is in. Related Links: Microsoft CJK Input Methods http://www.microsoft.com/windows/ie/features/ime.asp * Perl 5.6 and Unicode Perl 5.6 will include the option of using UTF-8 as its internal encoding. After including the "use utf8;" pragma in your Perl program, operations that once operated on the byte level in strings will now work on the character level. Regular expressions, substr, length, etc. will all work with characters. This makes working with Chinese as easy working with English. No word yet on whether support will be included for reading and writing files in different encodings. Example programs will be included in future newsletters. Further Information: In-depth Information on using Unicode in Perl5.6 http://users.erols.com/eepeter/chinesecomputing/programming/perl-unicode.txt Brief Discussion of Perl Unicode Support http://www.perl.com/pub/1999/06/perl5-6.html#unicode Description of using XML with Perl, the main motivation for adding Unicode support to Perl http://www.perl.com/pub/1998/11/xml.html * Java 1.3 Includes Non-OS Input Method Support Previous versions of the Java Foundation Classes included support for communicating with the input methods of the operating system, but did not have a standard way of including pure Java input methods. With the introduction of the Input Method Engine SPI in Java 1.3, the need for platform independent input methods has finally been filled. With this new specification, programmers can create their own input methods and package them into a JAR file. Including this JAR file in the lib/ext subdirectory in the main Java directory will make the input method available to all Java programs using that JVM. This will be useful to programmers who want to write Chinese programs to be used on non-Chinese platforms. Further Information: JavaSoft's Page on IME SPI http://java.sun.com/products/jdk/1.3/docs/guide/imf/index.html JavaSoft's Tutorial on Creating New Input Methods http://java.sun.com/products/jdk/1.3/docs/guide/imf/spi-tutorial.html * GNU Emacs 20.4 for Windows Programmers looking for a free, powerful, multilingual text editor on Windows 95/98/NT will want to check out the latest release of GNU Emacs for Windows. With this release, the multilingual abilities of Mule have been folded back into Emacs. Emacs now supports the display of multiple languages at once, will display Chinese directly without the need for helper programs if you have Chinese fonts on your system, and has free input methods for Chinese, including pinyin, zhuyin, English, 4corner, and Cantonese. In addition, Emacs supports Japanese, Korean, many European languages, and more. Add this to Emacs' ability to act as a development enviroment for many different programming languages, including Perl, Java, C, and Tcl/Tk and it may be on the only text editor you'll need. Further Information: GNU Emacs for Windows Page http://www.gnu.org/software/emacs/windows/ntemacs.html UTF-8 Support for Emacs http://www.cs.ust.hk/~otfried/Mule -------------------------------------------------------------------- Please send suggestions for future Chinese Computer Newsletter items to erik@chinesecomputing.com. Past issues of the newsletter can be accessed through the www.chinesecomputing.com site. To remove yourself from the list, send an e-mail to newsletter@chinesecomputing.com. On the subject line write "remove your-email-address".