Chinese Computing Newsletter; January, 2000, v.3 ================================================= CONTENTS ======== * Netscape 5.0 to Include Automatic Chinese Charset Detection * IE 5.5 to Include Vertical Text Support * Chinese Support in Windows 2000 * Chinese Government Leaning towards Adopting Linux * iSilk.com Chinese Web Page Reader in Beta Testing * IBM Chinese Translation/Search Engine Now On-line * Code Sample of the Month: Converting between GB and HZ Using Perl ARTICLES ======== ** Netscape 5.0 to Include Automatic Chinese Charset Detection Netscape 5.0 (codenamed Mozilla) will include automatic character set detection for Chinese, much like it currently does for Japanese. When the automatic detection option is set, web users can surf between simplified and traditional Chinese pages without worrying about the pages displaying properly. Related Links http://www.mozilla.org/quality/intl/chardetect.html http://www.mozilla.org/projects/seamonkey/release-notes/m10.html#chardet ** Internet Explorer 5.5 to Include Vertical Text Support Starting with the version 5.5 beta, Internet Explorer now supports vertical text display. By setting the style attribute "writing-mode" to "tb-rl", the web page or region will display text from top to bottom and right to left, with wide characters (hanzi, etc.) displaying upright, but English and other half-width glyphs rotated 90 degrees clockwise. When the beta was tested, the vertical display worked, but the wide East Asian glyphs were also rotated 90 degrees clockwise. Hopefully this will be fixed by the final version. Related Links http://msdn.microsoft.com/workshop/author/dhtml/reference/properties/writingmode.asp ** Chinese Support in Windows 2000 Windows 2000 is Microsoft's first fully internationalized operating system, with one set of binaries supporting all languages. With Unicode as the underlying character, creating applications using Chinese is simple. Users of the any language version of Windows 2000 will be able to use Chinese input methods and set the default local to Chinese. Windows 2000 includes support for multilingual text display along with other Chinese-specific enhancements. Among these enhancements are Chinese font updates, additional and improved input methods, and Chinese-specific printer drivers. Related Links http://www.microsoft.com/GLOBALDEV/Non%20mirror/back%20up/International.asp http://www.microsoft.com/globaldev/FAQs/Multilang.asp http://www.microsoft.com/globaldev/articles/multilang.asp ** Chinese Governing Leaning towards Adopting Linux Several news reports have suggested that mainland China is leaning towards adopting Linux in government ministries. One report says a version of Linux called "Red Flag Linux" will become the new standard. Related Links http://www.virtualchina.com/news/jan00/0107/010799-linux.html http://www.computercurrents.com/newstoday/00/01/14/news8.html http://www.cnn.com/2000/TECH/computing/01/06/china.microsoft.reut/index.html http://news.cnet.com/news/0-1003-200-1515595.html?tag=st.ne.1002.bgif.1003-200-1515595 ** iSilk.com Chinese Web Page Reader in Beta Testing iSilk.com is currently beta testing a new service to help English speakers read Chinese web pages. After typing in the address of the Chinese web site you are interested in, iSilk downloads the page and annotates each Chinese word with its pronunciation and an English gloss. Based on tests of the beta system though, the quality of the word segmentation and the English definitions still needs much more improvement. Related Links http://www.isilk.com ** IBM Chinese Translation/Search Engine Now On-line IBM recently put up a page to that allows you to use Chinese to search for information on English search engines on the web. Click on their link for Yahoo, and Yahoo appears translated into Chinese. You can then type in your query in Chinese, then the IBM program translates the query back into English and runs the search, returning the results in Chinese. You can see the translated web page in just Chinese or with both English and Chinese together. Related Links http://nativesearch.alphaworks.ibm.com:3106/ http://www.mandarintools.com/translate.html ** Code Sample of the Month: Converting between GB and HZ Using Perl # Take a given string and convert any Hz sequences in it to the # corresponding GB sequence sub hz2gb { my($hzline) = @_; my($gbline) = ""; my($hzlen) = length($hzline); my($i, $hzval1, $hzval2); for ($i = 0; $i < $hzlen; $i++) { if (substr($hzline, $i, 1) eq "~") { if (substr($hzline, $i+1, 1) eq "{") { $i += 2; while ($i < $hzlen) { if (substr($hzline, $i, 2) eq "~}") { $i++; last; } elsif (substr($hzline, $i, 1) eq "\n" or substr($hzline, $i, 1) eq "\r") { $gbline .= substr($hzline, $i, 1); last; } $hzval1 = vec($hzline, $i, 8) + 0x80; $hzval2 = vec($hzline, $i+1, 8) + 0x80; $hzval = pack("C2", $hzval1, $hzval2); $gbline .= $hzval; $i += 2; } } elsif (substr($hzline, $i+1, 1) eq "~") { # ~~ becomes ~ $gbline .= "~"; } else { # false alarm $gbline .= substr($hzline, $i, 1); } } else { $gbline .= substr($hzline, $i, 1); } } return $gbline; } # Take a string containing GB characters and convert it to the # corresponding Hz encoded string. Adjacent GB characters will # all be included in the the Hz escape sequences (only one "~{" ) sub gb2hz { my($gbline) = @_; my($hzline) = ""; my($gblen) = length($gbline); my($i, $gbval1, $gbval2); for ($i = 0; $i < $gblen; $i++) { if (vec($gbline, $i, 8) > 127) { $hzline .= "~{"; while ($i < $gblen) { if (vec($gbline, $i, 8) < 128) { $hzline .= "~}" . substr($gbline, $i, 1); last; } elsif (substr($gbline, $i, 1) eq "\n" or substr($gbline, $i, 1) eq "\r") { $hzline .= "~}" . substr($gbline, $i, 1); last; } $gbval1 = vec($gbline, $i, 8) - 0x80; $gbval2 = vec($gbline, $i+1, 8) - 0x80; $gbval = pack("C2", $gbval1, $gbval2); $hzline .= $gbval; $i += 2; } } else { if (substr($gbline, $i, 1) eq "~") { $hzline .= "~~"; # ~ must be escaped } else { $hzline .= substr($gbline, $i, 1); } } } return $hzline; } # End Code Sample -------------------------------------------------------------------- Please send suggestions for future Chinese Computing Newsletter items to erik@chinesecomputing.com. Past issues of the newsletter can be accessed through the www.chinesecomputing.com site. Feel free to redistribute the newsletter for non-commercial use. To remove yourself from the list, send an e-mail to newsletter@chinesecomputing.com. On the subject line write "remove your@email-address.com".