=head1 NAME Mac::TEC - Interface to the MacOS Text Encoding Converter =head1 SYNOPSIS use Mac::TEC; @encodings=Mac::TEC::GetEncodings; $convertedText=Mac::TEC::ConvertText($string,$fromCode,$toCode); @sniffers=Mac::TEC::GetSniffers; ($encoding,$error,$features)=Mac::TEC::SniffTextEncoding($string,$encodingI,$encodingII[,$encodingIII,...]); =head1 EXPORTS Nothing =head1 DESCRIPTION This is a MacPerl interface to the MacOS Text Encoding Converter. Text Encoding Converter is a Systemextension of Mac OS8. To use this Module you need the PPC or CFM-68k Version of MacPerl 5.1.9r4 or MacJPerl 5.1.5r4J or higher, Iimori-sans TEC OSAX Ver. 1.1.1 and the Text Encoding Converter Vers. 1.2 installed inside your System-Folder. This module is in the alpha-state. B =head2 Text Conversion @encodings=Mac::TEC::GetEncodings; Returns a list of all available Textconverters. Text Encoding Converter 1.2 should return something like: macintosh, X-MAC-JAPANESE, X-MAC-CHINESETRAD, X-MAC-KOREAN, X-MAC-ARABIC, X-MAC-HEBREW,X-MAC-GREEK, X-MAC-CYRILLIC, X-MAC-DEVANAGARI, X-MAC-GURMUKHI, X-MAC-GUJARATI, X-MAC-THAI,X-MAC-CHINESESIMP, X-MAC-CENTRALEURROMAN, X-MAC-TURKISH, X-MAC-CROATIAN, X-MAC-ICELANDIC,X-MAC-ROMANIAN,X-MAC-UKRAINIAN, UNICODE-1-1, UNICODE-1-1-UTF-7, UNICODE-1-1-UTF-8,UNICODE-2-0, UNICODE-2-0-UTF-7,UTF-8, ISO-8859-1, ISO-8859-2, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, cp437,cp864, windows-1252, windows-1250, windows-1251,windows-1253, windows-1254, windows-1255,windows-1256, GB_2312-80, KS_C_5601-1987,ISO-2022-JP, ISO-2022-CN, ISO-2022-KR, EUC-JP, GB2312; X-EUC-TW; EUC-KR, Shift_JIS,KOI8-R, Big5, X-MAC-LATIN1, HZ-GB-2312. $convertedText=Mac::TEC::ConvertText($string,$fromCode,$toCode); Converts a string ($string) from one encoding ($fromCode) to an other encoding ($toCode).You can use all available encodings. Of course you won't be happy with your result, if you try to convert Hebrew to Japanese. You just get a lot of "?"s. Because Apple-Events are really slow, you don't wan't to do line-by-line processing on large files. You can speed up your code by a factor of 10 or more if you concencate some lines before you send them to ConvertText. =head2 Code Detection @codedetectors=Mac::TEC::GetSniffers; Returns a list of all available code detectors. Text Encoding Converter 1.2 should return something like: X-MAC-JAPANESE, X-MAC-CHINESETRAD, X-MAC-KOREAN, X-MAC-CHINESESIMP, GB_2312-80, KS_C_5601-1987, ISO-2022-JP, ISO-2022-CN, ISO-2022-KR, EUC-JP, GB2312, X-EUC-TW, EUC-KR, Shift_JIS, Big5, HZ-GB-2312 ($encoding,$error,$features)=Mac::TEC::SniffTextEncoding($string, $endoding1,$encoding2,...) Tries to find out in which of the encodings ($encoding1....$encoding_n) the string ($string) is written. It returns the name of the encoding, the number of errors and the number of encoding specific characters. Errors will be returned if your $string is written in different encodings (e.g JIS and SJIS). Remember, the number of encoding specific characters, does mean characters, not bytes. The code-detection routines are only of use for Chinese, Japanese an Korean. You use them if you know some string is written in Japanese, but can't figure out wether the encoding is JIS, SJIS or EUC-JP etc. =head1 AUTHOR Andreas Marcel Riechert I (c) 1998 Andreas Marcel Riechert. All rights reserved. This program is free software; you can redistribute it under the same terms as Perl itself. Please see the Perl Artistic License. =head1 SEE ALSO =over =item TEC OSAX Homepage http://www.bekkoame.or.jp/~iimori/sw/TECOSAX.html (only in Japanese) =item Apples Technote 1102 (Text Encoding Converter 1.2) http://devworld.apple.com/dev/technotes/tn/tn1102.html#textenc =back =cut