These are experimental modules to handle various Unicode issues. This is software under construction. Not even alpha state right now. More information on Unicode can be found at http://www.unicode.org Current modules are: Unicode::String - represent strings of Unicode chars. Unicode::CharName - look up character names Some of ideas to investigate for the Unicode modules are: o Fast encoding/decoding to various 8-bit char sets. Mapping table objects perhaps? o Fast convertion to other large char sets (east-asien). I don't know anything about this. o Composition/decomposition support: $u->decomp; # will decomposite as much as possible: "å" --> "a°" $u->comp; # will composite as much as possible: "a°" --> "å" Need separate routines or a special argument to distinguish between compatibility decomposition and canonical decomposition. The last one is a subset of the first one. o General Unicode string to number convertion (based on unidata number attributes) o Case convertions (lc, uc, ucfirst) last one should use title-case o Fast lookup of Unicode attributes (unidata lookup using XS) $u->isletter, $u->isupper, $u->islower,.... why do we need them when perl does not need them for normal text?? o There might be some support for the private area (i.e. adding case convertion and char properties to chars within the area). o Unicode tr-function, sprintf-function o Unicode string comparison functions: cmp(), le, eq,... o Unicode regular expressions: m// s/// split(//,..) o Unicode filehandles (automatic convertion from UTF-7/UTF-8/8-bit char set when reading,writing to filehandles) The following are examples of use of the current modules: use Unicode::String qw(latin1 utf8); $u = utf8("this is a string\n"); print $u->ucs4; print $u->utf16; print $u->utf8; print $u->utf7; print $u->latin1; print $u->hex; print latin1("naïve\n")->utf8; use Unicode::CharName qw(uname); print uname(ord('$')), "\n"; COPYRIGHT © 1997 Gisle Aas. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.