NAME
Text::Undiacritic - remove diacritics from a string
VERSION
This document describes Text::Undiacritic 0.01
SYNOPSIS
use Text::Undiacritic qw(undiacritic);
$ascii_string = undiacritic( $czech_string );
DESCRIPTION
Changes characters with diacritics into their base characters.
Also changes into base character in cases where UNICODE does not provide
a decomposition.
E.g. all characters '... WITH STROKE' like 'LATIN SMALL LETTER L WITH
STROKE' do not have a decomposition. In the latter case the result will
be 'LATIN SMALL LETTER L'.
Removing diacritics is useful for matching text independent of spelling
variants.
SUBROUTINES/METHODS
undiacritic
$ascii_string = undiacritic( $characters );
Removes diacritics from $characters and returns a simplified character
string.
The input string must be in character modus, i.e. UNICODE code points.
DIAGNOSTICS
CONFIGURATION AND ENVIRONMENT
DEPENDENCIES
* version
* charnames
* Unicode::Normalize
INCOMPATIBILITIES
BUGS AND LIMITATIONS
There is no experience if this module gives useful results for scripts
other than Latin.
AUTHOR
Helmut Wollmersdorfer ""
LICENSE AND COPYRIGHT
Copyright (c) 2007, Helmut Wollmersdorfer "". All
rights reserved.
This module is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.