NAME Lingua::Phoneme - MySQL-based accent-lookups. SYNOPSIS First time, to install the dictionary, manually create an MySQL database whose name is as described in $Lingua::Phoneme::DATABASE - by defaul this is "accents": mysqladmin create accents Then run these following lines of Perl to install the DB from the MobyPron.txt file (or use build.pl): use Lingua::Phoneme; my $o = new Lingua::Phoneme( USERNAME => 'myusername', PASSWORD => 'mypassword', ); $o->build; You can supply a parameter to "build" that should be the directory in which this module is located. Thereafter: use Lingua::Phoneme; my $o = new Lingua::Phoneme( USERNAME => 'myusername', PASSWORD => 'mypassword', }; $_ = $o->phoneme("house"); @_ = $o->phoneme("house"); my ($ps,$p,$s) = $o->phoneme_accent("house"); __END__ PREREQUISITES DBI.pm, ( DBD::mysql.pm or other DBD::* ). DESCRIPTION This module is intended to provide information on the phonemes and stress of English-language words. Currently it uses the Moby Pronunciation Dictionary in a MySQL DB, but you can change the DB settings at construction time, and there is no reason why it can't be extended to other languages should dictionaries be made available. NOTES ON THE DATABASE From the Moby README file: Each pronunciation vocabulary entry consists of a word or phrase field followed by a field delimiter of space and the IPA-equivalent field that is coded using the following ASCII symbols (case is significant). Spaces between words in the word or phrase or pronunciation field is denoted with underbar "_". /&/ sounds like the "a" in "dab" /(@)/ sounds like the "a" in "air" /A/ sounds like the "a" in "far" /eI/ sounds like the "a" in "day" /@/ sounds like the "a" in "ado" or the glide "e" in "system" (dipthong schwa) /-/ sounds like the "ir" glide in "tire" or the "dl" glide in "handle" or the "den" glide in "sodden" (dipthong little schwa) /b/ sounds like the "b" in "nab" /tS/ sounds like the "ch" in "ouch" /d/ sounds like the "d" in "pod" /E/ sounds like the "e" in "red" /i/ sounds like the "e" in "see" /f/ sounds like the "f" in "elf" /g/ sounds like the "g" in "fig" /h/ sounds like the "h" in "had" /hw/ sounds like the "w" in "white" /I/ sounds like the "i" in "hid" /aI/ sounds like the "i" in "ice" /dZ/ sounds like the "g" in "vegetably" /k/ sounds like the "c" in "act" /l/ sounds like the "l" in "ail" /m/ sounds like the "m" in "aim" /N/ sounds like the "ng" in "bang" /n/ sounds like the "n" in "and" /Oi/ sounds like the "oi" in "oil" /A/ sounds like the "o" in "bob" /AU/ sounds like the "ow" in "how" /O/ sounds like the "o" in "dog" /oU/ sounds like the "o" in "boat" /u/ sounds like the "oo" in "too" /U/ sounds like the "oo" in "book" /p/ sounds like the "p" in "imp" /r/ sounds like the "r" in "ire" /S/ sounds like the "sh" in "she" /s/ sounds like the "s" in "sip" /T/ sounds like the "th" in "bath" /D/ sounds like the "th" in "the" /t/ sounds like the "t" in "tap" /@/ sounds like the "u" in "cup" /@r/ sounds like the "u" in "burn" /v/ sounds like the "v" in "average" /w/ sounds like the "w" in "win" /j/ sounds like the "y" in "you" /Z/ sounds like the "s" in "vision" /z/ sounds like the "z" in "zoo" Moby Pronunciator contains many common names and phrases borrowed from other languages; special sounds include (case is significant): "A" sounds like the "a" in "ami" "N" sounds like the "n" in "Francoise" "R" sounds like the "r" in "Der" /x/ sounds like the "ch" in "Bach" /y/ sounds like the "eu" in "cordon bleu" "Y" sounds like the "u" in "Dubois" Words and Phrases adopted from languages other than English have the unaccented form of the roman spelling. For example, "etude" has an initial accented "e" but is spelled without the accent in the Moby Pronunciator II database. INSTALLATION OF THE DATABASE See the build manpage. CONSTRUCTOR new Accepts name/value pairs as a hash or hash-like structure: CHAT Real-time info about progress on "STDERR". DATABASE The name of the rhyming dictionary database that will be created. Defaults to "accents". DRIVER The "DBI::*" driver: defaults to "mysql". USER. PASSWORD Used to access the DB - no default values. HOSTNAME, PORT The following variables must be set by the user to access the database. Defaults are "localhost", "3306" METHOD &build ($optional_path_to_db) Calling this method will fill the database, dropping and re-making all tables if they already exist. Optionally, supply an arugment which is the full path to the Moby Pronounciation dictionary file - the default is to use "MobyPron" in the "$perl/site/lib/Lingua/Phoneme/dict/EN" directory. METHOD raw Accepts database handle and scalar of the word to lookup Returns raw Moby phoneme scalar from DB, or "undef" on failure to find the word (not necessarily an error). You are advised to use other methods to look up data in the db: if you do use this, note that the DB keys have _underscores_ instead of spaces. You can use the "&prepare" function to convert these. METHOD phoneme ($word_to_lookup) Accepts a word to look up. Returns the phonemes of the word, as a scalar or array, depending on the calling context, or "undef" if the word isn't in the dictionary. The phoneme pattern is defined in the Moby documentation: see "PHONEMES". METHOD phoneme_accent ($word_to_lookup) Accepts a word to look up. Returns a reference to an array of the phonemes of the word, plus the index in that array of the primary accent, and if there is a secondary accent, its index too. Returns "undef" if the word isn't in the dictionary. The phoneme pattern is defined in the Moby documentation: see "PHONEMES". Note that the Moby documentation describes the primary punctuation mark thus: "'" (uncurled apostrophe) marks primary stress "," (comma) marks secondary stress. This is plainly in reverse, as the entry for "house" is "house ,h/&//U/s". SEE ALSO the DBI manpage, the DBD::mysql manpage, the Lingua::Rhyme manpage. KEYWORDS Phoneme, phoneme, syllable. ACKNOWLEDGMENTS The Moby dictionary was found at described as *Moby (tm) Pronunciator II...(22 June 93)* with the contact address: *3449 Martha Ct., Arcata, CA 95521-4884, USA, +1 (707) 826-7715*. AUTHOR Lee Goddard COPYRIGHT THis module is Copyright (C) Lee Goddard, 10 June 2002. This is free software, and can be used/modified under the same terms as Perl itself. The Moby dictionary is Copyright (c) 1988-93, Grady Ward. All Rights Reserved.