A tool for automatic derivation of phone transitions for the creation of a diphone database for Sinhala text to speech synthesis

No Thumbnail Available

Date

2009

Journal Title

Journal ISSN

Volume Title

Publisher

Research Symposium 2009 - Faculty of Graduate Studies, University of Kelaniya

Abstract

Since the conventional user interfaces such as keyboard and monitors restrict the usage of computers, there is a dire need for an interface other than keyboard and screen-interface that is widely in use at present. Speech technologies promise to be the next generation user interfaces. In general, two technologies for processing speech are needed. One is speech recognition, and the other is speech synthesis. Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software and/or hardware. Text-to-Speech (TTS) is one of the speech synthesis technologies. TTS can be defined as “the production of speech by machines, by way of the automatic phonetization of the sentences to utter”. Before a synthesizer can produce an utterance, several steps have to be completed. First, the right segments/units have to be selected. The units usually used are diphones, half-syllables, and triphones etc. Many synthesizers use diphones as their basic units of concatenation. A diphone is the transition between two speech sounds, obtained from natural speech. Creating a diphone database, which contains all the sound transitions in the target language, is critical in diphone TTS synthesis.

Description

Keywords

Citation

Research Symposium; 2009 :122-123p

Collections

Endorsement

Review

Supplemented By

Referenced By