Software – Speech

iSpeak For Children

iSpeak for children is an interactive learning application for kids. It is a well-accepted fact that multimedia can enhance the learning of kids. With speech recognition technology and voice feedback system, learning alphabets, shapes, objects, calculations etc. would be a lot more easy and interesting than before. The current version of iSpeak offers the following speech recognition based (hands free) activities:

  • Learning English alphabets
  • Learning basic shapes
  • A quiz to identify various objects
  • A Tic-Tac-Toe game

The complete application is totally voice controlled which is the most appealing feature for kids.

Currently we’re rolling out a demo version of the application with limited features. Users can download this demo version free of cost, and provide their valuable feedback. Full version can be made available on specific request, after the user has provided a satisfactory feedback of the demo version.

Click here if you wish to download iSpeak for children.

A YouTube video has been created to demonstrate the features of iSpeak. (http://www.youtube.com/watch?v=X-Pj5mWNxGs)

IPA Help

The International Phonetic Alphabet (IPA) is a notational standard for the phonetic representation of all languages.

IPA Help is a computer program for learning to recognize, transcribe, and produce the sounds of the International Phonetic Alphabet (IPA). To help you associate symbols and sounds, IPA Help provides phonetic charts, lists of example words, and recordings of the phones and words.

You can download it from: http://www.sil.org/computing/ipahel/ipahelp_download.htm

Cool Edit

The Best Audio Editing Software now known as Adobe Audition

Whether you`re an audio engineer, web developer, multimedia creator or musician, Cool Edit PRO is the software application that meets all your demanding needs.

You can download a free 21 days trial from: http://www.softpedia.com/progDownload/Cool-Edit-Pro-Download-2076.html

SP Wave

spwave is an audio file editor supporting several sound formats including WAV, AIFF, MP3, Ogg Vorbis, raw, and more. The program is designed for research use, so stability and usability are regarded as important. spwave runs on multiple platforms including Windows, Mac OS, and Linux.

spwave has following features.

  • Support for multiple platforms: Windows, Mac OS, Linux (Motif, gtk), etc.
  • Support for WAV, AIFF, MP3, Ogg Vorbis, raw, and text files by using plug-ins.
  • Support for many bits/samples: 8bits, 16bits, 24bits, 32bits, 32bits float, 64bits double.
  • Converting the sampling frequency and the bits/sample of a file.
  • Playing, zooming, cropping, deleting, extracting, etc. of a selected region.
  • Fade-in, fade-out, gain adjustment, channel swapping, etc of a selected region.
  • Editing file information that supports comments of WAV and AIFF, and ID3 tag of MP3.
  • Analysis of a selected region using several analysis types, e.g. spectrum, smoothed spectrum, phase, unwrapped phase and group delay.
  • Undoing and redoing without limitation of the number of times.
  • Waveform extraction by drag & drop.
  • Opening files by drag & drop.
  • Autosaving of selected regions (you can do this by drag & drop also).
  • Saving positions and regions as labels.
  • Viewing some waveforms and setting regions synchronously.
  • Almost all processing is 64 bits processing internally.

For more information, you can refer
http://www-ie.meijo-u.ac.jp/~banno/spLibs/spwave/index.html

 

PRAAT

Praat is a free computer software package for the scientific analysis of speech in phonetics. It was designed, and continues to be developed, by Paul Boersma and David Weenink of the University of Amsterdam. It can run on a wide range of operating systems, including various versions of Unix, Linux, Mac and Microsoft Windows . The program supports speech synthesis, including articulatory synthesis. PRAAT offers a wide range of standard and non-standard procedures, including spectrographic analysis, articulatory synthesis and neural networks.
Following are some of the operations which can be performed in it.

  • Create a speech object
  • Process a signal
  • Label a waveform
  • General analysis (waveform, intensity, sonogram, pitch, duration)
  • Spectrographic analysis
  • Intensity analysis
  • Pitch analysis

PRAAT can be downloaded from here.
You can refer this link for tutorial of PRAAT.

HTK (Hidden Markov Model Toolkit)

The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HMMs can be used to model any time series and the core of HTK is similarly general-purpose. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
HTK consists of a set of library modules and tools available in C source form. The tools provide sophisticated facilities for speech analysis, HMM training, testing and results analysis. The software supports HMMs using both continuous density mixture Gaussians and discrete distributions and can be used to build complex HMM systems.

The HTK training tools are used to estimate the parameters of a set of HMMs using training utterances and their associated transcriptions. Then unknown utterances are transcribed using the HTK recognition tools. Much of the functionality of HTK is built into the library modules. These modules ensure that every tool interfaces to the outside world in exactly the same way. They also provide a central resource of commonly used functions.

HTK can be downloaded from here

To get more information on HTK, you can refer
1. www.eecs.yorku.ca/course_archive/2007-08/W/6328/Reading/htkbook31_part1.pdf

2. http://www.bas.uni-muenchen.de/forschung/publikationen/Schiel_HTK.txt

Comments are closed.