Speech Recognition by Indexing and Sequencing

Authors

  • Simone Franzini Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan St., Chicago, IL 60607, USA
  • Jezekiel Ben-Arie Department of Electrical and Computer Engineering, University of Illinois at Chicago, 851 S. Morgan St., Chicago, IL 60607, USA

Keywords:

example-based speech recognition; recognition by indexing and sequencing; RISq; compounded examples

Abstract

Recognition by Indexing and Sequencing (RISq) is a general-purpose example-based method for classification of temporal vector sequences. We developed an advanced version of RISq and applied it to speech recognition, a task most commonly performed with Hidden Markov Models (HMMs) or Dynamic Time Warping (DTW). RISq is substantially different from both these methods and presents several advantages over them: robust recognition can be achieved using only a few samples from the input sequence and training can be carried out with one or more examples per class. This enables much faster training and also allows to recognize speech with a variety of accents. A two-step classification algorithm is used: first the training samples closest to each input sample are identified and weighted with a parallel algorithm (indexing). Then a maximum weighted bipartite graph matching is found between the input sequence and a training sequence, respecting an additional temporal constraint (sequencing). We discuss the application of RISq to speech recognition and compare its architecture and performance with that of Sphinx, a state-of-the-art speech recognizer based on HMMs.

Downloads

Download data is not yet available.

Downloads

Published

2012-04-01

How to Cite

Simone Franzini, & Jezekiel Ben-Arie. (2012). Speech Recognition by Indexing and Sequencing. International Journal of Computer Information Systems and Industrial Management Applications, 4, 8. Retrieved from https://cspub-ijcisim.org/index.php/ijcisim/article/view/183

Issue

Section

Original Articles