Tandem MLNs based Phonetic Feature Extraction for Phoneme Recognition

Mohammed Rokibul Alam Kotwal; Foyzul Hassan; Ghulam Muhammad; Mohammad Nurul Huda

Authors

Mohammed Rokibul Alam Kotwal United International University, Department of Computer Science and Engineering
Foyzul Hassan United International University, Department of Computer Science and Engineering
Ghulam Muhammad King Saud University, Department of CE, College of CIS
Mohammad Nurul Huda United International University, Department of Computer Science and Engineering

Keywords:

: multilayer neural network, hidden Markov model, automatic speech recognition, mel frequency cepstral coefficients, distinctive phonetic features, out-of-vocabulary

Abstract

This paper presents a method for automatic phoneme recognition for Japanese language using tandem MLNs. Here, an accurate phoneme recognizer or phonetic type-writer, which extracts out-of-vocabulary (OOV) word for resolving OOV problem that occurred when a new vocabulary does not exist in word lexicon, plays an important role in current hidden Markov model (HMM)-based automatic speech recognition (ASR) system. The construction of the proposed method comprises three stages: (i) the multilayer neural network (MLN) that converts acoustic features, mel frequency cepstral coefficients (MFCCs), into distinctive phonetic features (DPFs) is incorporated at first stage, (ii) the second MLN that combines DPFs and acoustic features as input and outputs a 45 dimensional DPF vector with less context effect is added and (iii) the 45 dimensional feature vector generated by the second MLN are inserted into a hidden Markov model (HMM) based classifier to obtain more accurate phoneme strings from the input speech. From the experiments on Japanese Newspaper Article Sentences (JNAS) in clean acoustic environment, it is observed that the proposed method provides a higher phoneme correct rate and improves phoneme accuracy tremendously over the method based on a single MLN. Moreover, it requires fewer mixture components in HMMs. Consequently, less computation time is required for the HMMs.

Downloads

Download data is not yet available.

Tandem MLNs based Phonetic Feature Extraction for Phoneme Recognition

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Information