Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/4831
Title: Estimation of nonparametric probability density functions with applications to automatic speech recognition
Other Titles: Schätzung von nichtparametrischen Wahrscheinlichkeitsdichtefunktionen, mit Anwendungen auf Automatische Spracherkennung
Author(s): Schafföner, Martin
Granting Institution: Otto-von-Guericke-Universität Magdeburg
Issue Date: 2007
Extent: Online-Ressource (PDF-Datei: 138 S., 1481 KB)
Type: Hochschulschrift
Type: PhDThesis
Language: English
Publisher: Universitätsbibliothek
Otto von Guericke University Library, Magdeburg, Germany
URN: urn:nbn:de:101:1-201010182253
Subjects: Hochschulschrift
Online-Publikation
Automatische Spracherkennung
Abstract: During the last decade, a new learning paradigm called Structural Risk Minimization (SRM) derived from Statistical Learning Theory, has become widely studied in machine learning. Machines implementing SRM, e. g., Support Vector Machines (SVMs) and Kernel Fisher Discriminants (KFDs), have been very successfully used for solving pattern recognition and function regression problems. SRM's ability to simultaneously minimize the risk of error on training data and the complexity of a learning machine results in better generalization capability than plain Empirical Risk Minimization (ERM), especially if the amount of training data is limited. The present work is devoted to applying SRM to the problem of probability density function (PDF) estimation. When modeling sequences of continuous-valued events using Hidden Markov Models (HMMs), e. g., automatic speech recognition (ASR), PDFs are used to model the emission probabilities of the HMMs' states. This thesis investigates and develops methods to efficiently train sparse kernel PDF models by regression of the empirical cumulative distribution function (ECDF). A new method for obtaining a sparse approximation of the orthogonal least-squares regression solution by forward-selection of relevant samples is presented, where a novel memory-efficient thin update of the orthogonal decomposition is used. This method is evaluated on standard benchmark problems of up to five dimensions, showing superior performance to traditional parametric Gaussian Mixture Models (GMMs) and similar performance to the theoretically optimal, non-sparse Parzen windows PDF models. However, it is found that this new method cannot be applied to the problem of estimating PDFs for ASR due to the complexity of the ECDF in high dimensions. Instead, posterior class probabilities calibrated from the outputs of binary discriminants such as SVMs or KFDs are turned into class-conditional PDFs using Bayes' rule. This approach is tested within a monophone HMM ASR system on the Resource Management task, outperforming traditional HMM-GMM systems significantly, especially on random limited samples which demonstrates the new models' improved generalization ability on small-sample problems. In order to realize these large-scale experiments, a novel machine learning software library is presented. Primary focus is put on fast computations, simplicity both in terms of expressing algorithms and extending functionality, and flexibility in order to properly appreciate algorithms' properties and advantages. The software library follows an object-oriented design and has been implemented in C++. For productivity, the library is equipped with fine-grained tracing, an object-oriented persistence model, transparent error handling and parallelization on distributedmemory computer clusters.
URI: https://opendata.uni-halle.de//handle/1981185920/10873
http://dx.doi.org/10.25673/4831
Open Access: Open access publication
Appears in Collections:Fakultät für Elektrotechnik und Informationstechnik

Files in This Item:
File Description SizeFormat 
marschaffoener.pdf1.48 MBAdobe PDFThumbnail
View/Open