Show simple item record

dc.contributor.authorAli Humayun, Mohammad
dc.contributor.authorHameed, Ibrahim A.
dc.contributor.authorMuslim Shah, Syed
dc.contributor.authorHassan Khan, Sohaib
dc.contributor.authorZafar, Irfan
dc.contributor.authorBin Ahmed, Saad
dc.contributor.authorShuja, Junaid
dc.date.accessioned2019-07-01T09:47:12Z
dc.date.available2019-07-01T09:47:12Z
dc.date.created2019-06-30T10:30:18Z
dc.date.issued2019
dc.identifier.citationApplied Sciences. 2019, 9 (9), .nb_NO
dc.identifier.issn2076-3417
dc.identifier.urihttp://hdl.handle.net/11250/2602970
dc.description.abstractAutomatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challenge to obtain for low-resource languages like Urdu. Most models proposed for Urdu ASR are based on Hidden Markov Models (HMMs). This paper proposes an end-to-end neural network model, for Urdu ASR, regularized with dropout, ensemble averaging and Maxout units. Dropout and ensembles are averaging techniques over multiple neural network models while Maxout are units in a neural network which adapt their activation functions. Due to limited labeled data, Semi Supervised Learning (SSL) techniques are also incorporated to improve model generalization. Speech features are transformed into a lower dimensional manifold using an unsupervised dimensionality-reduction technique called Locally Linear Embedding (LLE). Transformed data along with higher dimensional features is used to train neural networks. The proposed model also utilizes label propagation-based self-training of initially trained models and achieves a Word Error Rate (WER) of 4% less than that reported as the benchmark on the same Urdu corpus using HMM. The decrease in WER after incorporating SSL is more significant with an increased validation data size.nb_NO
dc.language.isoengnb_NO
dc.publisherMDPInb_NO
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleRegularized Urdu Speech Recognition with Semi-Supervised Deep Learningnb_NO
dc.typeJournal articlenb_NO
dc.typePeer reviewednb_NO
dc.description.versionpublishedVersionnb_NO
dc.source.pagenumber15nb_NO
dc.source.volume9nb_NO
dc.source.journalApplied Sciencesnb_NO
dc.source.issue9nb_NO
dc.identifier.doihttps://doi.org/10.3390/app9091956
dc.identifier.cristin1708849
dc.description.localcode© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).nb_NO
cristin.unitcode194,63,55,0
cristin.unitnameInstitutt for IKT og realfag
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Navngivelse 4.0 Internasjonal
Except where otherwise noted, this item's license is described as Navngivelse 4.0 Internasjonal