Vis enkel innførsel

dc.contributor.authorGelderblom, Femke B.
dc.contributor.authorKvam, Johannes
dc.contributor.authorLiu, Yi
dc.contributor.authorMyrvoll, Tor Andre
dc.date.accessioned2021-10-22T07:26:29Z
dc.date.available2021-10-22T07:26:29Z
dc.date.created2021-05-28T12:46:02Z
dc.date.issued2021
dc.identifier.isbn978-1-7281-7606-2
dc.identifier.urihttps://hdl.handle.net/11250/2824872
dc.description.abstractThis paper investigates the use of different room impulse response (RIR) simulation methods for synthesizing training data for deep neural network-based direction of arrival (DOA) estimation of speech in reverberant rooms. Different sets of synthetic RIRs are obtained using the image source method (ISM) and more advanced methods including diffuse reflections and/or source directivity. Multi-layer perceptron (MLP) deep neural network (DNN) models are trained on generalized cross correlation (GCC) features extracted for each set. Finally, models are tested on features obtained from measured RIRs. This study shows the importance of training with RIRs from directive sources, as resultant DOA models achieved up to 51% error reduction compared to the steered response power with phase transform (SRP-PHAT) baseline (significant with p<<.01), while models trained with RIRs from omnidirectional sources did worse than the baseline. The performance difference was specifically present when estimating the azimuth of speakers not facing the array directly.en_US
dc.language.isoengen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.ispartofICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
dc.titleSYNTHETIC DATA FOR DNN-BASED DOA ESTIMATION OF INDOOR SPEECHen_US
dc.typeChapteren_US
dc.description.versionacceptedVersionen_US
dc.source.pagenumber4390-4394en_US
dc.identifier.doi10.1109/ICASSP39728.2021.9414415
dc.identifier.cristin1912501
dc.relation.projectNorges forskningsråd: 256753en_US
dc.description.localcode© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel