A character-based analysis of impacts of dialects on end-to-end Norwegian ASR

Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero

dc.contributor.author	Parsons, Phoebe
dc.contributor.author	Kvale, Knut
dc.contributor.author	Svendsen, Torbjørn Karl
dc.contributor.author	Salvi, Giampiero
dc.date.accessioned	2024-01-05T13:15:22Z
dc.date.available	2024-01-05T13:15:22Z
dc.date.created	2023-05-31T15:37:58Z
dc.date.issued	2023
dc.identifier.isbn	978-99-1621-999-7
dc.identifier.uri	https://hdl.handle.net/11250/3110151
dc.description.abstract	We present a method for analyzing character errors for use with character-based, end-to-end ASR systems, as used herein for investigating dialectal speech. As end-to-end systems are able to produce novel spellings, there exists a possibility that the spelling variants produced by these systems can capture phonological information beyond the intended target word. We therefore first introduce a way of guaranteeing that similar words and characters are paired during alignment, thus ensuring that any resulting analysis of character errors is founded on sound substitutions. Then, from such a careful character alignment, we find trends in system-generated spellings that align with known phonological features of Norwegian dialects, in particular, “r” and “l” confusability and voiceless stop lenition. Through this analysis, we demonstrate that cues from acoustic dialectal features can influence the output of an end-to-end ASR systems.	en_US
dc.description.abstract	A character-based analysis of impacts of dialects on end-to-end Norwegian ASR	en_US
dc.language.iso	eng	en_US
dc.publisher	University of Tartu	en_US
dc.relation.ispartof	Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
dc.relation.uri	https://aclanthology.org/2023.nodalida-1.47.pdf
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.subject	Språkteknologi	en_US
dc.subject	Language Technology	en_US
dc.subject	Talegjenkjenning	en_US
dc.subject	Speech recognition	en_US
dc.subject	Maskinlæring	en_US
dc.subject	Machine learning	en_US
dc.title	A character-based analysis of impacts of dialects on end-to-end Norwegian ASR	en_US
dc.title.alternative	A character-based analysis of impacts of dialects on end-to-end Norwegian ASR	en_US
dc.type	Chapter	en_US
dc.description.version	acceptedVersion	en_US
dc.subject.nsi	VDP::Datateknologi: 551	en_US
dc.subject.nsi	VDP::Computer technology: 551	en_US
dc.subject.nsi	VDP::Annen informasjonsteknologi: 559	en_US
dc.subject.nsi	VDP::Other information technology: 559	en_US
dc.source.pagenumber	467-476	en_US
dc.identifier.cristin	2150529
dc.relation.project	Norges forskningsråd: 322964	en_US
cristin.ispublished	true
cristin.fulltext	postprint

Files in this item

Name:: NoDaLiDa_2023____A_character_b ...
Size:: 861.1Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Institutt for elektroniske systemer [2349]
Publikasjoner fra CRIStin - NTNU [38688]

Show simple item record

Except where otherwise noted, this item's license is described as Navngivelse 4.0 Internasjonal