Browsing NTNU Open by Author "Svendsen, Torbjørn Karl"
Now showing items 1-20 of 35
-
A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification
Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl (Journal article; Peer reviewed, 2019)This paper provides a comprehensive analysis of the effect of speaking rate on frame classification accuracy. Different speaking rates may affect the performance of the automatic speech recognition system yielding poor ... -
A Deep Learning Approach to Spoken Language Acquisition
Rugayan, Janine (Master thesis, 2021)The process of human spoken language acquisition is still being studied up to this day—the most popular theory from B.F. Skinner describes the language learning of infants as a verbal behavior controlled by consequences. ... -
Acoustic Feature Comparison for Different Speaking Rates
Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl (Chapter, 2018)This paper investigates the effect of speaking rate variation on the task of frame classification. This task is indicative of the performance on phoneme and word recognition and is a first step towards designing voice-controlled ... -
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
Sabzi Shahrebabaki, Abdolreza; Salvi, Giampiero; Svendsen, Torbjørn Karl; Siniscalchi, Sabato Marco (Journal article; Peer reviewed, 2021)We investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy conditions within the deep neural network (DNN) framework. In contrast with recent results in the literature, we argue ... -
An Analysis of Goodness of Pronunciation for Child Speech
Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero (Peer reviewed; Journal article, 2023) -
Better coding for learning speech representations
Rosberg, Sivert (Master thesis, 2022)I denne oppgaven er muligheten for å bruke Non-autoregressive Predictive Coding (NPC) til å lære talerepresentasjoner undersøkt. NPC er en selv-overvåket dyp-læringsmetode som, i motsetning til andre vanlige selv-overvåkede ... -
A character-based analysis of impacts of dialects on end-to-end Norwegian ASR
Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero (Chapter, 2023)We present a method for analyzing character errors for use with character-based, end-to-end ASR systems, as used herein for investigating dialectal speech. As end-to-end systems are able to produce novel spellings, there ... -
Child Speech Recognition
Steinskog, Kristin Ottesen (Master thesis, 2021)Talegjenkjenning for barn er utfordrende ettersom dagens talegjenkjenningssystem er basert på tale fra voksne. Talegjenkjenning kan hjelpe utviklingen av tale og språk hos barn. Derfor er det viktig å forbedre talegjenkj ... -
Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages
Olstad, Anne Marte Haug; Smolander, Anna; Strömbergsson, Sofia; Ylinen, Sari; Lehtonen, Minna; Kurimo, Mikko; Getman, Yaroslav; Grósz, Tamás; Cao, Xinwei; Svendsen, Torbjørn Karl; Salvi, Giampiero (Journal article; Peer reviewed, 2024) -
Decision Algorithm for Parking Sensors
Karami, Hossein (Master thesis, 2020)Studier viser at opptil en tredjedel av all urbane overbelastning er forårsaket av sjåfører som leter etter et sted å parkere. Amerikanske sjåfører bruker gjennomsnittlig 17 timer i året på å søke etter gratis parkeringsplasser ... -
Detecting Parkinson's Disease from Speech Data using Wav2Vec-based Architecture
Schult, Julie Elisabeth; Ven, Laura Feøy (Master thesis, 2024)Denne masteroppgaven har undersøkt om det er hensiktsmessig å detektere Parkinsons Sykdom ved å bruke taledata og forhåndstrent modell Wav2Vec 2.0. Dette prosjektet er fortsettelsen av fordypningssprosjekt gjennomført ... -
Detecting Parkinson's Disease from Speech Data using Wav2Vec-based Architecture
Ven, Laura Feøy; Schult, Julie Elisabeth (Master thesis, 2024)Denne masteroppgaven har undersøkt om det er hensiktsmessig å detektere Parkinsons Sykdom ved å bruke taledata og forhåndstrent modellWav2Vec 2.0. Dette prosjektet er fortsettelsen av fordypningssprosjekt gjennomført ... -
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion
Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl (Chapter, 2021)In this work, we investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy condition within the deep neural network (DNN) framework. We claim that DNN vector-to-vector regression for ... -
Enhancement of Noisy Speech Using Deep Learning
Turøy, Ida; Mo, Kari Vikøren (Master thesis, 2021)Abstract will be available on 2024-01-07 -
Enhancement of Noisy Speech Using Deep Learning
Mo, Kari Vikøren; Turøy, Ida (Master thesis, 2021)Abstract will be available on 2024-01-11 -
Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions
La Quatra, Moreno; Turco, Maria Francesca; Svendsen, Torbjørn Karl; Salvi, Giampiero; Orozco-Arroyave, Juan Rafael; Siniscalchi, Sabato Marco (Journal article; Peer reviewed, 2024) -
A Framework for Phoneme-Level Pronunciation Assessment Using CTC
Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero (Peer reviewed; Journal article, 2024) -
Improving Generalization of Norwegian ASR with Limited Linguistic Resources
Solberg, Per Erik; Ortiz Cabello, Pablo; Parsons, Phoebe; Svendsen, Torbjørn Karl; Salvi, Giampiero (Chapter, 2023) -
Low-resource speech recognition - Exploring methods of improving performance
Moum, August Høyen; Winnerdal, Skjalg (Master thesis, 2020)Å lage et nøyaktig talegjenkjenningssystem som generaliserer tilstrekkelig er ingen lett oppgave. Begrensede mengder med transkribert taledata kompliserer dette ytterligere, ettersom systemene krever store mengder treningsdata ... -
Low-resource speech recognition - Exploring methods of improving performance
Moum, August Høyen; Winnerdal, Skjalg (Master thesis, 2020)Å lage et nøyaktig talegjenkjenningssystem som generaliserer tilstrekkelig er ingen lett oppgave. Begrensede mengder med transkribert taledata kompliserer dette ytterligere, ettersom systemene krever store mengder treningsdata ...