Blar i Institutt for elektroniske systemer på forfatter "Salvi, Giampiero"
-
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
Sabzi Shahrebabaki, Abdolreza; Salvi, Giampiero; Svendsen, Torbjørn Karl; Siniscalchi, Sabato Marco (Journal article; Peer reviewed, 2021)We investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy conditions within the deep neural network (DNN) framework. In contrast with recent results in the literature, we argue ... -
Beyond the Self: Using Grounded Affordances to Interpret and Describe Others' Actions
Saponaro, Giovanni; Jamone, Lorenzo; Alexandre, Bernardino; Salvi, Giampiero (Journal article; Peer reviewed, 2019)We propose a developmental approach that allows a robot to interpret and describe the actions of human agents by reusing previous experience. The robot first learns the association between words and object affordances by ... -
A character-based analysis of impacts of dialects on end-to-end Norwegian ASR
Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero (Chapter, 2023)We present a method for analyzing character errors for use with character-based, end-to-end ASR systems, as used herein for investigating dialectal speech. As end-to-end systems are able to produce novel spellings, there ... -
Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages
Olstad, Anne Marte Haug; Smolander, Anna; Strömbergsson, Sofia; Ylinen, Sari; Lehtonen, Minna; Kurimo, Mikko; Getman, Yaroslav; Grósz, Tamás; Cao, Xinwei; Svendsen, Torbjørn Karl; Salvi, Giampiero (Journal article; Peer reviewed, 2024) -
Comparative analysis of explainable machine learning prediction models for hospital mortality
Stenwig, Eline; Salvi, Giampiero; Salvo Rossi, Pierluigi; Skjaervold, Nils Kristian (Peer reviewed; Journal article, 2022)Background Machine learning (ML) holds the promise of becoming an essential tool for utilising the increasing amount of clinical data available for analysis and clinical decision support. However, the lack of trust in ... -
Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit
Stenwig, Eline; Salvi, Giampiero; Salvo Rossi, Pierluigi; Skjaervold, Nils Kristian (Peer reviewed; Journal article, 2023)Background The use of machine learning is becoming increasingly popular in many disciplines, but there is still an implementation gap of machine learning models in clinical settings. Lack of trust in models is one of ... -
Development of a Data Characterization Software for a Gamma Spectroscopy System
Godø, Sofia Olivia Kathea (Master thesis, 2021)Basert på egen ASIC, IDE3421, har selskapet IDEAS utviklet et system som tolker spektrometeravlesninger fra flere parallelle CZT-krystaller. GDS-100-systemet har nytte av en programvare som implementerer en kalibreringsprosedyre ... -
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion
Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl (Chapter, 2021)In this work, we investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy condition within the deep neural network (DNN) framework. We claim that DNN vector-to-vector regression for ... -
Dynamic Beam Search Decoding for Speech Recognition using Confidence Estimation Module
Stensgård, Kristian (Master thesis, 2022)Dyp læring med nevrale nettverk har blitt stadig mer populært det siste tiåret og har vist lovende resultater i datasyn og Natural Language Processing (NLP)-oppgaver samt for Automatic Speech Recognition (ASR). Bruksområdene ... -
Modeling the Interpretability of an End-to-End Automatic Speech Recognition System Adapted to Norwegian Speech
Lunde, Solveig Reppen (Master thesis, 2022)Formålet med dette arbeidet var å modellere tolkbarheten til et automatisk talegjenkjenningssystem trent på norsk tale. Systemet er et ende-til-ende dypt nevralt nettverk som tar inn taledata og er trent for å gi ut ... -
NAAQA: A Neural Architecture for Acoustic Question Answering
Abdelnour, Jerome; Rouat, Jean; Salvi, Giampiero (Peer reviewed; Journal article, 2022)The goal of the Acoustic Question Answering (AQA) task is to answer a free-form text question about the content of an acoustic scene. It was inspired by the Visual Question Answering (VQA) task. In this paper, based on the ... -
Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation
Rugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero (Peer reviewed; Journal article, 2023) -
Performance Evaluation of Different Approaches to Motion Artifact Cancellation in Pulse Oximetry
Gjeraker, Johannes (Master thesis, 2020)Abstract will be available on 2023-06-27 -
Real-Time Object Detection for Control of Towed Marine Seismic Handling Systems
Høiberget, Magnus (Master thesis, 2021) -
Self-supervised vision-based detection of the active speaker as support for socially aware language acquisition
Stefanov, Kalin; Beskow, Jonas; Salvi, Giampiero (Peer reviewed; Journal article, 2020)This paper presents a self-supervised method for visual detection of the active speaker in a multiperson spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive ... -
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
Stefanov, Kalin; Beskow, Jonas; Salvi, Giampiero (Journal article; Peer reviewed, 2019)This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive ... -
Semantically Meaningful Metrics for Norwegian ASR Systems
Rugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero (Peer reviewed; Journal article, 2022)Evaluation metrics are important for quanitfying the performance of Automatic Speech Recognition (ASR) systems. However, the widely used word error rate (WER) captures errors at the word-level only and weighs each error ... -
Semi-supervised learning for Automatic Speech Recognition
Rahim, Felicia (Master thesis, 2020)Denne masteroppgaven undersøker et talegjenkjenningssystem som trent på en delvis annotert database innenfor fagområdet talegjenkjenning (ASR). Et dypt nevralt nettverk (DNN) klassifiserte tilstander som tilhørte individuelle ... -
Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals
Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl (Peer reviewed; Journal article, 2020)We propose a new acoustic-to-articulatory inversion (AAI) sequence-to-sequence neural architecture, where spectral sub-bands are independently processed in time by 1-dimensional (1-D) convolutional filters of different ... -
Silent Speech Communication Using Facial Electromyography
Backsæther, Mathias Gullikstad (Master thesis, 2021)Språk er uvurderlig for mennesket som art, og tale som kommunikasjonsmiddel muliggjør samarbeid mellom mennesker hver dag. Allikevel finnes det ulike situasjoner der vokalisert tale ikke er et alternativ. Interessen for ...