Transient Noise Event Detection in Recorded Speech
MetadataShow full item record
Transient Noise Event Detection in Speech is the ability to detect a transient noise event like loud knocks, crumpling of paper, and other impulsive sounds in recorded speech.Transient Event Detection is a relevant issue in many different fields, and much work has been done to find methods that work in specific applications. In speech processing there exist a few well established methods, which works quite well on this problem. Mel-Frequency Cepstral Coefficients (MFCC) for instance is one of the most widely used methods for speech processing, and is used in many commercial programs that has to deal with transient noise events, like automatic speech recognizers. There is a lack of an extensive comparison of the existing methods, to find how well they perform against each other.This work focuses on testing many of the existing methods in speech processing and other fields, to see how they compare against each other. The speech used is from the TIMIT database, and a wide variety of transients are mixed in with the speech to test the detection performance of the different methods. Some tweaks to existing algorithms are proposed.It turns out that a well-trained MFCC Gaussian Mixture Model (GMM), reaches almost 100% correct detection. Other methods like the change of the Short Term Energy in third-octave bands got more than 95% correct detection, but at a much lower computational cost.