Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Detection of Human Emotion from Noise Speech
Blekinge Institute of Technology, Faculty of Engineering, Department of Applied Signal Processing.
Blekinge Institute of Technology, Faculty of Engineering, Department of Applied Signal Processing.
2020 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Detection of a human emotion from human speech is always a challenging task. Factors like intonation, pitch, and loudness of signal vary from different human voice. So, it's important to know the exact pitch, intonation and loudness of a speech for making it a challenging task for detection. Some voices exhibit high background noise which will affect the amplitude or pitch of the signal. So, knowing the detailed properties of a speech to detect emotion is mandatory. Detection of emotion in humans from speech signals is a recent research field. One of the scenarios where this field has been applied is in situations where the human integrity and security are at risk In this project we are proposing a set of features based on the decomposition signals from discrete wavelet transform to characterize different types of negative emotions such as anger, happy, sad, and desperation. The features are measured in three different conditions: (1) the original speech signals, (2) the signals that are contaminated with noise or are affected by the presence of a phone channel, and (3) the signals that are obtained after processing using an algorithm for Speech Enhancement Transform. According to the results, when the speech enhancement is applied, the detection of emotion in speech is increased and compared to results obtained when the speech signal is highly contaminated with noise. Our objective is to use Artificial neural network because the brain is the most efficient and best machine to recognize speech. The brain is built with some neural network. At the same time, Artificial neural networks are clearly advanced with respect to several features, such as their nonlinearity and high classification capability. If we use Artificial neural networks to evolve the machine or computer that it can detect the emotion. Here we are using feedforward neural network which is suitable for classification process and using sigmoid function as activation function. The detection of human emotion from speech is achieved by training the neural network with features extracted from the speech. To achieve this, we need proper features from the speech. So, we must remove background noise in the speech. We can remove background noise by using filters. wavelet transform is the filtering technique used to remove the background noise and enhance the required features in the speech.

Place, publisher, year, edition, pages
2020. , p. 67
Series
Blekinge Tekniska Högskola Forskningsrapport, ISSN 1103-1581
Keywords [en]
Neural Network, Activation Function, Fast Fourier Transform, Karhunen-Loeve Transform, speech enhancement, filtering, Wavelet Transform, Speech preprocessing, signal to noise ratio, shallow neural network
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:bth-19610OAI: oai:DiVA.org:bth-19610DiVA, id: diva2:1437250
Subject / course
ET2566 Master's Thesis (120 credits) in Electrical Engineering with emphasis on Signal processing
Educational program
ETASX Master of Science Programme in Electrical Engineering with emphasis on Signal Processing
Presentation
2020-02-28, Blekinge Institute of Technology, Karlskrona, 07:45 (English)
Supervisors
Examiners
Available from: 2020-06-09 Created: 2020-06-09 Last updated: 2020-06-09Bibliographically approved

Open Access in DiVA

fulltext(2474 kB)202 downloads
File information
File name FULLTEXT01.pdfFile size 2474 kBChecksum SHA-512
c48f7875137d91819275a164cd7718eb0de6fbba6967adc6675c7ffb7cd3592ab665d2a8ec5f5d19fbdf799194a94f592dbb48f5615cfb0bcd9d703bdd25a437
Type fulltextMimetype application/pdf
Detection of Human Emotion from Noise Speech(2815 kB)120 downloads
File information
File name FULLTEXT02.pdfFile size 2815 kBChecksum SHA-512
4a2f8f4f4ab6deae88bc7fde26ca8ac63713694bb4efb06c35b92d1475495d99681e4f0b265963ca988122b279fbd0c61ad1647260478b0074190c2562471d9c
Type fulltextMimetype application/pdf

By organisation
Department of Applied Signal Processing
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 322 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 361 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf