Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Far-Field Wiener Beamforming and Source Localization in Frequency Domain
Blekinge Institute of Technology, School of Engineering.
Blekinge Institute of Technology, School of Engineering.
2012 (English)Independent thesis Advanced level (degree of Master (Two Years))Student thesis
Abstract [en]

In present conference environments where video recording is required, a set of cameras operated by a human being is needed to track the active speaker as he discusses in the conference. In order to automate this procedure, different methods have been developed in acoustic and visual tracking. In this thesis work, a robust speaker tracking system is developed namely, Steered Response Power PHase Alignment Transform (SRP-PHAT) and Steered Response Kurtosis PHase Alignment Transform (SRK-PHAT) which compute the likelihood of each source position using the generalized cross correlation estimations between each pair of microphones. While developing the hands-free speech applications in a smart room environment, speech source will be located at a distance from the microphones and the effect of presence of noise and reverberation is high in estimating the location of the speech source. The accuracy of the SRP-Phat and SRK-Phat methods in estimating the source location is limited by the time resolution of weighted PHAT function. In this thesis work, SRP-Phat and SRK-Phat has been implemented using 2 element microphone array and 4 element microphone array and to compare the above methods in detail, the performance of the methods has been analyzed for 64,128 and 256 subbands in a WOLA filter bank. The estimated Time difference of arrival (TDOA’s) and Direction of Arrival (DOA’s) of SRPPhat and SRK-Phat are compared along with Original values to determine the best method for estimating the speech source location. Mean estimation error and Standard deviation are calculated to determine the accuracy of the TDOA’s estimated. In this thesis work, Wiener Beamforming is implemented for removing noise and reverberation in a room environment using a 2 element microphone array. The performance of the method is analyzed using Signal-to-Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ). In order to improve the results obtained, a De-reverberation procedure is also included in the Wiener Beamforming method and the improvement in PESQ values is discussed in chapter 4.The performance of the wiener beamforming method is tested for brown noise, babble noise, fan noise and white noise.

Place, publisher, year, edition, pages
2012. , p. 80
Keywords [en]
Wiener Beamformer, SRP-Phat, SRK-Phat, Reverberation
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:bth-2491Local ID: oai:bth.se:arkivexEE17C96451FE5401C1257A220080260DOAI: oai:DiVA.org:bth-2491DiVA, id: diva2:829770
Uppsok
Technology
Supervisors
Available from: 2015-04-22 Created: 2012-06-20 Last updated: 2015-06-30Bibliographically approved

Open Access in DiVA

fulltext(2939 kB)545 downloads
File information
File name FULLTEXT01.pdfFile size 2939 kBChecksum SHA-512
cb808aeb49250bda291aeab6f2b1374e05b7dfc65f69138b5660c32f13f313346becefc7ce00a247acad714d34204551f2af3672d984347b3822123320de1c38
Type fulltextMimetype application/pdf

By organisation
School of Engineering
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 546 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 220 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf