Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Blind Subband Beamforming With Time-Delay Constraints for Moving Source Speech Enhancement
Responsible organisation
2007 (English)In: IEEE Transactions on Audio, Speech, and Language Processing, ISSN 1558-7916, E-ISSN 1558-7924, Vol. 15, no 8, p. 2360-2372Article in journal (Refereed) Published
Abstract [en]

A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves speech enhancement, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The novelty of the proposed method involves considering the spatial spreading of the sound source as equivalent to a time-delay spreading, thus, allowing for the estimated intersensor time-delays to be directly used in the beamforming operations. In comparison to previous approaches, this new method requires no special array geometry, knowledge of the array manifold, or acquisition of calibration data to adapt the array weights. Furthermore, such a scheme allows for the beamformer to efficiently adapt to speaker movement. The robustness of the time-delay estimation of speech signals in high noise levels is improved by making use of the non-Gaussian nature of speech trough a subband Kurtosis-weighted structure. Evaluation in a real environment with a moving speaker shows promising results, with suppression levels of up to 16 dB for background noise and interfering (speech) signals, associated to a relatively small effect of speech distortion.

Place, publisher, year, edition, pages
IEEE , 2007. Vol. 15, no 8, p. 2360-2372
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:bth-8479DOI: 10.1109/TASL.2007.903309Local ID: oai:bth.se:forskinfo978C238BCD0BA042C12574A500544EE6OAI: oai:DiVA.org:bth-8479DiVA, id: diva2:836203
Available from: 2012-09-18 Created: 2008-08-14 Last updated: 2017-12-04Bibliographically approved

Open Access in DiVA

fulltext(1200 kB)383 downloads
File information
File name FULLTEXT01.pdfFile size 1200 kBChecksum SHA-512
c2be519f3b4017cecd218e906fa80b0ed4f6909d9071b4a816e66b398d3c6ad9c5ded6c1ce8b33e482f84825b47c0177dbdb0626646191c7ba97ee76c325d7be
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Claesson, Ingvar

Search in DiVA

By author/editor
Claesson, Ingvar
In the same journal
IEEE Transactions on Audio, Speech, and Language Processing
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 383 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 141 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf