Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ARMAS: Active Reconstruction of Missing Audio Segments
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.
2021 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Background: Audio signal reconstruction using machine/deep learning algorithms has been explored much more in the recent years, and it has many applications in digital signal processing. There are many research works on audio reconstruction with linear interpolation, phase coding, tone insertion techniques combined with AI models. However, there is no research work on reconstructing audio signals with the fusion of Steganoflage (an adaptive approach to image steganography)  and AI models. Thus, in our thesis work, we focus on audio reconstruction combining Steganoflage and AI models.

Objectives: This thesis aims to explore the possible enhancement of audio reconstruction using machine/deep learning models fusing Steganoflage technique. Furthermore, the suitable models implemented with the fusion of Steganoflage are analyzed and compared based on the performance metrics.

Methods: We have conducted a systematic literature review followed by an experiment method to answer our research questions. The models implemented in the thesis are the results from a systematic literature review (SLR). In the experiments, we have fused the RF (Random Forest), SVR (Support Vector Regression), and LSTM (Long Short-Term Memory) models with Steganoflage for possible enhancement of reconstruction of lost audio signals. Then, the models were trained to estimate the possible approximate reconstructed signals. Finally, we observed the performance of the models and compared the reconstructed audio signals with the original signals (ground-truth) with four different performance metrics: Pearson linear correlation, PSNR, WPSNR, and SSIM.

Results: The results from the SLR show that for machine learning models, RF and SVR models were mainly used for signals reconstructions and works well with time-series data. For deep learning models, recurrent neural network LSTM was the first choice as the survey of literature demonstrated that the model is suitable for time series forecasting. From the experiments, we found that the performance of LSTM model was better than RF and SVR models. Moreover, the reconstruction of audio signals from dropped short single region was better than that for multiple regions.

Conclusions: We conclude that the Steganoflage, when fused with machine/deep learning models, enhances the lost audio signal reconstruction. Moreover, we also conclude that the LSTM model is more accurate than RF and SVR models in reconstructing the lost audio signals for a single drop region on both short and long gaps. However, we also observed that the audio reconstruction for multiple drops needs improvements considering long gaps. Furthermore, improvements can be made by exploring newer AI methods/optimization to enhance the reconstructed audio signals.

Place, publisher, year, edition, pages
2021. , p. 44
Keywords [en]
Audio Reconstruction, Audio Inpainting, Deep Learning, Machine Learning, Audio Steganography
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-22278OAI: oai:DiVA.org:bth-22278DiVA, id: diva2:1608536
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVACS Master of Science Programme in Computer Science
Presentation
2021-09-28, 11:00 (English)
Supervisors
Examiners
Available from: 2021-11-12 Created: 2021-11-03 Last updated: 2021-11-12Bibliographically approved

Open Access in DiVA

ARMAS: Active Reconstruction of Missing Audio Segments(3173 kB)777 downloads
File information
File name FULLTEXT02.pdfFile size 3173 kBChecksum SHA-512
c7025ccd1fb5daa880015042fc2e92c7eda5b6f11673198a3ee99a6f87f565c48462e6d86c7a17f736eb1877d42dd82824a63934af1cf4c73aa1ea70243416de
Type fulltextMimetype application/pdf

By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 777 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 364 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf