Stochastic Restoration of Heavily Compressed Musical Audio using GANs

2023-12-18 15:44:34

Quote

We introduce a Generative Adversarial Network (GAN) architecture for the restoration of MP3-encoded musical audio signals. We train different stochastic and deterministic generators on MP3s with different compression rates.
Using these models, we investigate if
restorations of the models considerably improve the MP3 versions,
if we can systematically pick samples among the outputs of the stochastic generators which are closer to the original than such of the deterministic generators, and
if the stochastic generators generally output higher-quality restorations than the deterministic generators.
To that end, we perform an extensive evaluation of the different experiment setups utilizing objective metrics and listening tests. We find that the models are successful in points 1 and 2, but the random outputs of the stochastic generators are approximately on a par (i.e., do not improve) the overall quality compared to the deterministic models (point 3).
The proposed GAN architecture is based on dilated convolutions with skip connections, combined with a novel concept which we call Frequency Aggregation Filters. These are convolutional filters spanning the whole frequency range, which contribute to the stability of
the training and constitute a consequent take on the problem of non-local correlations in the frequency spectrum. We also find that using so-called self-gating considerably reduces the memory requirement of the architecture by halving the number of input maps to each layer without degradation of the results. In order to prevent mode collapse, we propose a regularization that enforces a correlation between differences in the noise input and differences in the model output. As opposed to most other works (but in line with few other approaches using GANs and U-Net-based architectures), we input (and output) directly the (non-linearly scaled) complex-valued
spectrum to the generator, eliminating the need to deal with phase information separately

Unofficial implementation: https://github.com/abreuwallace/Stochastic-Restoration-GAN

Notice