Self Remixing

screenshot of Self Remixing

Official implementation of Self-Remixing

Overview:

Self-Remixing presents an innovative approach to sound separation through its unsupervised framework, aimed at enhancing the clarity and quality of audio. This flexible system allows users to either fine-tune pre-trained models or embark on training from scratch, making it accessible to a wide range of applications. With a focus on single-channel sound separation, Self-Remixing showcases a suite of powerful techniques to achieve impressive results in the task of audio disentanglement.

Features:

  • Unsupervised Sound Separation: Utilizes an unsupervised framework to effectively separate and remix audio signals without the need for labeled training data.
  • Flexible Training Options: Supports training from scratch as well as fine-tuning existing pre-trained models, giving users more control over their audio processing needs.
  • Multiple Algorithms: Incorporates various sound separation methods like Mixture Invariant Training (MixIT) and RemixIT, enabling users to choose the best fit for their specific use case.
  • Supported Datasets: Compatible with public datasets including SMS-WSJ and the Free Universal Sound Separation (FUSS), facilitating easy access for experimentation and training.
  • Efficient Logging: Integrates with Weights and Biases for streamlined logging and performance tracking during the training process.
  • Evaluation Metrics: Offers robust evaluation options with speech metrics and word error rate assessments utilizing Whisper Large v2, ensuring accurate performance measurements.
  • Open Source: Released under the MIT License, making it freely available for modification and use, while also acknowledging contributions from esteemed repositories like ESPnet and Asteroid.