Overview:
Self-Remixing presents an innovative approach to sound separation through its unsupervised framework, aimed at enhancing the clarity and quality of audio. This flexible system allows users to either fine-tune pre-trained models or embark on training from scratch, making it accessible to a wide range of applications. With a focus on single-channel sound separation, Self-Remixing showcases a suite of powerful techniques to achieve impressive results in the task of audio disentanglement.
Features:
- Unsupervised Sound Separation: Utilizes an unsupervised framework to effectively separate and remix audio signals without the need for labeled training data.
- Flexible Training Options: Supports training from scratch as well as fine-tuning existing pre-trained models, giving users more control over their audio processing needs.
- Multiple Algorithms: Incorporates various sound separation methods like Mixture Invariant Training (MixIT) and RemixIT, enabling users to choose the best fit for their specific use case.
- Supported Datasets: Compatible with public datasets including SMS-WSJ and the Free Universal Sound Separation (FUSS), facilitating easy access for experimentation and training.
- Efficient Logging: Integrates with Weights and Biases for streamlined logging and performance tracking during the training process.
- Evaluation Metrics: Offers robust evaluation options with speech metrics and word error rate assessments utilizing Whisper Large v2, ensuring accurate performance measurements.
- Open Source: Released under the MIT License, making it freely available for modification and use, while also acknowledging contributions from esteemed repositories like ESPnet and Asteroid.