SVCC23_FastSVC

screenshot of SVCC23_FastSVC

Singing Voice Conversion Challenge 2023 Starter Kit: FastSVC Reimplementation

Overview:

The Singing Voice Conversion Challenge 2023 Starter Kit includes two systems: FastSVC and SVCC B02. These systems are designed to facilitate singing voice conversion and improve training time. The FastSVC system is a fast cross-domain singing voice conversion system with feature-wise linear modulation. The SVCC B02 system is a decomposed version of FastSVC that allows for quicker training. The starter kit provides access to pretrained models, datasets, and code for implementation.

Features:

  • FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation
  • Only uses Harvest from the PyWorld toolkit as the pitch extractor, instead of three different pitch extractors
  • Fixed speaker embeddings extracted beforehand by averaging extractions from each utterance
  • Replaces the discriminator with the one used in HiFiGAN as recommended by the authors
  • The PPG extractor has a hop size of 160, so the upsampling scales are changed to [5, 4, 4, 2]
  • SVCC B02: A decomposed version of FastSVC to improve training time

Summary:

The Singing Voice Conversion Challenge 2023 Starter Kit includes two systems, FastSVC and SVCC B02, that facilitate singing voice conversion and improve training time. The FastSVC system offers fast cross-domain singing voice conversion with feature-wise linear modulation and includes specific changes from the original system. The SVCC B02 system is a decomposed version of FastSVC designed to enhance training time. The starter kit provides access to pretrained models, datasets, and code for implementation.