To create a custom voice font, we are going to use Microsoft's Custom Voice service. To get started, go to https://speech.microsoft.com/portal and click on Custom Voice. When on the Custom Voice page, click on New project:
Then, after giving your project a name and description, it is time to upload some audio files for training. As of the time of writing this book, the best voice system, Neural Voice, is in private preview. This means you will have to request access to use it. If you can access the Neural Voice feature, you will need 1 hour of voice data. To achieve a slightly less high-fidelity voice font, you can use the standard voice training system. You can provide it with as low as 1 hour of audio samples but to achieve high quality, you will need 8 hours of audio.
After creating a new project, you will be in Microsoft Speech Studio. First, click on Data, and then Upload data. Then, select audio only, unless you have some pre-transcribed audio:
Then, upload...