We'd like to create an audio style transfer solution to convert our spoken word audio files into a different voice that's cloned from a voice actor.
The deliverable would be the following:
1. Training set of words and phrases that the voice actor should speak and record in order to train a model (training input)
2. Model training using the input from #1 above
3. A way to invoke that model with spoken word audio as the input and the output being the new spoken word audio in the cloned voice
* We will need to be able to create numerous models for different voices. This is not a one time training and output exercise.
* The quality of the audio output needs to be very good, therefore you need to have experience not only with deep learning but the application of deep learning with audio style transfer.
This will be an initial engagement to create the solution with ongoing work to improve and maintain the system.