[Day 212] Final Glaswegian TTS model
Hello :)
Today is Day 212!
A quick summary of today:- creating a simple Glaswegian assistant app
Here is a link to the model on HuggingFace. And its training results:
Now that we have the final 2 hour dataset, I was hoping for better results. Before, the generated audio (while with a little accent) sounded robotic. First thing I had to do was fix the HuggingFace space where the previous version of the glaswegian TTS was running. The issue was related to voice embeddings, and after a quick fix ~It was up again, and I loaded the latest glaswegian_tts model. Well now, it *does* sound better. There are cases where it is robotic, but there is definitely improvement compared to the previous version. That previous version was trained or around 30 mins of audio, compared to now 2 hours.Next - create a full assistant app
Gradio and HF spaces make it very easy -> here.Audio input -> transcribed using glaswegian_asr -> send to gpt2 -> answer from gpt2 is turned to speech and returned to the user
At the start I used gpt2, but as its not that good, I switched to using gpt3.5-turbo.
That is all for today!
See you tomorrow :)