[Day 211] 2 hour mark !!! Glaswegian dataset goal - accomplished! + whisper-small fine-tuned

 Hello :)
Today is Day 211!


A quick summary of today:
  • fine-tuning the final whisper-small model for Glaswegian  automatic speech recognition (ASR)


After yesterday, it the audio was 118 and I was not happy that we did not *officially* hit the 2 hour mark, so today I went on youtube and got a 3min clip from Limmy which I transcribed and added to our dataset. Here is a link to the final 2 hour Glaswegian dataset on HuggingFace.

After a few months of working on this side project we hit the set goal!~

After I got the dataset I started fine-tuning whisper-small. It took around 4 hours ~ and the fine-tuned model is ready and deployed for using on HuggingFace Spaces

Looking at the metrics, training could be improved. 

As for what is next - training a Text-To-Speech model using our final dataset. This was the hard part before because the voice at the end was, while understandable, a bit robotic. So this is the next step and hopefully we get better results now. 


That is all for today!

See you tomorrow :)

Popular posts from this blog

[Day 198] Transactions Data Streaming Pipeline Porject [v1 completed]

[Day 107] Transforming natural language to charts

[Day 54] I became a backprop ninja! (woohoo)