Posts

Showing posts from July 29, 2024

[Day 210] 118 minutes of Glaswegian accent audio clips

Image
 Hello :) Today is Day 210! A quick summary of today: final audio clips preprocessing to reach our audio dataset mark Final dataset for the glaswegian voice assistant AI ( link to HuggingFace ). Today I preprocessed the final audios from 2 of Limmy's youtube videos (Limmy accidentaly kills the city and The writer of Saw called Limmy a ...).  Just an update on how the process goes now ~  Since our transcription AI is pretty good (according to my Glaswegian speaking project partner), we pass the full raw audio to our fine-tuned whisper  model hosten on HuggingFace spaces. Then the transcript is put into a docs file (where first I check over it for obious mistakes and flag if I see something odd and cannot understand it from re-listening to the audio) and split into sensible (small) bits while listening to the audio, like: (this is the start from Limmy accidentaly kills the city) Then using an audio tool, I cut the full audio length into clips according to the cut text, then I match c