[Day 61] Stanford CS224N (NLP with DL): Machine translation, seq2seq + a side CDCGAN mini project

3/03/2024 01:56:00 am

Hello :)
Today is Day 61!

A quick summary of today:
Covered Lecture 7: machine translation, seq2seq, attention from Stanford CS224N
Tried to make a conditional DCGAN to generate MNIST numbers (colab) (kaggle)

I will first cover the GAN story

(then will share my notes from the lecture)

So... while watching and taking notes today, I started thinking, what if I can use my notes as data to a model and afterwards, when I want, I can give it raw string text and it will output text in the format of my notes (with my handwriting).

Well I started looking around and actually the first model architecture that came to my mind was the GAN (specifically conditional GAN) - I remembered there was a GAN architecture that alongside the pictures, we can give it the labels, and then on-demand generate. In retrospect, there are of course others, but I decided to go with GAN.

For maybe 2 hours I busted my head trying to make a simple model with the EMNIST dataset (english characters), and I kept getting weird input size issues. And after a bit I read online that the num of classes for the EMNIST from PyTorch is a bit weird (i.e.). What is more, I saw this:

This image with handwritten text is generated with a conditional deep conv GAN (repo link). And my mind was hooked on the idea -> I want to do it too. But firstly, I wanted to do it with just numbers, a bit more simple (and not having to struggle with loading the EMNIST dataset).

Thanks to Aladdin Persson's youtube channel, I got reminded how GAN's architecture works and I had a simple model, and started training in my google colab. After a few hours of adjusting params to optimize training and lower loss, and also doing inference - producing new number images, I had a working model. I was so happy. This felt soo long, at first with the EMNIST problems, and then just getting a conditional DCGAN to work. Aaaand... my colab gpu free time finished in a middle of a run and all was lost because I had not saved any weights :/ I did not even take a screenshot of the generated output numbers I got - and they were amazing, human-like.

Right now as I am writing this, in the colab, there is no output, but hopefully later if someone reads this, you will be able to see sample results (I will rerun it when I get my free gpu hours).

But also, I am running it on kaggle right now, will get a generated output pic and go to sleep because it is almost 2am.

So, this is the result. Not too bad. But in colab I could use tensorboard (there is a way in Kaggle too, but it gives me an error :/ ) (Link to the kaggle notebook)