[Day 52] Learning more about transformers with Andrej Karpathy
Hello :) Today is Day 52! A quick summary of today: GPT from scratch with Andrej Karpathy Re-watching his other tutorials Firstly, I will mention about what I learned from the GPT from scratch tutorial. Andrej Karpathy's videos introduced me to PyTorch for the 1st time. Actually the first video I saw which was about building a neural network from scratch, I kinda got what was happening, but as he went deeper and deeper and started writing code as if it was PyTorch (or building a neural network on a lower level compared to TensorFlow), I felt like I got slapped in the face hahaha. TF is much higher level than PyTorch, and at first it felt weird to have to write the whole dataset creation, model, training, eval, etc by myself. But with some practice it grew on me and now PyTorch feels more comfortable than TF, it feels more 'down-to-earth' ('down-to-python') haha. So, on Day 46 I really tried to understand and learn transformers through KAIST's professor Choi a...