Posts

Showing posts from March 14, 2024

[Day 72] Carnegie Mellon University - Advanced NLP Spring 2024 - assignment 1

Image
 Hello :) Today is Day 72! A quick summary of today: Today I found 11-711 Advanced NLP by CMU I am amazed. It is still ongoing and they upload the lecture videos and all the information is on the course website. I had a look at the syllabus and saw that the first few lectures are: Given I have covered CS224N by Stanford, I felt confident and skimmed over the lectures just to check for any new info, and then decided to jump into assignment 1. WOW! Assignment 1 felt great, it was challenging, difficult and made me read through research papers like (RMS layer norm, and Adam) to implement a Llama model from scratch! I uploaded all my assignment code to this github repo , but the main files that needed to be implemented are classifier.py, optimizer.py, rope.py, and llama.py. Below I will go over how I implemented each file (and my struggles along the way).  I am not that familiar with Llama but this was a good exercises to get more familiar with the model - by jumping right into the action.