50+ days of Machine Learning

Posts

Showing posts from February, 2024

[Day 59] Stanford CS224N (NLP with DL): Backprop and Dependency Parsing

2/29/2024 11:39:00 pm

Hello :) Today is Day 59! A quick summary of today: From the Stanford CS224N course , I covered: Lecture 3: Backprop and Neural Networks Lecture 4: Dependency Parsing And below are my notes for both. Lecture 3: Backprop After Andrej Karpathy, backprop and I have a friend-friend relationship, so this felt like a nice overview over backprop in NNs. Lecture 4: Dependency Parsing This one felt like a linguistics lesson, learning about how people interpret language and how such logic transferred to computers. Tomorrow is RNN's turn. Really exciting! That is all for today! See you tomorrow :)

[Day 58] Stanford CS224N (NLP with DL): Lecture 2 - Neural classifiers (diving deeper into word embeddings)

2/29/2024 12:37:00 am

Hello :) Today is Day 58! A quick summary of today: covered Lecture 2 of Stanford's NLP with DL did assignment 1 on google colab which covered some exercises on count-based and prediction-based methods Read 6 papers about word embeddings and wrote down some basic summaries GloVe: Global Vectors for Word Representation Improving Distributional Similarity with Lessons Learned from Word Embeddings Evaluation methods for unsupervised word embeddings A Latent Variable Model Approach to PMI-based Word Embeddings Linear Algebraic Structure of Word Senses, with Applications to Polysemy On the Dimensionality of Word Embedding Firstly, I will share my notes from the lecture: Next, are the short summaries of each paper. These were definitely interesting papers, I feel like I am learning the history of something big, and after a few lectures I will be in the present haha. These papers, that preceeded the infamous transformer, and that laid grounds for modern embeddings were v...

[Day 57] Stanford CS224N - Lecture 1. Word vectors

2/27/2024 11:01:00 pm

Hello :) Today is Day 57! Quick summary of today: Got introduced to word vectors with lecture 1 + small showcase Read the paper and took notes about: Efficient Estimation of Word Representations in Vector Space (word2vec) Distributed Representations of Words and Phrases and their Compositionality (negative sampling) First I watched the lecture by Professor Manning, and then I read the papers so some of the material was overlapping, but there were still interesting new parts in each of the three. 1) Lecture notes My notes from the lecture were not that long so I will first share them. After the lecture, there was a small colab document I could run through. In the lecture, we mention about matrix multiplication and the larger the dot product between the 2 words, the more similar they are -> exp(u0T @ vc) So in the colab there was a model loaded. And I searched for the word embeddings of bread and baguette which in my view are similar words. In the below pi...