[Day 130] CS109: MAP, Naive Bayes, Logistic Regression
Hello :) Today is 130! A quick summary of today: covered Lecture 22 : MAP, Lecture 23 : Naive Bayes, Lecture 24 : Logistic Regression from Stanford's CS109 From 11pm later, to 3am (I believe) is the 3rd day of MLx Fundamentals , and I will share about the 2 sessions tomorrow. Lecture 22: MAP We saw new method for optimization: gradient descent (or ascent for now). We want to choose theta, that is the argmax of some function. Seeing the blue curve is nice, but what can we do if we cannot see it ? Imagine we start with a random theta value, we look at the likelihood of that theta, and we can know the derivative at that point, and based on that we can follow along where the derivative is positive. When we reach the top, we actually don’t know we are at the top but when we check the gradient is 0. So we assume we are at the top. In the real world, because we want to think of minimising the loss, we minimise the negative log likelihood. MLE is great, but it ha...