[Day 199] Continuing with Build an LLM from scratch
Hello :) Today is Day 199! A quick summary of today: saw that all the chapters from the book Build an LLM from scratch have been published so I decided to continue with it (after a few moths of waiting) I like that even though we are in chapter 5 (out of 7), the author still reminds us the learners the process of going from input text to LLM generated text. Making sure we are still on the same page The goal of this chapter is to train a model because at the moment, when we try to generate some text we get gibberish At the moment the untrained model is given: Every effort moves you ; and the model continues this with rentingetic wasnم refres RexMeCHicular stren By the way, the current (untrained) model has the following config: How can the model learn ? It's weights need to be updated so they start to predict the target tokens. Here comes good ol' backpropagation. And it requires a loss function which calculates the dif between desired and actual output (i.e. how ...