Posts

Showing posts from March 17, 2024

[Day 76] Finishing the Retrieval-based LM talk, and learning about distillation, quantization and pruning

Image
 Hello :) Today is Day 76! Finished section 6 and 7 about multilingual retrieval-based LMs and retrieval-based LMs' challenges and opportunities ( ACL 2023 ) Covered  lecture 11  of CMU 11-711 Advanced NLP - Distillation, Quantization, and Pruning Section 6: Section 7: Lecture 11: Distillation, Quantization, and Pruning Problem: The best models for NLP tasks are massive. So how can we cheaply, efficiently and equitably deploy NLP systems at the expense of performance? Answer: Model compression Quantization -  keep the model the same but reduce the number of bits Pruning -  remove parts of the model while retaining performance Distillation -  train a smaller model to imitate the larger model Quantization - no parameters are changed, up to k bits of precision Amongst other methods, we can use post-training quantization We can binarize the parameters and activations  Pruning - a number of parameters are set to zero, the rest are unchanged There is Magnitute pruning (Han et al., 2015;

[Day 75] Retrieval-based LMs training and applications

Image
 Hello :) Today is Day 75! A quick summary of today: covered Section 4: retrieval-based LMs training of ACL 2023 covered Section 5: Applications Found out about NVIDIA GTC 2024 which is next week and registered for some of the events Firstly, as for the NVIDIA GTC courses that I registered for over the course of the conference: There is also a paid workshop on Building Transformer-Based Natural Language Processing Applications but it is sold out, and even though I have joined the waiting list, my hopes are not high haha. As for my notes on section 4 and 5 of the retrieval-based LM talk Section 4: Retrieval-based LMs: Training Section 5: Retrieval-based LMs: Applications There is ~30mins left, which I skimmed through and are not that long, but will cover tomorrow :) That is all for today! See you tomorrow :)