Posts

Showing posts from August 4, 2024

[Day 216] Pipelines for XGBoost and CatBoost training, and using the models in the real-time inference pipeline

Image
 Hello :) Today is Day 216! A quick summary of today: created pipelines to train XGBoost and CatBoost models added XGBoost and Catboost models' predictions in the real-time inference pipeline update model dictionary UI created project README The project submission deadline is 11th of Aug, and after that we will make the repo public, but until then I just have to share pictures from the project.  All pipelines including today's: Setting up model training pipes was easy, the tricky part was using the models. Because the models require OHE data (or dummy vars), so the approach I ended up with was from the development notebook, I took all the columns that are used for the model training data, to emulate a OHE dataset, then I created a dict with key: col, value:0. When a message gets processed, the code goes over the data, and updates the 0 to a 1 if a particular value is there. It is easier to understand with an example - if the transaction has category: entertainment, then in the