Posts

Showing posts from July 28, 2024

[Day 209] Using Mage for pipeline orchestration in the KB project

Image
 Hello :) Today is Day 209! A quick summary of today: creating Mage pipelines for the KB AI competition project The repo after today Today morning/afternoon I went on a bit of a roll ~  I set up all the above pipelines in Mage. Below I will go over each one get_kaggle_data it is just one block that downloads the data for the project from Kaggle using the kaggle python package load_batch_into_neo4j Gets the loaded data from the get_kaggle_data pipeline and inserts it into neo4j. This is the fraudTrain.csv from the Kaggle website (because the fraudTest.csv will be used for the pseudo-streaming pipeline). train_gcn I tried to split this into more blocks, but at the moment the way I structured the code, the most optimal solution was to do it all at once. That is - create a torch-geometric dataset, create node and edge index, train the model, test it, and save summary info. At the moment, because everything is local, I am using mlflow just for easy comparison but later (after the project s