[Day 214] The evaluation of my MLOps zoomcamp project arrived - max points

 Hello :)
Today is Day 214!


A quick summary of today:
  • documenting the KB project data using arrow.app
  • small repo changes
  • ML zoomcamp announcement


I tried to find some graph documentation tool for the KB project. The most popular I could find is this arrows.app application where we can create sample nodes with edges between and add attributes with descriptions. I created the below pic using that app


Also I made some small changes to the repo:

the readme I added:

  • Docker - used to containarise and self-host the below services
  • Mlflow - used for easy model comparison during development. Can be improved by using cloud services (like GCP) for hosting, database and artifact store
  • Mage - used for pipeline orchestration of the model training and real-time inference pipelines. Can be improved by hosting mage on the cloud
  • Neo4j - used as a Graph database to store transaction data as nodes and edges. Can be improved by using Neo4j's AuraDB (hosted on the cloud)
  • Kafka - used to ensure real-time transaction data processing
  • Grafana - used for real-time dashboard creating and monitoring. Can be improved by using Grafana's cloud services
  • Streamlit - used to host a model dictionary showing models, evaluation metrics, and feature importance graphs

Terraform can be used to provide Infrastructure as Code (IaC) for services that can be hosted on the cloud

I also removed .env from gitignore just so that reproducers can clearly see the env variables used. At the moment there is no actual secret variables so it is fine. 


A task I added to my TODOs is to figure out how to setup the neo4j db connection on grafana startup. I saw I could use a grafana.ini file, but I will figure out this tomorrow.


On another note, today the evaluation of my MLOps zoomcamp project came back.

I got 32, which I believe is the max. The points are given for: describing the problem, usage of cloud services, mlflow and exp. tracking, orchestration, model deployment, monitoring, reproducability and best practices. 

The certificate is not issued yet, but I will share it when it is sent. 


Today I saw that Alexey Grigorev, the creator of the DataTalksClub, announced the next cohort for their ML zoomcamp. It will cover the below:


I might sign up to see if I can learn something new, and keep my skills sharp ^^ 


That is all for today!

See you tomorrow :)

Popular posts from this blog

[Day 198] Transactions Data Streaming Pipeline Porject [v1 completed]

[Day 107] Transforming natural language to charts

[Day 54] I became a backprop ninja! (woohoo)