[Day 189] I finished the Car Insurance Fraud MLOps project. Thank you MLOps zoomcamp for teaching me so much!

 Hello :)
Today is Day 189!


A quick summary of today:
  • completed the project and wrote project description



The whole project and all info is on my repo

Here is a project diagram I created using lucid.app


Well ~ today I added some final things to my project. 


First I added terraform code to create GCP buckets for mlflow articats and raw data, and also start a VM.

The last one is kind of cool because when we start the VM, mlflow starts automatically as well because the code to start mlflow is included in the `metadata_startup_script` which is:

I also added the whole terraform init, plan and apply setup in the project's Makefile for easy set and start up. So now the make help looks like:


Next, I added pylint

Following the suggestions from MLOps zoomcamp, I added pylint and fixed a lot of whitespace, and import order lines of code as per pylint's suggestions. 

Next, I tried to add some tests using pytest

I added 5 simple tests to some of the prefect flows. 
The exact tests are:
  • test_read_new_data
  • test_predict
  • test_save_predictions
  • test_batch_model_predict
  • test_local_to_gcs
But there is work to be done for more thorough testing. One thing Mage.ai was nice for is that in every block of code, at the end of it there was a dedicated test place so adding tests was very easy (which is not the case for Prefect)

Next, I added python documentation

I used sphinx back in my placement year with Lloyds Banking Group, and I have been adding small docstrings to my code, so adding a sphinx python docs seemed like a nice little addition. I easily set it up, but I could not get github actions to automate this, so I used an alternative:

Use a new branch where the _build folder (normally not uploaded to github because it has many files) is uploaded and netlify watches that branch and updates the docs

Here is the link to the netlify hosted python docs for the project.


Finally, I added some basic git pre-commit hooks

  • trailing-whitespace
  • end-of-file-fixer
  • check-yaml
  • check-added-large-files
  • pytest-check
  • pylint

These are ran and checked before every commit, and the commit is stopped if one fails.



After this, I started putting things together and writing the README of the project, and creating the project diagram (top of this blog).

There are things to improve, but nevertheless DataTalksClub - THANK YOU SO MUCH! MLOps zoomcamp is such an amazing course, and it taught me amazing tools to build amazing things.

From now on, I will continue learning the ongoing LLM zoomcamp - hopefully I will build an amazing project from there as well!


That is all for today!

See you tomorrow :)

Popular posts from this blog

[Day 198] Transactions Data Streaming Pipeline Porject [v1 completed]

[미리 공부] 기초 통계 복습 (Day 1는 1월2일)

[Day 61] Stanford CS224N (NLP with DL): Machine translation, seq2seq + a side CDCGAN mini project