Posts

Showing posts from July 23, 2024

[Day 204] Transaction data EDA + MLflow & minIO docker setup

Image
 Hello :) Today is Day 204! A quick summary of today: some EDA on the KB AI competition data setting up mlflow and minIO Firstly, about doing some basic cleaning and EDA on my part of the data for the Kukmin Bank project These are the variables assigned to myself:  trans_date_trans_time,cc_num,merchant,category,amt,first,last,gender,street,city,state No missing/null values.  Box plot of log(amount) in Not Fraud vs Fraud Distribution of Not Fraud vs Fraud Other graphs Interesting ~ only Fraud transactions in the state of Delaware (of course this is not real data, but interesting nonetheless) We should also do some basic transformation and cleanup before the raw data goes to the db. In my columns, an example case is: all merchant's name start with fraud_, so removing it would be fine, just for a bit more clarity.  On another note ~ mlflow and minIO I found this website (in Korean but can be translated) that provides an easy "plug-n-play" Dockerfile and docker-compose seriv