Posts

Showing posts from June 20, 2024

[Day 170] Uber data engineering project using GCP and Mage

Image
 Hello :) Today is Day 170! A quick summary of today: decided to give mage.ai another go and now use it alongside GCP Well... I obviously did not suffer enough with the countless problems I experienced when I was first learning about mage from DataTalksClub's MLOps zoomcamp, so I decided to do a cool-looking data engineering project [ youtube ]. The caveat is that I will use GCP. And I just hope I do not incur any major costs. I checked, and I have 21 days left on plenty of free credits.  Getting to the project It uses the infamous NYC taxi dataset.  Using lucid I learned a bit amout data deminsion modelling Then using python, some basic preprocessing was done on the raw data, to convert it into the top 8 tables. I actually put everything so far on my github , and plan on doing a nice readme documentation once everything is finished. Even though I started it after work today, I did not finish it because of *again* mage problems.  Before I get to the mage problems, a bit about GCP.