Posts

Showing posts from June 24, 2024

[Day 175] Learning about and using dbt cloud

Image
 Hello :) Today is Day 175! A quick summary of today: basic SQL in BigQuery continued Module 4: analytics engineering and dbt  from the data eng camp and learned about dbt There were some bits from Module 3 that I did not finish. About using BigQuery to create a partitioned and clustered dataset and see the impact it has on processed data. Using the infamous nyc taxi dataset, I executed some simple queries on creating external datasets in BigQuery and could see the effect of partitioning and clustering First create a non-partitioned and partitioned table in BigQuery The dataset is by dates so we are using one of the datetime columns to partition by it. Below we can see a significant drop in processed data when doing a simple WHERE between dates on the non-partitioned vs partitioned tables. I faced an issue here because when creating the partitioned data, in BigQuery there is a table details section where it says 0 partitions. And I was really confused because the data I was partitionin