This commit is contained in:
Pablo Martin 2023-10-31 17:22:51 +01:00
parent 2b6c385b8c
commit 7480222cc7
7 changed files with 49 additions and 13 deletions

View file

@ -0,0 +1,27 @@
{{
config(
materialized = 'table'
)
}}
WITH fact_reviews AS (
SELECT *
FROM
{{ ref('fact_reviews') }}
),
full_moon_dates AS (
SELECT *
FROM
{{ ref('seed_full_moon_dates')}}
)
SELECT
fr.*,
CASE
WHEN fm.full_moon_date IS NULL THEN 'not full moon'
ELSE 'full moon'
END AS is_full_moon
FROM
fact_reviews fr
LEFT JOIN full_moon_dates fm
ON (fr.review_date::date) = (fm.full_moon_date + interval '1' day)

View file

@ -0,0 +1,12 @@
version: 2
sources:
- name: airbnb
schema: raw
tables:
- name: listings
identifier: raw_listings
- name: hosts
identifier: raw_hosts
- name: reviews
identifier: raw_reviews

View file

@ -1,6 +1,6 @@
WITH raw_hosts AS (
SELECT *
FROM raw.raw_hosts
FROM {{ source ('airbnb', 'hosts')}}
)
SELECT
id as host_id,

View file

@ -1,6 +1,6 @@
WITH raw_listings AS (
SELECT *
FROM raw.raw_listings
FROM {{ source ('airbnb', 'listings')}}
)
SELECT
id AS listing_id,

View file

@ -1,6 +1,6 @@
WITH raw_reviews AS (
SELECT *
FROM raw.raw_reviews
FROM {{ source ('airbnb', 'reviews')}}
)
SELECT
listing_id,

View file

@ -42,4 +42,9 @@ WHERE
Bear in mind that how to define the strategy to determine what should be loaded is up to the engineer. Any SQL can be placed within the `if is_incremental()` block. In the example above, we have a date field that easily signals what's the most recent date the table has currently seen.
##
## Sources and seeds
Seeds are local files that you upload to a DWH from dbt. You place them as CSVs in the `seeds` folder.
Sources are an abstraction layer on top of the input tables. They are not strictly necessary, but can help make the project more structured. To create sources, you create a `sources.yml` file and place it in the `models` dir.

View file

@ -106,11 +106,3 @@ dbt makes sense nowadays because the modern data stack makes transformations wit
- `dbt_project.yml`: header of the project, with stuff like versioning, the default profile for the project, the paths to different folders, etc.
This is a pic of the data flow we are going to build: ![img.png](../images/dataflow_overview.png)
## Sources and seeds
Seeds are local files that you upload to a DWH from dbt. You place them as CSVs in the `seeds` folder.
Sources are an abstraction layer on top of the input tables.