This commit is contained in:
Pablo Martin 2023-10-31 17:22:51 +01:00
parent 2b6c385b8c
commit 7480222cc7
7 changed files with 49 additions and 13 deletions

View file

@ -0,0 +1,27 @@
{{
config(
materialized = 'table'
)
}}
WITH fact_reviews AS (
SELECT *
FROM
{{ ref('fact_reviews') }}
),
full_moon_dates AS (
SELECT *
FROM
{{ ref('seed_full_moon_dates')}}
)
SELECT
fr.*,
CASE
WHEN fm.full_moon_date IS NULL THEN 'not full moon'
ELSE 'full moon'
END AS is_full_moon
FROM
fact_reviews fr
LEFT JOIN full_moon_dates fm
ON (fr.review_date::date) = (fm.full_moon_date + interval '1' day)

View file

@ -0,0 +1,12 @@
version: 2
sources:
- name: airbnb
schema: raw
tables:
- name: listings
identifier: raw_listings
- name: hosts
identifier: raw_hosts
- name: reviews
identifier: raw_reviews

View file

@ -1,6 +1,6 @@
WITH raw_hosts AS ( WITH raw_hosts AS (
SELECT * SELECT *
FROM raw.raw_hosts FROM {{ source ('airbnb', 'hosts')}}
) )
SELECT SELECT
id as host_id, id as host_id,

View file

@ -1,6 +1,6 @@
WITH raw_listings AS ( WITH raw_listings AS (
SELECT * SELECT *
FROM raw.raw_listings FROM {{ source ('airbnb', 'listings')}}
) )
SELECT SELECT
id AS listing_id, id AS listing_id,

View file

@ -1,6 +1,6 @@
WITH raw_reviews AS ( WITH raw_reviews AS (
SELECT * SELECT *
FROM raw.raw_reviews FROM {{ source ('airbnb', 'reviews')}}
) )
SELECT SELECT
listing_id, listing_id,

View file

@ -42,4 +42,9 @@ WHERE
Bear in mind that how to define the strategy to determine what should be loaded is up to the engineer. Any SQL can be placed within the `if is_incremental()` block. In the example above, we have a date field that easily signals what's the most recent date the table has currently seen. Bear in mind that how to define the strategy to determine what should be loaded is up to the engineer. Any SQL can be placed within the `if is_incremental()` block. In the example above, we have a date field that easily signals what's the most recent date the table has currently seen.
## ## Sources and seeds
Seeds are local files that you upload to a DWH from dbt. You place them as CSVs in the `seeds` folder.
Sources are an abstraction layer on top of the input tables. They are not strictly necessary, but can help make the project more structured. To create sources, you create a `sources.yml` file and place it in the `models` dir.

View file

@ -105,12 +105,4 @@ dbt makes sense nowadays because the modern data stack makes transformations wit
- `dbt_project.yml`: header of the project, with stuff like versioning, the default profile for the project, the paths to different folders, etc. - `dbt_project.yml`: header of the project, with stuff like versioning, the default profile for the project, the paths to different folders, etc.
This is a pic of the data flow we are going to build: ![img.png](../images/dataflow_overview.png) This is a pic of the data flow we are going to build: ![img.png](../images/dataflow_overview.png)
## Sources and seeds
Seeds are local files that you upload to a DWH from dbt. You place them as CSVs in the `seeds` folder.
Sources are an abstraction layer on top of the input tables.