4.4 KiB
How to SQL rollup
Introduction
Do you need to compute metric aggregations at different granularities and you’re tired to create tons of code just to get a Global category?
Introducing SQL rollup!
Instead of doing this:
with booking_count_per_state as (
select
booking_state,
count(1) as booking_count
from
intermediate.int_core__bookings
group by
1
),
global_bookings as (
select
'Global' as booking_state,
sum(booking_count) as booking_count
from
booking_count_per_state
group by
1
)
select *
from global_bookings
union all
select *
from booking_count_per_state
To get:
You can use ROLLUP to simplify the code:
select
coalesce(booking_state,'Global') as booking_state,
count(1) as booking_count
from
intermediate.int_core__bookings
group by
rollup(booking_state) -- NOT EQUIVALENT to group by rollup(1)!!!
and get the same results (note we didn’t order by any field):
Be aware that the use of numeric wildcards such as group by rollup(1) is not advised. Please read the next section to understand more.
Step-by-step guide (to avoid common mistakes)
By default, when rolling up, SQL will create a category with the aggregated value with a null name. If we follow the example with Bookings, we would have:
select
booking_state,
count(1) as booking_count
from
intermediate.int_core__bookings
group by
rollup(1) -- equivalent to group by rollup(booking_state)
That’s why in this case we used COALESCE to specify that the resulting value should be called Global, since it counts all Bookings.
However, we should be very, very conscious of null values. Let’s go for another example, in this case, on guest payments. Let’s count the total amount of payments per verification payment type. To start with, we can do:
-- SIMPLE QUERY
select
verification_payment_type,
count(1) as total_payments
from
intermediate.int_core__verification_payments icvp
group by
1 -- equivalent to group by verification_payment_type
order by
total_payments desc
And we see we have 2k rows that do not have any verification payment type set. Since we know that rollup will create another category with NULL, first things first, we should tackle these verification payments that do not have a type set. Intuitively, we can name this category as Unknown:
-- QUERY SETTING UNKNOWN TO NULL VALUES
select
coalesce(verification_payment_type, 'Unknown') as verification_payment_type,
count(1) as total_payments
from
intermediate.int_core__verification_payments icvp
group by
1 -- equivalent to group by coalesce(verification_payment_type, 'Unknown')
Now, if adding the rollup, we should get a new Null category with the total amount of payments:
-- QUERY SETTING UNKNOWN TO NULL VALUES WITH ROLLUP (1)
select
coalesce(verification_payment_type, 'Unknown') as verification_payment_type,
count(1) as total_payments
from
intermediate.int_core__verification_payments icvp
group by
rollup(1) -- equivalent to group by rollup(coalesce(verification_payment_type, 'Unknown'))
Note that we’re using group by rollup(1). This is equivalent of doing group by rollup(coalesce(verification_payment_type, 'Unknown'))
Cool but… how can we now re-name the resulting Null value to Global?
- We need to add another coalesce in the select clause,
- But we need to keep the previous aggregation
group by rollup(coalesce(verification_payment_type, 'Unknown'))
In other words:
-- QUERY SETTING UNKNOWN TO NULL VALUES WITH WELL CREATED ROLLUP
select
coalesce(coalesce(verification_payment_type, 'Unknown'),'Global') as verification_payment_type,
count(1) as total_payments
from
intermediate.int_core__verification_payments icvp
group by
rollup(coalesce(verification_payment_type, 'Unknown')) -- NOT EQUIVALENT to group by rollup(1)!!!
Et voilà!
The key aspect here is understanding that the field that we’re showing in the select clause is different from the field we’re using to conduct the group by rollup. Therefore, it’s extremely recommended to explicitly specify the group by rollup condition instead of using the numeric wildcards.
Additional Links
https://neon.tech/postgresql/postgresql-tutorial/postgresql-rollup






