sh-notion/notion_data_team_no_files/How to SQL rollup 1600446ff9c9806c9fdac2ff48e48bd3.md
Pablo Martin a256b48b01 pages
2025-07-11 16:15:17 +02:00

4.4 KiB
Raw Permalink Blame History

How to SQL rollup

Introduction

Do you need to compute metric aggregations at different granularities and youre tired to create tons of code just to get a Global category?

Introducing SQL rollup!

Instead of doing this:

with booking_count_per_state as (
select 
	booking_state, 
	count(1) as booking_count 
from 
	intermediate.int_core__bookings
group by 
	1
),
global_bookings as (
select 
	'Global' as booking_state, 
	sum(booking_count) as booking_count
from 
	booking_count_per_state
group by 
	1
)
select * 
from global_bookings
union all
select * 
from booking_count_per_state

To get:

image.png

You can use ROLLUP to simplify the code:

select 
	coalesce(booking_state,'Global') as booking_state, 
	count(1) as booking_count 
from 
	intermediate.int_core__bookings
group by 
	rollup(booking_state) -- NOT EQUIVALENT to group by rollup(1)!!!

and get the same results (note we didnt order by any field):

image.png

Be aware that the use of numeric wildcards such as group by rollup(1) is not advised. Please read the next section to understand more.

Step-by-step guide (to avoid common mistakes)

By default, when rolling up, SQL will create a category with the aggregated value with a null name. If we follow the example with Bookings, we would have:

select 
	booking_state, 
	count(1) as booking_count 
from 
	intermediate.int_core__bookings
group by 
	rollup(1) -- equivalent to group by rollup(booking_state)

image.png

Thats why in this case we used COALESCE to specify that the resulting value should be called Global, since it counts all Bookings.

However, we should be very, very conscious of null values. Lets go for another example, in this case, on guest payments. Lets count the total amount of payments per verification payment type. To start with, we can do:

-- SIMPLE QUERY
select 
	verification_payment_type, 
	count(1) as total_payments
from 
	intermediate.int_core__verification_payments icvp 
group by 
	1 -- equivalent to group by verification_payment_type
order by 
	total_payments desc 

image.png

And we see we have 2k rows that do not have any verification payment type set. Since we know that rollup will create another category with NULL, first things first, we should tackle these verification payments that do not have a type set. Intuitively, we can name this category as Unknown:

-- QUERY SETTING UNKNOWN TO NULL VALUES
select 
	coalesce(verification_payment_type, 'Unknown') as verification_payment_type,
	count(1) as total_payments
from 
	intermediate.int_core__verification_payments icvp 
group by 
	1 -- equivalent to group by coalesce(verification_payment_type, 'Unknown')

image.png

Now, if adding the rollup, we should get a new Null category with the total amount of payments:

-- QUERY SETTING UNKNOWN TO NULL VALUES WITH ROLLUP (1)
select 
	coalesce(verification_payment_type, 'Unknown') as verification_payment_type,
	count(1) as total_payments
from 
	intermediate.int_core__verification_payments icvp 
group by 
	rollup(1) -- equivalent to group by rollup(coalesce(verification_payment_type, 'Unknown'))

image.png

Note that were using group by rollup(1). This is equivalent of doing group by rollup(coalesce(verification_payment_type, 'Unknown'))

Cool but… how can we now re-name the resulting Null value to Global?

  • We need to add another coalesce in the select clause,
  • But we need to keep the previous aggregation group by rollup(coalesce(verification_payment_type, 'Unknown'))

In other words:

-- QUERY SETTING UNKNOWN TO NULL VALUES WITH WELL CREATED ROLLUP
select 
	coalesce(coalesce(verification_payment_type, 'Unknown'),'Global') as verification_payment_type,
	count(1) as total_payments
from 
	intermediate.int_core__verification_payments icvp 
group by 
	rollup(coalesce(verification_payment_type, 'Unknown')) -- NOT EQUIVALENT to group by rollup(1)!!!

image.png

Et voilà!

The key aspect here is understanding that the field that were showing in the select clause is different from the field were using to conduct the group by rollup. Therefore, its extremely recommended to explicitly specify the group by rollup condition instead of using the numeric wildcards.

Additional Links

https://neon.tech/postgresql/postgresql-tutorial/postgresql-rollup