lots of stuff

This commit is contained in:
counterweight 2025-01-16 23:48:56 +01:00
parent b0e3b1c3e0
commit 2123b52371
Signed by: counterweight
GPG key ID: 883EDBAA726BD96C

View file

@ -18,42 +18,53 @@
<hr> <hr>
<section> <section>
<h2>I want code defined dashboards so badly</h2> <h2>I want code defined dashboards so badly</h2>
<p>For decades, Data teams and data gardeners in general have used specific tools to build dashboards. This <p>Analysts build dashboards. These are also called reports, data tools, data products, and another
are also called reports, data tools, data products, and another gazillion funny names.</p> gazillion funny names.</p>
<p>For the sake of clarity, when I say dashboard here, I refer to some piece of software that people look at
on a screen where they text, tables and charts, as well as a few controls to play with what they can see <p>For the sake of clarity, when I say dashboard here, I refer to some software-driven wizardry that puts
(filters and selectors, mostly).</p> pixels on a screen, displaying text, tables, and charts, as well as a few controls to play with what
data they can see (filters and selectors, mostly). Some human will read this in hope of understanding
data for something of use.</p>
<p> <p>
Analysts and other species are expected to build lots of this. Some business person needs to know stuff, Analysts and other species are expected to build lots of them. Some business person needs to know stuff,
so the analyst goes and builds a dashboard the person can look at. so the analyst goes and builds a dashboard so the person can look at it and know the stuff he needs to.
</p> </p>
<p> <p>
My description of a dashboard is very open, so there's a million ways to technically implement one. The My description of a dashboard is very open ended, technology wise. So there's a million ways to
thing is, analysts are analyts, not software engineers. If I ask my good colleague Uri, who is a technically implement one. The thing is, analysts are analysts, not software engineers. If I ask my good
wonderful analyst, to code you up your KPIs dashboard from scratch with React, he's going to take colleague Uri, who is a wonderful analyst, to code you up your KPIs dashboard from scratch with React,
somewhere between six weeks and two years. Not great. he's going to take somewhere between six weeks and two years. Not great.
</p>
<p>For decades, Data teams and data gardeners in general have used specific tools to build dashboards, which
are designed around the idea that an analyst is not a software engineer. This being the case, we've had
multiple generations of tools that have tried to make it simpler for people to build dashboards. The
whole point of them is it shouldn't take knowing linked lists and big O notation to declare "put what
comes out of this <code>SELECT * FROM thingie</code> into a pretty line chart".
</p> </p>
<p> <p>
This being the case, we've had multiple generations of tools that have tried to make it simpler for These tools act as abstractions on the low-level details of rendering a screen with charts, and like all
people to build dashboards. The whole point of them shouldn't take knowing linked lists and big O abstractions, they will be opinionated, restrict your freedom and <a
notation to say "put what comes out of this `SELECT * FROM thingie` into a line chart". href="https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/" target="_blank"
</p> rel="noopener noreferrer">be leaky</a> to some degree. In many cases, this is a great trade-off,
<p> because the analyst (and his boss) really doesn't give a damn whether the stroke of the x-axis is 3px or
These tools are abstractions, and like all abstractions, they will be opinionated, restrict your freedom 5px
and <a href="https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/" target="_blank" thick. But he does care about going from select to chart instantly.
rel="noopener noreferrer">be leaky</a> to some degree.
</p> </p>
<p> <p>
At the time I'm writing this, I feel the most popular tools out there to do this work are Looker, At the time I'm writing this, I feel the most popular tools out there to do this work are Looker,
Tableau, Power BI. And all of them have something I hate deeply: <strong>You can't put a dashboard in a Tableau, Power BI. All of them do a decent job in solving the above mentioned trade-off. Yet all three,
text file, and you can't version control it with Git <a href="#footnote-1">[1]</a>.</strong> along with all the previous generations I've lived through, have something in common I hate deeply:
<strong>You can't store a dashboard in a text file, and you can't version control it with Git <a
href="#footnote-1">[1]</a>.</strong>
</p> </p>
<p> <p>
Instead, these tools will always ask you to build your dashboard through point and click, drag and drop Instead, these tools will always ask you to build your dashboard through point and click, drag and drop
interfaces, and hide them behind propietary formats or just store it in their platform and not let you interfaces, and then hide the underlying definition behind propietary formats or even worse, just store
see the raw thing behind. it in their platform and not let you see the raw thing under the hood. You nasty nasty Google, you
gifted us mortals with LookML to build Explores and then threw us back in hell when time came to plot.
</p> </p>
<p>This causes an unfortunate set of limitations: <p>This pattern of keeping the dashboard definitions kidnapped and away from us causes an unfortunate set of
limitations:
<ul> <ul>
<li>You can't have a nice developer flow with Pull Requests, diffs, reviews, etc.</li> <li>You can't have a nice developer flow with Pull Requests, diffs, reviews, etc.</li>
<li>You can't elegantly reuse bits of dashboards across multiple ones, nor make mass refactors across a <li>You can't elegantly reuse bits of dashboards across multiple ones, nor make mass refactors across a
@ -73,15 +84,19 @@
scale at all. And my experience has taught me that even a small organization can produce dozens of scale at all. And my experience has taught me that even a small organization can produce dozens of
dashboards pretty fast with basic data reporting needs. dashboards pretty fast with basic data reporting needs.
</p> </p>
<p>Which brings me to my craving which heads this post: I want code defined dashboards so badly.</p> <p>Which brings me to my craving which heads this post: <strong>I want code defined dashboards so
badly.</strong></p>
<p>Picture a tool (I'm going to call my dream tool) where you can define a dashboard in a plain text file <p>Picture a tool (I'm going to call my dream tool) where you can define a dashboard in a plain text file
with some formal structure. The file with some formal structure. The file defines what data gets pulled in (queries), how it gets displayed
defines what data gets pulled in (queries), how it gets displayed (visuals and formatting), and some (visuals and formatting), and some other additional elements (filters and controls, text for
other additional elements (filters and controls, text for descriptions, headers, etc.). Possibly, the descriptions, headers, etc.). Possibly, the syntax also allows for metadata and comments, allowing the
syntax also allows for metadata and comments, allowing the analyst to document the dashboard in the same analyst to document the dashboard in the same file, instead of in some external tool that will
file, instead of in some external tool that will eventually drift away from reality because I'm lazy and eventually drift away from reality because I'm lazy and can't have two windows open at the same time. My
can't have two windows open at the same time. My dream tool is capable of reading this file and making a dream tool is capable of reading this file, figuring out how to get the needed data, and rendering a
dashboard out of it.</p> dashboard out of it. Notice I still don't care whether the stroke of the x-axis is 3px or 5px thick. I
don't need the full power (and responsibility) of web development. Just that the whole deal is stored in
a plain text file.
</p>
<p> <p>
Okay, so let's assume my dream tool exists and works like that. What have we gained? A lot of things! Okay, so let's assume my dream tool exists and works like that. What have we gained? A lot of things!
</p> </p>
@ -102,23 +117,49 @@
</li> </li>
<li> <li>
Even though my dashboard files might be useless without my dream tool because I can't render them, Even though my dashboard files might be useless without my dream tool because I can't render them,
just owning the file gives me a nice degree of sovereignty. You can cutt my off your service, or I just owning the file gives me a nice degree of sovereignty. You can cutt me off your service, or I
might get tired of paying your license, but I still have my repo with my files which tells me what might get tired of paying your license, but I still have my repo with my files which tells me what
data I was consuming, how I was presenting it, who owned it, etc. The barrier for me to move off the data I was consuming, how I was presenting it, who owned it, etc. The barrier for me to move off the
dream tool, potentially rolling my own if needed, is lowered. Actually, a sufficiently well designed dream tool, potentially rolling my own if needed, is lowered. Actually, a sufficiently well designed
standard for defining these dashboards could even lead to a kind protocol that multiple tools could standard for defining these dashboards could even lead to a kind protocol that multiple tools could
adhere to: can you imagine using one tool, getting tired of it, and being able to migrate all your adhere to: can you imagine using one tool, getting tired of it, and being able to migrate all your
dashboards to a new one just like that? I know a guy or two who have gone through the hell of a dashboards to a new one just like that? I know a guy or two who have gone through the hell of a
visualization tool migration how are probably having a catartic breakdown just with this thought. visualization tool migration in their companies who are probably having a catartic breakdown just
by witnessing this idea.
</li>
<li>
I could even make dashboards a first-class feature of my app, and check them <em>in the same
repository</em> where I develop my app. Bobby tables changed <code>fulfilled_orders</code> into
<code>completed_orders</code> in the latest MySQL migration and forgot to update the dashboard?
Not to worry, our CI tests caught that before it hit <code>master</code>.
</li> </li>
</ul> </ul>
<p>
<ul> I'm hopeful this won't remain sci-fi for long. The world of data tooling has changed quite a bit in
<li>the past and current landscape</li> the past decade, with a strong influence from software engineering and open source software. I remember
<li>the issues and what i crave for</li> ten years ago the companies I was working with would bleed licenses for propietary databases, propietary
<li>green shoots</li> datawarehouses, propietary ETL tools, propietary governance tools and propietary visualization tools.
</ul> The idea that someone working in Business Intelligence (how this world was known before it turned into
<p id="#footnote-1" class="footnote"><em>[1] You actually can put a Power BI dashboard in Git, but it's simply "data") had to know how to use git seemed extraterrestrial. Now we have plenty of more modern
alternatives that allow a small company to easily run a full data stack in a simple VM with no licensing
costs. Composable, dev-friendly, open source, high quality tools have already conquered storage,
querying, transformations, ETL and governance. It's only a matter of time that the slot of the
visualization tool gets conquered as well.
</p>
<p>
There are actually some extremely young tools which are already exploring ideas similar to this. <a
href="https://evidence.dev" target="_blank" rel="noopener noreferrer">evidence.dev</a> is the best
one I've come accross so far. And even though it's young and still needs plenty of polishing and
feature-richness to be up there with the big boys that dominate the landscape, I think it's enough for
small teams as it is today.
</p>
<p>
I hope that, some time soon, I can look back at this post and say my dream of code defined dashboards
has become a very mundane and boring reality, and that my junior analyts wherever I'm working at the
time think I'm a caveman when I tell them we used to update the name of a field across 231 Tableau
dashboards by opening, pointing and clicking for seveteen days when someone changed the field name in
the DWH.</p>
<p id="footnote-1" class="footnote"><em>[1] You actually can put a Power BI dashboard in Git, but it's
quite useless since the best format they quite useless since the best format they
can offer you is an ocean of unreadable JSON-strings-within-JSONs you would never dare to touch can offer you is an ocean of unreadable JSON-strings-within-JSONs you would never dare to touch
without Power BI desktop, without Power BI desktop,