pablohere/public/writings/i-want-code-defined-dashboards-so-badly.html

134 lines
8.5 KiB
HTML
Raw Normal View History

2025-01-14 16:33:14 +01:00
<!DOCTYPE HTML>
<html>
<head>
<title>Pablo here</title>
<meta charset="utf-8">
<meta viewport="width=device-width, initial-scale=1">
<link rel="stylesheet" href="../styles.css">
</head>
<body>
<main>
<h1>
Hi, Pablo here
</h1>
<p><a href="../index.html">back to home</a></p>
<hr>
<section>
<h2>I want code defined dashboards so badly</h2>
<p>For decades, Data teams and data gardeners in general have used specific tools to build dashboards. This
are also called reports, data tools, data products, and another gazillion funny names.</p>
<p>For the sake of clarity, when I say dashboard here, I refer to some piece of software that people look at
on a screen where they text, tables and charts, as well as a few controls to play with what they can see
(filters and selectors, mostly).</p>
<p>
Analysts and other species are expected to build lots of this. Some business person needs to know stuff,
so the analyst goes and builds a dashboard the person can look at.
</p>
<p>
My description of a dashboard is very open, so there's a million ways to technically implement one. The
thing is, analysts are analyts, not software engineers. If I ask my good colleague Uri, who is a
wonderful analyst, to code you up your KPIs dashboard from scratch with React, he's going to take
somewhere between six weeks and two years. Not great.
</p>
<p>
This being the case, we've had multiple generations of tools that have tried to make it simpler for
people to build dashboards. The whole point of them shouldn't take knowing linked lists and big O
notation to say "put what comes out of this `SELECT * FROM thingie` into a line chart".
</p>
<p>
These tools are abstractions, and like all abstractions, they will be opinionated, restrict your freedom
and <a href="https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/" target="_blank"
rel="noopener noreferrer">be leaky</a> to some degree.
</p>
<p>
At the time I'm writing this, I feel the most popular tools out there to do this work are Looker,
2025-01-15 11:00:35 +01:00
Tableau, Power BI. And all of them have something I hate deeply: <strong>You can't put a dashboard in a
text file, and you can't version control it with Git <a href="#footnote-1">[1]</a>.</strong>
2025-01-14 16:33:14 +01:00
</p>
2025-01-15 11:00:35 +01:00
<p>
Instead, these tools will always ask you to build your dashboard through point and click, drag and drop
interfaces, and hide them behind propietary formats or just store it in their platform and not let you
see the raw thing behind.
</p>
<p>This causes an unfortunate set of limitations:
<ul>
<li>You can't have a nice developer flow with Pull Requests, diffs, reviews, etc.</li>
<li>You can't elegantly reuse bits of dashboards across multiple ones, nor make mass refactors across a
large number of dashboards.</li>
<li>You can't parse the dashboard programmatically, so you can't integrate with tooling you might roll
yourself.</li>
</ul>
</p>
<p>
As with all engineering work, there are no silver bullets and the design decisions these tools have made
do have some pros. For example, people can build a dashboard with these tools without knowing how to use
Git. The drag and drop approach lowers the barrier for analysts and business profiles so that they can
build dashboards without knowing much (or anything) about coding in strict syntaxes.
</p>
<p>
But when you start to have a lot of dashboards (dozens, hundreds, even thousands), this approach doesn't
scale at all. And my experience has taught me that even a small organization can produce dozens of
dashboards pretty fast with basic data reporting needs.
</p>
<p>Which brings me to my craving which heads this post: I want code defined dashboards so badly.</p>
<p>Picture a tool (I'm going to call my dream tool) where you can define a dashboard in a plain text file
with some formal structure. The file
defines what data gets pulled in (queries), how it gets displayed (visuals and formatting), and some
other additional elements (filters and controls, text for descriptions, headers, etc.). Possibly, the
syntax also allows for metadata and comments, allowing the analyst to document the dashboard in the same
file, instead of in some external tool that will eventually drift away from reality because I'm lazy and
can't have two windows open at the same time. My dream tool is capable of reading this file and making a
dashboard out of it.</p>
<p>
Okay, so let's assume my dream tool exists and works like that. What have we gained? A lot of things!
</p>
<ul>
<li>Because the tool is plain text with a formal structure, I can parse it and work on it
programmatically. If a column name has changed in some DWH table, I can scan a gazillion dashboards
to look for all the places where it gets used. Actually, I can regularly parse all my dashboards and
produce a structure summary of "dashboard X uses tables A, B, C" constantly to monitor my
dependencies. The metadata and documentation in the dashboard itself also allows for further
organization tricks. For instance, I can specify an owner for each dashboard, ensuring I know who to
call if it breaks.
</li>
<li>
I can manage the dashboards in git and leverage all the nice workflows, CI and other practices
around it. I can place automated checks that run on each PR to ensure conventions and standards
(e.g. "every dashboard must have the metadata field 'owner' specified"). I can see who changes what
and when. I can easily rollback screwups from my junior analysts.
</li>
<li>
Even though my dashboard files might be useless without my dream tool because I can't render them,
just owning the file gives me a nice degree of sovereignty. You can cutt my off your service, or I
might get tired of paying your license, but I still have my repo with my files which tells me what
data I was consuming, how I was presenting it, who owned it, etc. The barrier for me to move off the
dream tool, potentially rolling my own if needed, is lowered. Actually, a sufficiently well designed
standard for defining these dashboards could even lead to a kind protocol that multiple tools could
adhere to: can you imagine using one tool, getting tired of it, and being able to migrate all your
dashboards to a new one just like that? I know a guy or two who have gone through the hell of a
visualization tool migration how are probably having a catartic breakdown just with this thought.
</li>
</ul>
2025-01-14 16:33:14 +01:00
<ul>
<li>the past and current landscape</li>
<li>the issues and what i crave for</li>
<li>green shoots</li>
</ul>
2025-01-15 11:00:35 +01:00
<p id="#footnote-1" class="footnote"><em>[1] You actually can put a Power BI dashboard in Git, but it's
quite useless since the best format they
can offer you is an ocean of unreadable JSON-strings-within-JSONs you would never dare to touch
without Power BI desktop,
much less parse yourself. No way to leverage Git properly with it other than commit and
rollback.</em></p>
2025-01-14 16:33:14 +01:00
<hr>
<p><a href="../index.html">back to home</a></p>
</section>
</main>
</body>
</html>