175 lines
No EOL
12 KiB
HTML
175 lines
No EOL
12 KiB
HTML
<!DOCTYPE HTML>
|
|
<html>
|
|
|
|
<head>
|
|
<title>Pablo here</title>
|
|
<meta charset="utf-8">
|
|
<meta viewport="width=device-width, initial-scale=1">
|
|
<link rel="stylesheet" href="../styles.css">
|
|
</head>
|
|
|
|
|
|
<body>
|
|
<main>
|
|
<h1>
|
|
Hi, Pablo here
|
|
</h1>
|
|
<p><a href="../index.html">back to home</a></p>
|
|
<hr>
|
|
<section>
|
|
<h2>I want code defined dashboards so badly</h2>
|
|
<p>Analysts build dashboards. These are also called reports, data tools, data products, and another
|
|
gazillion funny names.</p>
|
|
|
|
<p>For the sake of clarity, when I say dashboard here, I refer to some software-driven wizardry that puts
|
|
pixels on a screen, displaying text, tables, and charts, as well as a few controls to play with what
|
|
data they can see (filters and selectors, mostly). Some human will read this in hope of understanding
|
|
data for something of use.</p>
|
|
<p>
|
|
Analysts and other species are expected to build lots of them. Some business person needs to know stuff,
|
|
so the analyst goes and builds a dashboard so the person can look at it and know the stuff he needs to.
|
|
</p>
|
|
<p>
|
|
My description of a dashboard is very open ended, technology wise. So there's a million ways to
|
|
technically implement one. The thing is, analysts are analysts, not software engineers. If I ask my good
|
|
colleague Uri, who is a wonderful analyst, to code you up your KPIs dashboard from scratch with React,
|
|
he's going to take somewhere between six weeks and two years. Not great.
|
|
</p>
|
|
<p>For decades, Data teams and data gardeners in general have used specific tools to build dashboards, which
|
|
are designed around the idea that an analyst is not a software engineer. This being the case, we've had
|
|
multiple generations of tools that have tried to make it simpler for people to build dashboards. The
|
|
whole point of them is it shouldn't take knowing linked lists and big O notation to declare "put what
|
|
comes out of this <code>SELECT * FROM thingie</code> into a pretty line chart".
|
|
</p>
|
|
<p>
|
|
These tools act as abstractions on the low-level details of rendering a screen with charts, and like all
|
|
abstractions, they will be opinionated, restrict your freedom and <a
|
|
href="https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/" target="_blank"
|
|
rel="noopener noreferrer">be leaky</a> to some degree. In many cases, this is a great trade-off,
|
|
because the analyst (and his boss) really doesn't give a damn whether the stroke of the x-axis is 3px or
|
|
5px
|
|
thick. But he does care about going from select to chart instantly.
|
|
</p>
|
|
<p>
|
|
At the time I'm writing this, I feel the most popular tools out there to do this work are Looker,
|
|
Tableau, Power BI. All of them do a decent job in solving the above mentioned trade-off. Yet all three,
|
|
along with all the previous generations I've lived through, have something in common I hate deeply:
|
|
<strong>You can't store a dashboard in a text file, and you can't version control it with Git <a
|
|
href="#footnote-1">[1]</a>.</strong>
|
|
</p>
|
|
<p>
|
|
Instead, these tools will always ask you to build your dashboard through point and click, drag and drop
|
|
interfaces, and then hide the underlying definition behind propietary formats or even worse, just store
|
|
it in their platform and not let you see the raw thing under the hood. You nasty nasty Google, you
|
|
gifted us mortals with LookML to build Explores and then threw us back in hell when time came to plot.
|
|
</p>
|
|
<p>This pattern of keeping the dashboard definitions kidnapped and away from us causes an unfortunate set of
|
|
limitations:
|
|
<ul>
|
|
<li>You can't have a nice developer flow with Pull Requests, diffs, reviews, etc.</li>
|
|
<li>You can't elegantly reuse bits of dashboards across multiple ones, nor make mass refactors across a
|
|
large number of dashboards.</li>
|
|
<li>You can't parse the dashboard programmatically, so you can't integrate with tooling you might roll
|
|
yourself.</li>
|
|
</ul>
|
|
</p>
|
|
<p>
|
|
As with all engineering work, there are no silver bullets and the design decisions these tools have made
|
|
do have some pros. For example, people can build a dashboard with these tools without knowing how to use
|
|
Git. The drag and drop approach lowers the barrier for analysts and business profiles so that they can
|
|
build dashboards without knowing much (or anything) about coding in strict syntaxes.
|
|
</p>
|
|
<p>
|
|
But when you start to have a lot of dashboards (dozens, hundreds, even thousands), this approach doesn't
|
|
scale at all. And my experience has taught me that even a small organization can produce dozens of
|
|
dashboards pretty fast with basic data reporting needs.
|
|
</p>
|
|
<p>Which brings me to my craving which heads this post: <strong>I want code defined dashboards so
|
|
badly.</strong></p>
|
|
<p>Picture a tool (I'm going to call my dream tool) where you can define a dashboard in a plain text file
|
|
with some formal structure. The file defines what data gets pulled in (queries), how it gets displayed
|
|
(visuals and formatting), and some other additional elements (filters and controls, text for
|
|
descriptions, headers, etc.). Possibly, the syntax also allows for metadata and comments, allowing the
|
|
analyst to document the dashboard in the same file, instead of in some external tool that will
|
|
eventually drift away from reality because I'm lazy and can't have two windows open at the same time. My
|
|
dream tool is capable of reading this file, figuring out how to get the needed data, and rendering a
|
|
dashboard out of it. Notice I still don't care whether the stroke of the x-axis is 3px or 5px thick. I
|
|
don't need the full power (and responsibility) of web development. Just that the whole deal is stored in
|
|
a plain text file.
|
|
</p>
|
|
<p>
|
|
Okay, so let's assume my dream tool exists and works like that. What have we gained? A lot of things!
|
|
</p>
|
|
<ul>
|
|
<li>Because the tool is plain text with a formal structure, I can parse it and work on it
|
|
programmatically. If a column name has changed in some DWH table, I can scan a gazillion dashboards
|
|
to look for all the places where it gets used. Actually, I can regularly parse all my dashboards and
|
|
produce a structure summary of "dashboard X uses tables A, B, C" constantly to monitor my
|
|
dependencies. The metadata and documentation in the dashboard itself also allows for further
|
|
organization tricks. For instance, I can specify an owner for each dashboard, ensuring I know who to
|
|
call if it breaks.
|
|
</li>
|
|
<li>
|
|
I can manage the dashboards in git and leverage all the nice workflows, CI and other practices
|
|
around it. I can place automated checks that run on each PR to ensure conventions and standards
|
|
(e.g. "every dashboard must have the metadata field 'owner' specified"). I can see who changes what
|
|
and when. I can easily rollback screwups from my junior analysts.
|
|
</li>
|
|
<li>
|
|
Even though my dashboard files might be useless without my dream tool because I can't render them,
|
|
just owning the file gives me a nice degree of sovereignty. You can cutt me off your service, or I
|
|
might get tired of paying your license, but I still have my repo with my files which tells me what
|
|
data I was consuming, how I was presenting it, who owned it, etc. The barrier for me to move off the
|
|
dream tool, potentially rolling my own if needed, is lowered. Actually, a sufficiently well designed
|
|
standard for defining these dashboards could even lead to a kind protocol that multiple tools could
|
|
adhere to: can you imagine using one tool, getting tired of it, and being able to migrate all your
|
|
dashboards to a new one just like that? I know a guy or two who have gone through the hell of a
|
|
visualization tool migration in their companies who are probably having a catartic breakdown just
|
|
by witnessing this idea.
|
|
</li>
|
|
<li>
|
|
I could even make dashboards a first-class feature of my app, and check them <em>in the same
|
|
repository</em> where I develop my app. Bobby tables changed <code>fulfilled_orders</code> into
|
|
<code>completed_orders</code> in the latest MySQL migration and forgot to update the dashboard?
|
|
Not to worry, our CI tests caught that before it hit <code>master</code>.
|
|
</li>
|
|
</ul>
|
|
<p>
|
|
I'm hopeful this won't remain sci-fi for long. The world of data tooling has changed quite a bit in
|
|
the past decade, with a strong influence from software engineering and open source software. I remember
|
|
ten years ago the companies I was working with would bleed licenses for propietary databases, propietary
|
|
datawarehouses, propietary ETL tools, propietary governance tools and propietary visualization tools.
|
|
The idea that someone working in Business Intelligence (how this world was known before it turned into
|
|
simply "data") had to know how to use git seemed extraterrestrial. Now we have plenty of more modern
|
|
alternatives that allow a small company to easily run a full data stack in a simple VM with no licensing
|
|
costs. Composable, dev-friendly, open source, high quality tools have already conquered storage,
|
|
querying, transformations, ETL and governance. It's only a matter of time that the slot of the
|
|
visualization tool gets conquered as well.
|
|
</p>
|
|
<p>
|
|
There are actually some extremely young tools which are already exploring ideas similar to this. <a
|
|
href="https://evidence.dev" target="_blank" rel="noopener noreferrer">evidence.dev</a> is the best
|
|
one I've come accross so far. And even though it's young and still needs plenty of polishing and
|
|
feature-richness to be up there with the big boys that dominate the landscape, I think it's enough for
|
|
small teams as it is today.
|
|
</p>
|
|
<p>
|
|
I hope that, some time soon, I can look back at this post and say my dream of code defined dashboards
|
|
has become a very mundane and boring reality, and that my junior analyts wherever I'm working at the
|
|
time think I'm a caveman when I tell them we used to update the name of a field across 231 Tableau
|
|
dashboards by opening, pointing and clicking for seveteen days when someone changed the field name in
|
|
the DWH.</p>
|
|
<p id="footnote-1" class="footnote"><em>[1] You actually can put a Power BI dashboard in Git, but it's
|
|
quite useless since the best format they
|
|
can offer you is an ocean of unreadable JSON-strings-within-JSONs you would never dare to touch
|
|
without Power BI desktop,
|
|
much less parse yourself. No way to leverage Git properly with it other than commit and
|
|
rollback.</em></p>
|
|
<hr>
|
|
<p><a href="../index.html">back to home</a></p>
|
|
</section>
|
|
</main>
|
|
|
|
</body>
|
|
|
|
</html> |