“It is easier to ship recipes than cakes and biscuits.” -John Maynard Keynes
As an Analytics Engineer, you may hear this a lot: ‘Wait, so you just make dashboards and reports, right?’ It’s common but often based off a deep misunderstanding about the analytics process. Let me attempt to clarify.
As an Analytics Engineer, you are responsible for moving the analytics and data processes into the layer above the data warehouse. Essentially, your task is to create something from the basic set of data and shape it into something both actionable and useful. Naturally, this task lends itself to a broad and narrow set of skills. And it is probably why people are so quick to assume you just ‘make reports and dashboards.’ They see the result of the work, not the process that created that report and dashboard. They did not see you toil with a dataset too wide and at the wrong granularity. They did not see you automate the ETL of said table into an aggregated table with security, alerting and logging baked into the process. And, after all that, they did not see you worry about font, color, and styling of the dashboard. Because the effects of a terrible design are catastrophic for your message.
An Analytics Engineer must be, primarily, a technologist and thinker. With the sheer volume of options available (throughout the Analytics and Data layers), context and task shifting are an understatement. It’s common to spend most of the day at the data (SQL) layer, while spending a smaller portion of time creating and implementing a dashboard tool. And it’s that balance of time which allows for functional dashboards.
One vital piece that is worth mentioning (and possibility elaborating on) is how this field is not solely locked into one technology or stack (for example: LAMP stack, cloud vendors or analytics tools); rather, this field should be able to use all the tools and should leverage its own, more abstract, stack unrelated to a technology. So, today it might be MySQL but tomorrow it could be Redshift or the like. Therein rests the challenge: being able to adapt to the tool at hand without missing a beat.
Your data (information) is a set of free-flowing, dynamic instructions about how business (or whatever) is understood. It will never be perfect. In fact, you don’t want it to be; if it’s perfect (which, again, is impossible) there’s no room for improvement or self-reflection. What’s more, it will lack creative impulse: you don’t think freely if something is ‘perfect’ and done.
The goal should be to bring the certification/governance process *to* the data. If you must wait, collect, meet, agree and on and on, there is a critical piece missing: data should be certified from what it produces (or its many derivations). How does this work? How can you certify a csv file? Simple: Alert, Integrate and Monitor your Analytics Infrastructure.
Essentially, if nothing is created from the data, why would there be a need to certify it? Once something is created, then you relentlessly certify, in flight, what is being produced. It’s a sort of fact-checking, data-alerting mechanism that’s completely possible with the right framework. Which leads to the next point…
Collect data about patterns of usage
If you’re not analyzing usage patterns, you’re missing valuable data. With all analytics, there are reasons for why (1) a specific set of data is selected and (2) what the user is attempting to do with the data. You can easily keep this metadata in AWS S3 (with a good lifecycle policy) or store for potential later use somewhere else. The point is that if you aren’t understanding *why* then you are only seeing one side of the coin.
Keep everything and then figure out what to do with it and then how to certify it.
Here’s a quick tip for Tableau users out there who may have tried to get their individual date filter to automatically choose the max data while (here’s the catch) still allowing for the drop down with other dates in it. Sure, it’s possible to use the LOD calculations to get you a Boolean but that would be too easy. Sometimes, the people just want to max date selected in their filter automatically. Oh, and, this also works for dynamic parameters.
Here are the steps:
SQL to query for max date in column (this is the field used in your filter)
Match the date format (eg: mm-d-yyyy or mm-dd-yyyy)
Update the workbook section with the filter
The little bit of code you’ll need to dynamically do this will look like:
While there are numerous and exceptional benefits in administering Tableau Server via the GUI, the hidden gem is its capability for automation and integration. In simple terms, automating as much of the administration and monitoring makes for a very happy Tableau user base. In this session, you’ll learn how having at least some automation can make your environment faster and leaner.
We’ll automate everything from user provisioning (and removal), auditing views, securing content, and ‘Garbage Collection’ (or just removing old content). Want integration too?! We’ll show you how to reach pretty much anything via the REST API and trigger extracts via tools like Slack.
Oh, one more thing. We’ll show a new platform called Tableau Working Wax: or the ability to automatically generate reports and deliver them to the people.
In the end, you’ll be on your way to a fully automated and healthy Tableau Server infrastructure.