- Workbook Performance
- Workbook Metadata
Well, I’m happy to announce a third and final piece to the data model: Geo data. This is often a missing piece to the general ‘performance’ discussion on Tableau Server. What’s more, if your organization serves users both internally and externally, then performance becomes a big first impression. For example, you want to know why some users in various locations might mention a view is slow to load even though it’s been tested at, say, less than two seconds. Adding the geography (and ISP) of the people requesting the view makes the data (and workbooks) a bit easier to diagnose. BONUS, you can also understand the most requested views, http status codes (500s, etc.) and more.
Geo data: steps to implement
- Pull the Apache logs (I used Logentries and its open API to pre-filter before exporting).
- Parse the logs for basic columns: ip, datetime, request, port, etc
- Get distinct list of IPs and lookup
- Add to database (choose your flavor)
Connect the dots
The final table in the model had a few steps to it. It’s an IP helper table that stores the information for users who request views (city, state, latitude, longitude, etc.). Rather than look up each IP, I’ll check this table before and omit it from the lookup if we already have it stored. Once the basic script is complete, we automated it and then join the previous two tables together in order to understand performance of a view as it relates to geography. So, you’re SQL to join them all together might look something like this:
SELECT w.site ,w.user ,w.workbook ,a.datetimemst ,a.ip ,a.httpstatus ,g.city ,g.region ,g.country ,g.loc ,sum(w.elapsed_sec) as load_time FROM tableau_workbook_perf as w JOIN tableau_apache_perf as a on w.req = a.requestid JOIN tableau_apache_geo as g on g.ip = a.ip GROUP BY w.site ,w.user ,w.workbook ,a.ip ,a.httpstatus ,g.city ,g.region ,g.country ,g.loc
Now that we have all the pieces together, it’s time to visualize it back in Tableau (ah, yes, the circle of life, or data). We can now take the highly viewed workbooks and understand if performance is a local/geographical issue and/or a time issue. When we combine with the TWB Internals metadata, well, we have a very complete picture of a workbooks value.