As a freelancer, I've got some alerts set up with Airtasker, such that if any new dashboard-related gigs get posted, I get pinged. Airtasker can be a bit of a "race to the bottom" where people (including myself) post relatively simple tasks that they just want sorted for a minimum amount of money.
I've taken gigs this way and they have led to bigger and better things, so it's not all bad. If nothing else, it's great for putting yourself out there (although you have to watch out for shady university students demanding you finish their assignment for $20).
Tableau tutoring is something I've done multiple times (both privately and through AirTasker). I thought I may as well share some of the common questions that I have come across. So this might turn into a bit of blog post series...
This blog post is about one Airtasker gig that brought up a Tableau issue that I seemed to have encountered (in one form or another) multiple times. It's an issue that gets right to the heart of why many people find Tableau incredibly frustrating.
The poster included two pictures to illustrated her question.
I didn't end up getting assigned this gig (Drats!), but here is the neat and tidy version of what I would have liked to convey.
To get us started, I can quickly generate dummy data using R (see the gist here), with monthly Budgets and daily Actuals, to resemble their data.
Feel free to download the resulting Excel file, here (if you would like to follow along).
We can plonk this into Tableau and set things up a little.
In their pictures, they have something different to the above. Instead of just using Amount, they have several calculated fields that display the amount for several specific time periods:
- Current month
- Current year
- Current YTD (fiscal)
- Current year to go (YTG) (fiscal)
This is how I would create these.
The first step is to create a new column that is just the current date repeated all the way down.
I think of calculated fields as new columns added to the data source table -unless you're dealing with LOD calculations -in which case it's more like a joined aggregated table -but LODs are a whole different discussion).
Then we make a series of Boolean (True / False) columns that will say whether the Date in each row meets any of our four time period conditions.
Current month and year are simple enough, using the datetrunc() function.
But the financial year ones require a bit more finesse.
- Establish what the current financial year is, according to the "today" column (we are talking the Australian financial year, here).
- Use virtually the same formula, to establish each row's financial year, according to the Date column.
- Add in the relevant Boolean tests.
Nice! It's also a good idea to double check these Booleans with a table, like below:
Seems all good.
Our next step is to utilise these Boolean columns to only contain the Amount coloumn value, if their condition is met. We can do this simply, by adding an IF formula into each.
Then drag each of these down into the Measures section.
We can now drag each of them onto our viz, and we've got something looking pretty darn close to the poster's images.
Finding a Solution
According to the original pictures, there are two parts to her request:
1. add in a Variance column
2. remove the "Actuals" and "Variance" from the Year to Go (YTG) section.
This is precisely this sort of apparently easy thing, that is very difficult to achieve in Tableau and drives people back to using Excel, where they can easily add in or hide a column.
Figuring out "the Tableau way" requires a good understanding of how Tableau "thinks."
The Quick Win Solution
The (seemingly) easiest solution is just to add subtotals to the table.
As you can see, the new total column is just the addition of the Actuals and Budget columns.
In order to get the variance, we would have to create a "proxy" Amount column, where Actual values were made negative.
Then we would do a "Replace References", so that all of our newly created time-specific formulas linked back to the proxy instead the original Amount column.
This works okay. The format bar also allows us to change the header label from "Total" to "Variance".
But we have very limited control over this new Variance column.
Besides from formatting its colour and alignment, there's nothing we can change about it. If we wanted to show it as a percentage, we wouldn't be able to do that.
This solution would only really satisfy the most easy-going of clients.
Another tempting avenue is to right click on each of our measure pills and select "Quick Table Calculation" and "Difference"...
This changes our columns from being two (Actual & Budget) to being just one (the difference). It is very easy to muck around in this area for hours before realising that, in this case, it won't lead us to a working solution.
We might throw in the towel and go ahead and just manually aggregate our data, in a pre-processing step (maybe using SQL to make a View or even a scheduled Tableau Prep flow) that would append the Variance values to our aggregated raw data.
Here is a quick video of such a process, using Tableau Prep.
This sort of thing should really only ever be a last resort.
Creating a whole bunch of single-use data sources is a bad habit. Especially when just starting out.
Sometimes it is unavoidable and the creation of them is fast and simple. But doing this all the time might stop you from picking up all the different tips and tricks that allow the functionality you were aiming for to be produced off the original data source.
Additionally, the scheduling of this process could become a bit of a rod for your own back. If we were working with a database source, it's possible to create a View with the data in such a correct form. But, if we were working off a live service without some intermediary location (where we could build such a View), then we'd be up for a whole lot more leg work in setting this up.
If the report was only refreshed monthly, admittedly, the addition of a Prep flow into the regular process probably wouldn't be so much of a big deal.
The Real Solution
There is a way we can get pretty close to our client's request, operating off the original dataset.
What we have to do is create three new (actual, budget and variance) measures for each of our four time-periods.
Having to make all these (12 measures) sounds like such a headache... but I'm going to show you a nifty shortcut.
Firstly, we make two helper columns.
The trick to making multiple calculated fields, is the little known drag and drop method.
First, create the names for your new fields as comments. Modify each of the formulas accordingly. Then drag each section (including the comment / title) over to the measures pane.
Ta da! The comment becomes the measures name. Mind blown, right?
Making the Variance calculated fields is slightly different. Remember, thinking in terms of what Tableau sees, there are no rows where we would see a budget and an actual amount value.
This is determined by the SAP Version column. Any particular row is either "ACTUALS" or "BUDGET". Therefore using [current month actual] - [current month budget] as the formula will not work for us.
Instead, the formula needs to be an aggregation, dependent upon the context of the visualisation. We need to use SUM([current month actual]) - SUM([current month budget]).
I put all these new calculations into a folder for neatness. Let's take a look at them.
Oh no! At this point, I realised that the "is Actual" column wasn't working!
The SAP Version had "ACTUALS" instead of "ACTUAL". Luckily, I only had to update the "is Actual" column and not every single one of those "... actual" calculated fields that we'd just made. Phew! A good lesson to minimize repeated code, wherever possible.
Okay. Back on track.
Chucking all these calculated fields into the viz, we are presented with a new set of challenges.
For each of new measures, I want to change the header aliases from, for example, "Current Month Actual" to "Actuals", it won't let me do this more than once because it cannot have repeated aliases.
In response, I had to resort to putting trailing spaces on subsequent aliases, eg "Acutals ". I also formatted these headers to be left aligned and to never "wrap text" (so they would all look the same).
Admittedly this is a pretty minor issue.
Using all these 12 newly calculated measures, we now have the flexibility to modify the variance column as we would like (as in, we can change it to being a percentage of the budget). We can also remove the Actual and and Variance for the Year To Go section.
As an informative table, however, this is pretty lackluster and it is missing a critical part from the client's request.
We can't add those "Month ($m)", Year to Date ($m)" headers above where it says CAPEX OPEX (there is no column in our source table that would allow this). Our current setup is just have a whole bunch of numbers, with no clear differentiation between the timeframes.
We could go off and make any number of cool charts, with these new measures but let's say our client was a real stickler to either get the table looking the way she wanted or to know conclusively that what she wanted simply wasn't possible.
After all this hard work, are we going to admit defeat?
Talk to our client. Try to persuade her of an alternative solution that you know how to do.
Maybe they don't need to have each of those time periods displayed in the one table. We could create a parameter to only show selected timeframes, one at a time.
Then use the parameter selector to drive which amount is being displayed.
A little header formula, would give us the title we needed.
If they absolutely must have them all on the one page, there is a solution available.
We can achieve this by making a dashboard containing a series of sheets with only the measures of each timeframe.
We would also add a highlight action, so that rows highlighted in one sheet would also highlight in the other 3. Reassuring that everything was correctly lined up.
If the actual data had many categories, we would also have to deal with those scroll bars, because these would not scroll in a synchronized fashion. There is a hack, available for this, explained pretty well in the following video.
That allows us to restrict the number of rows shown on each sheet, via a central parameter. But you do end up with a horizontal scrolling bar, which is a bit annoying. There might be some other way of improving the scrolling experience.
This demo showcases using a parameter. But I think it isn't much better.
Aggregating Up and Down
One final trick we might show our client, is expanding and contracting the tables by the Reporting levels. A hierarchy, unfortunately, won't work across the multiple sheets in our dashboard.
The simplest solution (copied from this Youtube Video) is to add a report level parameter.
And then replace the actual Reporting Levels with a calculated field that will show the levels (or be blank) according to the parameter's setting.
This gives us the ability to aggregate up or down (across all the sheets) by the different reporting levels.
There also might be "Set action" maneuvers as well, to investigate. To enable "click to expand" functionality. But the parameter slider seems satisfactory, to me.
So there you have it.
It seems that we have finally gotten to something that (in theory) should satisfy our client's initial request. You can download the Tableau Workbook from Tableau public.
In the process, we have also shown her a whole range of aspects to Tableau, many of which she may have not seen before.
There are heaps of places to take this now, as in visualizations that would take far less cognitive load to understand.