In the past, I have written about what you can do with a brand new (flat file) dataset that lands in your inbox, with the instructions being only "Please analyse this."
I am coming to suspect that having a useful collection of dashboard templates at the ready, is something well worth it -so that drilling down to the meat and potatoes of unspecific requests is made far less daunting.
In other words, being able to shoot back, "Let's take a look at this data together, with this exploratory dashboard I've just whipped up" is a much better response then heading off on an uncertain deep dive all by yourself
This post looks into the situation where a client asks:
"What was the benefit of us doing X?"
It's a similarly open-ended request. They probably aren't quite ready for "answers" but rather a simple tool, which will provide an overview, to guide them towards operationalising their hypothesis a bit better. And getting a clear idea of the statistics they're going to need to evaluate their gut feel.
Having a template at the ready is going to be a great help. But, of course, (I'm not crazy) preparing for every possible scenario is not feasible. So hopefully what follows is generalisable enough for many scenarios.
This dataset was part of a machine learning competition ~4 years back. Basically each row is a customer, and contains a number of details about how they engaged with the AirBnB app (and whether they ended up making a booking or not).
The (made up) premise that I'm running with for this example, is that AirBnB made a change to its customer onboarding process, somewhere in Dec 2013, and now (~6 months later) they want to review how this change improved revenue and user acquisition.
What follows is a neat little "What's the difference" finder, that allows you to investigate this sort of question, from a range of different angles. To get an initial overview.
The dashboard is also up on Tableau Public.
Here we can see the counts / revenue for a range of variables. And, in the table at the bottom, the differences observed between the new and the old onboarding methods.
Because I manufactured this "new method / old method" column, any differences observed are entirely due to existing fluctuations in the data.
But this is not such an easy question to answer...
The main challenge here is attribution. We might begin by trying to see if we had an increase in user numbers. We did. But in the first months of the "new method" there is a mixture of new and old methods.
Also, the increase observed could be due to other factors (more people telling their buddies) and the new method deployed could actually have been a hinderance to the growth.
So this really limits the conclusions we are able to draw (with certainty). We might, instead, look into how the qualities of the customers changed. And that is what the table and the different toggles allow you to do.
Table of differences
The table shows us the % change (in selected customer segments) transitioning from the old method to the new.
Above we can see that, with the new method, there was a ~9% jump in the iPhone users segment.
We can also add a further level of detail, by way of the toggles (produced via the method, detailed in this post).
Now we can see that the proportion of iPhone users who did not end up making a booking increased by ~7%. And we can go further and deeper, if we so please.
We can also make selections in the other charts, to filter / refine the results in our differences table.
Several improvements might make the end-result a little bit more digestible for the client.
- Having the number format of the metric column adjust dynamically. So that when the metric is set to "Income" it aggregates to 000's and includes a $ sign. Currently this is a new feature request -but can be manually enabled by the following process.
- Rather then having all those separate toggle switches, if Tableau introduced muti-select parameters, this would make things easier. This seems to be a feature request.
- Statistical significance. It is possibe to do a test of significance within Tableau (I think by calculating Z-scores). But with so many ad-hoc (chi-square) tests occuring, I doubt this would be a valid approach to use. Probably better to do a separate analysis in R -to detect the areas where the greatest differences were detected -I'm thinking a logistic regression with multiple order interactions).
This dashboard is primarily for data exploration rather then actually reporting interesting findings.
It's something to whip up quickly, so that you have a practical tool, to clarify with the client what they're really interested in (before investing too much of your own time guessing what that might be).
Tools such as Tableau and PowerBI (and even PivotTables) are great for having conversations with non-technical stake-holders, where you can actually investigate the data "on the fly".
Sure, knowing your selected tool well enough to be able to do this, with only minor hesitation, is not easy. But if you chance it (and succeed) it can go a great way towards having business conversations that are analytics informed (and vice-versa).