How can business intelligence (BI) launch your data efforts, and pave the way for your first data science hire?
When companies want to start analyzing their data, they're often tempted to hire a data scientist as a first step. It should really be step four, after they have:
If you haven't heard of it before, BI is the data analysis that was fashionable before data science arrived: you summarize your data (roll-ups and counts) or split it apart (pivot tables, cubes, breakdowns) or even make some mild projections (trend analysis).
Overall, BI has come to mean data analysis that looks "backward" to analyze the past whereas data science involves a lot of looking "forward" to predict the future. [1]
Given everything you hear about data science, then, why bother with BI? It's just a bunch of counting and summaries, right? Don't let the simple math fool you. My colleague Greg Reda (@gjreda) and I talk about this sort of thing quite a bit [2], and in a recent discussion, we explored some ways that BI can help a company before they try to tackle data science:
A company definitely needs needs to know which products and services make the most money, or what time of day most people visit the online store, or which product features customers use the most. Those answers require BI's counting and binning, not data science's predictive models.
These are not "fancy" and "predictive" insights, but they're the sort that can shed a lot of light on a company's activities and opportunities.
To do BI well, you'll need to organize and document your data sources (how that data is created, and what each field or column means) and test your data infrastructure (make sure the right people can access the right data as needed).
These are just good practices, anyway -- they fall squarely into "unsexy-but-totally-necessary" -- but there's an added benefit: a proper data infrastructure is a precursor to doing effective data science. Those data scientists can't analyze data they can't find, nor can they provide meaningful analyses if they don't understand what each field really means. If you've done BI well, neither of these will be an issue when you move into data science.
The simple math behind BI techniques makes it easier to identify data entry and data storage problems because you can compare those results to institutional knowledge. You'll know, for example, that you've only sold goods within the US, so any charts that show sales in the UK are wrong. And since you've been in business since 2010, that sale dated 1912 is a typo or indicative of a wider data collection problem.
If BI's summaries and roll-ups fail, there's no way the higher-end data science techniques will work. Those problems are tougher to catch when you skip BI and go straight to data science. Some predictive models are less transparent than adding and binning, so you don't always know why they have made a given decision. This can lead you believe a model is working when it's really not ... and you then make business decisions on those results ... and your data science efforts become costly in more ways than one.
The best thing about BI is that you can get a good head start in a spreadsheet tool like Excel before you shell out for a fancier BI package. You already have Excel, right? So long as your data is in neat rows and columns somewhere, and it's small enough to fit into a desktop app, you get that first dose of BI for free. You can make charts and pivot tables to your heart's content.
(Spreadsheets are more powerful than we often give them credit. I wrote about this a while back.)
Resist the temptation to move forward on data science efforts before you've prepared that BI foundation. People will try to tell you that it's not fancy, but it can still provide valuable insights into your business activities. It's fast and cheap to implement compared to data science, and having a proper BI setup will make those first data scientist hires effective in a shorter time frame.
Are you having trouble building or growing your data science team? I want to help. I can help. Please [contact me](/contact/) to start the discussion.BI and data science are both tools for analyzing data to improve your business (or, as I like to call it, analyzing data for fun and profit) so I've always found it odd that people try to shun the former while promoting the latter. If something improves your business, do you really care that it's not very sophisticated? ↩︎
No, really, we talk -- OK, "rant" -- about data quite a bit. Don't worry, though. We promise that we won't release a podcast. It's for your good as well as ours. ↩︎
Common Mistakes in Data Science Hiring : Part 2
Having trouble hiring data scientists? or, once you hire them, do they not stick around? You may be tripping over your own feet. Part 2 of 2.
Data Ethics for Leaders: A Risk Approach (Part 1)
Looking at data ethics through the lens of risk. (Part 1 in a series.)