What is a data strategy, and why do I need one?

Are you planning your firm's first data project? Whether you call it Big Data, data science, analytics, or business intelligence, you're likely under pressure to do something with data because it's supposed to help your company gain a competitive advantage.

There is indeed some truth that data can be used to drive decisions and build products that increase profits, reduce costs, or reduce risks. There is also a lot of work between the idea and the end result. Even considering the new opportunities for reward, building an analytics team and running data projects are time-consuming, expensive affairs. Adding to the pressure, your first steps hold strong influence on your direction in the long run. You'd do well to think it through.

So, then, what's your plan?

You do have one, don't you?

In some companies I've encountered, the extent of the "plan" is to hire a couple of data scientists and turn them loose. Or buy some expensive data-crunching hardware. And maybe not even in that order.

If you expect to be in this for the long haul, if you want to improve your chances of a smooth success, you're going to need a plan. That plan is called a data strategy.

the what

A data strategy is a road map for your company's data efforts. It describes what to do and when to do it, based on your business model, present state, goals, and concerns. Most of all, it identifies the purpose, the how and why your company will put data analysis to use.

On closer look, it:

Like any plan, a data strategy is full of forward-looking statements. For those who question the value of such a speculative document, ask yourself: would you rather have a plan that falls apart, or no plan at all? At least in the former case, you know when you're off-course.

Now that you know what goes into a data strategy, how does it look? The final product matters less than the journey: it can be a rough outline in a shared document, or a formal report, or anything in-between. It may be a lengthy affair, or it may sketch out just one or two proof-of-concept projects. The most important part is that you've thought through where you want to go, why it makes sense to go there, and have at least an initial idea of how to get there.

the why

Simply put: drawing a road map of data efforts means you'll know what to expect, and when. You stand a much better chance of avoiding costly mistakes, ranging from technical problems ("we don't yet have a way to collect the data we'll need"), to staffing problems ("we've just hired a data scientist, but we're six months away from having meaningful data for them to analyze"), to legal or PR problems ("it turns out, our use of data runs afoul of local laws").

The point on staffing is especially relevant, since so many companies complain how difficult it is to hire data scientists. Experienced data scientists tell me that they can tell when a company doesn't have a plan, and in turn, that they refuse to work in such companies. To skip the data strategy, then, is to sabotage your own hiring efforts. Data scientists tend to be motivated by tangible progress towards a goal, and in turn, they find the absence of such progress demotivating ... even disorienting.

Experienced data scientists tell me that they can tell when a company doesn't have a plan, and in turn, that they refuse to work in such companies.

If you turn this around, developing a strategy means your first data scientists can hit the ground running. They will appreciate landing in a shop that has a solid understanding of what's possible with data, and has already lined up the first projects to tackle.

(As a side note: I have a hunch that the supply/demand imbalance here is rooted in companies overestimating their need for data scientists, which is in turn rooted in a lack of a solid data strategy... but I digress...)

The most important reason to develop a data strategy up-front is that you're going to do it anyway. You may not call it "developing our data strategy," true. You may instead know it as, "getting halfway through a project before realizing we keep getting lost." or "I wish we'd known that we were nowhere near ready for this." or, "we need to lay off some people because we overstaffed." or even, "Wow, we burned a ton of money and have nothing to show for it." By that point you are operating under time pressure, so what are the chances you will do it well? Better to map it out early, as a separate exercise, when you have time to think.

the how

In short, the process to develop your data strategy is: explore how to use data in your company in ways relevant to your goals, then line up the list of projects to execute. If you don't have the necessary data expertise in-house, you'll want to work with an outside consultant to guide you.

The slightly longer version:

First, explore your business model. What does your company want to achieve? What are your business goals? From there, get a handle on how data analysis could improve your business and help you to reach those goals. You'll want to express those goals in terms of questions, such as: "can we segment our user base in a way that is useful for marketing efforts?"

Next, you'll need to determine what data you'll need to answer those questions. Do you already have it in-house? Can you buy it from a third party? If you can't get it directly, can you approximate it with data you can access? While checking for sources, consider unlikely candidates such as your homegrown applications or your IT infrastructure: they can generate a lot of useful data, but it's possible you don't actively collect it. (For example, maybe you throw away your web server logs after just a couple of days.)

Also -- and this is a very important step that some companies skip -- decide what data not to collect, to spare yourself some trouble later on. This isn't just for companies that are subject to HIPAA, PCI, and other formal regulations. Even a social network can find itself in hot water if, say, end-users find out you've been surreptitiously tracking their location or copying their phone's address book.

For data you already have in-house, perform an assessment so you'll understand whether it's up to snuff. Is it patchy? Are there lots of weird values? Are the collection mechanisms unreliable? These questions all amount to, "is this data really suitable for our purposes?"

With all of this on-hand, you're ready to develop a list of projects. For each project, you'll note the questions to answer, the business value of having those answers, the required data, and the specific technical skills and infrastructure you'll need to succeed. Except in very rare cases, your first projects will lay out infrastructure, collection mechanisms, and data storage. After that, you'll execute projects to analyze the data. Finally, you'll act on the results of each analysis: you could use them to make a decision ("offer certain perks to retain our best customers") or operationalize your findings ("feed these new features into the website's recommendation engine").

When lining up projects, be sure to compare the cost of implementation against the expected value of having the answers. It's entirely possible for a data project's price tag to far exceed its value, and developing a plan will help to expose costs ahead of time. You needn't even have precise figures here; being within an order of magnitude on your estimates will still yield insight.

Working through these steps can be daunting when you don't have a lot of data experience on your team. If you're stuck, or if you'd prefer an outside view, talk to me and I can help you. Depending on the size of your company, your available data, and a few other factors, developing a strategy can take as little as a few days.

what next?

As the line goes, "No plan survives first contact with the enemy," but you still need a plan. Given the price tag associated with data scientists, tools, and failed projects, it's not worth it to dive in to data projects without first thinking it through.

Resist the pressure from leadership, teammates, or even the business press. Develop a solid data strategy to improve chances of success in your data efforts.

Are you ready to develop your company's data strategy? Contact me to get started.

Many thanks to Bobby Norton and others for reviewing this article.

Hiring on Your Analytics Team

Stack the deck in your favor when hiring people into your data team.

The Importance of Data Infrastructure

A successful data science shop requires more than just data scientists.