Roles on Your Analytics Team
2015-02-10 | tags: AI employment

(This is the second post in the "I get this question a lot" series. Feel free to check out the first one, "Do I Need a Hadoop Cluster?")

"Data scientist" is supposed to be the sexy job of the century, and "data science" -- or Big Data, or analytics, or whatever term you prefer -- is supposed to advance every business by leaps and bounds. There's truth in both statements, but I sometimes meet people who insist on going about this the hard way. Two key (and closely related) mistakes companies make are to assume they need only data scientists, and to try to hire several data scientists before there's enough work for them to do.

These problems are flip-sides of the same coin, and you can solve them both by understanding the various roles of a data science team. In the same way an operating theatre needs more than surgeons, and a trading firm needs more than traders, a successful data science effort needs more than data scientists. A proper analytics team involves several roles, each with different responsibilities.

All the roles you'll need

What, then, are the roles you need for a well-rounded data science team and successful analytics efforts?

Everyone here works to put data to use for the Customer, who represents some business function or business unit (or even several business units). While the Customer is not part of the analytics team, per se, their needs ultimately determine why the company even needs an analytics team and what that team will do.

If you've been counting along with me, the Data Scientist is one of seven key roles and sits in the middle of this list. Data scientists are responsible to the roles ahead of them in the list, and are customers of the roles that follow them.

That last point bears special mention. It's why the list of roles includes your IT staff. Your first forays into data may be small, isolated affairs that have minimal interaction with the rest of your IT staff; but over time, expect your data efforts to become regular patrons of your existing infrastructure and tap into your app/dev stack.

Mapping those roles to people

Seven roles sounds like quite a menagerie. Some might say that is simply too many people for a company's first analytics project. Maybe. But remember, these are roles, not people. (Likely, though, "IT Staff" already exists as a group of people.) In the early proof-of-concept stages one person will likely take multiple roles. You simply won't have enough data work to engage seven people on a full-time basis. Also, taking such a lightweight approach will permit you to move quickly and nimbly, both of which are key elements of proof-of-concept projects.

Take care in how you assign these roles to people. Consider business need, but also align according to skills and incentives. I've seen it done the other way -- the Data Scientist who spends most of their time playing Data Engineer, to the point they can't do the analysis work; the company that wants a Data Scientist who will also manage the production-grade Hadoop cluster -- and it's a recipe for unhappiness.

Most of all, prepare for growth: make it easy for someone to spin out one of their roles to another person when the time comes.

Let's try an example

Let's say your company has identified a Customer's need and is ready to explore its first data effort. You'll probably go far on a single, well-rounded Data Scientist and a Champion/Sponsor, plus a little help from the in-house IT Staff. When I say "well-rounded" Data Scientist, I emphasize someone who has the skills to fulfill the Data Engineer role and handle some minor Tool Admin responsibilities. They can split the Data Lead role with the Champion/Sponsor.

Expect the company to quickly develop an appetite for data. This will saturate the first Data Scientist with work, and it may be tempting to simply hire more Data Scientists to handle the work load. Not so fast! See whether your lone Data Scientist has been filling any of these other roles, and bring in new people accordingly.

Most likely, you'll first need to spin out the Data Engineer work from the Data Scientist. You'll also need more involvement from your IT staff, both the ops crew to setup regular data exports and your developers to add new data collection taps in your homegrown apps. Also, over time, the Tool Admin duties will spin out from their existing owner (be it the first Data Engineer or the IT ops crew) into a separate person. Finally, the first Data Scientist would ideally have leadership potential such that they could manage the analytics team as it grows.

Moving forward

It can be exciting to launch data science efforts in your company. Just remember that you'll need more than data scientists to make this happen, and you'll need to grow the team in the proper order. Build a well-rounded team, both in terms of technical roles and leadership abilities, to improve your chances of success.

I offer consulting services on just this matter -- everything from data strategy to serving as interim analytics lead -- and would be keen to hear how we could work together. Contact me to get started.


Many thanks to Ken Gleason, Tim Knight, and Joshua Ulrich for reviewing this article.

How Do You Know If Your Company Needs Hadoop?

Let's walk through the decision of whether your company would benefit from building a Hadoop cluster.

"On Leadership" -- New O'Reilly Radar Post

Moving from a technical to a leadership role