Undervalued Practices in ML/AI, Part 4: Project Execution
2021-04-12 | tags: data literacy AI

This post part of a series on undervalued practices in ML/AI:

13 - Start with BI.

A common -- and unfortunate -- misconception is that since AI gets so much media attention these days, it must be inherently more valuable than BI. Companies therefore reason that they should skip straight to AI. This is, put simply, a bad idea.

If you have never done data analysis, BI is a very important first step. Before you try to build models to predict the future, why not review your past and present? All of those transactions, website visits, and mobile-app taps can tell you a lot about your company. Ad-hoc analyses can challenge key assumptions, while dashboards and reports can keep everyone up to date and on the same page.

Another benefit of BI is that, so long as your data is clean and correct, you can trust BI. Just about everything you see through a BI tool is a fact because you're counting, grouping, and reshaping information about what has already happened. Compare that to ML/AI, where there's inherent uncertainty because you're trying to make predictions.

Not only is BI important, but it's a prerequisite to doing AI well. BI will help you build out data infrastructure and test your data before you try build AI models on it. It will also guide you on what sort of predictive models would even be useful for you.

14 - Develop a solid data infrastructure.

Speaking of which, investing in data infrastructure up-front will pay dividends.

I won't go into too much detail here, as I've covered this in a post on data infrastructure as well as my series on best practices borrowed from algorithmic trading.

The short version is: take the discipline to scope out how you collect, store, and retrieve data. Then, label and review that data. Doing so will make your analysts and data scientists more efficient in their work, which means your company will see better, faster, and more reliable results from its data projects. And at a lower time and dollar cost, to boot.

15 - Get to know your data.

Your data scientists will do their best work when they know more about the data itself. This sounds obvious, but companies sometimes skip this step in their haste to dive into ML/AI.

Having a robust data infrastructure will shorten the time your data scientists spend finding and accessing the data. An up-to-date data dictionary will tell them what each field means, so they won't have to guess.

Your data scientists will also need time to explore that data, look for problems, and spot opportunities. Instead of throwing that new data scientist into running anayses and building models, mark their first couple of weeks as data exploration time. They'll thank you, and you'll thank yourself.

We can tie this back to a point from the previous post, about embedding data scientists in product teams. If the data scientists know the data inside-out, they can spot problems before you get too far down the road on a modeling effort.

Imagine a data scientist reviewing a product idea, and pointing out that your data won't support that out of the box. "We'd need to do XYZ, which would probably cost an extra $10k to start and maybe another $80k over the year. What if instead we try ..."

Would you rather hear this early on in the product discussions? or long after the fact, when the product team has signed off on the idea, and they've kicked it over the proverbial wall to the data scientists to implmement their part? If your data scientists already know your data, you stand a much better chance of living the former scenario, which gives you ample time to course-correct.

16 - Dont just automate; innovate.

Automation can be tremendously valuable. It allows a business to offload dull, repetitive work to machines that can perform their duties around the clock. Those machines can quickly scale up or down accordng to demand, which is tougher to do with teams of people.

Whereas application development achieves automation by translating business rules into code, ML/AI models automate fuzzier decisions that rely on inference: "Is this transaction legitimate?" "What kind of document is this?" "Does this image contain the desired object?" (This is why I sometimes refer to AI-based automation as "software++.")

Automation is a matter of improving your existing business procedures. You can also use AI to find new ways of doing business. Under the right circumstances, ML/AI can drive this innovation in your company just as well as it drives automation.

Innovation is understandably more difficult to come by than automation. Still, it can be a real game-changer. If you're only using ML/AI to automate decisions, and not to innovate, you're missing out.

(My post on "Automation verus Innovation in ML/AI" explores this in greater detail.)