This article is part of a series. In Part 1, I outlined the premise: ML/AI shops can borrow tips and best practices from how algorithmic trading ("algo trading") shops operate. The rest of the articles explore those ideas in more detail.
No business is immune to risk, people in the the ML/AI world seem reluctant to talk about that. That's unfortunate, as many of their businesses are ill-prepared should those risks materialize into reality. By comparison, risk is a very mature subject in the field of algorithmic ("algo") trading -- it has to be, because there's so much money at stake -- so learning from their best practices can help us develop risk awareness for the ML/AI field.
A risk is a potential change that will carry consequences, and exposure reflects how much you'll suffer those consequences should that risk become a reality. You identify and handle risks by asking "what if?" and "what next?" [1] questions, respectively, in relation to your business objective and constraints.
Risk is always top of mind for algo traders, notably the financial risk of losing money due to how they chose to buy, sell, or hold shares of a certain stock. A common "what if?" scenario for traders, then, is: "what if this share price moves up or down? and what if that's a sharp, sudden movement?" If the price jumps before they're able to buy it, or if the price sinks after they've bought it, they lose money. [2] The traders' exposure to that risk, then, is how much of that stock they wanted to buy (as part of a wider trading strategy) or how much they already hold and were planning to sell (at a profit, in order to make money).
Once traders identify their exposure to a financial risk, they can determine how to handle (mitigate) it:
Price movements are an example of financial risk. Traders also consider "what if?" questions around operational risks, which involve the mechanics of the business operation. "What if we've detected a problem in our market data feeds?" or "what if we've stopped receiving market data altogether?" In this case it's simultaneously unsafe to trade ("we'll assume prices haven't moved") and to not trade ("we'll halt trading for now") since both paths expose the traders to the kind of financial risk mentioned above.
(This underscores why traders need to monitor their data feeds for problems, monitor their models for signs that they've gone awry, and enable human oversight in case models misbehave.)
Traders who understand and manage their risk tend to come out ahead. Those who don't, well, they don't trade very long.
Risk-handling holds such a prominent place in the trading world that every shop develops risk practices. A sufficiently large operation even has a dedicated risk management office. This team's job is to make sure that the bank doesn't collapse due to trades gone awry. For example, the risk management office may set rules that limit how much money a given trader is allowed to invest in a given sector.
In short, the traders' risk management flow is to acknowledge that a risk exists, determine its impact and then figure out what to do with it. That first step is critical. A trader's plan is a hope, not a promise from the universe, so they have to find ways to manage that risk if they want to stay in business. By comparison, plenty of ML/AI shops like to pretend that nothing can go wrong. That is a way of delaying the inevitable.
Few ML/AI shops are in the position to quantify their risk as easily as traders can. But you should still take the time to perform a risk assessment to identify your exposures. Too often, people fall into the trap of only considering their preferred, ideal path. My colleague Joshua Ulrich says it best: Good risk management requires thinking about how you may deviate from that path, and measuring how far you have to stray before you need to worry.
To begin your risk assessment, consider the three main types of risk every AI shop faces: data source risk, ML/AI model risk, and business model risk.
1 - Data source risk: Building ML/AI models requires training data, so that exposes you to a variety of risks around how you acquire that data.
Who provides your data? If you collect it yourself, do you have monitors to confirm that your collection processes are still running and that the data is being stored properly? Maybe you pull that data from an external vendor. Do you have a relationship with a backup vendor, in case your main source suddenly goes out of business? And maybe your "vendor" is really a site you're scraping. What's your plan for when they get wise to what you're doing and disable your access? (Or, worse yet, if they quietly feed you bad data?)
It's common to use third-party data labeling services these days. That kind of data enrichment is another flavor of data acquisition: you are indeed receiving data -- your data, plus some new fields -- from someone else. Your business is exposed to the risk of that vendor closing down, being bought out by one of your competitors, or even becoming a competitor if they are using or reselling your data on the side.
If you don't keep an eye on your data source risks, you could quickly lose your company's ML/AI function. And if ML/AI is not a department in your company, but your entire company, then losing access to data could mean shutting down the company.
2 - ML/AI model risk: Every model is wrong some of the time. How much does each wrong answer cost you? And does that cost accrue in a purely linear fashion? Or, is there some point past which the model's wrong answers push you into far worse territory, and even trigger a cascade? (Think beyond pure dollar cost. Also consider customer service headaches and PR fiascos.) Can the model ever be so wrong, so often, that it puts you out of business?
You can identify and address a lot of model risk by applying human experience: on the planning side, you get enough people involved to see where the model might go awry and then proactively build controls to limit its impact. Then, after deployment, you make sure that you have enough humans around to override the model when it produces bad decisions. (I explore this in detail in "Monitor Your Models" and "Provide Padding Around Your ML/AI Models" .)
If you sort out these details early on, during the planning and model development stages, you may even determine that it's too risky to deploy such a model to begin with. That's a tough sell to the leadership team -- no one likes having to scrap their plans -- but it's better to tell them "we shouldn't do this" ex ante than to let them find out the hard way ex post.
3 - Business model risk: Even if your ML/AI models exhibit solid performance, how your company makes money may get you in trouble. A number of data business models involve the (often, surreptitious) collection, analysis, and distribution of information about people.
Today, what you're doing is completely legal and widely (albeit, begrudgingly) accepted by the public. What about tomorrow, or next week? Similar to how traders ask "what if?" questions about movements in share prices, how often do you ask questions about movements in laws or public sentiment? If your business model toes that line, then all it takes is a small shift to put you on the wrong side.
Risk mitigation for your business model can be a complicated affair. The most common practice, it seems, is to play the risk-acceptance card and handle the problem if and when it comes. A more proactive approach would be to hedge -- positioning yourself for a quick shift to a new business model or shut down when the time comes -- or, frankly, to not build a shady data business to begin with.
Your company may not need to establish a formal risk office just yet, but it still helps to work through possible issues before they arise. Follow the algo traders' example and start by acknowledging that risks exist. Check for exposures around your data, your ML/AI models, and your business model to prevent a surprise from throwing you off-track.
Many thanks to Joshua Ulrich and other colleagues for reviewing this post and providing feedback. Any errors that remain are mine.
Part of why the "what next?" is so important, is because risky events can cascade: one effect leads to a new risk, which leads to a new effect, and so on. ↩︎
Technically, they only lose money in this case if they really need to buy/sell at that moment. Until that point, it's an unrealized loss. Traders can hold off on buying/selling until the price moves in a more favorable direction. Also, the mitigation strategy here would be different for the buy versus the sell case, since the latter would be more a matter of opportunity cost. ↩︎
Traders can automate this by placing a stop-loss order, which will sell the shares when the price falls below a certain threshold. They can also program their execution algos to watch te price and take action accordingly. ↩︎
Providing Padding Around ML/AI Models
ML/AI models still require a lot of human involvement.
Data Lessons from the World of Algorithmic Trading (part 8): "Develop Controls Around Data"
What can the world of algorithmic (electronic) trading teach us about good ML/AI practices? (Part 8 in a series)