Not All Datasets Are Created Equal
2022-04-11 | tags: risk AI

Not all of your datasets are created equal.

Some datasets are of higher quality than others, true. Maybe you've cleaned them up, or there were stronger checks in place to ensure consistency at the point of collection.

And sure, some datasets have proven more useful to you. Say, portions of your CRM that you've turned into successful marketing campaigns. Or data that has improved your retail site's recommendation engine.

Some of these datasets may have even greater direct revenue potential, because you've been able to sell them, or data products derived from them.

But I'm talking about something different: those datasets that carry greater risk than others.

These are the datasets that are more likely to get you in trouble. A change in data privacy laws can suddenly put you on the wrong side of regulatory matters. Or lax security standards can lead a leak of sensitive data, which will certainly cause a PR headache. Doubly so if you'd never disclosed that you were collecting that data in the first place.

These risky datasets are making money for you today. But they may cost you money down the line.

Looking for trouble

It may help to answer some questions as you review your datasets:

Your answers to these questions will help you rate each dataset according to the risk it carries. From there, you can compare the risk of holding or using that data to the value or revenue it generates through related business processes and ML models. You may find that some datasets aren't worth the trouble.

Not all data is good data

When you're building or acquiring a new dataset, be sure to rate it in terms of its risk/reward tradeoff. Data that requires extra protection, or that may cause you trouble in the long run, "costs" more (ergo, is worth less) than it may appear on the surface.

I'll dig deeper into data valuation approaches in a future post.

The Top Sources of Risk Facing the AI Sector

Potential problems that can affect the entire field.

New Radar series on N-sided marketplaces

N-sided marketplaces are very common in the business world. What are they, and how do they work?