Data Ethics for Leaders: A Risk Approach (Part 5)

2019-06-10 | tags: AI data ethics risk data literacy

This post is part of a series on approaching data ethics through the lens of risk: exposure, assessment, consequences, and mitigation.

Part 1 -- exploring the core ideas of risk, and how they apply to data ethics
Part 2 and Part 3 -- questions to kick off a data ethics risk assessment related to company-wide matters
Part 4 -- questions to assess individual data project
Part 5 -- risk mitigation and otherwise staying out of trouble

Over the past three posts -- Part 2, Part 3, and Part 4 -- we've explored questions to frame a risk assessment to uncover ethical problems in how your company collects and analyzes data. Your next step is to devise a mitigation strategy that details how you would address those problems.

To mitigate those risks, you can you can either modify (or outright cancel) your plans for data projects because you want to avoid trouble; or, you can stay the course but prepare yourself to handle the inevitable fallout. The ideas in this post focus on preventative medicine for companies in the former category.

Most data ethics problems are rooted in someone being surprised. Either a company has surprised people ("hey, we've done something dodgy with data about you!") or people have surprised a company ("we thought this idea was OK, but everyone thinks it's creepy!"). Surprises are bad for business because they are a distraction and they introduce unwelcome uncertainty. If you have to build emergency PR campaigns and call in your legal team to handle a problem, they become expensive distractions.

To cut down on surprises to your business, cut down on surprises that involve your customers, end-users, and even the law.

Honesty is a good policy ...

The first way to prevent a surprise is to let people know what you're doing with data about them. I mean, let them really know what you're doing. People get upset when they feel they've been tricked, and that usually happens when you tell them less-than-truths about what you're up to.

Do you plan to sell their personal data to advertisers? Tell them. Will you treat someone's personal photos as training data for your image recognition models? Tell them. The list goes on and on.

By practicing such informed consent, people who interact with you will know precisely what you're doing with data about them. They will have a clear path to not engage with you if they are uncomfortable.

Publishing a privacy policy helps, but only if it is clear and concise. It doesn't count if you're hiding your data efforts in reams of vague legal-speak.

... but purity is better

You can further avoid surprises by employing what I call a pure business model: you only collect and use data for the sake of providing the service. That data stays between you and the customer, and you don't have a side-business of selling that data to others.

Let's take hotels as an example. A hotel is in the business of accepting money in exchange for rooms. It, understandably, has to collect names, billing details, and travel dates in order to see whether rooms are available and to collect payment for a stay. In a pure business model, the hotel wouldn't then turn around and sell that guest information to advertisers.

Running a pure business model is straightforward for the hotel because the bookings, not the data, are its primary source of revenue. It can therefore decide to not sell (or stop selling) its guest list but still remain in operation.

This is tougher for a company that was built for the express purpose of collecting and reselling data. With these "data honeypot" business models, the only choices are to rewrite the entire business model, shut down altogether, or remain firmly in the "wait till we get into trouble and deal with it then" camp.

How well do you know your vendors?

Even if you have decided to engage in a pure business model, maybe your service providers have other plans.

When you use a third-party service for accounting, payroll, sending reminders, or event registration, who else gets to use that data? Is that service using data provided by your customers and your employees to build their business?

This unintended loophole can harm your reputation. You can claim to have a pure business model, but customers will eventually figure that you've (indirectly) shared their information. Expect them to complain to you, not to your third-party service. And don't expect them to be too sympathetic.

To mitigate the risk of a service provider (mis)using your customers' details, have your legal team thoroughly review those privacy policies and note any opportunities for misconduct.

Thoroughly "red-team" your ideas

It's bad enough when you know that you're taking a risk and then you get in trouble. It's far worse worse when you have genuine intentions but someone else has other plans.

Bad actors are constantly looking for ways to misuse otherwise-innocuous services. You can uncover potential problems and devise solutions by performing a red-team exercise on your data projects.

We explained red-teaming in Part 3 but, as a recap: put yourself in the bad guys' shoes and ask yourself how you would (mis)use your company's service or product to cause harm. Then, ask yourself how you could thwart those attempts but still retain the core business idea.

It can be unpleasant to modify or scrap a plan because of something you uncover while red-teaming. That's still far better than making front-page news because someone else found and exploited the problem first.

Default to private

Running a red-team exercise costs time, effort, and money. They can also run into overtime when a seemingly-innocuous idea spirals off into innumerable vectors of misuse.

A prime example is when you want to display a person's data in a public or semi-public arena. You may have the most innocent intentions in doing so, such as showing users of your platform who else lives nearby, or displaying the profiles of high-scoring users in your learning app.

The best way to stay out of trouble in this case is to ... not display the data by default. There are so many ways to misuse publicly-available data, and it can be difficult to uncover them -- even in a red-teaming exercise -- unless you have had a lot of interaction with bad actors.

You can always let your users opt-in to displaying their information on your platform. (I emphasize opt-in and not opt-out.) When you default to displaying information, you open yourself up to misuse and the negative press spotlight that follows. You also lose the trust of your users, who have to keep wondering what's the next foolish thing you're going to do with their information.

In conclusion ...

Congratulations on reaching the end of this series. You are now prepared to kick off a data ethics risk assessment in your company. By uncovering problems before they grow out of control, you will improve customers' trust in your services and spare yourself unwanted press spotlight.

(This post is based on materials for my workshop, Data Ethics for Leaders: A Risk Approach. Please contact me to deliver this workshop in your company.)

Data Ethics for Leaders: A Risk Approach (Part 4)

Looking at data ethics through the lens of risk. (Part 4 in a series.)

How to Prepare for That Data Scientist Job Interview

Looking for a data science job? It involves far more than the technical know-how.