Features Hub Opinion

The competitive edge of ethical AI

Fri 28 Jun 2019 | Michael Natusch

Responsible use of AI is poised to become a key business differentiator. Techerati spoke to Michael Natusch, global head of AI at Prudential, to learn about the benefits that come with responsible AI development

For smaller companies seeking to replicate the successes that tech giants have had with deep learning, it can be tempting to replicate Google, Facebook and co’s approach to training neural networks – feeding models with untold amounts of raw data. Lacking access to the billions of users from which these giants extract data, building image databases by scraping data may seem like the only option for companies wishing to join the AI brigade.

But why chart this ethically-questionable course, especially in the context of growing awareness of data privacy issues? Michael Natusch, global head of AI at Prudential says obtaining data with meaningful consent is a differentiator that brings in more clients and improves business outcomes.

Data sourcing

Assembling image-sets is the first stage in training neural networks. Organisations can either scrape this data from public sources online, without consent of the individuals to which the data relates, or source the data themselves first hand from individuals who provide active and meaningful consent.

Three common perceptions combine in a perfect storm to encourage organisations to take the first route: First, the perceived wisdom the accuracy of models is proportionate to the amount of data they are fed during the training phase; second, that it’s too time-consuming and costly for organisations to collect data themselves; third, that individuals are unwilling to provide data unless an organisation’s intent is disguised in tombs of T&Cs. Natusch takes aim at all three.

Be a spearfisher, not a trawler

First, relying on masses of scraped data is a lazy approach that can lead to counterproductive outcomes, Natusch says. Too much data can overfit models, producing spurious ‘accuracies’ that are irreflective of reality. If organisations are forced to source data themselves, the demands of efficiency also force them to be more thoughtful about the data required, resulting in models that produce more suitable recommendations, benefiting both organisations and their customers, he says.

“Rather than going on this big fishing exercise, collecting everything and shouting “yippee” if something comes out with a high correlation, I’m a big believer in being forced to think harder about what data really needs to go in [to a model] and why do I really want that data,” he says.

It is also not true that millions of images are required to train competent models. For most organisations, a market-ready application can be developed with 10,000 images, Natusch says.

No pain, no gain

Building primary databases of 10,000 images is costly, painful and time-consuming, but there are creative solutions that can ease the burden, making it practical for medium-sized companies.

If an organisation ever hosts events that draw in big crowds, or is partnered with firms that operate customer-facing branches, it’s a great opportunity to source data from from volunteers in an ethical fashion.

“In the context of Trump, Brexit, and Cambridge Analytica, the willingness of members of the public to hand over personal data is at an all-time low”

For instance, to train an upcoming AI application (full disclosure: Natusch asked we did not detail the product in question), Prudential asked attendees at one of the company’s annual summer bike rides to pose for selfies. 3,000 peopled volunteered.

“We didn’t try and force anybody and we were very open about what we were trying to do and how the data was going to be used,” he says.

“’It’s not just because we’re nice people or something like that (although we like to think we are). It’s also a question of competitive advantage.

“Ethical behaviour, privacy, security and compliance will only become more important and we want to be seen right from the start as a player who does things the right way. Is it costly and painful? Yes. But we have the consent of our customers, and quite frankly that is truly valuable.”

Communicate mutual benefits

In the context of Trump, Brexit, and Cambridge Analytica, the willingness of members of the public to hand over personal data is at an all-time low. But that’s not to say there aren’t enough individuals within reach who will cooperate if a firm is transparent in its aims.

After all, firms should have user interests at heart (if they don’t then they should seriously question why they are building their application to begin with). Communicating these goals should lead to a queue of volunteers – as Prudential discovered at its bike ride.

“We had people queuing up to have their picture taken. They were joking about how they’re going to use the picture or what bad things we could possibly be doing. But it was light-hearted and they still queued up,” Natusch says.

“How personal data is used is increasingly at the forefront of people’s minds. And I genuinely think that’s the way it should be, and that [organisations] shouldn’t be panicking about it.

“There will always be talk about consent: ‘Do you have consent to use data? Yes or no?’ And in many ways, that’s kind of a low bar. Actually, what I want is more than consent, what I think what I saw [at the bike ride] was more than consent, it was excitement. ”

Internal data

Ethical AI development also extends to how internal data is handled. A company like Prudential, that boasts millions of health insurance customers in Asia alone, possesses highly instrusive data due to insurance laws that mandate that hospital bills are highly detailed (even down to the name of surgeons who perform operations).

An ethical approach to incorporating this data into AI development involves patience: only using what is necessary and taking the time to apply anonymisation and encryption. If Prudential released an application without carefully anonymising this data, all it would take is one breach to expose their customers.

Organisations should make sure they apply they think carefully about the right anonymisation approach for each use case, especially considering anonymisation itself is a developing and complex field, Natusch says.

“I don’t yet have an approach where I say: ‘this is our gold standard’. Because our use cases are so different in nature,” he says.

“For instance, a model that transfers data back and forth from a variable device or from a smartphone is very different from a model that will only ever run internally in the back office. The key is to try different approaches. So it’s really very much the use case that directs our approach. ”

Shared responsibility

The presence of these technical solutions means nothing unless they are folded into company culture. Proper business processes – privacy, security, ethics, and compliance by design – are where trustworthy AI lives or dies. Thus the burden this falls on everybody who may encounter data, people who need to be identified at the start of any and all projects.

“The worst thing you can do is say ‘it’s the [responsibility of] the data scientist or the AI expert. Then it’s too late. It is truly everybody,” Natusch says.

The future of AI will comprise players who either march on irresponsibly or are guided by the principles of trustworthy AI. If more land in the former category, the “tech-lash” will only worsen, making it difficult for benign organisations to pursue their projects.

It is necessary for the tech industry as a whole to bake in trustworthy AI by default, and that starts by reeling back the cavalier attitude to data sourcing and establishing shared responsibility. Many can learn from Prudential’s readiness to conjure creative solutions.

“We have a very clear choice, we can either ignore this, and just plough ahead regardless, and hope there’s no sort of no blowback coming from that, or, we can be very mindful of what you do and have the conversation openly and publicly. And from our point of view, it would rather be in the second one,” Natusch says.

“Embrace ethics, privacy and security. Be seen to be driving it and have those conversations with your business colleagues and don’t let them force the ownership on you. They need to think about it too.”

Experts featured:

Michael Natusch

Global Head of AI
Prudentia;

Tags:

artificial intelligence data science data sourcing Ethics
Send us a correction Send us a news tip

Related Opinions




Do NOT follow this link or you will be banned from the site!