Books

Why Big Data is a big deal

Jinoy Jose P | Updated on January 12, 2018 Published on January 22, 2017

Title: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy <br> Author: Cathy O’Neil <br> Publisher: Crown <br> Price: ₹699

A data expert scans mathematical models to show how they trigger inequality



Three statisticians are out hunting. They find an antelope. The first statistician shoots but misses his target by a metre to the left. The second guy fires, only to miss by a metre to the right. The third doesn’t fire but shouts victoriously, “On the average we got it!”

Now, here’s a question. How many software coders does it take to change a light bulb? None. Because it’s a hardware problem. Now, imagine an illegitimate marriage of statistics and coding. What do you get out of the alliance? Enter weapons of math destruction — a series of data-enabled software that can change the way we live, work and consume thanks to the absurd fact that they, like these jokes, are far removed from reality. That’s pretty much how Cathy O’Neil sees things in her meticulously researched work Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.

Omnipresent, omnipotent

Big Data is not an alien phrase any more. Not only data pundits, but marketers, psychologists, academics, and policymakers try to exploit Big Data, which, according to the computing industry, is the most important natural resources of the day. It’s collated and clinically analysed in almost all influential, money-making sectors now — retail, consumer goods, military, governance and even gaming. A myriad tools built on complex mathematical models exist today that find patterns in how people buy, search, converse and express despair, to arrive at correlations that mean, in most cases, business.

But this business is not done fairly, according to O’Neil. She should know. She is a mathematician and former quantitative analyst with a Wall Street hedge fund. Big Data analytics is harmless and neutral, it is said, because machine learning has no agenda. So the apostles of the Big Data revolution say it’s holistic and trustworthy. In 2013, in their seminal work, Big Data: A Revolution That Will Transform How We Live, Work, and Think, Viktor Mayer-Schonberger and Kenneth Cukier wrote that the the “ground beneath our feet was shifting”. Old certainties are, they said, being questioned. Big data requires fresh discussion of the nature of decision-making, destiny, justice. A worldview we thought was made of causes is being challenged by a preponderance of correlations, they warned. Their book heralded the death of the subject matter expert. They foresaw a world where correlations replaced decision-making.

But what was in store, according to O’Neil, was a catastrophe. O’Neil gives the example of predictive policing algorithms. These models analyse patterns of past crimes and predict where future crimes will occur. Ideally, police reach these areas and try to check crime. But the “fundamental problem” with this concept is that it reinforces already uneven and racist policing practices. The data that form the basis of these algorithms reflects a broken, prejudiced system. So, it often targets the vulnerable, not the guilty.

All the bad signals

Today, Big Data is manipulated to pump prejudices, to cajole and force the underprivileged into taking decisions that ruin their future, to psychologically customise vulnerable buyers into unnecessary, wasteful purchase decisions. Big Data analyses as they are done today by machines programmed by humans, trigger inequality. The privileged, O’Neil says, are processed more by people, while the masses (read the poor and the downtrodden) get ranked, categorised, appraised and evaluated by machines. That’s the big problem with Big Data. Its practices lack a level-playing field. It’s not natural. And it’s time it was made natural.

To do that, we must first deconstruct the mathematical models that help run the software that pedal Big Data analytics. This process starts by understanding the people who make these models, and how they make choices about what’s important enough to include. They simplify the world into a toy version, writes O’Neil, which can be easily understood and from which we can infer important facts and actions. We expect it to handle only one job and accept that it will occasionally act like a clueless machine, one with enormous blind spots. O’Neil gives the example of Google Maps. When we ask for directions, it models the world as a series of roads, tunnels, and bridges. “It ignores the buildings, as they aren’t relevant to the task.”

Therein lies the rub. A model’s blind spots reflect the judgments and priorities of its creators. The umpteen Big Data tools that scan customers, professionals, children and the aged to derive at inferences that help businesses, educational institutions or hospitals to enhance their performance are so alarmingly dotted with blind spots that they make the idea of an inclusive, ethical world seem a myth. This is not good for democracy, O’Neil feels, because no model can include all of the real world’s complexity or the nuances of human communication. And by blindly relying on prejudiced models to analyse data, companies are exploiting the vulnerable because they know “vulnerability is worth gold”.

Privacy goes kaput

Such criminal use of Big Data also reflects a philosophical disconnect. Internet, which generates, distributes and helps collate large chunks of data, was seen to be a platform where anonymity was positively celebrated. During the early dotcom days the saying was that “nobody knows you’re a dog”. But now it’s the exact opposite, writes O’Neil. “We are ranked, categorised, and scored in hundreds of models, on the basis of our revealed preferences and patterns.” This, she warns, establishes a powerful basis for legitimate ad campaigns, but it also fuels their predatory cousins: ads that pinpoint people in great need and sell them false or overpriced promises.

This feast of inequality can be checked only if Big Data is managed wisely. Yes, there are rare cases where they offer key insights. But for that to happen more, Big Data tools must stop treating us like machine parts in the workplace. O’Neil urges civil society to police these weapons of math destruction, to “tame and disarm them”. Math deserves better than WMDs, and democracy does too.

MEET THE AUTHOR

Cathy O’Neil is a data scientist who blogs at mathbabe.org. She earned a PhD in mathematics from Harvard and worked for the hedge fund DE Shaw. She is the author of Doing Data Science.

Follow us on Telegram, Facebook, Twitter, Instagram, YouTube and Linkedin. You can also download our Android App or IOS App.

Published on January 22, 2017
This article is closed for comments.
Please Email the Editor