Data and ethics: With Big Data comes big responsibility
This feature first appeared in the Spring 2015 issue of Certification Magazine. Click here to get your own print or digital copy.
Big Data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And Big Data may be as important to business and society as is the Internet. The more data, the more accurate the analyses on that data become. These successes snowball as more accurate analyses lead to better decisions, and better decisions turn into greater operational efficiencies at reduced cost and risk.
So what exactly is Big Data? In 2001, industry analyst Doug Laney (currently with Gartner) defined Big Data with his three V’s: volume, velocity, and variety.
Volume: Many factors contribute to the increase in data volume. Sources of this increased volume can be any of a number of areas, but could include:
● Transaction-based data stored through the years
● Unstructured data streaming in from social media
● Increasing amounts of sensor and machine-to-machine data being collected
In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.
Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors, and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for
Variety: Data today comes in all types of formats, including but not limited to:
● Structured, numeric data in traditional databases
● Information created from line-of-business applications
● Unstructured text documents
● Stock ticker data
● Financial transactions
Managing, merging, and governing different varieties of data is something with which many organizations still grapple. To get a better handle on the impact of Big Data on business and society, let us turn to industry experts to get their views.
Big Data and Profits
Big Data is “the next frontier for innovation, competition, and productivity,” says McKinsey & Company.
Brooks Bell is the founder and CEO of Brooks Bell Inc., the premier firm focused exclusively on enterprise-level testing, personalization, and optimization services. Bell states that companies and executives rushing into data collection and analysis expecting immediate payoffs are bound to be disappointed. Most companies are years away from being able to effectively profit from data — and not simply from a lack of technology. Instead, some entrenched challenges need to be addressed before Big Data can have real impact.
Take, for example, the gut-driven approach to strategy that pervades the business world. Leadership by the “highest paid person’s opinion” is a common organizational weakness that should ultimately be remedied by Big Data. That will only happen, however, when the mindset also shifts so that a person’s decisions are based on real data, not gut assumptions. Simply having more data will not be enough to overturn this mentality and could even make the transition more difficult.
Big Data and Society
Jim Fruchterrman, founder and CEO of Benetech, has stated that Big Data is all the rage in Silicon Valley. From Facebook to Netflix, companies are tracking and analyzing our searches, our purchases, and just about every other online activity that will give them more insight into who we are and what we want. And though they use the massive sets of data they collect to help create a better experience for their consumers (such as customized ads or tailored movie recommendations), their primary goal is to use what they learn to maximize profits. But Big Data can also create positive social change.
Bookshare, a social enterprise operated by Benetech, last year processed requests for more than 1.3 million downloads of accessible books through its online library, to over 200,000 people with disabilities such as blindness and severe dyslexia. Benetech already collects a great deal of information such as which books are downloaded most, but its delivery model has been similar to that of print textbooks: “Here it is; hope it’s useful!” Benetech doesn’t know if the student ever gets past Chapter 1.
Or didn’t. Benetech recently launched a new feature for Bookshare that allows students to read books within a web browser instead of needing additional software or tools. Over the next few years, Benetech will be able to collect (ethically and legally, with proper respect for privacy) and analyze the many millions of interactions its users are having with these books. It will learn how (or whether) textbooks get used, and which approaches to a specific learning objective work best.
Social entrepreneurs should focus on using Big Data for the social good. Of course, data has to be collected in ways that match our value systems and respect ethics, privacy and informed consent. Benetech’s experience in collecting information about human rights violations and about people with disabilities, two highly sensitive areas, shows that this can be done.
Big Data or Big Brother?
Viktor Mayer-Schönberger, Professor of the Oxford Internet Institute, says that Big Data analysis is a powerful tool that has the potential to shape every part of our society, from health care and education to urban planning and protecting the environment. But as with every powerful tool, it has a dark side, too: It threatens privacy protection and human volition.
Let’s start with the promise of Big Data for society. First, we can capture much more information about a phenomenon than ever before, sometimes even close to all of it. For instance, rather than sequence just a small amount of DNA, we can now do that for all of it. As a consequence, we are no longer restricted to just examining a small sample of data. This gives us detail and granularity we never had before.
Second, because we have so much data available, we can accept a bit of messiness in the data we analyze. When we only had a small subset of data available, we needed to make sure that this data was highly accurate. With a lot of data at our disposal, a bit of inexactitude is permissible. More and messy trumps small and clean.
Third, and most important, we can now discover highly valuable and previously unknown connections between information. Big Data correlations help Amazon and Netflix recommend products to their customers. Correlations are at the heart of Google’s translation service as well as its spell checker.
And predictive maintenance based on Big Data correlations lets companies predict when a car engine part needs to be exchanged before the part actually breaks. Big Data doesn’t allow companies to tell us why something will happen, but what will happen, and they are able to do this at a crucial moment in time for us to act.
MIT Professor John Guttag puts an emotional spin on the good that can come from using Big Data. As the head of MIT’s Electrical Engineering and Computer Science Department, he argues that Big Data must use personal electronic medical records in order to prevent the spread of deadly hospital infections. “Progress in health care is too important and too urgent to wait for [the controversy over] privacy to be solved,” he said. “I’m for privacy, but not at the cost of avoidable pain, suffering and death.”
Big Data and the Future
Big Data is a big deal, says White House adviser John Podesta, head of the presidential study on the future of privacy and Big Data and the keynote speaker at an MIT workshop.
“We’re undergoing a revolution in the way that information about our purchases, our conversations, our social networks, our movements and even our physical identities are collected, stored, analyzed and used,” he said.
“On Facebook there are some 350 million photos uploaded and shared every day,” he said. “On YouTube, 100 hours of video are uploaded every minute, and we’re only in the very nascent stage of the Internet of things, where our appliances will communicate with each other and sensors will be nearly ubiquitous.” Podesta went on to say that not only will users of Big Data be able to analyze our past behavior, but they’ll also be able to predict it in advance.
Online retailer Amazon recently got a patent for what it calls “anticipatory shipping,” delivering products you want even before you buy them. Now, customers just press a button to place an order for more laundry detergent, or a customer can opt to have 150 diapers delivered at the beginning of each month.
“How should we think about individuals’ sense of identity when data reveals things about them they didn’t even know about themselves?” Podesta asked. “In this [presidential] study, we want to explore the capabilities of Big Data analytics but also the social and policy implications of that capability.”
Big Data: Good or bad? That question promises to be the hottest ethical question of the 21st century. Nevertheless, it reminds us all to use discretion in what we post on the Internet, what we purchase, and how we interface with technology. In our lifetimes, we will be both positively and negatively affected by the power of Big Data.