This is the first piece in a series that will examine the economic disparity between the profits being generated through use of consumers' aggregated data (as realized by any number of software firms), and the price that consumers pay when those firms fail to protect that information.
Data is a peculiar thing. It doesn’t exist until you capture it, and it isn’t useful until you can compare it to some other data. The process of comparing data points to one another is like a series of steppingstones to an insight; to knowledge. Over time – in theory – these insights can create a body of knowledge that contributes to improvements in our daily lives.
Take, for example, your daily commute. Knowing how far you’ve traveled toward your destination is good, and better if you know how far away your destination was to begin with. Further, knowing how much time it took to travel that distance might even allow you to make a good estimate of your expected arrival time. If you make this trip often then over time you may begin to refine your estimates even further, and develop insights about the times of day when travel is quicker (or slower). Eventually these insights may accumulate into a mental model that describes optimal routes, times, and modes of travel that you can take. To the degree that this model minimizes your travel time, your data collection and analysis may have made a small improvement to your life.
If this data allows you to improve your life in some observable way then it must have value. You could state the value in terms of time saved, give it a dollar denomination derived from the effective hourly earnings from your job. Or perhaps you save gas by traveling less time, which has an obvious cash value. This is all nice in theory but until the improvements to your routine become real, the value of the data itself is extremely difficult to determine. We lack the ability to measure the value of raw data, yet with some relatively simple arithmetic we can readily observe the real value of the insights and improvements that data enables. This value gap results in improper treatment of raw, granular data, and lies at the heart of the ongoing debate about data security and personal privacy.
We use data as currency – as a medium of exchange – in our daily lives. For example, our personal identities are established by a well-known piece of data that we call a Social Security Number. This 9-digit identifier allows us entry into a vast financial and regulatory system that is structured around each person’s unique, verifiable identity. This financial regime requires employers, banks and credit cards to verify their customers’ identities, which is done through an exchange of data – of a Social Security Number – with the employer, bank, or credit card company. In return for this exchange, one may become eligible to hold a job that pays a salary, or to hold a bank account for money storage, or to obtain a credit card and enjoy the powers of purchasing through leveraged debt.
As simple as it may seem on the surface, this Social Security Number is a citizen’s small claim to a patch of land on a trusted system of government regulation. This system is the bedrock that gives foundation upon which our financial system rests. Without this little piece of data you would be deemed untrusted; your participation in the financial system would be limited to cash transactions only.
Another example exists in the healthcare space, where our identities (based again on the SSN) allow us create medical records containing medical history and critical health information. Since each person’s medical history is unique and in many cases very private, it’s important that we have a trusted system to verify a patient’s identity. By keeping records in a consistent, verifiable way, doctors can track our health over time. Medical record storage and transfer eliminates the need for individuals to keep and carry cumbersome records of our own (which would be inconvenient) or to depend on our own judgment and memory to reproduce medical history (which would be terribly unreliable). Without the use of an identifier, we would be unable to participate in a health system that provides continuous, lifelong care. Our doctor visits would be limited to one-off occasions and emergencies only, with no way for a provider to identify and treat long-term health concerns.
In both examples, data provides a base layer of trust, upon which we build other layers of trusted information and knowledge. These layers allow two parties to trust one another, even though proof of one’s identity is only implied by the social security number. This method of structuring data to organize people into a system allows for broad application of measures that, on balance, improve people’s lives. Generally speaking, an individual who participates in the banking and healthcare systems is far better off than one who does not.
Both of these examples are offered as idealized abstractions of what both the financial and healthcare systems should be. Discussions of trust and verification are meant to describe the intention of the SSN as a tool, not to express opinions about the state of either the financial sector or the healthcare system. Neither is exempt from criticism, and both suffer from major deficiencies that warrant serious attention. It so happens that many of these deficiencies are rooted in a systemic mistreatment of important data, which will be discussed in the coming parts of this series.
The activities and influences of data collection and analytics expand far beyond these two bedrock examples. The expansion has resulted in new business models that are delivering new profits. In order to understand this rapidly diversifying landscape, it’s important to examine the benefits that are attracting new entrants, and why they keep coming.