Swimming in the ocean of big data

By Aaron Menenberg, a member of the Praescient Initiatives team and a Technology Fellow at the Institute for the Study of War, March 4, 2013, Originally Posted by Praescient Analytics

At last week’s Georgetown Law/Journal of National Security and Policy conference entitled “Swimming in the Ocean of Big data: National Security in the Age of Unlimited Information,” Professor Paul Ohmreferenced a now well-known quote, “data is the new oil.” If you’re a data-head like we are, you shake your head a bit when you hear it. After all, oil is not associated with generally positive attributions. But since we are making significant leaps in knowledge from increasing amounts of data like we have made economically or technologically from oil, perhaps this concept can be helpful. Just as oil has enormous positive impacts when produced and used responsibly, so too does data.

There was great debate at the Georgetown conference about the duties that both the technology and legal communities have in ensuring that the usage of big data meets the civil and personal liberties enshrined in law. One of the emphasized nuances was that the output of data algorithms is limited to providing inputs for humans to make decisions, and that possession and analysis of data in and of itself, big or small, does not produce bad outcomes. Only bad or irresponsible human analysis precipitates these negative outcomes. So long as data systems are built to ensure users meet the requirements of law, those laws can be both followed and enforced.

The challenge in meeting this technology requirement is having the legal requirements. Updating our laws on civil liberties and privacy to reflect the exponential growth and usage of data is an incredibly difficult process, and it has moved far more slowly than the rate of growth in data collection and usage. This has led many to believe new laws curtailing the very usage of big data ought to be established. Yet despite the historic rates of growth, the challenge of acting responsibly with information is not new – imagine the widespread doubt faced by those responsible for the 1890 US Census who were tabulating the results by machine for the first time, especially when it returned a population that was significantly less what many expected.

The advancement of a census approach from punch cards in 1890 to one that now uses algorithms to estimate population has produced more accurate results than manual head counts, and government policies are now based on human analysis of the results of those algorithms. Big data is and can continue to be used in the service of national security according to the legal regimes established to protect civil liberties and personal privacy, and it takes only advancement in those legal protections to advance the technological protection of them as well.