With the mass amounts of information, perspectives and opinions being published online, the debate on the right to privacy (to be more specific, the informational component of the right to privacy), as it relates to the concept of big data can easily become misguided and misinformed. A concern for ensuring that the debate on privacy and big data remains on the right tracks stems from the fact that providing a legal opinion on big data operations requires having a sufficiently clear understanding of just what big data really is. Only then can a proper legal assessment be properly conducted. Properly understood, one should easily agree when I say that the privacy implications, at least from a legal perspective, of big data operations must be conducted at a macro/nano level, i.e. against the nuts and bolts that structure the industrious big data machine.
Big data and privacy are both multifaceted concepts that cannot easily be defined. Some have recently described big data as “mathematical models to spot patterns”. While defining big data in these general terms is not necessarily wrong, it confuses into one vague notion the various phases and processes that really comprise big data, one of which includes analytics, i.e. the process of running complex algorythms through an infinitely scalable amount of data. Defining big data by the analytics process alone, however, misses the BIG picture of just how big big data really is. So how, if at all, can big data be defined? Is defining big data even necessary for legal purposes? Or, can one really assess the privacy implications of big data per se?
I recently had that occasion of writing a lengthy, comparative legal analysis of big data operations as it relates to the right to informational privacy in Canada, the U.S., and Europe. I also had the occasion of writing on other interesting issues, such as the definition of privacy and the emerging tort of invasion of privacy in Canada and other commonwealth countries, as well as on the regulation of cross-border data flows and privacy as it relates to the Internet. This last article can be viewed by clicking here. The two previous papers will soon be published. All three papers will be combined to provide a comprehensive, in-depth look on the topic of big data and the right to informational privacy in my upcoming book entitled “An International Perspective on Big Data and the Right to Informational Privacy”. Suffice to say that big data is a very broad, nebulous concept that cannot be defined within a blog post.
For our purposes, one interesting way to look at big data is to look at the numbers. Thus, in my work, I define big data as the 2,405,518.376 Internet users that are generating 2.5 quintillion bytes of data every day; the 43,441 petabytes of IP traffic flowing on a global communication system in 24 hours; the 2.8 million emails being sent every second; the 571 new websites being created every minute; the 500,000 social networking websites competing on the Internet every day; the 2.5 billion Facebook likes and 175 million Tweets in a day; and the over 5 billion mobile phones in use and 700,000 + apps available for download. These are but a few numbers that help us grasp just how big big data really is.
Looking at the numbers is of course one of many ways to look at big data. Another way of looking at big data is to ask the question, how or from what technologies are these big numbers being generated. A three layered (physical, logical, application) border-less global and decentralized communication system might be a logical starting point. One can then proceed to take an even closer look at say, the physical layer of the Internet: bits of 1’s and 0’s being transmitted through the airwaves, or fiber optic cable. From this point, one can then more easily and accurately assess the true data privacy and security issues associated with big data.
One can also look at the technological devices used to access our global communication system: smart phones, laptops, tablets, desktops, etc. As well, one can look at the increasing number of things capable of connecting to this network (RFID tags, CCTV cameras, smart meters, corporate security systems, etc.). All of the above can then be looked at from a social perspective: being observed on a CCTV camera when walking out of my downtown condo, financial data being collected and transmitted when taking out clients for breakfast, etc.
Having said all of the above, some authors have disagreed with the statement that anonymity is not necessary in big data operations. Neither statements are right or wrong. The truth is, it depends on a multitude of factors. One important consideration, however, is the purpose and way in which big data operations are being run, as well as the context. For instance, individual privacy considerations may yield to more important societal considerations where big data operations are used to prevent a global epidemic. The same may not be true in a purely commercial context where big data operations are being used to increase revenues. Also, and perhaps more importantly, some take advantage of data streams and modern technologies to build highly precise profiles on individual consumers so as to be in a position to present them with “an offer they can’t refuse”. Others, on the other hand, use the infinitely scalable amounts of big data to create immense data pools to then run complex analytics in order to glean broader market trends and insight. On the other hand, some big data operations aren’t necessarily concerned with collecting personal or sensitive consumer data.
In short, the concept of big data is a broad and rather nebulous concept. The debate on the right to informational privacy, from a legal perspective and as it relates to big data operations, has to stay grounded by being affixed into something concrete and precise. This is more easily accomplished by conducting a legal analysis from the micro/nano level. Furthermore, any legal analysis must inevitably be highly responsive to big data operations. Thus, looking at the design of the technologies being used to listen, collect, aggregate, store and analyze the big numbers that comprise big data is one of many other ways to approach the debate. (By Frank LeSieur, So. Sc., LL.L., LL.B., LL.M. – Security Law).
![](http://stats.wordpress.com/b.gif?host=dataprivacyandsecuritylaw.com&blog=48972501&post=444&subd=smitlesieur&ref=&feed=1)