Wednesday, March 25, 2015

Emerson and Connection Science

Facts are facts, or so the saying goes, but some facts are different. In a sense, all facts are also undifferentiated raw data. Yet some factual information is legally considered personal data and a variety of protections, rights, obligations and other expectations may apply. Other facts may be considered "Public Record" and even published freely on the Internet as "Open Data".

The nature and definition of public and private facts is highlighted in a simple and powerful way by Emerson: 
"In like manner all public facts are to be individualized, all private facts are to be generalized. Then at once History becomes fluid and true, and Biography deep and sublime." Emerson, Ralph Waldo. Essays - First Series
In a very literal sense, the personal data of an individual is "generalized" when de-identified and aggregated with a lot of other like data. This process is common and considered a basic pillar of privacy protection. 

Distinguishing public from private facts can, in some cases, be challenging. And generalization methods for protecting private personal information have serious limits as more and more options emerge for re-identification. 

Nonetheless, the phrasing of this basic principle by Emerson offers fresh perspectives for potential new approaches to fair information management. A system that individualized public facts well would presumably have collected and rendered usable much related data in a way that sheds light on individual context and the surrounding individual situation. 

Individualization of data goes inward and outward. Example of individualizations that go inward and deeper, such as by including specific identity attribute data of  a user account or data subject. Adding ever finer grained details of the data itself, such noting the precise file size of attaching cryptographic hash digests that are unique to that single blob of data, can individuate an otherwise identical file from all other files. 

Perhaps more important than the inward detailing is the outward individualization of public facts because it can provide social meaning and more cues for business, legal and technical understanding of the facts in their actual context. Including or linking to external data such as of the as timestamps at the nanosecond scale or fine grained location data. Absolute location (eg latitude and longitude) itself can be significantly fine tuned and detailed with proximity data, eg from nearby Bluetooth connection attempts, RFID readers and wireless hotspot routers. Information about business or legal context of a given piece of data likewise provides significant individualization. Documenting the owner of the device that created the data, the project that collected the data or the entity that funded the activity that created the data might allow a citizen at a public data portal to find and understand transportation, educational or other types of data in a much more meaningful and valuable way. 

Public facts can be individualized by including as metadata or links in documentation associations with other data shedding light on the relevant surrounding circumstances. These associations can be discovered and computed by software when presented in a structured standard manner. Individualization of public facts published as open data can be powerfully achieved by noting the key people and interactions relevant to the data in a standard way. Linking data to relevant people and transactions allows all subsequent users of the data to sort, filter and search based on widely varied flexible sets of factors revealing broader situations and intertwining circumstances. 

Emerson may have unknowingly offered a very timely and important functional capability goal for the current wave of open data adoption. As cities, states and federal agencies push more and more data out, anchoring to the axiom as stated by Emerson may provide the missing design requirements needed to build out systems that are both fair and functional. Generalizing and appropriately obfuscating private facts such as personal data is a must and always worth repeating. The balancing parallel admonition to render public facts individualized completed a coherent and buildable design requirement and can serve as is a kind of axiom for those investing in deployment of large scale open data projects around the world. 

Ensuring open data about public facts includes documentation and links to other data needed to relate it to other relevant individual data individualizes it. This is fundamental to make data usable, understandable and of value because it enables others to find, filter and sort and cross tabulate any combination of relevant data. 

Individualizing data about public facts enables connections to context, meaning, insight and value. 

Individualizing open data is fundamentally a matter of Connection Science.







No comments:

Post a Comment