I use Big Data every day. I don’t have Hadoop, a Data Warehouse, ETL, or a big analytical engine. But I use search engines, which are indexes of web-pages from around the world, to discover related and unrelated facts. I use Twitter and Linkedin, which aggregate the ideas of millions of people, to understand the sentiments of the people I follow. And I make decisions, and mistakes, with this information every day.
We all do. And in that context, we are all Big Data users and abusers, and we can identify with larger enterprises that are also confronting vast streams of information from every corner of the globe, created by individuals, communities, corporations, and governments. We as individuals never had industrial data management applications. We never had Data Governance Councils, Stewards, or Data Management professionals. So we’ve been selecting data streams first and using the ultimate analytical engine – our brains – to integrate that information, glean trends, and make decisions.
What’s new about Big Data is that large enterprises are copying the information processes that We The People use every day. They are selecting streams first, aggregating them second, determining application third, making decisions fourth. Judging consequences of decisions… later, if at all. Organizations around the world are deciding to retain information much longer because there is a belief that latent, slow developing, trends may lie dormant in that information that can be discovered much later.
But with vast volumes of information, long retention cycles, high velocity decision-making has the potential to do enormous damage as much as enormous good. And we know from experience, that decision-making is often influenced by cyclical trends, personal prejudice, and national dogma. Counter-Cyclical views can be marginalized. Whistle-blowers can be fired.
But Big Data also offers an historic opportunity for Data Management. This industry for too long has been seen as back-office archivists recording the deeds and attributes of heroic business leadership in dingy databases in large glass-house mainframes and data warehouses. They have taken back seats to application developers and business analysts who first and foremost collect the requirements of business users for new applications, features, and functions.
But Big Data changes all of that. It makes information sources and streams more important than applications, features, and functions. It changes the emphasis in value creation and puts the onus on Information Management to produce better sources and streams, easier aggregation and integration, manufacturing information products any user can leverage in any application they wish.
Its large enterprises automating the way We The People use online information every day, and the power and consequences of this paradigm shift are profound and potentially quite scary.
We need Information Governance over every part of Big Data to assure that organizations can answer these fundamental questions:
1. Can we trust our sources?
2. Do we know where they came from?
3. How do we verify the authenticity of the information?
4. Can we verify how the information will be used?
5. What decision options do we have?
6. What is the context for each decision?
7. Can we simulate the decisions and understand the consequences?
8. Will we record the consequences and use that information to improve our Big Data information gathering, context, analysis, and decision-making processes?
9. How will we protect all of our sources, our processes, and our decisions from theft and corruption?
10. (via David Bartholomew) When an error or exception is discovered how do we recover without incurring massive work re-engineering our streams, integration, analysis, and decision-making automation?
This morning, the Information Governance Community began discussing these issues in a global teleconference moderated by IDC. We have just scratched the surface of these issues and have much more to discuss. We have agreed to create a new category – Big Data – in our Maturity Model to provide organizations with new methods to benchmark their Big Data Governance maturity. But we also agreed that our existing Maturity Model categories also apply and we need to update them to include Big Data issues and questions.
I believe this is critical work. Big Data is an enormous opportunity to make information the arbiter of value creation in the Information Age. But it is also an enormous risk because the same solutions can be used to make dangerous and destructive decision-making a high volume, high velocity science.
Every new technology can be used for both good and evil. Join the Information Governance Community to help ensure Big Data serves the best possible uses.