Wednesday, February 22, 2012

Metadata: the answer to the digital dilemma


For one of our clients (a national broadcast news media organization), we manage 2.2 petabytes of digital assets, which is growing at a rate of roughly 200 terabytes monthly.  At this rate we are eclipsing anything imagined 10 years ago, due to technology advances, high-definition, and the speed at which information is available and shared.  Storage companies like EMC are preparing for even faster growth speculating that we will see 35 trillion additional gigabytes of data added to the world before 2020….or 35 zettabytes (a word that didn't exist prior to 1991…and forward thinking even then). 

Handling trillions of bytes each month, whether creating, meta-tagging, consolidating or replicating, means that we are part of the above digital dilemma – what to do with all the data.  Our archives are bursting at the seams with physical assets that have been digitized, backed up and protected…so now we are paying for not only the digital storage, but the physical storage for an asset to live, just in case the digital asset fails (and its backup fails).

In the physical world we have organized systems, like LC, Dewey, stacks, cards, etc…that all have the meta-data" about the physical asset.  If all else fails, the asset itself has enough information about it to re-create the meta-data; however, in the digital world that isn't always the case.  Date, author, photographer, videographer, producer, actor, etc.  ….these are all things that can be lost forever if the digital asset isn't properly meta-tagged when originally digitized.  Too many skip or skimp on this critical step.  Saving a buck on the metadata can cost you thousands later.  Lost productivity hours are just the tip – if you don't have the right meta-data on the digital asset you risk losing it for good …in the heap of trillions and trillions of bytes that are spread across trillions of DATs loaded in millions of SANs globally.

Even if you are only talking about a single organization, the numbers are staggering.  Unstructured data is likely to become the largest single expense for businesses, even surpassing staff, within the next 10 years.  In 2002 the total amount of information created in the world was 5 exabytes.  By 2006 that number was 160 exabytes.  Today facebook.com is the size of the entire internet in 2004 according to Geohive and Facebook.  Youtube estimates that 35 hours of video are being uploaded to the site every minute.  Pingdom estimates that there are roughly 300 billion emails sent daily!  According to EMC, the world’s information is doubling every two years. In 2011 the world created a staggering 1.8 zettabytes. By 2020 they estimate that the world will generate 50 times the amount of information and 75 times the number of "information containers" while IT staff to manage it will grow less than 1.5 times. This means that properly meta-tagged digital assets will be critical to successfully manage your digital assets.  New "information taming" technologies such as de-duplication, compression, and analysis tools are driving down the cost of creating, capturing, managing, and storing information to one-sixth the cost in 2011 in comparison to 2005; however, even with proper reduplication and "taming" the internet growth is speeding up, not slowing down.  So what's bigger than a zettabyte? A yottabyte, which is 1000 zettabytes.  While it seems as though that's a long way off, it will be here before we know it and we'll be off to add a new word to the dictionary.  

To be able to process, search, absorb or synthesize that data you must have exceptional metadata.  Without it, think of your video or image asset as a grain of sand at the bottom of the ocean.  You can describe the shape color and size of that grain of sand all you want (after the fact) but the odds of you coming up with the exact grain you were after, is impossible to comprehend (by 2015 this number is estimated to be 1 in 1.25E+22)…and the odds are stacked against you more and more by the second.  In fact, imagine while you are looking on the beach for that grain of sand, 15.6 million beach volleyball courts worth of sand was being added to that beach.  With the right metadata you are able to instantly search for that grain of sand, dive in and come up with the thousands of possible grains that match the description…and then drill down from there to get to your target.  That's the real power and value of what we offer our clients…our staff, working across the world, enable information, putting the power behind the search and allow trillions of bytes to be reviewed, to find that one specific item.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.