|By Jeffrey Abbott||
|February 19, 2015 01:30 PM EST|
Water, water everywhere and nothing to drink. Today I traveled from Boston to San Jose, CA. With stunningly clear weather and a window seat, I observed the transition from a frozen blanket of white covering the entire Northeast and Great Lakes, to the dry and rugged Rockies that are oddly snow-free, to the nearly empty reservoirs of California with their bleached sidewalls that reveal our failure to control our supply and demand for natural resources. The picture here is the Utah Wasatch range that’s home to Snowbird and Alta, which usually have among the most snow of any U.S. ski area (looks more like May than February right now). This year, you’ll find far more snow in New England. This trip brings me to the biggest gathering of big data practitioners of the year and although I see empty reservoirs, I see lots of data lakes.
In fact, from looking at the top big data vendors, it seems that the notion of a data lake has surpassed the skepticism, rejection, and second guessing that plagues all new tech concepts. Vendors, customers, and industry experts have found common ground around the idea that the data lake can relieve the challenges of the data warehouses. The big question is where the data lake fits with the data warehouse. Is it a teammate, a leader, a follower, or a full-on replacement?
The data lake, although it suffers from a bad name, leverages new technologies and approaches to accommodate both structured and unstructured data from a range of sources without the need to categorize/classify/label it when it’s captured. In other words, because technologies such as Hadoop enable it to be ingested with high efficiency, we can now store it without already knowing how we’ll use it.
Although so many vendors are rushing to position their capabilities to build you a data lake, many of them are missing the primary reason why their customers are slow to adopt. The challenge is that the promised value of a data lake has two distinct categories. The first is easy. It’s the cost savings side. It’s the efficiency derived from a better way to store massive amounts of both structured and unstructured data. And although that matters, it’s… well… boring. What makes business leaders interested? New products, services, markets, customers, business models, partnerships, revenue streams, etc. And those are exactly the right types of use cases for big data analytics and data lakes.
But in order for business leaders to sign off on major investments, they need numbers, metrics, KPIs, ROI, time-to-value, opportunity cost, economies of scale, etc. And for big data, they need to understand the analytics use cases that will result in insight that advances their strategic initiatives. They need this before committing to making a major shift in how they “afford” IT, in hopes of turning it from a cost center into a revenue center.
From Day 1 at the Strata Conference in San Jose 2015, it’s apparent that the data lake has moved from an experiment that runs alongside a data warehouse, into a better approach to ingest and store data that has untapped value. The critical first step is to determine where and how to apply the analytics capabilities. Many studies show that identifying use cases for big data is the biggest obstacle in big data adoption. EMC has addressed this with a Big Data Vision Workshop. This infographic explains the process.
- What’s In and What’s Out for IT in 2014
- The Data Lake Has Landed | @ThingsExpo #BigData #DevOps #IoT #M2M #API
- How to Save Money While Doing Green IT
- Big Data Technology - the Rebel without a Cause | @BigDataExpo #BigData #DataLake
- Creative Use Cases for 3D Printing
- Don’t Call #BigData Dead | @CloudExpo #IoT #AI #ML #MachineLearning
- IT and the Mobile User Experience
- Data Science says: “You don’t want a Ferrari.”
- Live From Strata + Hadoop World: Dry Lakes, Salt Lakes, Data Lakes
- The Big Data Catch 22