Traditionally companies invest into software that has been proven to meet their needs and has a clear ROI. This model falls apart when disruptive technology such as Big Data comes around. Most CIOs have started to hear about Big Data and based on their position on the spectrum of conservative to progressive they have either started to think about investing or have already started investing. The challenge these CIOs face is not so much whether they should invest into Big Data or not but what they should do with it. Large companies have complex landscapes that serve multiple LOBs and all these LOBs have their own ideas about what they want to get out of Big Data. Most of these LOB executives are even more excited about the potential of Big Data but are less informed about the upstream technical impact and the change of mindset that IT will have to go through to embrace it. But, these LOBs do have a stronger lever – money to spend if they see that technology can help them accomplish something that they could not accomplish before.
As more and more IT executives get excited over the potential of Big Data they are underestimating the challenges to get access to meaningful data in single repository. Data movement has been one of the most painful problems of a traditional BI system and it continues to stay that way for Big Data systems. A vast majority of companies have most of their data locked into their on-premise systems. Not only it is inconvenient but it’s actually impractical to move this data to the cloud for the purposes of analyzing it if Big Data platform happens to be a cloud platform. These companies also have a hybrid landscape where a subset of data resides in the cloud inside some of the cloud solutions that they use. It’s even harder to get data out from these systems to move it to either a cloud-based or an on-premise Big Data platform. Most SaaS solutions are designed to support ad hoc point-to-point or hub and spoke REST-ful integration but they are not designed to efficiently dump data for external consumption.
Integrating semantics is yet another challenge. As organizations start to combine several data sources the quality as well as the semantics of data still remain big challenges. Managing semantics for single source in itself isn’t easy. When you add multiple similar or dissimilar sources to the mix this challenge is further amplified. It has been the job of an application layer to make sense out of underlying data but when that layer goes away the underlying semantics become more challenging.
If you’re a vendor you should work hard thinking about business value of your Big Data technology – not what it is to you but what it could do for your customers. The spending pie for customers hasn’t changed and coming up with money to spend on (yet another) technology is quite a challenge. My humble opinion on this situation is that vendors have to go beyond technology talk and start understanding the impact of Big Data, understand the magnitude of these challenges, and then educate customers on the potential and especially help them with a business case. I would disagree with people who think that Big Data is a technology play/sale. It is not.
Photo Courtesy: Kurtis Garbutt