
The Roadmap to ‘Hadoop in the Cloud’
The Twitter ball started rolling again just now. Matt Asay posed an interesting question about Forrester suggesting Hadoop isn’t a great fit for the cloud. (Even) without context Vijay Vijayasankar and I started firing off questions and answers which inevitable led to my promise of writing down the transition plan for it Here it is I’ll start bottom-up, from […]

Teradata Labs President Scott Gnau discusses the evolving data analytics market
With all the noise around newer technologies such as Hadoop, it would be easy to assume that the data analytics space is new — and totally dominated by the NewSQL/ NoSQL tools pouring out of the world’s startups. It would be easy. But it would also be wrong. Data analytics, business intelligence, and related ideas are not new. […]

Rise Of Big Data On Cloud
Growing up as an engineer and as a programmer I was reminded every step along the way that resources—computing as well as memory—are scarce. The programs were designed on these constraints. Then the cloud revolution happened and we told people not to worry about scarce computing. We saw rise of MapReduce, Hadoop, and countless other […]

Visualisation – the key that unlocks data’s value?
As the Big Data hype machine continues its relentless attempt to gobble everything in its path, new business units and entire new domains buying into the promise find themselves faced with unanticipated data volume and complexity. They see the potential for data-based decision making, but still face (short-term?) challenges in actually managing, analysing or interpreting […]

GigaOM Pro report on Hadoop and cluster management
My latest piece of work for GigaOM Pro just went live. Scaling Hadoop clusters: the role of cluster management is available to GigaOM Pro subscribers, and was underwritten by StackIQ. Thanks to everyone who took the time to speak with me during the preparation of this report. As the blurb describes, From Facebook to Johns […]

Crunching the numbers in search of a greener cloud
Although sometimes portrayed as a big computer in the sky, the reality of cloud computing is far more mundane. Clouds run on physical hardware, located in data centres, connected to one another and to their customers via high speed networks. All of that hardware must be powered and cooled, and all of those offices must […]

The Right Storage, the Right Cloud
We spend a lot of time in this blog talking about the architecture of elastic infrastructure clouds (EIC) like AWS and our own Open Cloud System. We contrast this against the architecture of enterprise virtualization clouds (EVC) like VCE’s Vblock. Nowhere are these differences more obvious than when you look at how storage systems are […]

4 Big Data Myths – Part II
This is the second and the last part of this two-post series blog post on Big Data myths. If you haven’t read the first part, check it out here. Myth # 2: Big Data is an old wine in new bottle I hear people say, “Oh, that Big Data, we used to call it BI.” One […]

4 Big Data Myths – Part I
It was cloud then and it’s Big Data now. Every time there’s a new disruptive category it creates a lot of confusion. These categories are not well-defined. They just catch on. What hurts the most is the myths. This is the first part of my two-part series to debunk Big Data myths. Myth # 4: […]

Next Iteration Of PaaS: Microsoft Game Plan
In January, I proposed a simple model for the next iteration of PaaS, called Intelligent Platforms, which is centered around data. As we move into a world dominated by Big Data with mobile and various sensors churning out data several orders of magnitude more than even the petabyte scale, data is the new oil for […]

Big Data needs Big Collection and Big Execution
Big Data is the new buzz it seems, and I must say I have been sceptic of it since I first saw the very word – or phrase, what is it? As an IT architect, I’ve always equaled data to databases, and information to applications – and knowledge to the people on top of these […]

Coming To A Place Near You: A Private Cloud Spiked With Big Data
Netflix similarity map Yesterday, I moderated a couple of panels at the Big Data Cloud event. I have been a keynote speaker, panelist, moderator, and participant for many conferences in the last few years. It has always been a pleasure to see the cloud and big data becoming more and more mainstream. Here are my […]

Early Signs Of Big Data Going Mainstream
Today, Cloudera announced a new $40m funding round to scale their sales and marketing efforts and a partnership with NetApp where NetApp will resell Cloudera’s Hadoop as part of their solution portfolio. These both announcements are critical to where the cloud and Big Data are headed. Big Data going mainstream: Hadoop and MapReduce are not […]

Gluster Adds Hadoop Support To Offer An Open Source Petabyte Storage Alternative
Last week Gluster (previous CloudAve coverage), provider of open source storage solutions, announced the beta version of their next release with support for open source Hadoop. With this announcement, Gluster is targeting to be the open source alternative to proprietary storage solutions in the petabyte age. GlusterFS will use standard filesystem APIs available in Hadoop […]

Hadoop Looms Big After The Hadoop Summit
Hadoop Summit 2011 was held this week at Santa Clara and it highlighted how Hadoop has matured in the past few years. Hadoop is an open source project under Apache Foundation which aims to solve the storage and processing of big data. Based on Google’s Map Reduce, Hadoop has emerged as a major platform in […]