Hadoop Summit 2011 was held this week at Santa Clara and it highlighted how Hadoop has matured in the past few years. Hadoop is an open source project under Apache Foundation which aims to solve the storage and processing of big data. Based on Google’s Map Reduce, Hadoop has emerged as a major platform in the space and leading enterprises are using it to handle their data. An ecosystem of vendors offering anything from products around Hadoop platform to offering third party support have emerged around Hadoop, making it the Linux of Big Data world. The Hadoop Summit this week highlight this growing importance with some exciting news coming out of the ecosystem partners.
Some of the key highlights from the summit are:
- Yahoo spins off Hortonworks, a company focussed on commercializing Hadoop, with the support of Benchmark Capital. Hortonworks is lead by Eric Baldeschwieler who was formerly a VP of Software Engineering at Yahoo. Since Hortonworks employ many core Hadoop developers, they will also be playing a key role in the Hadoop development. In fact, Hortonworks have signed up Yahoo as the first client, offering third tier support on Hadoop. Expect them to add tremendous pressure on other competing companies like Cloudera and MapR, leading to rapid innovation around Hadoop
- Cloudera (previous CloudAve coverage), touted as Redhat of Hadoop by pundits sometime back, announced the release of Cloudera Enterprise 3.5 with some nifty features aimed to make Hadoop experience seamless and secure. Cloudera enterprise is a full lifecycle management and automated security solution for Apache Hadoop deployments. Some of the key features of Cloudera Enterprise 3.5 are: A service and configuration manager (SCM) which radically simplifies the deployment and management of a range of Hadoop services available in Cloudera distribution. This SCM tool is also available as a free download which will let anyone set up a 50 node Hadoop cluster in minutes; An activity manager offering not only real time view into Hadoop systems but also a historic view of Hadoop jobs; Additional enhancements to Resource Manager and Authorization Manager
- Pervasive Software (previous CloudAve coverage), the Austin based integration provider with a big data push, announced Pervasive TurboRush for Hive, a product that makes Hive queries run faster with lesser hardware. I have already written about Pervasive’s big plans for big data with DataRush platform. Pervasive TurboRush for Hive is first of the many accelerators that Pervasive is planning to release powered by their DataRush platform. Users can get a 2-4X speed with Pervasive DataRush for Hive and expected to speed up intelligence gathering in the big data world
- MapR, the Silicon Valley based company commercializing Hadoop, announced the release of two products, a free version of their Hadoop distribution and a commercial version. MapR claims that their distribution offers 2-5X performance improvements and tremendous cost savings in terms of reducing the hardware needed by half. MapR’s distribution includes some proprietary components unlike some of the competing distributions
When LexisNexis announced the opensourcing of their core big data platform, it was evident that Hadoop has gone big in the enterprise space. This week’s Hadoop Summit just confirmed this trend and Hadoop is the platform to beat as we go further into the big data world.
Related articles
- Hadoop spinoff CEO: Use Apache’s version (infoworld.com)
- Battle on: MapR, Cloudera pimp their Hadoop products (gigaom.com)
- Yahoo! Kicks off Fourth Annual Hadoop Summit (developer.yahoo.com)
- Cloudera promises ‘Google-like’ Big Data dream in minutes (go.theregister.com)
- Exclusive: Yahoo launching Hadoop spinoff this week (gigaom.com)
- My Review of Hadoop Summit 2011 #hadoopsummit (bytemining.com)
- Yahoo! ejects Hadoop engineers atop open source dream (go.theregister.com)
- Cloudera outfits Hadoop with management tools (infoworld.com)
- Cloudera Goes Full Lifecycle Management for Hadoop (diversity.net.nz)