Last week Gluster (previous CloudAve coverage), provider of open source storage solutions, announced the beta version of their next release with support for open source Hadoop. With this announcement, Gluster is targeting to be the open source alternative to proprietary storage solutions in the petabyte age. GlusterFS will use standard filesystem APIs available in Hadoop to offer new storage options for Hadoop deployments. Even though I don’t follow the storage vendors like other infrastructure providers, I am following Gluster for their open source roots and the aggressiveness with which they want to participate in the cloud marketplace. I am also excited by their support to OpenStack project (previous CloudAve coverage).
For those who are not following GlusterFS, it is an open source POSIX compliant software based storage solution that can be slapped on top of commoditized hardware pool that can scale on demand. It can easily scale out to petabytes and has high throughput comparable to other enterprise grade proprietary systems. It can be deployed either on premise or on public clouds. Since it is POSIX compliant, it can easily fit into an existing IT environment.
GlusterFS support for Hadoop makes them the first open source, POSIX compliant file and object storage solution for Hadoop. Any map reduce application is compatible with GlusterFS. It can also co-exist with HDFS and, therefore, opening up the data in any Hadoop deployment to any file or object based applications. This is a pretty neat implementation, IMHO. Gluster will be working with Hortonworks, newly formed spinoff from Yahoo to monetize Hadoop.
Some of the features of this new release include:
- Faster access to data
- High Availability through N-Way replication
- Increased flexibility in sizing
- POSIX comliant NAS
- Open Source
It will be interesting to see how organizations tap into GlusterFS for their big data needs.