[..] space [..]
Image by Yodel Anecdotal via Flickr
Yahoo has been pretty active on Cloud Computing research. They are the biggest contributors to Hadoop, which is fast becoming a significant player as an enabler of Cloud Computing. They have played a significant role in the development of Pig, a high level language for Hadoop. Apart from these development efforts, they have also partnered with Carnegie Mellon University to use the supercomputing cluster to analyze several millions of web documents. They, then, joined hands with Computation Research Laboratories in India to tap into their super computer to help scientists perform data intensive computing research. They partnered with HP and Intel for an open collaboration among industry, academia and government. In fact, they even reached out to undergraduate institutions to bring Cloud Computing as a part of their curriculum. Their Cloud Computing research website can be found here.
Late last week, Yahoo announced an expansion of their partnership with Carnegie Mellon University by adding three more universities to the group. They have expanded the partnership with the University of California at Berkeley, Cornell University and the University of Massachusetts at Amherst. Along with Carnegie Mellon, these universities will tap into Yahoo’s supercomputing clusters to conduct large-scale systems software research and explore new applications that analyze Internet-scale data sets, ranging from voting records to online news sources.
Yahoo’s supercomputing cluster, called as M45, has been operational since 2007. Yahoo’s M45 cluster runs Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. The cluster has approximately 4,000 processor-cores and 1.5 petabytes of storage. The scientists at Carnegie Mellon University has been using it for more than a year now. They have conducted research over 200 Million documents on the web. Their research, including a performance comparison of Hadoop file system with other parallel computing file systems, has resulted in many academic papers.
According to Shankar Sastry, dean of the College of Engineering at the University of California, Berkeley, this partnership will help them in many different fields from processing large scale data like voting records, online news sources and polling data to conduct computationally intensive econometrics research, combining economic theory with statistics to analyze and test large-scale economic relationships to, even, wildlife preservation and biodiversity to managing renewable sources of energy.
The impact of this announcement is huge and it will be felt in the years to come. I always had a special liking towards Yahoo for their role in supporting Open Source projects. Now, with their role in Hadoop and, also, with their participation in Cloud Computing research, my respect for them has increased many-fold and after realizing the impact of their research on the society, I really want them to succeed.
Post Comment