Derrick Harris, writing on GigaOm Pro (Subscription Required), puts forward an argument claiming that Social Networks like Facebook or Twitter cannot live on the public clouds.
The real irony here is that, while social networks are cutting-edge
web-based services, there is little symbiosis with the other
cutting-edge web-based paradigm getting lots of attention lately: cloud
computing. Social network sites can’t look to the cloud to help them
solve their infrastructural woes.
He argues that the bandwidth costs associated with handling the large databases and latency will be exorbitant and it will force the social networking companies to build their own infrastructure.
I was somewhat surprised by his outright dismissal of Cloud Computing when it comes to delivering social network applications. Contrary to what Derrick conveys, I have a totally opposite view on this. In fact, it is my opinion that social networks should tap into Clouds to add reliability to their offerings. There are some things that I want to point out in this discussion.
- I think his argument about the bandwidth costs related to the handling of databases is a bit of hand waving. He doesn’t give any numbers so I cannot argue decisively here but I wouldn’t put so much emphasis on the bandwidth costs. First, providers like Amazon usually don’t charge for any bandwidth as long as the packets stay within their network perimeter. So any calls by the web server to the database server doesn’t cost anything. Secondly, by using technologies like memcached, we can also reduce the “number of interactions” with the databases. Unless, I am missing something really obvious here, I don’t see it as a big enough problem to entirely ignore the Clouds. Latency, within a Cloud provider’s infrastructure, is, usually, no different from what happens inside a datacenter.
- Also, we don’t have to consider the use of Clouds as a “with us or against us” problem. Social networks need not use the Clouds entirely (though I would, personally, like to push them to the complete adoption of public clouds). They could use CDNs for delivering static contents, they could even tap into the Cloud for message queues, etc.. There are many ways in which one can partition the use of infrastructure between on-premise and Clouds and, thereby, minimize the money spent on the infrastructure.
In my opinion, the biggest obstacle faced by social networks is scalability and it depends entirely on how the application platform is architectured. I wouldn’t put the blame on the public cloud infrastructure. In fact, I don’t see how public clouds will be worser than managed hosting infrastructure (which, I suppose, is used by Twitter). We have seen many social apps in Facebook hosted on the public cloud infrastructure with great success. Jaiku may not have the user base like Twitter but it is now hosted on a public cloud (Google App Engine) and it could scale up to the level of Twitter without any problems.
Nati Shalom, founder of Gigaspaces, has addressed this question more than a year back
Twitter is no different than many other web apps that have become overnight successes.
He identifies their problem exactly
Success was much bigger and faster
then they imagined. Not surprisingly. the architecture was not
designed for scalability and they are now forced to go through the painful process
of scaling their architecture.
He then goes on to show that this scalability problem can be solved with their Gigaspaces platform. The most important point in his post is that when their solution is coupled with a public cloud platform like Amazon EC2, the problem becomes much easier to solve.
With services such as Amazon EC2,
and other cloud environments, this can be made even simpler, as we can have a
pre-configured image and hardware ready for deployment. All we need to do is just
deploy our business logic.
I think I will conclude this post by using Nati Shalom’s conclusion without adding anything extra to emphasize my point.
today’s frameworks architectures, we
don’t have to go through the same painful experience. We can build
architectures from the get-go. I would even argue that it takes less
time to build applications with this approach than the traditional