Facebook supports its social network of 500 million users with LAMP software infrastructure. This open source approach, also used by Google, Twitter, Yahoo and others, is a departure from the proprietary products offered by Microsoft, Oracle and IBM.
Facebook takes some heat from Google and others in the
social networking community for being a walled garden, keeping the information
users share within its friendly confines close to the vest.
But most of the software infrastructure that supports
Facebook activities is in fact open source, free for programmers to adopt,
customize and use. Google, Twitter, Yahoo and other Internet companies today
also support their platforms with open-source products.
Royal Pingdom took time in June to
catalogue Facebook's software infrastructure and noted that the social network
is largely a LAMP (Linux, Apache, MySQL and PHP) Website, albeit with some
modifications.
Facebook
uses Sun Microsystems' open-source database MySQL
for persistent storage,
Memcached for distributed memory caching between its Web servers and
MySQL servers, and distributed storage software Cassandra
for inbox search.
Facebook also created and released to open-source
HipHop for PHP, a compiler that
turns PHP into native code on Facebook's Web servers.
"Not only is Facebook using (and contributing to)
open-source software such as Linux, Memcached, MySQL, Hadoop, and many others,
it has also made much of its internally developed software available as open
source, Royal Pingdom noted, citing HipHop, Cassandra, Thrift and Scribe as
well as the Tornado Web server.
Other Internet companies use many of the same open-source
tools Facebook employs. For example, Twitter
uses Cassandra on its geo and research teams.
Facebook and Twitter also use Hadoop, a map-reduce
implementation that makes it possible to perform calculations on massive
amounts of data. In fact, Hadoop seems to be used by every Internet company
worth its salt, from Google to Amazon and Amazon Web services to Yahoo.
Yahoo is a heavy Hadoop user, burning through more than
100,000 CPUs in 36,000 computers for its Web search and ad platforms. The
company contributed to the open-source project last month by
releasing Hadoop with Kerberos security and a workflow engine for Hadoop.
Why, after years of enterprises powered by proprietary
software infrastructure from Microsoft, Oracle, HP, IBM and others, are big
Internet companies writing or adopting open source with such gusto?