eWEEK Labs Walk-Through: the Internet Archive

 
 
By eweek  |  Posted 2012-05-28
 
 
 

eWEEK Labs Walk-Through: the Internet Archive

Brewster Kahle, founder of the Internet Archive, helped start a supercomputer company called Thinking Machines that built systems for searching large text collections. The Internet Archive has one of the machines from Thinking Machines onsite at one

eWEEK Labs Walk-Through: the Internet Archive

eWEEK Labs Walk-Through: the Internet Archive - StorageTek

The Internet Archive used to use a StorageTek TimberWolf 9710 unit, which allowed it to save money but had the disadvantage of slow access speeds.

eWEEK Labs Walk-Through: the Internet Archive - StorageTek

eWEEK Labs Walk-Through: the Internet Archive - HP Desktops

As recently as 2002, the Internet Archive was using ordinary HP desktops stacked on top of one another. IT managers replaced the machines original disks with 160GB disks to increase their storage capability.

eWEEK Labs Walk-Through: the Internet Archive - HP Desktops

eWEEK Labs Walk-Through: the Internet Archive - John Berry

John Berry, VP of Operations at the Internet Archive, stands in front of racks of petaboxes: machines developed by the archive to store and process one petabyte of information in environments that require low power and high density. The petaboxes, whic

eWEEK Labs Walk-Through: the Internet Archive - John Berry

eWEEK Labs Walk-Through: the Internet Archive - Archive System Checker

Nagios--an enterprise-class open-source network monitoring application-- monitors the status of more than 16,000 checks that run on over 800 machines in the Internet Archives primary cluster.

eWEEK Labs Walk-Through: the Internet Archive - Archive System Checker

eWEEK Labs Walk-Through: the Internet Archive - Petabox Catalog

The Petabox Catalog manages thousands of tasks running across the cluster, balancing workloads and tracking job progress.

eWEEK Labs Walk-Through: the Internet Archive - Petabox Catalog

eWEEK Labs Walk-Through: the Internet Archive - Petabox Control Panel

The Petabox Control Panel provides a Web interface to configure and modify at the cluster, rack, node and partition levels.

eWEEK Labs Walk-Through: the Internet Archive - Petabox Control Panel

Rocket Fuel