The UIS Beowulf Cluster: High Performance Computing
The Advanced Research Computing Group
Imagine you're in charge of shelving new books in a library. In this library, books aren't shelved by subject, or author, or even by title; they're shelved according to which pages have the word "got" printed on them. Each time a new book comes in, before you can tell where it belongs, you must list all the places that the word "got" appears in the text. Then you must compare that list to the "got lists" for all the other books in the library and find the closest matches. Only then will you know where to shelve it.
Now imagine that you're classifying proteins. Each protein consists of hundreds of amino acid sequences; you've got to compare all these sequences to the known sequences of proteins in your database and find the closest matches. But you've already got 900,000 proteins in your database and each month thousands of new proteins requiring classification come pouring in. This makes shelving books at our fantasy library look like child's play. Even after you write software to automate the tasks of identifying and matching the proteins, how do you get it to check through those 900,000 entries in a timely manner? Baris Suzek of the Protein Information Resource (PIR) is answering these questions, and a large part of the answer is Abba, the Department of Chemistry/UIS Beowulf cluster.
What is a Beowulf cluster? Originally conceived at NASA, a Beowulf cluster is a high performance computer system consisting of several small, fast computers (nodes) harnessed together and controlled by a single (master node) computer. In this way, all the nodes can work in tandem, generating tremendous computing power. All the individual nodes and the master node are referred to collectively as a Beowulf cluster.
According to Woonki Chung of Georgetown's Advanced Research Computing (ARC) program, a Beowulf cluster provides great power at relatively low cost by using "commodity off-the-shelf " components: cheap and fast PCs; affordable, high-bandwidth internal networking; a Linux operating system; and parallel programming software.
The important thing to remember is that while each individual node may not be particularly powerful, hooking eight or more of them together does create a very powerful system.
The Georgetown Abba cluster Georgetown's Abba cluster was built when Professor Miklos Kertesz of the chemistry department apprised UIS of the need for a local high performance computer cluster. Partnering with the Department of Chemistry and Dell Computers, UIS obtained the necessary components.
Composed of eight 800MHz Pentium III dual-processor Dell PowerEdge 1550 rack-mountable servers, the Abba cluster runs the Scyld Beowulf Cluster Operating System. It was created by Woonki Chung and is maintained by Arnie Miles, systems administrator at ARC.
Using the Abba cluster The Abba cluster will be used primarily for Department of Chemistry computation but it can also be available for other research projects. Baris Suzek was able to test the system and reported that Abba vastly shortened his processing times: jobs that could have taken weeks on a stand-alone computer took just days on Abba. If you are interested in using a Beowulf cluster for your research, please visit the Abba Web site, and contact moores@georgetown.edu.
|