November 7, 2011

NSF funds massive data ‘pipeline’

Johns Hopkins to build blazingly fast scientific computer network

Financed by a $1.2 million National Science Foundation grant, one of the world’s fastest and most advanced scientific computer networks—one capable of transferring in and out of The Johns Hopkins University per day the amount of data equivalent to 80 million file cabinets filled with text—will be built on the university’s Homewood campus, with support from the University of Maryland, College Park.

The grant was announced last week by U.S. Sen. Barbara Mikulski of Maryland, who is chair of the Commerce, Justice and Science Appropriations Committee.

The network will allow for the transfer and analysis of the kind of complex and massive data sets being produced today in scientific fields such as astrophysics, medical research, genomics and turbulence modeling, according to Johns Hopkins physicist and computer scientist Alexander Szalay, one of the lead researchers on the new grant.

“Computer science has drastically altered the way we do science and the science that we do, and this networking capability is a crucial part of that,” said Szalay, the Alumni Centennial Professor in the Krieger School’s Henry A. Rowland Department of Physics and Astronomy and director of the university’s Institute for Data Intensive Engineering and Science. “This NSF-funded network will be one of the nation’s first public 100-gigabit-per-second Internet connections and will allow us to move data sets thousands of times bigger than we previously thought possible. Johns Hopkins will finally have world-class computing facilities.”

This installation, supported by the NSF’s Office of Cyberinfrastructure, will allow Johns Hopkins to receive huge data sets from Google, Oak Ridge National Laboratory and the San Diego Supercomputing Center, among others, according to Szalay, who is the co-principal investigator on the grant with Jonathan Bagger and Mark Robbins, both professors in the Department of Physics and Astronomy at Johns Hopkins.

The new system will be housed in a powerful, energy-efficient computing center in the Bloomberg Center for Physics and Astronomy on the Homewood campus, in a space that once served as the mission control center for NASA’s Far Ultraviolet Spectroscopic Explorer satellite. This transformation is being supported by a $1.3 million stimulus grant administered through the National Science Foundation.

Housed in the new space will be the Homewood High-Performance Cluster, which brings together the resources of investigators in both the Krieger School of Arts and Sciences and the Whiting School of Engineering to create a powerful and adaptive co-op facility that is designed to support large-scale computations on the Homewood campus.

Also housed in the new center will be the Data-Scope, a powerful cluster of computers capable of handling colossal sets of information. The cluster will be able to handle five petabytes of information, which is the equivalent of 66.5 years of HDTV data. (To put this in context, 50 petabytes would equal the entire written work of humankind, from the beginning of history until now, in all languages.)

The new apparatus will allow Johns Hopkins researchers—as well as those at other institutions, including universities and national laboratories such as Los Alamos and Oak Ridge—to conduct research directly in the database.

“This new National Science Foundation grant will facilitate lightning-fast connections to the Internet, which together with our new NSF-funded computer facility, will allow Johns Hopkins researchers to lengthen their lead in data-intensive science and engineering,” Bagger said.

The network will be supported by the regional Mid-Atlantic Crossroads research and engineering network at the University of Maryland, College Park. About $950,000 of the grant money comes directly to Johns Hopkins, and the remaining $250,000 goes to the University of Maryland.

 

Related websites

Computing at Johns Hopkins:

A seismic leap for science

A Space to switch on land

Institute for Data Intensive Engineering and Science (IDIES)

IDIES Homewood High-Performance Cluster

IDIES Research