Computing Advice from Colwell, National Science Foundation Leader
by Rita R. Colwell, Ph.D.
One of my favorite yardsticks of wisdom comes from Ralph Waldo Emerson over a century ago. “The invariable mark of wisdom” he said, “is to see the miraculous in the common.”
As scientists, engineers, and educators, we are privileged to have our lives infused with the miraculous. Discovery, learning and innovation are paths we travel daily. Many of us have seen our work transformed in unimagined ways by the power and breadth of the information and communications revolution that we are all a part of.
I’ll describe the National Science Foundation’s (NSF) vision of cyberinfrastructure for the future, and then provide some examples to illustrate why the time is ripe for action. The first wave of the information and communications technology revolution has reshaped the once familiar landscape of the economy and has forced us to clear new paths in research, education, and business. It has swept across every field of research, and changed forever our scientific and educational horizons. New frontiers of knowledge, unimagined only a few years ago, are now open to us.
For decades, NSF has been steadily crystallizing the idea of a center that brings together diverse skills, tools, and perspectives to focus laser-like on scientific and technological problems. From this came the original science and technology centers, the engineering research centers, and the supercomputing centers. Centers in new and promising areas of research are burgeoning.
Now we look toward a grander scale: the TeraGrid, a distributed facility that will let computational resources be shared among widely separated groups. This will be the most advanced computing facility available to scientists for all types of research in the US—exceptional not just in computing power but also as an integrated facility. Teams of researchers working within and across disciplines are coming together to lay the foundations for a cyberinfrastructure revolution.
More than any other fields, astronomy and physics have already benefited from the supercomputing revolution. Databases in astronomy and physics are currently orders of magnitude larger than those in neuroscience, earthquake engineering, or ecology. But not for long! Planned instruments and observational platforms will boost these figures sky-high in the years ahead.
Now let’s move from the far reaches of the cosmos to our own dynamic planet and to life on earth. The Northridge earthquake of 1994 reminded us of our vulnerability, and spurred the pace of disaster research. A variety of tools will help us understand the forces causing earthquakes and their destructive consequences. The Network for Earthquake Engineering Simulation is a 21st century model of collaboration—literally, a laboratory without walls or clocks. Researchers from across the United States will be able to operate equipment and observe experiments from anywhere on the net. They will study how building design, advanced materials and other measures can minimize earthquake damage and loss of life.
From earth science, I turn to the life sciences. Our new information and communications tools, combined with advances in molecular biology, fueled the second great scientific revolution of the last century: genomics. From the tiny genome of the first bacterium sequenced, Haemophilus influenzae, with 1.8 million base pairs, to the 3.12 billion that comprise the human genome was a leap of enormous magnitude. Researchers from Celera Genomics, who helped sequence the human genome, estimate that assembly of the 3.12 billion base pairs of DNA required 500 million trillion sequence comparisons. Completing the human genome project might have taken years to accomplish without the terascale power of our newest computers. Sequencing is underway on the parasite that causes malaria, on many of the world’s major food crops, and on the mouse. The age of biotechnology lies before us.
A challenge now is to describe gene function, and to unravel the structure and function of proteins. It can take milliseconds for a nascent protein to fold into its functional conformation. Until recently, it took 40 months of computer time to simulate that folding. With new terascale computer systems—operating at one trillion operations per second—we have reduced that time to one day. That’s 1000 times faster.
Simulation and visualization are able to reveal what experiment cannot. Klaus Schulten and colleagues at the University of Illinois at Urbana-Champagne studied water molecules passing single-file through a channel of the membrane protein aquaporin, doing a mid-channel flip. This mechanism blocks damaging hydrogen ions from entering the cell, while allowing water to pass through at up to a billion molecules per second. When impaired, aquaporins play a role in cataracts and diabetes.
This year, the Nobel Prize was awarded to three scientists for pioneering work that established a homely worm, Caenorhabditis eleganas as a model for neuroscience. Today, our imaging techniques, such as MRI and CAT, are producing a wealth of data on the human brain. Supercomputing projects are breaking new paths and opening the frontiers for the complex study of cognitive and behavioral neurobiology. Erich Jarvis, is investigating the neurobiology of vocal communication in songbirds to determine how vocal learning and associated brain structures evolved.
Vocal learning is the ability to imitate sounds. It is present in only six groups of animals: 3 groups of birds—parrots, hummingbirds and songbirds—and 3 groups of mammals—bats, cetaceans and humans. Evidence suggests that vocal learning evolved independently in all 6 groups over 65-70 million years. Perception and production of song in these groups are accompanied by anatomically distinct patterns of gene expression. Jarvis hopes to develop a model for how the brain generates, perceives, and learns behavior. His work draws on a broad spectrum of fields that integrate behavioral, anatomical, electrophysiological, molecular biological and bioinformatics techniques. His work could advance our knowledge of how humans learn language, of brain dysfunction, and of the evolution of intelligence.
These surprising connections—from molecular structure to behavior to cognition to gene expression to neuroanatomy—give us a taste of the extraordinary complexity—and potential for insight—that a biocomplexity perspective provides. A robust, flexible, and comprehensive cyberinfrastructure will give us the foundation we need to make rapid progress in understanding even our human complexities.
Data, computing speed, and networks are steps on the path to wisdom—they do not constitute wisdom. We now know that providing a secure homeland will increasingly depend on understanding other cultures—their ideas and attitudes—as well as advancing cyber security, and developing antidotes to combat biological and chemical threats.
The world of vast distances and differences is shrinking, and soon every part of the globe will seem as close as our own back yard. We need to keep our eyes on that future and plan now for the time when we are all next-door neighbors. That will define science and engineering for a 21st century society. Cyberinfrastructure will help take us there and beyond.#
Dr. Colwell is Director of the National Science Foundation.