Who We Are Republican Views Newsroom Documents Archives Subcommittees Search the site Home

Witness Testimony

Mr. Charlie Catlett
Senior Fellow
Computation Institute Argonne National Laboratory
9700 S. Cass Avenue
Lemont, IL, 60439

Online Pornography: Closing the Doors on Pervasive Smut.
Subcommittee on Commerce, Trade, and Consumer Protection
May 6, 2004
10:00 AM


Good morning, Mr. Chair and Members of the Committee. Thank you for allowing me this opportunity to comment on the use of the Internet and distributed computing technologies. I am Charles E. Catlett, a senior fellow at the Computation Institute at the University of Chicago and Argonne National Laboratory. I am the executive director of the NSF TeraGrid project, which is constructing one of the world's most powerful distributed computing systems, scheduled to be completed in October of this year. I am also the founding chair of the Global Grid Forum, an international standards body that brings together distributed computing researchers, commercial software providers, and end users to create software standards for distributed computing on the Internet. I have been involved in the evolution of the Internet since 1984, doing research in both advanced network technologies and the practical applications that these technologies enable. My work has been aimed at providing increasingly powerful information technology tools for the science and education community.

I am also a father of three, and I pay very close attention to what my children are able to do with the Internet and Peer-to-Peer software in particular. I am very encouraged by your interest in these issues, which involve very complex technology and which have far-reaching impact on our Nation, and I am honored to speak with you about this technology.

I have prepared some brief remarks regarding what types of applications are possible with the increasing availability of broadband Internet and distributed computing software capabilities, and several examples of the kind of benefit we are seeing from these capabilities.

1. Peer-to-Peer and "Grid" Computing

Many terms have substantial overlap and cause confusion in discussions about the Internet and related software, so I would like to start with straightforward definitions of four such terms.

"Distributed computing" is a general term that refers to any set of computers that work together, using a network, to provide some form of capability. Most distributed computing software used on the Internet falls into three categories:

"Client-Server" computing involves a person using a program (a "client") on a home or office computer, interacting over a network with a larger computer, or "server." The server provides information, applications, or services to many clients. A Web browser is an illustration of a client, and the Google search site is an example of a server. Thus the Web is essentially a client-server system.

"Peer-to-Peer" could fairly be described as "client-to-client" computing, where the participating clients run on home or office computers, and where there may be tens of thousand or even millions of computers involved in sharing information or computing capabilities.

"Grid" is a term that is used increasingly often to refer to what we might call "server-to-server" computing. In a Grid system, shared resources such as powerful servers, databases, or scientific instruments are integrated to support applications that need powerful capabilities not available at a single location. Users of Grid systems may access them via client-server approaches.

All three forms of distributed computing share the Internet as their communications utility, and have many attributes in common. It is also difficult to classify many applications into only one of these three categories, because the most powerful applications tend to combine aspects of all three forms.

For this reason, it is important to consider a wide range of application types in order to determine the impact that would be felt with the introduction of regulations aimed at a particular software genre. This is not unlike the work that we do in the Global Grid Forum, where we consider the broader impact of any changes to a protocol or interface standard.

As with other Internet technologies such as the World Wide Web, it is difficult to predict what new applications will be enabled with new capabilities. Peer-to-peer technology is a good example, and in the research community we find a number of promising applications that are being developed and evaluated.

We see potential uses of "peer-to-peer" technology in many venues where information - whether scientific, clinical, or educational data - is shared among a large population of potential users. For example, the "OceanStore" project at the University of California-Berkeley is using peer-to-peer techniques to provide highly available, virtually "indestructible" storage systems that assume the underlying servers will be neither reliable nor secure. Groove Networks is a commercial software firm that uses peer-to-peer technology to create secure collaboration services for distributed teams, allowing individuals to work closely "together" despite being spread across many time zones. And many Grid applications share some aspects of peer-to-peer, as I discuss below.

2. Practical Scientific Applications Using Distributed Computing Technologies

I would like to focus on three applications of distributed computing technology. These are illustrative of the type of applications being developed on today's Internet, each of which uses a variety of distributed computing technologies. In each of these cases, peer-to-peer software has the potential for extending data sharing capabilities to a much broader audience than the current scientific collaborations, however none are using peer-to-peer software today.

The first involves predicting and response to severe weather, which causes hundreds of lost lives and some $13B in economic loss annually. Here I describe the work of Professor Kelvin Droegemeier, director of the Center for Analysis and Prediction of Storms, and his colleagues. Weather applications are aimed at improving the nation's infrastructure for predicting and preparing for severe weather.

The second involves biomedical research aimed at understanding brain-related disease ranging from Alzheimer's to attention deficit disorder. Dr. Mark Ellisman, director of the Biomedical Informatics Research Network (BIRN), and collaborators at twelve U.S. universities are using the Internet to create highly secure, nation-wide research and clinical data-sharing capabilities. The BIRN project uses the Internet and distributed computing and information technologies to create infrastructure aimed at improving biomedical research by enabling researchers throughout the United States to collaborate on large-scale studies of human disease with unique, multi-resolution tools. BIRN uses technology that is quite similar to peer-to-peer software, albeit with much greater control over security, access and authorization.

The third application is the analysis of urban air quality and airflow, seeking to understand the impact of both existing pollutants and potential effects of airborne toxins from events such as fires or explosions. This application includes the work of Dr. Alan Huber and colleagues from the Environmental Protection Agency's National Exposure Research Laboratory, Argonne National Laboratory's Environmental Assessment Division, and Fluent, Inc., a commercial software provider.

2.1 Severe Weather Prediction and Early Warning

The Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma engages in basic and applied research in storm-scale data assimilation and numerical weather prediction, with several ongoing programs in collaboration with colleagues around the country. The work is aimed at integrating weather sensors and computer models, using high-performance computers and the Internet, to rapidly model evolving weather patterns in order to predict destructive storms in time to provide advanced warning.

Beginning in 1998, for instance, CAPS worked with the University Corporation for Atmospheric Research (UCAR) Unidata Program, the University of Washington, the National Severe Storms

Computer Simulation Mobile Doppler Radar

Figure 1: Comparison of a computer model with radar data, showing characteristic "hook echo" indicating conditions conducive to formation of tornados. The figure illustrates the advanced capabilities of today's weather models. CAPS uses the Internet to integrate sensors, databases, and computers in order to provide advanced, precise warning of severe weather.

Laboratory (NSSL), and the WSR-88D Operational Support Facility (now the Radar Operations Center ROC) to establish the Collaborative Radar Acquisition Field Test (CRAFT) project. The goal of CRAFT was to demonstrate the real time compression and Internet-based transmission of NEXRAD data from multiple radars with a view toward nationwide implementation. CAPS is currently working with the National Weather Service to transition the CRAFT system into an operational service.

To further advance sensor capabilities, CAPS is working with the Center for Collaborative Adaptive Sensing of the Atmosphere (CASA) at the University of Massachusetts at Amherst, to revolutionize the remote sensing of the lower troposphere, initially via inexpensive, low-power, phased array Doppler radars placed on cell towers and buildings. A unique component of this project is that the sensors interact with one another, using the Internet to dynamically adjust their characteristics to sense multiple atmospheric phenomena while meeting multiple end user needs in an optimal manner. These communications and data sharing techniques are similar to what is typically classified as peer-to-peer.

Computer models have been used to predict long-term weather trends for several years. However, in order to predict severe weather with sufficient precision and in a time frame to allow for early warning, high-performance computing systems are essential. Several years ago CAPS developed computer-based storm prediction capabilities to identify severe thunderstorm activity with roughly 4 hours notice. This amount of time was sufficient, for example, to inform airlines of pending thunderstorms at major hubs, allowing those airlines to delay flights prior to takeoff in order to ensure that landing would be possible, greatly reducing the cost of diverting aircraft once in the air.

Today CAPS also leads an NSF-funded project called Linked Environments for Atmospheric Discovery (LEAD), which aims to create capabilities for analysis tools, forecast models, and data repositories to function as dynamically adaptive, on-demand systems. These systems will change configuration rapidly and automatically in response to the evolving weather, responding immediately to user decisions based on the weather problem at hand, and enabling the steering of remote observing systems to optimize data collection and forecast/warning quality. The goal of such systems is to provide precise information about the predicted path of destructive weather, such as tornados, in a timeframe that permits citizens to prepare for, rather than react to, such weather.

2.2 Nationwide Sharing of Biomedical Research Data

The Biomedical Informatics Research Network (BIRN) is an initiative sponsored by the National Institutes of Health (NIH) and National Center for Research Resources (NCRR). BIRN fosters large-scale biomedical science collaborations by utilizing emerging distributed computing technologies and the Internet, including applications distributed among high-performance computers, databases, and new software and data integration capabilities developed within the project and elsewhere.

The BIRN currently involves a consortium of 12 universities and 16 research groups participating in three testbed projects centered on the brain imaging of human neurological disease and associated animal models. Some

BIRN groups are working on large-scale, cross-institutional imaging studies on Alzheimer's disease, depression, and schizophrenia using structural and functional magnetic resonance imaging (MRI). Others are studying animal models relevant to multiple sclerosis, attention deficit disorder, and Parkinson's disease through MRI, whole brain histology, and high-resolution light and electron microscopy.

These studies are being used to drive the definition, construction, and daily use of a "federated data system." Federation presents biological data held at geographically separated Internet sites to appear as a single, unified and persistent data archive. Data is securely accessed across institutional boundaries to address issues of data privacy and automatic translation of data formats. Most of the groups participating in the BIRN have traditionally conducted independent investigations on relatively small populations, using site-specific software tools.

The promise of the BIRN is the ability to test new hypotheses through the analysis of larger patient populations and unique multiresolution views of animal models through data sharing and the integration of site independent resources for collaborative data refinement. To accomplish these goals, the BIRN project will continue to rely on innovative distributed computing technologies on the Internet.

2.3 Air Quality and Impact of Airborne Biological or Toxic Agents

Understanding the pathway of toxic air pollutants from source to human exposure in urban areas is of critical interest to the US Environmental Protection Agency, and has been an ongoing activity. Rapid assessments of risk, such as the migration of toxic gases related to major fires or chemical spills, are vital to first responders, local officials, federal officials, and the public. The scientific shortcomings are especially serious for incidents that occur in an urban center where the understanding of airflow around large buildings is poor.

Computational fluid dynamic (CFD) simulations have long been used in the aerospace and automotive industries to evaluate airflow around planes and cars, and increasingly in biomedical applications such as the modeling of blood flow through the heart. CFD techniques also have the potential to be employed to describe the flow of pollutants (be they a plume from an event such as an explosion or fire, or be they the dispersion of some pollutant or agent) in the complex terrain that our urban areas represent.

EPA scientists in the National Exposure Research Laboratory are working with Argonne National Laboratory's Environmental Assessment Division and Fluent, Inc. in a computational laboratory setting to test and use high fidelity CFD simulations of the spread and transport of contamination in urban building environments. In addition, the EPA - Argonne collaboration will also explore the possibility of developing or adapting the products from CFD simulations to support rapid exposure and risk models to potentially guide urban emergency response and emergency management for chemical, biological or radiological attacks or accidents.

As part of this investigation, EPA and Argonne scientists will use the Internet to exchange databases, simulation results, and other types of data. Experiments will be done using several forms of distributed computing on the Internet. One approach to be explored is the use of Fluent's Remote Simulation Facility, a Web-based "portal" that allows users to upload data from their computers to run a simulation on Fluent's computers. Another will be to use supercomputers at Argonne in client-server mode, and a third approach will attempt to couple Fluent's portal with supercomputers in NSF's TeraGrid project. All of these approaches are likely to be useful for some types of work, and some may employ technology that has functionality similar to peer-to-peer software.

Figure 3: The EPA's National Exposure Research Laboratory, Argonne National Laboratory's Environmental Assessment Division, and Fluent, Inc. are exploring airflow models to understand the flow of pollutants in urban areas and to evaluate the potential for using distributed computing systems to provide airflow models to guide emergency response. (Image courtesy of Alan Huber, EPA)

Concluding Remarks

As a father and as a citizen I am very concerned about the availability of inappropriate material on the Internet. I would like to make several comments specifically regarding H.R. 2885.

The proposed requirements for clear and prominent notice would, in my view, be useful for software in general. Typical software end user license agreements are incomprehensible to average people. We have standard labels on food products to help consumers determine nutritional value, and dangers. A similar program for software could be designed to cover the disclosure proposed in H.R. 2885 as well as disclosure regarding privacy and security risks.

It is also very important to provide the user with clear and prominent notice regarding information sharing status and to require that file sharing be explicitly enabled at the user's discretion, not without their knowledge. Indeed some of the peer-to-peer software I have seen in recent months has already moved in this direction.

The proposed "do-not-install" beacon is an interesting idea, and I would encourage the committee to engage leaders from the software industry in exploring this idea and possible implementations.

I note that the H.R. 2885 definition for "peer-to-peer" software, as written, covers nearly all Internet software that I am aware of, including Web software, instant messaging software, and file transfer programs. In addition, the exclusion of software that is "marketed and distributed primarily for the operation" of networks implies that functionality built into computer operating systems (such as Windows or MacOS) would be excluded from these requirements. In practice, this would place a greater software engineering and support burden on small companies developing Internet software than would be placed on large companies adding new functions to operating systems software. This would put small software companies at a distinct disadvantage relative to their larger competitors.

In summary I would like to commend this committee for taking on this complex set of issues. I would also respectfully encourage the committee to engage leaders and experts from the software industry to work together toward achieving what I believe to be a common goal of protecting our children, and our privacy, while continuing to encourage innovation in this country using the Internet. This will require much more precision in the definition of the software to be regulated. Thank you very much for the opportunity to speak with you.

Biographical Sketch: Charles E. Catlett

Charles E. Catlett is a senior fellow at the University of Chicago and Argonne National Laboratory, executive director of the TeraGrid project, and chair of Global Grid Forum. He was the initial network architect of the TeraGrid network, the world's highest capacity open research network, operating at 4 times the capacity of the most advanced national backbone networks. TeraGrid, an NSF-funded project, is using this network to deploy a 25 teraflops computational Grid system integrating computers, storage systems, visualization facilities, and scientific instruments located in nine institutions nation-wide. TeraGrid partners include the University of Chicago and Argonne National Laboratory, Caltech, the National Center for Supercomputing Applications (University of Illinois at Urbana-Champaign), the Pittsburgh Supercomputing Center (University of Pittsburgh and Carnegie Melon University), the San Diego Supercomputer Center (University of California-San Diego), Indiana University, University of Texas, Oak Ridge National Laboratory, Purdue University, and Georgia Institute of Technology. Scientists use the nation-wide Internet2 network, Abilene, to access TeraGrid from hundreds of universities and laboratories.

Charlie is the founding chair of the Global Grid Forum (GGF), the preeminent Grid middleware standards body established in 1999 and with over 50 technical groups developing software specifications, best practices, and informational documents for distributed computing. Several thousand GGF participants come from over 30 countries and some 200 organizations including over 100 companies. As founding chair, Charlie led the development of GGF's standards processes, organization, governance, and culture.

In 1999, with $8M in funding from the State of Illinois, Charlie directed the I-WIRE project, creating a dedicated fiber optic facility connecting 10 universities and commercial telecommunications hubs in Illinois to support advanced distributed applications research and educational projects. I-WIRE enables projects such as the Starlight international optical networking hub, the NSF TeraGrid Backplane network, and the NSF-funded Optiputer project.

Prior to joining Argonne in 2000, Charlie was chief technology officer at the National Center for Supercomputing Applications (NCSA). Charlie was part of the original team that established NCSA in 1985 and his role included the early development of the NSFNET, which deployed Department of Defense networking technology from the ARPANET project to create a national research and education network. NSFNET formed the basis for today's Internet through its commercialization in the early 1990s.

With Larry Smarr, NCSA's founding director, Charlie co-authored a seminal paper in 1992, "Metacomputing," in the Communications of the ACM, which contributed to the concept of Grid computing. That same year Charlie's paper "In Search of Gigabit Applications," published in IEEE Network, received the Fred W. Ellersick award for best paper in an IEEE journal. His most recent publication is "Standards for Grid Computing: Global Grid Forum," in the inaugural issue of the Journal of Grid Computing.

Charlie can be reached by email (cec@uchicago.edu) or telephone (+1-630-252-7867). His often out of date web page is http://www.ggf.org/people/catlett/

3 More Information on Projects Cited

The TeraGrid Project (http://www.teragrid.org) is funded by the National Science Foundation (NSF) Directorate for Computer and Information Science and Engineering (CISE) under the direction of Dr. Peter Freeman. http://cise.nsf.gov/

OceanStore. University of California-Berkeley, Computer Science Division, Prof. John D. Kubiatowicz

et. al., http://oceanstore.cs.berkeley.edu/

Groove Networks. Groove Networks, Inc. Ray Ozzie et. al., http://www.groove.net

Center for Analysis and Prediction of Storms, (CAPS): http://www.caps.ou.edu/

Biomedical Informatics Research Network (BIRN): http://www.nbirn.net/

Environmental Protection Agency National Exposure Research Center: http://www.epa.gov/nerl/

Argonne National Laboratory Environmental Assessment Division: http://www.ead.anl.gov/

Fluent, Inc. Remote Simulation Facility: http://www.fluent.com/software/rsolve/

Global Grid Forum (GGF): http://www.ggf.org

Internet2: http://www.internet2.edu

Starlight: http://www.startap.net/starlight

Optiputer: http://www.optiputer.net/

I-WIRE (Illinois Wired Infrastructure for Research and Education): http://www.i-wire.org/

Related Documents

Tipline: Report Waste, Fraude, and Abuse
Majority Site