Witness Testimony
Mr. Charlie Catlett
Senior Fellow Computation Institute
Argonne National Laboratory 9700 S. Cass Avenue
Lemont, IL, 60439
Online Pornography: Closing the Doors on Pervasive Smut.
Subcommittee on Commerce, Trade, and Consumer Protection
May 6, 2004
10:00 AM
Good morning, Mr. Chair and Members of the Committee. Thank you for allowing me
this opportunity to comment on the use of the Internet and distributed computing
technologies. I am Charles E. Catlett, a senior fellow at the Computation
Institute at the University of Chicago and Argonne National Laboratory. I am the
executive director of the NSF TeraGrid project, which is constructing one of the
world's most powerful distributed computing systems, scheduled to be completed
in October of this year. I am also the founding chair of the Global Grid Forum,
an international standards body that brings together distributed computing
researchers, commercial software providers, and end users to create software
standards for distributed computing on the Internet. I have been involved in the
evolution of the Internet since 1984, doing research in both advanced network
technologies and the practical applications that these technologies enable. My
work has been aimed at providing increasingly powerful information technology
tools for the science and education community.
I am also a father of three, and I pay very close attention to what my children
are able to do with the Internet and Peer-to-Peer software in particular. I am
very encouraged by your interest in these issues, which involve very complex
technology and which have far-reaching impact on our Nation, and I am honored to
speak with you about this technology.
I have prepared some brief remarks regarding what types of applications are
possible with the increasing availability of broadband Internet and distributed
computing software capabilities, and several examples of the kind of benefit we
are seeing from these capabilities.
1. Peer-to-Peer and "Grid" Computing
Many terms have substantial overlap and cause confusion in discussions about the
Internet and related software, so I would like to start with straightforward
definitions of four such terms.
"Distributed computing" is a general term that refers to any set of
computers that work together, using a network, to provide some form of
capability. Most distributed computing software used on the Internet falls into
three categories:
"Client-Server" computing involves a person using a program (a
"client") on a home or office computer, interacting over a network
with a larger computer, or "server." The server provides information,
applications, or services to many clients. A Web browser is an illustration of a
client, and the Google search site is an example of a server. Thus the Web is
essentially a client-server system.
"Peer-to-Peer" could fairly be described as
"client-to-client" computing, where the participating clients run on
home or office computers, and where there may be tens of thousand or even
millions of computers involved in sharing information or computing capabilities.
"Grid" is a term that is used increasingly often to refer to what we
might call "server-to-server" computing. In a Grid system, shared
resources such as powerful servers, databases, or scientific instruments are
integrated to support applications that need powerful capabilities not available
at a single location. Users of Grid systems may access them via client-server
approaches.
All three forms of distributed computing share the Internet as their
communications utility, and have many attributes in common. It is also difficult
to classify many applications into only one of these three categories, because
the most powerful applications tend to combine aspects of all three forms.
For this reason, it is important to consider a wide range of application types
in order to determine the impact that would be felt with the introduction of
regulations aimed at a particular software genre. This is not unlike the work
that we do in the Global Grid Forum, where we consider the broader impact of any
changes to a protocol or interface standard.
As with other Internet technologies such as the World Wide Web, it is difficult
to predict what new applications will be enabled with new capabilities.
Peer-to-peer technology is a good example, and in the research community we find
a number of promising applications that are being developed and evaluated.
We see potential uses of "peer-to-peer" technology in many venues
where information - whether scientific, clinical, or educational data - is
shared among a large population of potential users. For example, the "OceanStore"
project at the University of California-Berkeley is using peer-to-peer
techniques to provide highly available, virtually "indestructible"
storage systems that assume the underlying servers will be neither reliable nor
secure. Groove Networks is a commercial software firm that uses peer-to-peer
technology to create secure collaboration services for distributed teams,
allowing individuals to work closely "together" despite being spread
across many time zones. And many Grid applications share some aspects of
peer-to-peer, as I discuss below.
2. Practical Scientific Applications Using Distributed Computing Technologies
I would like to focus on three applications of distributed computing technology.
These are illustrative of the type of applications being developed on today's
Internet, each of which uses a variety of distributed computing technologies. In
each of these cases, peer-to-peer software has the potential for extending data
sharing capabilities to a much broader audience than the current scientific
collaborations, however none are using peer-to-peer software today.
The first involves predicting and response to severe weather, which causes
hundreds of lost lives and some $13B in economic loss annually. Here I describe
the work of Professor Kelvin Droegemeier, director of the Center for Analysis
and Prediction of Storms, and his colleagues. Weather applications are aimed at
improving the nation's infrastructure for predicting and preparing for severe
weather.
The second involves biomedical research aimed at understanding brain-related
disease ranging from Alzheimer's to attention deficit disorder. Dr. Mark
Ellisman, director of the Biomedical Informatics Research Network (BIRN), and
collaborators at twelve U.S. universities are using the Internet to create
highly secure, nation-wide research and clinical data-sharing capabilities. The
BIRN project uses the Internet and distributed computing and information
technologies to create infrastructure aimed at improving biomedical research by
enabling researchers throughout the United States to collaborate on large-scale
studies of human disease with unique, multi-resolution tools. BIRN uses
technology that is quite similar to peer-to-peer software, albeit with much
greater control over security, access and authorization.
The third application is the analysis of urban air quality and airflow, seeking
to understand the impact of both existing pollutants and potential effects of
airborne toxins from events such as fires or explosions. This application
includes the work of Dr. Alan Huber and colleagues from the Environmental
Protection Agency's National Exposure Research Laboratory, Argonne National
Laboratory's Environmental Assessment Division, and Fluent, Inc., a commercial
software provider.
2.1 Severe Weather Prediction and Early Warning
The Center for Analysis and Prediction of Storms (CAPS) at the University of
Oklahoma engages in basic and applied research in storm-scale data assimilation
and numerical weather prediction, with several ongoing programs in collaboration
with colleagues around the country. The work is aimed at integrating weather
sensors and computer models, using high-performance computers and the Internet,
to rapidly model evolving weather patterns in order to predict destructive
storms in time to provide advanced warning.
Beginning in 1998, for instance, CAPS worked with the University Corporation for
Atmospheric Research (UCAR) Unidata Program, the University of Washington, the
National Severe Storms
Computer Simulation Mobile Doppler Radar

Figure 1: Comparison of a computer model with radar data, showing characteristic
"hook echo" indicating conditions conducive to formation of tornados.
The figure illustrates the advanced capabilities of today's weather models. CAPS
uses the Internet to integrate sensors, databases, and computers in order to
provide advanced, precise warning of severe weather.
Laboratory (NSSL), and the WSR-88D Operational Support Facility (now the
Radar Operations Center ROC) to establish the Collaborative Radar Acquisition
Field Test (CRAFT) project. The goal of CRAFT was to demonstrate the real time
compression and Internet-based transmission of NEXRAD data from multiple radars
with a view toward nationwide implementation. CAPS is currently working with the
National Weather Service to transition the CRAFT system into an operational
service.
To further advance sensor capabilities, CAPS is working with the Center for
Collaborative Adaptive Sensing of the Atmosphere (CASA) at the University of
Massachusetts at Amherst, to revolutionize the remote sensing of the lower
troposphere, initially via inexpensive, low-power, phased array Doppler radars
placed on cell towers and buildings. A unique component of this project is that
the sensors interact with one another, using the Internet to dynamically adjust
their characteristics to sense multiple atmospheric phenomena while meeting
multiple end user needs in an optimal manner. These communications and data
sharing techniques are similar to what is typically classified as peer-to-peer.
Computer models have been used to predict long-term weather trends for several
years. However, in order to predict severe weather with sufficient precision and
in a time frame to allow for early warning, high-performance computing systems
are essential. Several years ago CAPS developed computer-based storm prediction
capabilities to identify severe thunderstorm activity with roughly 4 hours
notice. This amount of time was sufficient, for example, to inform airlines of
pending thunderstorms at major hubs, allowing those airlines to delay flights
prior to takeoff in order to ensure that landing would be possible, greatly
reducing the cost of diverting aircraft once in the air.
Today CAPS also leads an NSF-funded project called Linked Environments for
Atmospheric Discovery (LEAD), which aims to create capabilities for analysis
tools, forecast models, and data repositories to function as dynamically
adaptive, on-demand systems. These systems will change configuration rapidly and
automatically in response to the evolving weather, responding immediately to
user decisions based on the weather problem at hand, and enabling the steering
of remote observing systems to optimize data collection and forecast/warning
quality. The goal of such systems is to provide precise information about the
predicted path of destructive weather, such as tornados, in a timeframe that
permits citizens to prepare for, rather than react to, such weather.
2.2 Nationwide Sharing of Biomedical Research Data

The Biomedical Informatics Research Network (BIRN) is an initiative sponsored by
the National Institutes of Health (NIH) and National Center for Research
Resources (NCRR). BIRN fosters large-scale biomedical science collaborations by
utilizing emerging distributed computing technologies and the Internet,
including applications distributed among high-performance computers, databases,
and new software and data integration capabilities developed within the project
and elsewhere.
The BIRN currently involves a consortium of 12 universities and 16 research
groups participating in three testbed projects centered on the brain imaging of
human neurological disease and associated animal models. Some
BIRN groups are working on large-scale, cross-institutional imaging studies on
Alzheimer's disease, depression, and schizophrenia using structural and
functional magnetic resonance imaging (MRI). Others are studying animal models
relevant to multiple sclerosis, attention deficit disorder, and Parkinson's
disease through MRI, whole brain histology, and high-resolution light and
electron microscopy.
These studies are being used to drive the definition, construction, and daily
use of a "federated data system." Federation presents biological data
held at geographically separated Internet sites to appear as a single, unified
and persistent data archive. Data is securely accessed across institutional
boundaries to address issues of data privacy and automatic translation of data
formats. Most of the groups participating in the BIRN have traditionally
conducted independent investigations on relatively small populations, using
site-specific software tools.
The promise of the BIRN is the ability to test new hypotheses through the
analysis of larger patient populations and unique multiresolution views of
animal models through data sharing and the integration of site independent
resources for collaborative data refinement. To accomplish these goals, the BIRN
project will continue to rely on innovative distributed computing technologies
on the Internet.
2.3 Air Quality and Impact of Airborne Biological or Toxic Agents
Understanding the pathway of toxic air pollutants from source to human exposure
in urban areas is of critical interest to the US Environmental Protection
Agency, and has been an ongoing activity. Rapid assessments of risk, such as the
migration of toxic gases related to major fires or chemical spills, are vital to
first responders, local officials, federal officials, and the public. The
scientific shortcomings are especially serious for incidents that occur in an
urban center where the understanding of airflow around large buildings is poor.
Computational fluid dynamic (CFD) simulations have long been used in the
aerospace and automotive industries to evaluate airflow around planes and cars,
and increasingly in biomedical applications such as the modeling of blood flow
through the heart. CFD techniques also have the potential to be employed to
describe the flow of pollutants (be they a plume from an event such as an
explosion or fire, or be they the dispersion of some pollutant or agent) in the
complex terrain that our urban areas represent.
EPA scientists in the National Exposure Research Laboratory are working with
Argonne National Laboratory's Environmental Assessment Division and Fluent, Inc.
in a computational laboratory setting to test and use high fidelity CFD
simulations of the spread and transport of contamination in urban building
environments. In addition, the EPA - Argonne collaboration will also explore the
possibility of developing or adapting the products from CFD simulations to
support rapid exposure and risk models to potentially guide urban emergency
response and emergency management for chemical, biological or radiological
attacks or accidents.
As part of this investigation, EPA and Argonne scientists will use the Internet
to exchange databases, simulation results, and other types of data. Experiments
will be done using several forms of distributed computing on the Internet. One
approach to be explored is the use of Fluent's Remote Simulation Facility, a
Web-based "portal" that allows users to upload data from their
computers to run a simulation on Fluent's computers. Another will be to use
supercomputers at Argonne in client-server mode, and a third approach will
attempt to couple Fluent's portal with supercomputers in NSF's TeraGrid project.
All of these approaches are likely to be useful for some types of work, and some
may employ technology that has functionality similar to peer-to-peer software.

Figure 3: The EPA's National Exposure Research Laboratory, Argonne National
Laboratory's Environmental Assessment Division, and Fluent, Inc. are exploring
airflow models to understand the flow of pollutants in urban areas and to
evaluate the potential for using distributed computing systems to provide
airflow models to guide emergency response. (Image courtesy of Alan Huber, EPA)
Concluding Remarks
As a father and as a citizen I am very concerned about the availability of
inappropriate material on the Internet. I would like to make several comments
specifically regarding H.R. 2885.
The proposed requirements for clear and prominent notice would, in my view, be
useful for software in general. Typical software end user license agreements are
incomprehensible to average people. We have standard labels on food products to
help consumers determine nutritional value, and dangers. A similar program for
software could be designed to cover the disclosure proposed in H.R. 2885 as well
as disclosure regarding privacy and security risks.
It is also very important to provide the user with clear and prominent notice
regarding information sharing status and to require that file sharing be
explicitly enabled at the user's discretion, not without their knowledge. Indeed
some of the peer-to-peer software I have seen in recent months has already moved
in this direction.
The proposed "do-not-install" beacon is an interesting idea, and I
would encourage the committee to engage leaders from the software industry in
exploring this idea and possible implementations.
I note that the H.R. 2885 definition for "peer-to-peer" software, as
written, covers nearly all Internet software that I am aware of, including Web
software, instant messaging software, and file transfer programs. In addition,
the exclusion of software that is "marketed and distributed primarily for
the operation" of networks implies that functionality built into computer
operating systems (such as Windows or MacOS) would be excluded from these
requirements. In practice, this would place a greater software engineering and
support burden on small companies developing Internet software than would be
placed on large companies adding new functions to operating systems software.
This would put small software companies at a distinct disadvantage relative to
their larger competitors.
In summary I would like to commend this committee for taking on this complex set
of issues. I would also respectfully encourage the committee to engage leaders
and experts from the software industry to work together toward achieving what I
believe to be a common goal of protecting our children, and our privacy, while
continuing to encourage innovation in this country using the Internet. This will
require much more precision in the definition of the software to be regulated.
Thank you very much for the opportunity to speak with you.
Biographical Sketch: Charles E. Catlett
Charles E. Catlett is a senior fellow at the University of Chicago and Argonne
National Laboratory, executive director of the TeraGrid project, and chair of
Global Grid Forum. He was the initial network architect of the TeraGrid network,
the world's highest capacity open research network, operating at 4 times the
capacity of the most advanced national backbone networks. TeraGrid, an
NSF-funded project, is using this network to deploy a 25 teraflops computational
Grid system integrating computers, storage systems, visualization facilities,
and scientific instruments located in nine institutions nation-wide. TeraGrid
partners include the University of Chicago and Argonne National Laboratory,
Caltech, the National Center for Supercomputing Applications (University of
Illinois at Urbana-Champaign), the Pittsburgh Supercomputing Center (University
of Pittsburgh and Carnegie Melon University), the San Diego Supercomputer Center
(University of California-San Diego), Indiana University, University of Texas,
Oak Ridge National Laboratory, Purdue University, and Georgia Institute of
Technology. Scientists use the nation-wide Internet2 network, Abilene, to access
TeraGrid from hundreds of universities and laboratories.
Charlie is the founding chair of the Global Grid Forum (GGF), the preeminent
Grid middleware standards body established in 1999 and with over 50 technical
groups developing software specifications, best practices, and informational
documents for distributed computing. Several thousand GGF participants come from
over 30 countries and some 200 organizations including over 100 companies. As
founding chair, Charlie led the development of GGF's standards processes,
organization, governance, and culture.
In 1999, with $8M in funding from the State of Illinois, Charlie directed the
I-WIRE project, creating a dedicated fiber optic facility connecting 10
universities and commercial telecommunications hubs in Illinois to support
advanced distributed applications research and educational projects. I-WIRE
enables projects such as the Starlight international optical networking hub, the
NSF TeraGrid Backplane network, and the NSF-funded Optiputer project.
Prior to joining Argonne in 2000, Charlie was chief technology officer at the
National Center for Supercomputing Applications (NCSA). Charlie was part of the
original team that established NCSA in 1985 and his role included the early
development of the NSFNET, which deployed Department of Defense networking
technology from the ARPANET project to create a national research and education
network. NSFNET formed the basis for today's Internet through its
commercialization in the early 1990s.
With Larry Smarr, NCSA's founding director, Charlie co-authored a seminal paper
in 1992, "Metacomputing," in the Communications of the ACM, which
contributed to the concept of Grid computing. That same year Charlie's paper
"In Search of Gigabit Applications," published in IEEE Network,
received the Fred W. Ellersick award for best paper in an IEEE journal. His most
recent publication is "Standards for Grid Computing: Global Grid
Forum," in the inaugural issue of the Journal of Grid Computing.
Charlie can be reached by email (cec@uchicago.edu) or telephone
(+1-630-252-7867). His often out of date web page is http://www.ggf.org/people/catlett/
3 More Information on Projects Cited
The TeraGrid Project (http://www.teragrid.org) is funded by the National Science
Foundation (NSF) Directorate for Computer and Information Science and
Engineering (CISE) under the direction of Dr. Peter Freeman. http://cise.nsf.gov/
OceanStore. University of California-Berkeley, Computer Science Division, Prof.
John D. Kubiatowicz
et. al., http://oceanstore.cs.berkeley.edu/
Groove Networks. Groove Networks, Inc. Ray Ozzie et. al., http://www.groove.net
Center for Analysis and Prediction of Storms, (CAPS): http://www.caps.ou.edu/
Biomedical Informatics Research Network (BIRN): http://www.nbirn.net/
Environmental Protection Agency National Exposure Research Center: http://www.epa.gov/nerl/
Argonne National Laboratory Environmental Assessment Division: http://www.ead.anl.gov/
Fluent, Inc. Remote Simulation Facility: http://www.fluent.com/software/rsolve/
Global Grid Forum (GGF): http://www.ggf.org
Internet2: http://www.internet2.edu
Starlight: http://www.startap.net/starlight
Optiputer: http://www.optiputer.net/
I-WIRE (Illinois Wired Infrastructure for Research and Education): http://www.i-wire.org/
|