National Laboratory for Applied Network Research

National Laboratory for Applied Network Research
Measurement and Operations Analysis Team

First Quarterly Report
April 1998 to June 1998

  1. Summary

    At the current time, when the scope of the Internet is dramatically expanding in terms of global ubiquity, performance, and degree of interconnections, understanding of the increasing complexity and decreasing manageability at its systemic level is declining. Today's network services are assessed in perceptions and probabilities, drifting from the mathematical assessment of early and simpler networks toward the unpredictable complexity of a biological organism. As a result, more and more service providers fight user perceptions that put their services into a bad light in terms of reported performances, arguing that their component network is in good shape, and that the problem must be somewhere else. At the same time, participants in the global Internet are often inconsiderate of a common fate-sharing model where the union of ISPs has to provide predictable and verifiable performances.

    A quick verification of global routing tables in http://moat.nlanr.net/ASPL shows that commonly, the distance between BGP-advertised Internet entities traverses 4 or 5 or more Autonomous System numbers, roughly mappable to service providers. This illustrates that the singular service parameters of an ISP are important, but typically not reflective of the end-to-end performance a user can expect.

    In response to these issues, NLANR is attempting to create a network analysis infrastructure, focusing on the vBNS sites, but extending to include the HPC community. It is not the intent to give all-encompassing answers to these problems, but to gather more insight and knowledge about the inner workings of the Internet. The NLANR measurement and analysis activity is complemented by collaboration with CAIDA (Cooperative Association for Internet Data Analysis). While NLANR's scope of work focuses on the academic research and HPC agenda and clientele, CAIDA focuses their collaborative measurements and network analysis on private Internet service providers.

    The objective of an analysis infrastructure is to operate beyond just obtaining network measurements -- because it is not good enough to create data silos containing large amounts of measurement data that is rarely looked at. It is critical to define analysis objectives early on, while retaining enough flexibility to adapt to new and extended requirements. The results of the NLANR network measurement and analysis activities are documented at http://moat.nlanr.net.

  2. Measurement objectives

    The parameter space for Internet measurements has to encompass at least three dimensions:

    This is different from more confined services, such as the circuits of a telecom voice connection, with guaranteed service at a constant, relatively low bandwidth, and a bell-curve duration distribution around a few minutes. The Internet encompasses a much broader area, making it critical to understand workloads and performances.

    For the NLANR measurement and analysis work, four areas of measurement data are currently being considered: passive workload profile assessments, active performance measurement, SNMP/MIB based statistics data, and stabilities and status of Internet routing.

    Activities in those areas will be part of the NLANR work scope, though the initial focus will be on passive workload profile assessments, given that there is quite a bit of activity in the other areas at sites throughout the Internet.


    1. Passive workload profile assessments

      Passive measurements are non-invasive relative to the observed networking environment. They will not impact the performance of the network. An example of this type of passive measurement is the OCXmon/Coral monitors, which tap into the light of a fiber interconnection by means of optical splitters, and collect packet header traces.

      Packet header traces can generate an immense amount of data, and, unless there is a compelling reason to keep the whole data set, it is usually critical to abstract the data as quickly and locally as possible - after that, of course, information contained in the full traces is lost. Abstractions can happen at a central data collation location, or, if desirable (to avoid high volumes of data transfer), at the location where the data is being collected. Sometimes it may be necessary to further abstract already abstracted data, which may include processing at both the data generating site as well as a central collection site.

      If data abstraction happens at the location where the data is being collected, the two options are to do it in real-time or non-real time.

      Real-time abstractions allow for a continuous operation, but are limited by the availability of local CPU and memory resources to accommodate the analysis at the speed in which new data is coming in. This becomes an especially difficult issue when complex analyses for in-depth understanding of the networking environment are desired, or in high traffic load situations. In such cases the local collection/analysis machine turns into a bottleneck.

      A non-real-time analysis assumes staggered data collection and data analysis phases, followed by a phase where the analyzed data be transferred to a central location, and the data collection restarted. However, even in this case, highly complex analyses can take a very long time. This methodology would reflect a reality sampled in time, as the measurement phases are interleaved with the analysis phases.

      The ability to do real-time and continuous analysis conflicts with the need for increasingly complex analysis requirements:

      The speed of the data collection facility limits the degree of analysis complexity, while still being able to keep up with the data generated in parallel by the monitor. Hence careful consideration must be given as to where and when analysis will happen. In most complex cases, packet traces have to be collected and analyzed offline, with its major data volume/transmission impact on the network for high performance collection environments. Packet traces also allow to retain raw and non-abstracted data for later use in new analysis concepts. Long-duration packet traces can help to establish a baseline of how the measured environment behaves over a longer period of time, and whether there are long-term variations in the traffic profile.

      The prime example for a passive measurement tool for the NLANR project is OCXmon/Coral, which allows for OC3 and OC12 based environments to be monitored, and traffic traces be analyzed, without impact to the network by the measurement itself. Strategic network-centric locations of significant traffic aggregation will form the nucleus for deployment of these data collection tools.

      An analysis application of OCXmon considers active flows, based on work done at SDSC starting in the early 90's.

      This model of flows considers transaction impacts as the network sees them, and is useful for a first level of traffic aggregation properties. It assumes that a flow will start when specific criteria are met, and will continue until a timeout expires for the particular flow, at which time the timeout is subtracted from the flow duration. Flows can be defined and possibly aggregated relative to multiple criteria, such as source hosts, destination hosts, host reductions to networks, quintuples of hosts, protocol, ports, and so on. Flows can allow insight into packets, bytes, and durations of network transactions, relative to a specified flow criteria.

      It should be pointed out that there are other alternatives for workload profile measurements, such as Cisco flow monitors, which are not entirely passive, insofar as they impact the packet forwarding performance.

      An initial OC3mon/Coral test environment utilized the OC3mon firmware's ability to transmit receive data out of the transmit line, avoiding the need for optical splitters:

      This allowed for controlled traffic flows and accounting for each packet passing by the measurement machine. It also helped find and fix a number of problems with the already available, more experimental version of a Unix OC3mon, and bring it toward a deployable state.

      Coral packet traces use a crl format similar to the one developed by Joel Apisdorf (MCI) for the DOS version of OC3mon (60 bytes per packet):

      This is different from the fr (24 bytes per packet) format we used in previous analysis activities:

      or fr+ (44 bytes per packet), another version from our previous work:

      We have been using these two formats for network analysis activities at SDSC for a number of years. The newer crl format is less compact, but includes more data per packet, and allows to identify multiple interface sources in a trace. Over time we would like to generalize and modularize the ability to use new analysis code:

      Based on a library (libcoral) with well-defined interfaces, it should be possible to use the same analysis executable code to read traces off an interface of a file, and to do the analysis either in real-time or offline. Work for this is currently under way. This would also include the ability to do result visualization, either local to the traffic collector, in a distributed mode between the collector and a remote visualization engine, or as offline analysis/visualization from collected files:

      Measurement topology

      We have begun to deploy OC3mon measurement machines to a number of sites:

      A great deal of appreciation is due to those sites for their willingness to work with us. Several are not only willing to house a monitor, but are also interested in involving students and faculty in the analysis research. The sites marked green are up and operational. The sites became operational on:

      • 7 March 1998: NASA-Ames (AIX/MAE-West or NREN)
      • 22 April 1998: Argonne National Laboratory
      • 12 May 1998: MCNC (first of the new OC3mon hardware design)
      • 17 June 1998: SDSC (FDDI monitor on SDSC DMZ)
      • 30 June 1998: NCAR
      • 2 July 1998: Florida
      • 5 July 1998: Texas
      • 17 July 1998: Old Dominion University
      • 24 July 1998: Ohio State University

      The red location has a machine on site, but not yet installed. We are in discussions with the blue site about potential deployment, if resources permit. The white sites have expressed interest in collaboration. Of the white sites, UCLA is an extremely likely deployment site. The University of Washington would be an important site, but will do "Packet Over Sonet," and is willing to be an early test site as such equipment becomes available.

      An expanded list of the vBNS HPC sites can be found in http://www.vbns.net/logical.html.

      An early testbed for OC3mon deployment was possible due to our collaboration with people at NASA-Ames. The NASA-Ames people (under Bill Jones) have been very cooperative for a number of years, initially in the context of allowing for FDDI-based analysis at FIX-West.

      The diagram above shows the OC3mon testbed, at the NASA Ames Inter-eXchange(AIX) and the MAE-West network interconnection of Metropolitan FiberSystems. The two environments are using four striped OC3 channels to interconnect Digital Gigaswitches, and an OC3mon measurement machine collects packet headers off the second of the four OC3 stripes.

      OCXmon passive traffic tracing

      OCXmon is a suite of flexible, affordable, high performance network statistics collection tools. Current development is focused on the OC3 (Coral/OC3) and OC12 (Coral/OC12), though other possibilities are under consideration. This current summary focuses on OC3mon for OC3networks.

      Coral is a continuation of the MCI OC3mon activity (which was based on measurement and analysis tools assessing active flows for Ethernet and FDDI developed at SDSC), but adds additional analysis functionality. A more comprehensive presentation of the OCXmon environment can be found in http://moat.nlanr.net/Presentations/NAI.

      The initial development was done by Joel Apisdorf of MCI, for a DOS platform with a pair of FORE cards (one for each traffic direction) used as intelligent and programmable front-ends. Jon Dugan of NCSA, using initial work by Eric Hoffman, adapted the code to the FreeBSD 2.2.5 Unix platform. Jon used Joel Apisdorf's FORE firmware unmodified, and included his own code to interface the FORE controllers and the firmware to the flow analysis code. Independent code to write header traces to hard disk also exists for FreeBSD.

      OC3mon utilizes optical splitters to tap non-intrusively into optical fiber, to take a small fraction of the light for its own purposes, but without impacting the operation of the network. The observed fiber may have ATM switches or ATM hosts at its ends.

      The monitoring system is distributed. The downloaded software on the ATM cards excerpts packet headers, then pushes them across the PCI bus into main memory. The main processor then has the responsibility for post processing, including writing the data to hard disk.

      The ATM card subsystem shares the host bus with the memory and the rest of the host components, but has to be able to become bus master for both cards to control the bus for the data transfers.

      Several modules are available to be down loaded into the FORE ATM cards. These are tailored for different requirements. For example, while pca200e.bin is the standard module for the collection of the first cell of each packet, skipgigx.bin may have to be used on interconnections between Digital Gigaswitch nodes, in order to skip over the MAC level information.

      Once the ATM cards have delivered data into the main memory, the host processor can access the data, and decide what to do with it. This may include a real-time analysis function (such as real-time flow abstractions and analysis), collecting the raw data to a hard disk file, or staggered collection and post processing.

      Two reference implementations for OC3mon exist, one for Unix at http://moat.nlanr.net/Coral , and one for DOS at http://www.mci.net/~apisdorf/coral. This software and the analysis components are still evolving, and there will be changes over time.

      A number of results of the passive monitoring activity can be found at http://moat.nlanr.net.

      OCXmon Packet generator

      As an essential tool for verifying OCXmon performance, Joel Apisdorf has extended the functionality of the FORE front-end card in OC3 monitors to be able to generate cells based on an input file read by the host processor.

      The file input format used is similar to the packet trace format that OCXmon generates. Such a file can also easily be generated from a program.

      The need for a high performance OC3 packet generator arose when test machines connected to a data collector were able to fully utilize an OC3 connection for large packets, but were only able to generate about 40,000 minimum sized packets per second -- a fraction of the OC3 bandwidth.

    2. Active performance measurements

      Common applications of active performance measurements assess host reachability, packet losses, and throughput measurements. The throughput measurements may be windowed (e.g., using TCP) or non-windowed (e.g., plain UDP). These kind of activities are becoming more and more widespread, as any "common user" can undertake these tests by himself from his host to other hosts quite easily. An increasing number of Internet sites attempt to assess Internet performance figures by probing many sites from a handful of machines. They then turn these results into "Internet Weather Reports."

      NLANR will measure performance parameters from within the network, i.e., via probes deployed within the infrastructure itself. Strategic locations will be selected to be of use to our objectives. However, NLANR has an opportunity here to architect a nect generation of active measurement techniques, in the confines of the high performance vBNS environment and its sites.

      Tony McGregor, working on NLANR projects while on sabbatical from the University of Waikato in New Zealand, is taking the lead on designing the framework and implementation for the NLANR active monitoring. He recently began this sabbatical, and we hope to report significant progress in the next quarterly report.

    3. SNMP/MIB based statistics data

      An obvious candidate for (close to) passive data collection is using SNMP data from participating routers. In the vBNS context this has to consider what is already being collected by the MCI vBNS team, how often the collections happen, and at what locations. The NLANR measurement and analysis activity, while attempting to be an additional source of information to the community, should be complementary to the MCI work.

      As a result of this, the MCI vBNS team (thanks largely to MCI's Kevin Thompson) has been sending the whole day's SNMP tree to an NLANR machine since mid-May. However, not much analysis has been undertaken. The data is being stored, and an undocumented prototype of a VRML-based utilization object was created in the early days of receiving the data:

      The axes are measurement site, usage percentile, and usage value, with the first graph using a linear scale, and the second an logarithmic one. The blue plane represents a T1 (1.544Mbps) utilization level, and the red plane represents DS3 (45Mbps).

    4. Stabilities and status of Internet routing

      Internet routing considers today's complex mesh of interconnections between component networks and service providers. Since the ARPAnet days of a hierarchically structured core network are long over, loops across networks can form, and local change events can get globally distributed.

      The volume of routed identifiers (IP address/mask pairs) is a concern, as well as their churn. While the sheer size of the tables increases the burden on the packet forwarding engines, the churn impacts the ability to keep the global routing system current, and also introduces traffic loads as routing information is being exchanged.

      The routing table system also reflects the addressable network entities that can be reached, and what IP address space they utilize. A snapshot evaluation is demonstrated in http://moat.nlanr.net/IPaddrocc.

      Several months ago we created a 3-D rendered sample path visualization based on AS numbers from collected BGP data:

      More explanation on this can be found in http://moat.nlanr.net/ASx

      Further BGP/routing related information can be found in:

      In addition, we have been running a read-only BGP session with the vBNS since mid-May, to capture routing changes in a log file. However, no analysis on that file has been undertaken so far.


    Another important area of consideration is measurement and analysis in web areas, such as the covered in the IRCache activity. While this is not a direct part of NLANR/MOAT, both projects will be able to benefit from each other.

    The goal of these activities is to produce public analysis results, data sets, and methodologies of use to the Internet community. This is different from the CAIDA objectives (where they focus on using analysis results, data sets, and methodologies in an inter-ISP environment, but not necessarily for public distribution).

  3. Overall network analysis infrastructure

    To successfully design a system for the analysis infrastructure, layers of functions will have to be considered:

    The overall analysis infrastructure requires sets of machines and tools with various functionalities, including:

    The following depicts a conceptual implementation framework:

    The analysis infrastructure is supported by miscellaneous monitors which collect data, possibly perform local analysis, and send the results to a centralized collector for further analysis and storage. Some of the data will be sensitive in nature, e.g., it may contain IP addresses for which privacy issues have to be considered. Machines that may contain such sensitive information are marked in red. The yellow boxes refer to data that has lost some sensitivity as a result of the initial aggregation, but may still have to be treated as potentially sensitive. Green functions would contain freely distributable and desensitized data.

    Initial collection of data and local analysis will happen at the measurement location, followed by central analysis and data cataloging, and archived on NLANR hard disk, SDSC HPSS storage, or other media.

    The analysis system provides data for NLANR researchers doing network analysis on behalf of the project. It also provides for public dissemination of data and information regarding the network properties.

    This system will evolve further during the duration of this project, and the diagrams above should be viewed in the context of a conceptual framework only.

    For the time being, we have a 400MHz machine used for the "first stage storage and computation" with about 45GB of disk space. The"external presentations" server is moat.nlanr.net with its web interface, and about 15GB of disk space.

    For the data being collected so far we have three different organizational descriptors:

    moat.nlanr.net has the collected and public data available on the web server, with the dimensions in a hierarchical file structure based on the data origin, project name, and data collection date. A program is running one per hour, to create the appearance of a web-based "datacube" that allows arbitrary access to the three dimensions.

    The datacube can be accessed via "http://moat.nlanr.net/Datacube"

  4. Summary for this reporting period

    Over the past months progress had been made to develop an NLANR Network Analysis Infrastructure ( http://moat.nlanr.net/NAI) (slide presentation at http://moat.nlanr.net/Presentations/NAI) in the context of the NLANR ( http://www.nlanr.net, specifically http://moat.nlanr.net) project. This activity should help us to derive a better understanding of systemic service models and metrics of the Internet. This includes passive measurements based on analysis of packet traces to, e.g., derive workload profiles, active measurements which probe service properties, SNMP information from participating servers, and Internet routing related information based on BGP data.

    A number of people from several sites are participating in this activity, especially those collaborating on deploying measurement machines (some of whom have expressed interest in student and faculty involvement on the analysis side). With the status of the deployment of passive measurement machines (nine fully operational, one waiting to be installed, and several sites we are discussing things with), we are getting closer to the state of an usable infrastructure for this kind of activity.

    For the machines that are operational, we are currently running automated data collections plus post processing and a rudimentary web interface. This is modeled after the collections we started doing at FIX-West a few years ago, which is available at http://www.nlanr.net/NA/FIX/Stats/West/index.html. For some minutes per hour a packet trace gets collected, analysis software runs on the trace, the results is transferred to a server in San Diego, and are made available on the web server in http://moat.nlanr.net/Datacube. A number of packet traces, some fairly recent, some old, can be found at http://moat.nlanr.net/Traces. A range of other information can be found in the main http://moat.nlanr.net location.

    A recent analysis method was based on some questions raised by Jay Dombrowski (SDSC), regarding better insight into burstiness behaviors -- e.g., see http://moat.nlanr.net/Datacube/Data/SDC/Bspread/980726/901490404.crl-bspread.xmgr). This looks at bps rates for different observation durations, but requires xmgr to display the results.

    The newly deployed machines are based on the OC3mon work that Joel Apisdorf (MCI) did, and which followed the network analysis work performed at SDSC for a number of years (particularly the analysis work on active flows). However, we usually use a Unix (FreeBSD) adaptation to the OC3mon front-end, which allows us greater flexibility in the more research-oriented environment we are faced with. The Unix port was initially done by Jon Dugan of NCSA.

    The deployment sites and status are depicted in the image on http://moat.nlanr.net. The installed machines are FreeBSD based Coral/OC3mon monitors (rack-mountable, as described in http://moat.nlanr.net/OC3mon-monitors), with the exception of the machine at SDSC, which collects via an FDDI interface while producing Coral-compatible output. Pictures of some of the installations can be found in http://moat.nlanr.net/NAIImages, and we would like to include pictures from other sites as well.

    We have also purchased several machines for future active measurements that are similar to the OC3mon monitors (except without FORE cards or splitters), and Tony McGregor has begun development on those machines.

    We have started to put the machinery together as a system, including labeling the different sensitivities (red/yellow/green) of data. We are very cautious about protecting privacy and security for the measurement and analysis activities. Since we only consider packet headers, and not content, the main privacy exposure is IP addresses, which we encode and/or collapse if made available. We also attempt to protect the machines security-wise by disabling everything but ssh for external services.

    The aforementioned image also depicts significant storage needs, for which we are planning to purchase a large file systems. A RAID-based evaluation unit, to be shared between multiple SCSI hosts, did not work out for multiple host architectures.

    Multiple parties cannot do measurements at the same time, unless we create some strict scheduling. While the measurements have to be centrally organized (to protect privacy of the data and security of the site), there is certainly a lot of room for collaboration and new ideas. It would be valuable if we could arrange to get more students and faculty involved, especially since some of the sites have already expressed an interest.

    We created a nai@nlanr.net mailing list to promote common discussions and focus on analysis activities. This list is based on the e-mail from sites that expressed interest in deployment/collaboration, including the vBNS, NSF, NGI, and I2 people, plus a few others.

    Current list members include:

    NASA Ames:
    Mark Foster
    Wm. Prichard Jones
    Lance Tatman

    SDSC/UCSD:
    Jay Dombrowski
    George Polyzos

    NCAR/UCB:
    Scot Colburn
    Basil Irwin
    Caren Litvanyi
    Marla Meehl

    ANL:
    Linda Winkler

    Startap:
    Andrew Schmidt

    Merit:
    Abha Ahuja
    Craig Labovitz

    OARnet/OSU:
    Mark Fullmer
    Eugene Wallis

    Texas GigaPOP:
    Stan Barber
    Charles Chambers
    Farrell E Gerbode
    Lennart Johnsson

    FloridaNet:
    Matt Grover
    Ken Hays
    Dave Pokorney

    MCNC/NCREN:
    John Bass
    Mark Johnson
    Massimo Strazzeri

    UCLA:
    Mario Gerla
    Ronn Ritke
    Lixia Zhang

    ODU:
    Glen Wheless

    CSUSB:
    Yasha Karant

    Vanderbilt Univeristy:
    Esfandiar Zafar

    University of Pennsylvania:
    Robert Hollebeek

    vBNS:
    Joel Apisdorf
    John Jamison
    Greg Miller
    Kevin Thompson
    Rick Wilder

    NSF:
    Javad Boroumand
    Bill Decker
    Steve Goldstein
    Don Mitchell

    NGI:
    Phil Dykstra

    Internet2:
    Guy Almes
    Steve Corbato
    David L. Wasley

    Others:
    Fred Baker
    Noah Breslow
    Steve Feldman
    Ian Graham
    Todd Kaloudis
    Ahmed Mokhtar

    NLANR:
    Kathy Benninger
    Mark Gates
    Jamshid Mahdavi

    SDSC NLANR or CAIDA:
    Nancy Bachman
    Hans-Werner Braun
    Jeff Brown
    Jambi Ganbar
    Sean McCreary
    Tony McGregor
    Tracie Monk
    David Moore
    Evi Nemeth
    Mike Tesch
    Jenniffer Woodson

  5. Hiring

    Mike Tesch (formally working for CAIDA) is developing a Unix version of OC12mon, based on the OC12 card and interface that Joel Apisdorf is using. We have discussed the need for a defined interface between the collection and the application - an interface that can be used as a denominator between multiple data collection processes, as well as analysis modules. Mike is a student from the University of Wisconsin, Madison, working with us until January, 1999.

    Jeff Brown is a NLANR summer intern working with Mike on measurement and analysis software. His objective is to derive a modular approach, where analysis software can be used either in real-time (on the measurement machines), or offline on collected packet header traces. Jeff will start with fs2flows as an initial module, and create a boilerplate. Jeff and Mike will work closely together on these two projects, so that all of the collection/interface/analysis modules will interoperate smoothly.

    Tony McGregor is working on NLANR projects while on sabbatical from the University of Waikato in New Zealand. Tony will take the lead on designing a framework and implementation for the NLANR active measurements for the vBNS network, as seen from client sites. The contained vBNS environment and collaboration with MCI (as the vBNS service provider) provide a great opportunity to think about "next generation" measurement strategies.

    Brynjar Viken, a Ph.D. student in the Department of Telematics at the Norwegian University of Science and Technology, will be joining NLANR for six months, beginning in September, 1998. The working title of his Ph.D. thesis is "Traffic in High Capacity Networks," and his current research focuses on measurements of Internet traffic and integrated services in the Internet. Last year he worked on projects for Uninett, the Norwegian academic network for research and education, where he developed software to perform NetFlow measurements on Cisco routers. We would like to involve him in the areas of SNMP or routing, but don't yet have a feel for the exact nature of what he will be working on.

  6. Outreach

    Hans-Werner Braun attended the Internet2 Second GigaPoP OperatorsWorkshop ["GO2"] in Research Triangle Park, North Carolina, and made an MBONE presentation on OC3Mons at the VBNS Techs meeting on Monday, June 1st, 1998.

    He and Jenniffer Woodson attend the weekly NLANR conference calls.