Bill St. Arnaud: Scaling issues on Internet networks

New Message Reply About this list Date view Thread view Subject view Author view

From: Joerg Micheel (joerg@cs.waikato.ac.nz)
Date: Mon Jan 15 2001 - 13:18:08 PST


My apologies to people already on Bill's list, but this thought-provoking
paper really is looking for an answer from the Internet Data Analysis
community.

        -- Joerg
---------------------------------------------------------------------------

For more information on this item please visit the CANARIE CA*net 3 Optical
Internet program web site at http://www.canet3.net/news/news.html
-------------------------------------------

[This paper can also be viewed at the CANARIE web site at
http://www.canet3.net/library/papers.html - BSA]

Scaling Issues on Internet Networks

Bill St. Arnaud, CANARIE Inc
bill.st.arnaud@canarie.ca

(Draft --- Draft ---Draft --- For Discussion Purposes Only)

You are free to distribute this paper as long as the following the
disclaimer is included:

“This is a discussion paper intended to provoke further thought and debate
on the implications of Internet growth and scaling. These arguments and
opinions presented here are mine alone and do not necessarily reflect those
of the CANARIE board, management or members.”

January 13, 2001
Abstract

The Internet continues to experience phenomenal growth with traffic volumes
doubling approximately every year. It is postulated in this paper that there
are in fact two drivers causing this growth – increased usage and an
increasing number of connections to the Internet. The number of connections
“N” is defined by the number of simultaneous TCP or UDP sessions on the
network connected computers. While the number of users and connected
computers may grow linearly, peer-to-peer applications and other services
will significantly increase the size of N. Consequently N is unbounded and
will continue to grow with the number of connected computers and by the
average number of simultaneous sessions per computer which is only limited
by Moore’s Law. As a consequence, the current hierarchical central managed
network paradigm for carrying Internet traffic may face serious scaling
issues due to a fundamental laws of graph theory referred to as the “N
squared” phenomena where the number of interconnections increases by the
square of the number of nodes on the network. One possible outcome of this
phenomena is that the capacity at the core of any Internet network will
continue to increase even if the number of customers and the size of the
offered load remains constant. The “N squared” phenomena is further
aggravated on the Internet as compared to traditional networks because of
its lack of hierarchical applications, greater temporal self similarity
(“burstiness”), wider spatial self similarity (“slosh”) and significantly
greater distance invariance. E-mail may continue to remain the killer
application for the Internet as it puts far greater topological stress on
the network than its traffic volumes would otherwise indicate. Further
research needs to be carried out on this phenomena, but if it proves valid,
alternative network architectures need to be investigated. A hierarchical
network architecture originally designed for a single hierarchical
application with a mature and relatively stable N – voice telephony may not
map well to a network made up of essentially peer to peer applications
either at the server level (e-mail, news, web) or user level (Napster,
Gnutella). One possible alternative network architecture that is more
closely aligned to the overlay application architecture and which may be
less susceptible to the N squared phenomena is where connections radiate
from network intelligent devices at the edge which interconnect to each
other on a peer-to-peer basis. The significant economic advantages of
direct peering between ISPs is an early example of how this approach might
have already arisen in the marketplace in response to the N squared
challenge. This “edge radiating” network architecture with direct peering
interconnections may be now extended beyond large ISPs to other
organizations given the recent availability of customer owned dark fiber and
wavelengths. This architectural approach may provide significant new
business opportunities for those carriers interested in selling “pipes” to
Internet exchange points rather than providing a managed Internet service.
It will also may underpin architectural concepts for “community” based fiber
networks interconnecting businesses, schools and other organizations to a
“community” based carrier neutral exchange point. This architecture seems
similar to that which exists in other “natural” networks and in particular,
the brain where neurons interconnect to each other with long tentacles
called axons. As such, the architecture approach might also open some new
areas of research in self aware networks, artificial intelligent grids,
optical grids, object oriented networks and other “networked” based
approaches to problem solving.

Introduction

Current Internet networks are built on network architecture paradigm that
remains unchanged from the early days of telephony where traffic is
aggregated in a classic “snowflake” pattern of hierarchical nodes. This
network architecture has worked well for traditional telephony when there is
single hierarchical application (telephony), a considerable degree of local
traffic versus long distance and where large multiplexing ratios can be
achieved at each node. However, increasingly network engineers are
beginning to wonder if this network architecture model will scale with the
huge growth in traffic volumes on the Internet. There has been some
anecdotal evidence that the core infrastructure of the Internet may actually
be growing faster than the offered load. If this proves to be a systemic
condition then network engineers and researchers may have to re-evaluate the
fundamental network topology of the Internet.

To understand some of the underlying conditions that may be driving this
growth we must first return to some basic principles on the structure of
networks.

What is N?

We know from basic graph theory that the number of lines required to
interconnect a set of nodes (often referred to as “N”), on a network is
related to the square of the number of nodes. The exact formula to
calculate the number of interconnecting lines between N nodes is N*(N-1)/2.
For large values of N this approximates to N**2. This basic relationship in
networks is called the “N squared” phenomena because the number of
interconnections increases by the square of the number of nodes. For large
N, this can have a dramatic affect on network topology and cost.

But what are the nodes “N” in an Internet network? Is “N” the number of
humans using the network, the number of devices connected to the network, or
some other parameter? Understanding what is “N” is fundamental to the
arguments presented in this paper.

In a telephone network N is very easy to define. It is the number of
telephone lines in the network. It is important to note that “N” in a
telephone network is not the number of telephones per se. Although there may
be many telephones in a home or office, the number of simultaneous phone
calls is limited by the number of available telephone lines.

Telephone lines for the most part connect to telephones who for the most
part are used by humans. Because telephones can generally only used by
humans, the growth in the number of telephone lines in a traditional
telephone network is then ultimately limited by the number of humans.

Human brains are not very effective at multitasking voice conversations. As
a consequence because humans can typically only carry on one voice
conversation at a time there is a limit to the number of possible
connections in a telephone network to a given destination which is
ultimately related to the number of humans at that destination.

For example it would be impossible for a million people to call Washington
DC and all talk to the President of the United States at the same time.
Therefore telephone companies implement “call blocking” services throughout
the network to limit the number of simultaneous connections to Washington DC
based on the number of people living in the capitol and ultimately just one
connection to the President of the United States.

On the Internet on the other hand, it is quite conceivable that a million
connections could be made to the White House web server. The important
thing to note here is that although there may be “one” server, the number of
connections “N” is significantly greater. If each one of these connections
used the same bandwidth as a voice call then the network capacity to the
White Houser web server would be a million times that of the network
capacity to the President’s telephone.

So on the Internet “N” is not set by the number of connected computers but
by the number of simultaneous sessions that can be sustained on those
computers. More importantly with computers the number of simultaneous
connections is only limited by the processing power of the computer. As
computers grow ever more powerful they will be able to support greater and
greater numbers of simultaneous connections. Accordingly on the Internet
the size of N is only limited by the number of connected computers and their
respective processing power. But we know that processing powers is related
to the CPU power which increases year to year by the famous Moore’s Law
where of doubling CPU power every 18 months. But also memory bandwidth, disk
size and speed affect the overall processing power. Consequently even if the
number of computers connected to a network remains the same, the
 “connection” load will increase as the computers increase in power and
speed because the same number computers can support more simultaneous
sessions.

There are significant implications of “N” being equal to the number of
simultaneous sessions and not the number of connected computers. For
example many people still connect to the Internet through a dial up line.
Because of the limited bandwidth of dial up connections they can only
support a small number of simultaneous connections. Therefore the impact of
computers with dial up connections on the size and growth of the Internet is
limited.

Another example is Internet appliances. Although there are many
predications of thousands of wireless devices and appliances connecting to
the Internet the impact they will have on network capacity may be limited.
These devices will be generally limited in computing capability and usually
have very narrow bandwidth connections to the network and as such they will
likely only be able to support a small number of simultaneous connections.
However, their impact will grow as wireless bandwidth increases and the more
powerful devices come on to the market.

On the other hand peer-to-peer applications and server applications like
e-mail or network news can support many tens if not hundreds of connections.
Take Napster for example. Although it is not a true peer-to-peer
application the impact on networks is the same as with any true peer-to-peer
application. A typical Napster user will click on several songs to download
at the same time. Meanwhile in the background Napster is serving up songs
from the user’s hard drive to other computers on the network. The result is
that a single peer-to-peer application can support many simultaneous
connections.

Communication between e-mail servers is another classic peer to peer
application. Most computer users don’t send email directly from their
computer to another user. Instead they send their mail to a “server” that
then forwards and redistribute the e-mail to another set of servers that
holds mail for the designated recipients. In essence, e-mail is a peer to
peer application with at least one level of hierarchy.

With the explosion of list servers and “spam” e-mail servers can easily have
several tens if not hundreds of connections open at the same time. And
because there is only one level of hierarchy it is quite common for the same
e-mail message to be replicated and transmitted many times over and over
again across the same network links.

Today a home computer may only be operating one or two simultaneous
sessions – mostly sending or receiving mail and/or surfing the web. With
Napster that same computer could be supporting a half dozen or more
simultaneous connections uploading and downloading music files. Eventually
as Home LANs and wireless networks appliances become prevalent within the
home that same computer gateway could be supporting hundreds of simultaneous
connections. As a result in the coming years we could see a dramatic
increase in “N” which is much greater than the number of connected computers
and even greater still than the number of paying customers.

The impact of “N” on network topology

An increase in N will have a significant impact on the growth of network
topology even if the offered load remains the same, where the offered load
is defined as the average total number volume of bits delivered by the
customer to the network. To understand why let us take a very simple
example.

Ted, Mary, Alice and John have their own private telephone network. There
is no switch in this network so on each desk there are 3 telephones with
dedicated telephone lines to every other individual in the network. As a
result there would be 6 separate telephone lines connecting all 4 speakers.
Because of limitations of the human brain only one telephone can be used at
one time so at any one time there can only be one circuit in use between
two speakers. If each individual is talking to one other individual then at
any given time there would be a maximum of 2 circuits in use out of a total
of possible 6 circuits in this network.

But one day Ted decides to be clever and tries to carry on three telephone
conversations at the same time. He calls Mary, Alice and John and then
speaks to each one of them in turn in rapid succession. Because he has only
one larynx he cannot carry out multiple simultaneous conversations. So even
though his traffic volume is the same he has now tied up three telephone
circuits as opposed to the one circuit when he was talking to a single
individual previously.

This simple example is exactly analogous to what is happening when a
computer has multiple simultaneous sessions across the network. Imagine now
if Ted, Mary, Alice and John all decided to have simultaneous conversations
with two other individuals. This would result in the full mesh of 6
circuits being fully utilized between all 4 speakers. It is important to
note that overall the actual traffic volume remains unchanged from when they
spoke to each other individually but the number of circuits in use has gone
from 2 to 6.

It is clear from this simple example the effect of increasing N is not so
much greater traffic load but a dramatic increase in topology load. For a
service provider an increase in topology can be much more expensive than
increase in traffic load. With increased traffic load a network operator
only has to increase the size of the pipes proportional to the actual usage.
With increased N, however, all the pipes interconnecting potential users
have to be increased in size at some function related to the square of N.

It is important to note that even if the offered traffic load (packets per
second) remains the same an increase in N (number of TCP or UDP sessions)
will result in an increased number of connections in the network. Because
network engineers cannot predict with certainty when a circuit will be used
they must reserve spare capacity in each circuit. This results in network
capacity growing in some relationship to the square of N regardless if the
traffic volume remains the same.

Now in the real world computers are also limited in the number of other
computers that they can talk to simultaneously and by the size of the access
pipes. Network engineers take advantage of this fact and introduce
multiplexing to minimize the N squared growth in the network. But inevitably
as more powerful computers are attached to the network which can support a
greater number of simultaneous connections and as access pipes get larger
these networks will not only see an increased load per connection, but more
connections per computer which drives up the cost of the core
infrastructure.

Clearly peer-to-peer applications will have a major impact on the number of
simultaneous connections. A typical computer today may have only one or two
simultaneous sessions going at one time e.g. web and e-mail. However with
peer-to-peer applications like Napster and Gnutella the computer could be
handling 10s if not 100s of simultaneous connections distributing music and
other data to other computers around the world.

The impact of “N” on network traffic growth and revenues

All Internet networks are seeing dramatic growth in traffic volumes. But it
is important to distinguish whether that increased traffic volume is being
driven by actual increased usage or by the N squared phenomena. If the
increased traffic volume is driven by increased usage then service providers
should be able to realize greater revenues with only moderate increase cost
in infrastructure. However, if the increase in traffic volume is driven by
the N squared phenomena then for every increase in volume in traffic there
will be a corresponding increase cost in infrastructure related to N
squared.

Let us look at e-mail as a simple example. Two or three years ago most of us
probably only received 20 to 30 e-mails a day. Today, many people routinely
receive hundreds of e-mails a day. The increased e-mail volume is generally
not because these people are sending more and more e-mail back and forth
with the same correspondents of two or three years ago. The largest
increase in volume is because they are sending and receiving e-mail with
many more correspondents around the world. List servers and “spam” have
also significantly contributed to the e-mail volume.

But it is the increased number of correspondents that significantly
increases the costs of delivering this e-mail for the service provider. If
in fact the e-mail volume had increased to the same 20 or 30 correspondents
that there were two or three years ago then the service provider would only
have to build larger pipes to those correspondents to accommodate the
increased traffic volume. However, since most of the increased traffic
volume is from new correspondents, many new pipes have to be built to
support the communication of e-mail to those new correspondents.

Web pages over time have also gotten richer in content and typically have
many connections. With content servers it is quite common that each image,
each icon, applets, banners, etc are separate connections which also
increases the topological stress like e-mail.

Although e-mail traffic has considerably less volume than web or file
transfer traffic its impact on N squared can possibly be much greater. Web
and file transfer traffic tends to have a much greater locality of source
than e-mail traffic e.g. many people go the same web sites. As a result a
wider range of e-mail correspondents over time can have significantly
greater impact on N squared. In terms of topology cost e-mail may still be
the number killer application for the Internet. This echoes similar
findings by Andrew Olydzko in his paper “Content is not King”
http://www.research.att.com/%7Eamo/doc/recent.html where he concludes that
historically, connectivity has mattered much more than content and it is
connectivity, not content that drives revenues.

It is a general rule of thumb that most Internet networks double their
traffic volumes every year. For the longest time it was assumed that this
growth was directly user driven i.e. we are all using the Internet more and
more in our every day lives. But it is hard to imagine a human being
increasing their usage of the Internet on compounded annual growth of 100%
per year after year after year.
Let us look at an excellent example provided by Andrew Oldyzko in a recent
paper “Internet Growth: Myth and Reality, Use and Abuse”
http://www.cisp.org/imp/november_2000/odlyzko/11_00odlyzko.htm .In that
paper he shows year-to-year traffic volumes across the Atlantic for the
Swiss research network called “SWITCH”. The SWITCH network connects up most
of the universities and research centres in Switzerland. For the past 5
years the trans-Atlantic link has shown consistent year-to-year growth in
traffic of 87% per year.
As opposed to the consumer or even the business market, the growth in
computers and connections to the Internet by universities has largely
matured over the past few years. The universities were the first to connect
to the Internet so their growth in new computers and new connections to the
Internet would be significantly less than in the consumer market which is
still in the early stages of Internet connectivity. So where is this
compounded growth coming from if it is not from new connections to the
network? Why would a mature university network with very little growth in
the number of students, computers or connections to the Internet exhibit the
same growth characteristics as a consumer or business network where the
number of network computers is still climbing dramatically?

Undoubtedly a portion of the annual traffic growth is coming from new
applications and from greater use of the Internet by faculty and students
alike. But if the growth was from increased usage that would mean for every
hour spent on the Internet in 1996 a person would have to spend close to 16
hours in the year 2000 to account for this growth rate. We do know in fact
that that a significant portion of the growth from 1996 to 1998 was the
transition was from mostly text based traffic to largely web based graphics
traffic. But now that web based traffic is well established and available to
everyone over the past two years why hasn’t the traffic growth levelled off
on the SWITCH network?

The growth in the number of simultaneous connections may give us a better
clue to the source of this steady annual increase in traffic. A compounded
annual growth rate of 87% per year is a faster rate of growth than “N
squared”. Could it be that this growth rate reflects not greater usage by
humans but in fact an increase in the number of average connections per
computer resulting in total number of “N squared” connections across the
Atlantic, each of which is carrying traffic equal to one connection in 1996?

Unfortunately as yet we do not have any statistical data to refute or
support this interpretation of the data. However if there is a direct
correlation between the number of connections and increased traffic volume
then the implications are ominous for existing Internet service providers
and consumers of bandwidth alike:

1. For organizations, even if there are no new applications, Internet
traffic will continue to grow regardless if the number of employees and/or
computers at the organization remains constant; and
2. For service providers the topology of networks will continue to increase
in size even with a fixed number of customers and with no increase in
offered load

If Internet networks start to exhibit N squared phenomena of any degree then
these networks will simply not scale for any large values of N. If the
value of N is growing to accelerate, the value of N squared is going to grow
even more dramatically.

What is fascinating to note is that with large values of N, the economies of
scale of large networks are inverted. If a network infrastructure must grow
at some function related to the square of N, presumably the network costs
will also grow at a corresponding rate. Accordingly a network with N-1
users will be less expensive than a network with N users, particularly for
large values of N. If both networks are connected to the same initial set
of customers the smaller network will always be able to force cost pressure
against the larger network.

There are probably a number of additional factors but the inverse economics
of N-1 maybe the driver for ISPs to establish direct peering amongst
themselves and thereby substantially reduce their transit Internet fees to a
much larger ISP. It has been demonstrated in an excellent paper by William
Norton http://www.nanog.org/mtg-0010/tree.doc that direct peering can
dramatically reduce Internet transit fees from a larger upstream provider.

The effect of “N squared” on DSL and Cable modem networks in particular
could be pernicious. These networks are deployed on a flat monthly billing
model and with average expected customer usage. Even if the traffic load
remains constant peer-to-peer applications on these networks could drive up
costs of the core infrastructure of the carrier.

Side Bar: The Impact of N on suburban road congestion

The N squared phenomena is not only limited to communication networks. It
is also easily evidenced in the congestion on our suburban roads and
highways. Before WWII most business and factories were located in the
center of cities. As a result highly efficient and hierarchical public
transportation systems were possible for moving citizens to and from the
core of the city.

But the wide scale availability of automobiles after the war led to building
of suburbs. At first the suburban roads were not congested as there were
only a small number of residents (i.e. small N) living in the suburbs. But
as the suburban population increased congestion grew dramatically on the
surrounding roads. Because there is no “center” in the suburbs most suburban
roads around major cities are continually congested in all directions and
far out of proportion to the number of residents.

To get away from this congestion developers are building new communities
further and further out from the city. This is a classic small N solution,
but ultimately self defeating. Initially because N is small congestion is
light. But the cycle soon starts to repeat itself as N becomes larger and
road congestion again increases.

In the past this congestion has been blamed on a number of factors such as
cheaper housing, people’s love with the car, the tragedy of the commons,
political pressure from developers, etc. But in fact it may more accurately
relect a classic N squared phenomena. Whether its packets on an Internet or
cars on suburban highways – that all obey the same immutable laws N squared
connectivity.

Other Internet Traffic Characteristics that affect Topology Growth

Traditional telephone networks have a number of distinct advantages over the
Internet in terms of scaling and growth:

1. The key application (telephony) has a hierarchical server model;
2. 80% of the connections are local; and
3. High multiplex efficiencies are possible with voice traffic.

The Internet has probably only one major advantage in that it is a
fundamentally a connectionless packet based network. As a result when
“connections” are established on the Internet bandwidth is not “locked” up
as in a traditional connection oriented network. But there still is a cost
when TCP or UDP connections are established on the Internet even in idle
traffic. Because network engineers cannot predict with certainty when these
connections will be used they must build in reserve capacity on each link so
that congestion does not occur when traffic does flow. The calculation of
necessary reserve capacity is based on queuing theory and a number of other
factors and as we shall see in the following discussion these combine to
exasperate the N squared phenomena of Internet networks.

1. Hierarchical versus Non Hierarchical Applications. In a traditional
telephone system there is a hierarchy of telephony servers commonly called
switches that route telephone calls across the network. Rather than a local
neighbourhood switch setting up the circuit all the way to the destination,
the call request is handed off to a hierarchy of switches who in turn assess
if sufficient capacity exists to the next higher switch in the hierarchy,
and if so, establish the connection to that switch. Such a hierarchical
application is well suited to a hierarchical network model. In fact on the
telephone network they map to each other so precisely that often the
“network” topology is indistinguishable from the “application” topology.

On the Internet there is a loose hierarchy of routers which can forward
packets, but in general there is no hierarchy of application services. Most
application servers, at best have one level of hierarchy (e.g. e-mail) and
the establish direct peer to peer connections with similar servers located
anywhere else on the planet. The consequence of lack of hierarchy is that
many times the same data is transmitted repeatedly over the network. There
have been attempts to minimize this lack of hierarchy by deploying mirroring
and cache servers. This obviously only works for web based traffic but is
not effective at all for e-mail traffic or peer to peer applications.

2. Local versus Long Distant Traffic. In a traditional telephone network a
common rule of thumb is that 80% of the traffic is local. On Internet
networks, the local versus long distance ratio has been completely reversed.
As a rule of thumb, only 20% of Internet traffic is considered local. In
fact the ratio may be significantly less than that, particularly at the
metro level.

3. Poisson versus Self Similar Traffic Patterns.In addition because traffic
on traditional telephone networks follows a Poisson distribution, increasing
higher multiplex efficiencies can be achieved at aggregating nodes closest
towards the center of the snowflake [MOLL89].

It has been well documented that Ethernet networks and Internet traffic in
traffic exhibit what is called self similar behaviour [LELA94] and [PAXS95].
The self similar characteristics of Internet and Ethernet traffic
necessitate that large buffers be required in order to obtain equivalent
multiplexing efficiencies as with traditional Poisson distributed traffic
characteristics of traditional telephone networks. However these buffers
are impractical due to the large delays that would be incurred. As a
consequence network engineers have either increased the number or the size
of output ports at any give node. This, of course, reduces the multiplexing
efficiency of the network. As the incoming traffic exhibits increasing self
similar characteristics the multiplexing efficiency decreases in order to
maintain the same delay in the network queues.

4. Spatial Self Similarity. Another facet of the self similarity of Internet
traffic is that it probably acts in two dimensions – the temporal and
spatial. If traffic arriving at a multiplexer exhibits self similar
characteristics in the temporal domain then presumably the traffic on each
leg of the multiplexer output will exhibit the same behaviour. If this
traffic is then observed spatially it will be seen to “slosh” from one leg
to the other. This sloshing imposes an additional cost on the network
topology as greater capacity has to be built into each leg of the
multiplexer than would otherwise be predicted by Poisson distributed
traffic.

Traditional networks with their application hierarchies of voice switches,
high degree of locality and large multiplex efficiencies have been able to
mitigate the “N squared” phenomena of their networks. In addition, on a
traditional telephone network because only one user can speak to one other
user there was an immediate multiplexing gain of 1/N. As a result on some
traditional networks aggregation could be logarithmic to the number of
users. This of course would allow for significant economies of scale.

The Internet has none of these advantages. So compounding the N squared
phenomena is the fact that far fewer network efficiencies can be obtained
for Internet traffic. But probably the biggest issue confronting the
Internet is “abuse of the commons” because there is no differentiating
charges for those customers who generate a lot of “N squared” traffic.

Billing models and the impact on N squared

The traditional telephone network has a time and distance billing system.
So when someone makes a long distance telephone call, not only are they
charged for the duration of the call but also its relative distance. A call
from New York to Japan is more expensive than a call from New York to
Washington. This means that the telephone companies receive direct
compensation from those customers who place the greatest number of long
distance calls to the greatest number of destinations. In fact this model
encouraged telephone companies to market long distance services that
increased usage but minimized N squared demands on their networks e.g.
“friends and family” long distance packages.

Unfortunately with the Internet there is no equivalent billing mechanism.
All packets are priced the same (although there is often a volume discount)
regardless of where they are destined. As a result “spamers” and other
large users of the network get a free ride because they do not have to pay
the distance or N squared cost component of delivering their packets.

Possible solutions for N squared networks

It is interesting to note that in early days of the telephone industry there
was a similar N squared problem. When telephone systems were first deployed
there was a myriad number of competing companies deploying independent
telephone lines and networks. As a result it was quite common to have
several different telephones on an office desk – each connected to a
different telephone company.

This was clearly not scalable. So government regulators were convinced that
the only rationale solution was a “natural” monopoly and as consequence the
famous Bell telephone system was created.

But even this monopoly network ran into N squared scaling problems with the
use of human operators to make all telephone connections. At one time it
was forecast that a large percentage of the working population would have to
become telephone operators in order to maintain a functioning telephone
system. However electrical switches were invented which largely eliminated
the need for so many telephone operators.

Internet service providers today are probably at the early stages of dealing
with the N squared problem comparable to the early days of the telephony
industry. Already many network managers are complaining of the huge capital
costs that they are incurring because of the dramatic growth in Internet
traffic and the number of users. This is most likely the first symptom of
the N squared phenomena. As an example the following e-mail exchange by Mike
O’Dell, Chief Scientist for UUnet on the NANOG list further illustrates the
challenge of the N squared phenomena:

“…for UUNET's network to handle the 100% increase in gigabit/sec offered
load over 12 months, the gigabits/sec-route-miles capacity of the network
must increase 100% about every 4 months…. The deep intuition about network
growth dynamics developed over the years with voice networks simply does not
yield workable results when applied to very large data networks which
exhibit huge dynamic ranges of traffic slosh and the astounding doubling of
offered load every year. (and this is still the case even given how few
people currently enjoy "broadband" access)”
http://www.interesting-people.org/200011/0058.html

The fundamental problem in today’s networks is that the cost of the N
squared phenomena is not being passed onto customers. As mentioned
previously as long as there is an increase in the number of simultaneous
connections network capacity will continue to grow even if the number of
customers and the offered load remains constant.

Accordingly there are a number of possible solutions where the costs of N
squared might be passed on directly to consumers:

1. Develop billing systems that take into account the N squared costs. One
possible solution is to develop a similar metering concept for the Internet
where customers are billed on the total number of packets, the destination
of the packets as well as the total number of simultaneous sessions. However
implementing this solution with today’s technology and volume rates would
seem problematic.

2. Assume that new optical network technologies will drop in cost faster
than the growth of N squared. This approach seems problematic for two
reasons. First although optical networks have unquestionably reduced the
per bit cost of delivering Internet traffic it still pales against the
juggernaut of N squared particularly for large N. Secondly the inverse
economics of scale with large N means that a competitor with N-1 customers
can purchase the same optical technology and deliver a network at lower
cost.

3. Develop hierarchical based applications for e-mail and other
applications. This also seems problematic given the history of the Internet
and the development of an increasing number of peer to peer applications.
Another aspect of this approach would be to build “walled gardens” and limit
users to services as deemed valuable by the service provider.

4. Develop new network topologies based on the direct peering
interconnection model where the costs of the N squared phenomena can be
based on directly to the consumer.

It would seem that the first 3 possible solutions would be impractical,
although undoubtedly there will be attempts to build networks based on those
concepts. Perhaps there is a more rational and scalable approach that is
consistent with the very openness and non-proprietary nature of the
Internet.

An Alternate Network Topology

There is one possible network architecture that might allow the costs of N
squared to be passed on directly to the consumer without special billing
mechanisms. In this network architecture the customer is responsible for
building the connections to support the N squared growth. Rather than
buying an “Internet” service per se from a “network cloud”, the customer
instead purchases a number of dedicated links radiating out from the
customer to a number of Internet exchange points where they interconnect
with other like minded customers. In this case if there is an increased
traffic load due to N squared phenomena the “customer” rather than the
Internet service provider is responsible for building out new connections to
other Internet exchange points. Rather than looking like a traditional
snowflake the network starts to look like a mesh of interconnecting
radiating stars.

In addition with customer owned pipes the N squared mesh can be collapsed to
a single point. Rather than a carrier building out a network cloud to the
customer and assuming the responsibility of carrying the N Squared traffic,
it may be economically more efficient for the customer to purchase long
pipes to a single point for the interchange of traffic.

In essence this network architecture is already in place with smaller
Internet service providers who interconnect to each other at common peering
points. In this case the “customer” is the ISP. Generally these ISPs do
not charge for the exchange of traffic between themselves, but in most cases
they still have to purchase Internet “transit” service from a major ISP to
interconnect to the global Internet. It has been demonstrated in an
excellent paper by William Norton http://www.nanog.org/mtg-0010/tree.doc
that this network architecture can dramatically reduce Internet transit fees
from a larger upstream provider. For some time the demise of many of the
smaller ISPs has been predicted. But perhaps this is a classic vindication
of the inverse economies of scale with N squared growth in the Internet.
And as William Norton has noted in his paper “the amount of traffic
exchanged with a peer is less important than the amount of traffic
 “expected” to be exchanged in the future.”

Although smaller ISPs should be able to undercut the costs of the larger
ISPs, they in turn will suffer cost pressures from their larger customers
who can implement similar solutions for their network connectivity. The
question then is how small in size this architectural model will scale? It
seems to work well for “customers” who are ISPs. Will it scale for large
enterprise customers? Will it scale for individual homeowners?
Interestingly this architecture approach also seems to be more in line with
some new ideas in “natural” networks. A recent article in the New York
Times http://www.nytimes.com/2000/12/26/science/26WEBS.html that many
natural networks such as molecules in a cell, of species in an ecosystem,
and of people in a social group seem to organize themselves so that most
nodes have very few links, and a tiny number of nodes, called hubs, have
many links.
The most fascinating aspect of this architectural approach is the striking
resemblance to the neuron architecture of the brain. The neurons in the
brain have long tentacles called axons which interconnect to neurons and
other cells throughout the brain and the human body. The neurons are also
covered with much smaller “dendrites” which in most cases brings data
signals to the neuron. Has mother nature already discovered the most
efficient network architecture for distributed intelligence? And if so do
concepts of self aware networks and networked artificial intelligence enter
into the picture? What impact on network design and engineering with there
be if networks radiate out from the intelligent component of the network
which today is the server at the edge, rather than the switch in the middle?
What will happen if networks are built on the same lines as the direction of
application flows?

Dr. Larry Smarr in a recent article in New York Times has also speculated on
these very possibilities - The Soul of the Ultimate Machine -
http://www.nytimes.com/2000/12/10/technology/10SMAR.html

CANARIE is also leading an active research program on these issues with a
concept called customer empower networks. See
http://www.canet3.net/library/presentations/CAnet4DesignDocument-Sept00.ppt
for more details. As a result of this research CANARIE is proposing the
deployment of the optical network where customer controlled wavelengths are
dark fiber and routed by the customers at the edge of the network to support
massive direct peering. Leading edge optical technology in this case is not
required to support bandwidth, but rather connectivity.

In this architecture approach of radiating wavelengths, the traditional
network requirements for restoral and protection are not necessary. The
loss of a single radiating arm is not catastrophic as in a traditional
hierarchical network. As a consequence simpler optical network
architectures can be deployed.

It is important to note that “customer controlled” wavelengths does not mean
that the customer deploys their own optical network. Instead it is expected
that customers will purchase their wavelengths and fiber from commercial
carriers and other suppliers. However rather the carrier offering a managed
service, the customer controls the routing of the wavelength to an Internet
exchange point of their choice instead of the carrier.

Initially, in this proposed network architecture the customers will
initially be the GigaPOPs located in each province. But over it time it is
hoped that customers will be individual universities and ultimately
individual workstations on the campus. The proposed CA*net 4 network will
give Canadian researchers and industry partners the ability to study
alternative approaches to the N squared phenomena and develop network
product solutions that might enable the large deployment of scalable
networks. Already in pursuit of that objective CANARIE is leading a
research effort to develop a new version of the Border Gateway Protocol
(BGP) called Optical BGP (OBGP)
http://www.canet3.net/library/papers/OpticalBGPNetworks.doc to enable
customers at the edge to setup and tear down their own wavelengths in order
to enable massive direct peering with as many like minded networks as
possible.

This concept of customer owned wavelengths and edge radiating optical
networks also opens areas of research in “optical grids” where large high
end applications such as in high energy physics, astronomy, etc establish
their own set of interconnected wavelength grids. It is also possible to
conceive wavelengths as programming “objects” with inheritances and classes
that can be treated as standard components of a middleware toolkit –
sometimes referred to as “object oriented networking (OON)”. The
construction of large planetary computers of thousands of distributed
processors interconnected by wavelengths might also be conceivable.

Impact on the Commercial Internet

Clearly large commercial Internet service providers will face some serious
challenges if the arguments presented here are proven true. On the other
hand carriers that do not sell Internet service, but instead only sell pipes
will be clear beneficiaries. Operators of carrier neutral Internet exchange
facilities should also benefit from N squared growth in the Internet.

But the reality is that large ISPs will always be needed to provide global
Internet connectivity. It would be difficult to imagine every small ISP and
large enterprise establishing a massive web of radiating bandwidth pipes to
all other similar organizations located around the world.

One possible hybrid model is that as community based fiber networks become
more prevalent small community based ISPs could offer Internet services
locally to the community as they would be less susceptible to N squared
costs. ISPs and large commercial enterprises usually host most of the
common peer to peer application servers that generate N squared traffic such
as e-mail, web hosting, caching, etc. As such local fiber or wireless
community networks connected to these servers can be deployed at relatively
low cost. The community based ISPs could then purchase dedicated wavelengths
to a number of Internet exchange points within the community and elsewhere
in the region or internationally. Ultimately they would still have to
purchase some volume of transit service from a large global ISP. In this
case, the large global ISP could bill not only for traffic volume, but
topology costs as well.

In some ways the situation is analogous to electrical power systems. For
small customers electrical utilities only bill based on the consumed
amperage as this gives a fair approximation of the power consumption. For
large consumers of power such as factories, utilities not only for the
amperage used but for the change in phase of the delivered power.

Perhaps Internet usage could also be calculated based on some complex number
made up of packets transmitted and average “spread” in the number and
distance of the packet destination.

Large global ISPs will not disappear. At some there will be an economic
balance between a smaller ISP or organization owning wavelengths to an IX
point versus the costs of operating a wide geographic network. As the cost
of optical technology declines with increasing number of wavelengths per
fiber presumably over time the economics will tilt in favour of the smaller
organization.

Further Areas of Research

The N squared phenomena is not new. It has been a basic feature of all
networks since the first two telephones. However with traditional
telephone networks it was never a serious issue because of the large
locality of traffic, hierarchical applications and the high multiplexing
ratios that were possible on such networks. But the Internet is different.
And network architectures that were designed for the good old telephone may
simply may not scale for a fundamentally new type of traffic type and
fundamental change in the way we communicate.

Further mathematical research and evidential data of the growth of the N
squared phenomena needs to be carried out. How quickly are Internet
backbones experiencing N squared traffic growth? Will new applications
enhance N squared Internet traffic or decrease it? Will broadband services
to the home with DSL, cable modems and higher exacerbate N squared and the
self similar nature of Internet traffic? Will we have to develop network
topologies and protocols so that eventually every network device will be
able to set up direct BGP peering with other devices and networks?

References

MOLL89 Molloy, M., Fundamentals of Performance Modeling. New York:
MacMillan, 1989

LELA94 Leland W., Taqqu, M., Willinger, W., and Wilson D. “On the Self
similar Nature of Ethernet Traffic”, Proceedings of SIGCOMM 93, September
1993

PAXS95 Paxson, V. and Floyd, S. “Wide Area Traffic: The Failure of Poisson
Modeling” IEEE/ACM Transactions on Networking, April 1994

Acknowledgements

I would particularly like to thank the following people for their comments
and contributions to this discussion piece: Andrew Oldyzko, John Bourne, Ken
Hayward, Francois Menard, Charlie Catlett, Gary Finley, Jim Yuan and Ben
Bacque.

The author would also like to thank Industry Canada and the Government of
Canada. This paper would not be possible without their generous support for
advanced Internet organizations like CANARIE.

-------------------------------------
To subscribe or unsubscribe to the CANARIE-NEWS list please send e-mail to:

majordomo@canarie.ca

In the body of the e-mail:

subscribe testnet
end

-------------------------------------

These news items and comments are mine alone and do not necessarily reflect
those of the CANARIE board or management.

Bill St. Arnaud
Senior Director Network Projects
CANARIE
bill.st.arnaud@canarie.ca
+1 613 785-0426

Bill St. Arnaud
Senior Director Network Projects
CANARIE
bill.st.arnaud@canarie.ca
+1 613 785-0426


New Message Reply About this list Date view Thread view Subject view Author view

This archive was generated by hypermail 2b30 : Thu Sep 27 2001 - 16:24:41 PDT