IP Traffic and Quality of Service Management in Wired and Wireless Networks.

Research report
© 2003 Anton Vinokurov, anton@netams.com

<- Back to my home page

Download this document in PDF format (474Kb)


Table of contents:

1. Introduction
2. Present tools
3. What was done
4. How it works together
5. Future work
6. Conclusion
7. References

In this document I would like to present some of the results of my research in IP traffic management discipline as well as point out some yet unsolved problems in this area. I will start with a short crash course into this subject; show what is already done in industry and science, what has been done by me and what I am doing now. Finally, I'll tell you what should be done next.

1. Introduction

According to [1], information is a knowledge traveling between endpoints (humans, computers) via networks (wire, air, fiber). Human-perceptible data should be converted into digital form and sent with some device over a route. The most known network is Internet, and the protocol (convention) used in data transmission there is IP. Traffic is a lot of simultaneous endpoint-to-endpoint conversations observed at every network node: router, switch, PC. Because of stochastic and non-predictable nature of majority of networks, there is strong competition between conversations (flows) for the various network resources (bandwidth, delay). Any kind of traffic could potentially dominate over the other, and it is probably not what is wanted by endpoint user. There should be a way to manage flows according their roles in whole traffic, and to offer better forwarding quality and services to some of them. Another task is to provide exact, sharp and smart way to administrator to see what is going with the network now, to reconfigure it, to count flows and traffic for further analysis, planning and billing. While wired networks are well-studied and there are a lot of theoretical and practical experience exists as well and tools are developed, wireless networks (especially high-speed ones) are just making first steps and there are a lot of research and development to be done to improve their manageability.

It is well-known fact that personal computer performance is constantly increases. The amount of data to be exchanged between endpoints also grows dramatically. At the same time, the throughput of communication channels grows slower, and should be shared with all users of the network. Every network consists of media where data is moved, and devices responsible for data routing logic, signal amplification and retransmission. Every large network, such as Internet, consists of high-speed and low-speed paths, and potentially there are some bottlenecks on the way between two endpoints. The amount of data to be transferred over the path is higher than the path could transmit because of media or processing speed limitations. Some data packets will be dropped and lost, some delayed or sent out of order. For legacy technologies such as FTP data transfer and e-mails it is not an issue, but for modern technologies like Voice-over-IP [2], or for mission-critical data of banking transactions or online ticket booking etc., a management technology is required.
Recent developments in silicon resulted in high-speed CPUs. A most of processing could be done in relatively cheap software routers based on conventional personal computers. The burst of popularity of Linux [3] and FreeBSD [4] operating systems is the example. Modern Pentium4©-based PC with two network adapters could route hundreds of megabits per second of IP traffic for the price of two 2M serial ports hardware router with 40MHz processor onboard. The next advantage the PC routers offers is the ability to run for free any of thousands applications developed by UNIX community. One could easily write his own tool and integrate it into the routing platform. Because of software nature of routing process, the traffic management and QoS management tools could be tightly coupled with the existing packet switching path. UNIX operating system gives high level of control and troubleshooting, a lot of statistical data and simulation results could be used in further development process, and research.

There is a lot of management products on the market. All of them are expensive. Algorithms are hard-coded and hidden: nobody knows how it works in fact. It is practically impossible to extend the functionality or add a new feature. For example, to use CiscoWorks product suite and manage traffic and QoS you should previously buy a Cisco [5] hardware device. For Allot [6] or Packeteer solutions you can use any routing or switching device, but should buy a separate hardware only for management purposes. SNMP-based HP OpenView, SNMPc and UCD-SNMP software products are mainly for monitoring and a lot of hand-working is required to give them a enforcement ability.

There were some tools developed for IP traffic monitoring, accounting and filtering for UNIX-based routers. Most of them are based on firewalling mechanisms provided by UNIX kernel itself. Traffic data is collected in the kernel or user space and then stored in the plain text file or database for further analysis. Several billing products use a technique with external data acquisition and processing. Freeware tools are weak in interface and manageability, moreover they are suited only for small networks with little traffic and configuring needs. All known commercial tools with a web interface has no sources available and could not be improved one's own. The design of UNIX tools is usually inaccurate and undependable with no redundancy or scalability. The Quality of Service management tools is practically unknown. There is a simple implementation of various queuing techniques in Linux and "dummynet" in FreeBSD. The most advanced product is ALTQ management tool for BSD kernels [7]. All of QoS management is separated from traffic management itself (say, accounting, filtering, policing and access control) and there are no integrated freeware products exist. At the same time, it is a good idea to have the "integrated" tool since it is reasonable to have all traffic management information about subnets or hosts (rules, limits, filters) in the same place with QoS management (bandwidth allowed and delay mark). There is a tendency for Ethernet and IP growth in wired networks, and the same is for wireless. Practically every new mass wireless data device has Ethernet interface and can be easily connected to the wired part of the network. Over-the-air nature of wireless communications requires a strong data encryption, recovery, reliability mechanisms much complex than for wired applications. Bandwidth is more expensive there, and applications are more aggressive to the quality of data transfer path provided. That is why traffic and QoS management for wireless communications is a new interesting and complicated area of the research.

2. Present tools

There are a lot of bandwidth, queue management and accounting tools exist. One can easily find a lot of freeware projects by searching over sourceforge.net web site. Here I'd like to discuss some of successful tools.

ipfw/dummynet is a part of the FreeBSD operating system kernel. Originally made as a firewalling mechanism, ipfw quickly became a basic platform for a further packet processing management, i.e. divert sockets, forwarding, and traffic shaping. Generally speaking, ipfw is the one of the points on the Ethernet and IP packet path inside of UNIX kernel (fig.1). It consists of the kernel module (static or loadable), ipfw.ko, and user-space configuration utility, /sbin/ipfw. They interact with each other via a raw socket. Configuration of ipfw subsystem is done by a set of "rules", which are read by user space utility one-by one, handled and packet into form of "messages" and sent to the kernel. For every packet processed by a kernel, a table lookup is done and, if matched, corresponding action is performed (i.e. pass, drop, count, forward etc.).

       ^     to upper layers   V
       |                       |
       +----------->-----------+
       ^                       V
   [ip_input]              [ip_output]        net.inet.ip.fw.enable=1
       |                       |
       ^                       V
 [ether_demux]    [ether_output_frame]          net.link.ether.ipfw=1
       |                       |
       +-->--[bdg_forward]-->--+         net.link.ether.bridge_ipfw=1
       ^                       V
       |      to devices       |        
Fig.1. Packet path with ipfw subsystem (taken from FreeBSD ipfw man page)

Newer versions of ipfw subsystem (ipfw2) introduce several attractive features like keepalives, non-IPv4 packets, ethernet and others. The entire ipfw subsystem allows to build a flexible firewall or packet processing engine using a well-defined and predictive set of rules. Dumminet is the part of ipfw provining a simple queuing and shaping disciplines. According to ipfw rule, any matching packet could be placed into the FIFO pipe or WF2Q+ queue with pre-set bandwidth, delay and packet drop probability values. There is a chance to specify queue size as well as Random Early Detection (RED) parameters useful to control TCP traffic bursts. While ipfw/dummynet is very popular in BSD community, there is no Graphical User Interface exist for ipfw, and for large network with a complex topology and policy the management task could be very complex. Traditional UNIX way of small set of scripts does not work here because there is a need to have one big database of users, policies, rules, traffic statistics and events - all of this should be done manually or specially written for existing ipfw technology. Linux tc (Traffic Control) looks much like FreeBSD's dummynet but is implemented as an additional kernel module and has routing module's roots. There is a lot of queuing disciplines exists, and the way of their control is much like that in FreeBSD. Also, there is lack of manageability for this toolset in a complex network scenarios. ALTQ (ALTernative Queuing) is the set of kernel patches and userspace utilities for all BSD-like kernels. This result in ALTQ is working under NetBSD, OpenBSD and FreeBSD operating systems. While ipfw requires the IP or Ethernet forwarding path to be intercepted, ALTQ works on interface level and requires the interface drivers to be modified (fig.2). Fortunately, it could be done by replacing of a limited set of macros to the new ones:

**old-style**
int
ether_output(ifp, m0, dst, rt0)
{
...
s = splimp();
if (IF_QFULL(&ifp->if_snd)) {
IF_DROP(&ifp->if_snd) ;
splx(s);
m_freem(m) ;
return (ENOBUFS) ; 
}
IF_ENQUEUE(&ifp->if_snd, m);
ifp->if_obytes += m->m_pkthdr.len;
if (m->m_flags & M_MCAST)
ifp->if_omcasts++ ;
if ((ifp->if_flags 
&IFF_OACTIVE) == 0)
   (*ifp->if_start)(ifp);
splx(s);
return(error);
**new-style**
int
ether_output(ifp, m0, dst, rt0)
{
...
mflags = m->m_flags;
len = m->m_pkthdr.len;
s = splimp();
IFQ_ENQUETUE(&ifp->if_snd, m, &pktaddr, error); 
if (error != 0) {
splx(s);
return (error);
}
ifp->if_obytes += len; 
if (mflags && M_MCAST)
ifp->if_omcasts++;
if (ifp->if_flags
& IFF_OACTIVE) == 0)
(*ifp->if_start)(ifp); splx(s);
return (error);
Fig.2. ALTQ replacement for drivers [8]


Because of tighter control of output queues, the bandwidth control could be performed with a higher precision than in ipfw; there are over ten of queuing disciplines exist and one can implement its own quite easily. In addition, there is a lot of additional experimental features included: RED with In/Out, Conditioner, Marker and so on. Generally speaking, ALTQ is mostly for research rather than for daily network administering. There is also no single management console or GUI exist. Currently, a significant part of ALTQ code is used in KAME IPv6 project and in pf mechanism at OpenBSD distribution. While an effort to port ALTQ to upcoming FreeBSD 5.x tree has made [8], the perspectives of this toolset are doubtful.

Typically a fresh-installed FreeBSD or Linux distribution occupies about 1 gigabyte of HDD space. It consists of kernel, modules, user-space utilities, documentation and man pages, and sources (kernel only or whole system). Binaries-only system without of documentation occupies about 150 Megs, while most of the utilities are rarely used. An effort has been made to construct a smaller distribution set which could be fitted to a floppy disk or small flash memory device. As a result, one of the most popular product is PicoBSD [9], which based on FreeBSD 3.x, has been developed. It consists of stripped version of UNIX kernel built without of modules, and around of twenty utilities like ps, ssh, cat, sh, ping, ifconfig and others. All these utilities are statically compiled into one binary file using crunchgen tool. A set of hardlinks is used to access the separate utilities. The whole system is packed into a single archive fittable to a standard 3.5" floppy. After a startup this image unpacks into the Memory Filesystem. Such a tiny distribution could run on Intel 386 or 486 CPU hardware with at least 4 megs of ram, making a flexible router with several interfaces for the small cost. Unfortunately, this project is turning down while the cost of flash devices reduces dramatically. It is more preferable to have a bigger image (around ten megs) with advanced network functionality support and more utilities included. This problem was partially solved in proprietary "Dateliner DL1" appliance OS image described later.

Accounting of network traffic is a well-known task for the network administrator. There are a lot of packages were developed to introduce some automation to this process, there is no universal and complete tool exists. The most common way is to use the counters provided by operating system, this is the base to construct of rather simple set of scripts which periodically examines kernel counters and updates a log file. Unfortunately, this approach gives a little control over an accounting process in a big network. This is how ip_acct or ipCount utilities are build. Another way is to store an accounting information into SQL database, this results in possibility of further data processing. Trafshow of aaa+fw utilities works by this way. Further improvement of such tools lead to the development of the next generation of utilities. Ng_ipacc [15] tool works much like ip_acct but it runs in the kernel space using a netgraph layer, significantly improving the processing speed and reducing the delays. There are a lot of utilities which are working by constructing and processing of NetFlow-like [14] traffic accounting statistics (NeTraMet, flow-tools, ENHT). They are widely used in a Cisco routers environment with a heavy amount of information to be processed.

From this brief review you can imagine that a lot of software solutions useful for traffic management process exist. However, most of them are in development process or currently not supported; it is too hard to construct a monolithic management system from such a broad variety of different products. This situation has shifted me to prepare my own vision how it SHOULD look like and develop my own set of hardware and software solutions described in the next section.

3. What was done

My goal is to create a cheap, PC-based router/bridge with Ethernet interfaces capable both for IP traffic accounting and filtering and for traffic management (queues and QoS policing). The first that should be considered is the operating system under which everything will work. The second is the managing software, and the last is the hardware platform.
MS Windows is well-suited for business, home and office applications and it is very popular. But it was designed not as packets processing OS. While having a lot of traffic management software on the market (CheckPoint Firewall NG, MS Internet Security and Acceleration Server etc) as well as DDK and SDK available, it is hard to develop a freeware tool suitable for both research and practice without a deep knowledge of the operating system internals. This task is more complicated because of absence of operating system and drivers sources.
Linux operating system is very popular and it has full source codes for both kernel and utilities. It is widely used as server and software router OS because of high stability and performance. But, it is oriented primary for end-users or server tasks, and its networking facilities are not perfect. The fuss around Linux is too much annoying while developers community is still thinking on it as a student's article.
Realtime operating systems like LynxOS or VxWorks are fine from the performance, stability and compactness points of view. But sources are expensive, the number of software existent for that platforms is small, and it is doubtful to make a successive project based on them.
Various BSD clones like FreeBSD, NetBSD and OpenBSD have proved their excellent stability and performance for many years. Based on commercial BSD4.4Lite2 system, they are free and distributed with full sources. The development models are pretty fine and allows developers to be in touch with the latest versions. Code is primary network-oriented rather then user-oriented, and the number of NIC drivers and network utilities and packages is huge. In my research I am usually using latest -STABLE and -CURRENT branches of FreeBSD operating systems (while interoperability with Linux and other BSDs is often possible).

My research activity was started at 1997 with small Pentium-133 based server running FreeBSD 2.2.5 operating system suited for small network routing, web- and email servicing. There was a task to accout per-user Internet traffic for a small department network (around 50 computers installed). For that period of time there was practically no such a software, and I has been forced to write my own tool. It's name is "ipCount". It was written on "C" programming language. The CGI web interface was written in Perl, and it was my first experience with system programming and SQL.

IpCount uses the built-in accounting features provided by OS ipfw utility. It simply puts corresponding "ipfw count ip from any to A.B.C.D via XX" rule into the kernel chain and periodically obtain the counter values and processes them. All values are stored in the SQL database for reliability and for subsequent analysis. The web interface allows to add/remove rules as well as displays some summary information over a selected period of time or IP address. This design has a lot of drawbacks. At first, it was oriented only for FreeBSD and for accounting only: no filtering or porting to Linux platform could be possible. The security and performance were also poor, and completely no scalability has been provided. That is why after two years of supporting ipCount, a next generation software has been developed, and named aaa+fw. Aaa+fw (Authentication, authorization and accounting plus firewall) package was written on C++ in threaded model. This results in design simplicity as well as platform independence: aaa+fw could be executed on both FreeBSD and Linux. The packet processing engine uses "ipfw divert" technology available in the latest FreeBSD kernels. Linux version uses "IPQ" library (tightly connected with iptables utility). It has builtin scheduler, telnet server for in-band management, html pages generator, quota and MAC address control features. The traffic data itself still resides in SQL database (MySQL server was chosen because of popularity and simplicity). Comparing with ipCount, all hosts could be logically combined into "groups", processing speed was increased, and a lot of useful stuff like "show me current traffic activity" was added. Unfortunately, there was neither traffic filtering no sub-protocol (TCP, ICMP etc) processing. Because of internal design limitations, it was decided to write a new tool from scratch instead of rewriting aaa+fw.

The new software design process has taken around half a year, and in the beginning of 2002 the NeTAMS package was firstly presented [10]. It stands for Network Traffic Accounting and Management Software. Currently it is my only supported and still in the development traffic management tool, although it is not clearly decided if it will be used as a starting point for further development or not. NeTAMS is a multithreaded UNIX application (daemon) written in C++ language. Whole package includes the netams daemon itself, netamsctl command-line utility for the single command execution, flowprobe and ipfw2netflow utilities for external flow data acquisition, Perl API and several CGI scripts for web management, logins, and statistics presentation. There are also startup scripts, configuration files examples and documentation files included.

Several basic "building blocks" used in NeTAMS design are listed below:
  • OID is the Object Identifier, an unique key of every object used in indexing and accessing in the SQL tables.
  • UNIT is the primary management object. It represents one of the typical network objects: host (single IP address), subnet (network address and mask), cluster (several IP addresses) and group (a collection of units of any type, including chained groups).
  • POLICY defines the rule used in IP packet processing. Any field in packet header could be compared with a given policy rule according to policy target definition: IP address, protocol, port, port group, presence of a given IP addresses in a list (or file) and others.
  • SERVICE is a logically separated single thread of program execution. It runs its own well-defined job and plays with common internal program structures.
  • DATA-SOURCE is a service, responsible for acquiring of traffic data into the program.
  • STORAGE is responsible for storing traffic information as well as packet monitoring or supplementary data into SQL or HASH database.
NeTAMS is starting with opening and processing of configuration file specified. While processing, a number of parameters are set (i.e. debug level, the list of administrators allowed to manage NeTAMS instance, etc.). Then, services are created and configured one-by-one as specified (fig.3). The main service, processor, has all units and policiy definitions as well as default policies and storage numbers.



Fig.3. NeTAMS framework


After processing of configuration file, NeTAMS consists of:
  1. A set of data structures (typically joined onto one-way linked lists). There are Units list, Policies list, PolicyData, list of current connections, users, scheduled tasks and others.
  2. A number of threads of execution (services), waiting for an event (slept) or blocked on a system call (network read, disk or terminal I/O).
Each of services shares a common set of structures. Each structure is protected by a set of mutexes or rwlocks for a simultaneous accesses management. A great effort was done to prevent unnecessary long locking on time-critical parts of code. During the most of program's activity, only a small part of data needs to be changed. The majority of other data is practically static and could be accessed in read-only non-blocking manner.
Every policy defines a "matching rule", a set of conditions to be checked with every incoming IP packet. Each unit has own set of associated "firewalling" and/or "accounting" policies grouped into PolicyData linked lists (fig.4). The PolicyData object consists of five structures "pstat" containing accounting information for a given "unit-policy-period" combination. All data is accounted for a period of time starting at fixed moment till current time. These "periods" are from the beginning of current flow period (value defined at processor service, usually 5 minutes), current hour, day, week, month and first time this unit was created (total). Pstat has input and output byte counters (unsigned long long integer capable of storing up to 264 bytes), begin and end periods of time (time_t variables with 1 second accuracy).



Fig.4.Units, Policies and PolicyData structures hierarchy


An example of stored accounting data (output of "show list full name XXX" command) is shown here (fig.5):

>show list full name proxy-cust
OID: 051F16 Name: proxy-cust Type: group    Parent: <>        
 SYST policy is not set
   FW policy list is empty
 ACCT policy: OID    NAME       CHECK        MATCH       
              14643C all-ip     84579285     84579285    
 26.08.2003 13:22:02 flow       in: 41696        out: 1619        
 23.03.2002 17:19:16 total      in: 164358578606 out: 39305378779 
 01.08.2003 00:00:00 month      in: 7073602536   out: 1403639347  
 25.08.2003 00:00:00 week       in: 595480552    out: 124961629   
 26.08.2003 00:00:00 day        in: 115408356    out: 30581932    
 26.08.2003 13:00:00 hour       in: 14849814     out: 2888387     
              141949 smtp       84579285     1088137     
 --.--.---- --:--:-- flow       in: 0            out: 0           
 02.09.2002 21:23:28 total      in: 145505443    out: 5009268585  
 01.08.2003 00:00:00 month      in: 5770783      out: 225938212   
 25.08.2003 00:00:00 week       in: 582363       out: 24655848    
 26.08.2003 00:00:00 day        in: 110095       out: 5015169     
 26.08.2003 13:00:00 hour       in: 12036        out: 448587      
              142AE8 pop3       84579285     2090041     
 --.--.---- --:--:-- flow       in: 0            out: 0           
 02.09.2002 21:23:28 total      in: 7669368341   out: 225519089   
 01.08.2003 00:00:00 month      in: 329764801    out: 9519170     
 25.08.2003 00:00:00 week       in: 29606393     out: 1050438     
 26.08.2003 00:00:00 day        in: 11168438     out: 287353      
 26.08.2003 13:00:00 hour       in: 241371       out: 21919       
              146255 web        84579285     55496108    
 26.08.2003 13:21:57 flow       in: 84178        out: 26858       
 02.09.2002 21:23:28 total      in: 90304075799  out: 12285348758 
 01.08.2003 00:00:00 month      in: 5713133303   out: 769209459   
 25.08.2003 00:00:00 week       in: 509450324    out: 68697239    
 26.08.2003 00:00:00 day        in: 91729261     out: 17422352    
 26.08.2003 13:00:00 hour       in: 8893292      out: 1657179     
Fig.5.Output of "show stat" NeTAMS command


Every "lookup-time" seconds (parameter specified in configuration file and typically have 5 sec. value) the processor service looks through all units in the units list and checks all accounting PolicyData structures. For each structure where flow is expired or time period changes (i.e. new day begins), all data is saved in a new "storage messages" and sent to the processing queue. Messages are the primary way to distribute accounting information to and from storage database. The reason of introducing messages lies in performance considerations. The previous software (ipCount and aaa+fw) as well as many other accounting utilities uses a synchronous model of storing of data. Each packet processing loop could result in flow flush followed by the database write cycle. This operation is time-consuming and potentially may block the whole cycle for a significant amount of time, followed by a high packet delay and jitter. Newer NeTAMS design is oriented to performance, and all data to be flushed is organized into "messages" and put onto special queues. This is a relatively fast process. Then, packet processing continues with no further time and processor consumption. Instead of it, all "save" operations are performed by other threads (storage services) asynchronously (fig.6):



Fig.6. NeTAMS synchronous and asynchronous storage models


While the actual database saving process could require time-expensive disk I/O or process switching, only storage thread is temporary blocked and all new write messages would be stored at FIFO queue without delaying of main packet loop.
But how data is getting into NeTAMS? There should be several (at least one) services of "data-source" type. They are responsible for obtaining packets from operating system, analyzing headers, increment bytes counters, filtering and returning packets back to the system. There are three types of data sources implemented:
  1. IP packet obtained from operation system firewall mechanisms. It could be a packet itself or a packet copy. For the first case, NeTAMS may modify it and make a decision to return or not to return it back to the system. For the copy of the packet, only accounting is done while original packet pass the kernel untouched.
  2. Ethernet (or, more generally, data-link layer) packet that come via specified network interface using well-known libpcap library.
  3. UDP packet of Cisco Netflow© version 5 traffic statistic information coming from a near Cisco router or PC with flowprobe/ipfw2netflow or similar collector installed.
Having a lot of traffic source types gives an additional flexibility to network administrator. One can choose, whether filter or not filter a traffic flow. While filtering, any access control based on policy, quota, MAC address, authorization or something else could be easily implemented. There could be several data-sources per one NeTAMS process, that obtain data from different interfaces or routers standing nearby. Each data source thread process incoming packets one-by-one to find:
  1. A single unit or a group that matches this IP src/dst header field
  2. A list of matched policies within this unit, individually for filtering purposes and for accounting ones.
  3. To decide whether to filter (drop) packet or pass back to the kernel. This is useful only with "ip-traffic" data source type.
The unit match operations are rather simple and requires an IP packet header SRC and DST fields comparison with corresponding unit data. Policy matching is more difficult because of variety of policy types possible. It could be a simple IP protocol/port match as well as more complex scenarios like port range, ports list, match against specific unit or list of subnets from a specific prefix file. Early releases of NeTAMS data-source services had a shared linked list of all units defined and for all stored prefixes from target prefix file XXX policy type. This result in linear comparison of every unit and every subnet one-by-one with incoming IP header fields. For a simple networks (less than 100 PCs) routed with a strong hardware platform where NeTAMS is running, it was not an issue: the average system load level was less then 10% even at rush hours. But for a larger network with 3000 units defined each packet header was checked through all IP addresses twice (both SRC and DST fields), resulting in a high CPU load and greater processing delay.
This limitation was fixed by introducing two-way linked list of four-level-deep binary search tree, which is the perfect storage for IP addresses and subnets. The search operation requires only four comparisons and binary shifts to build a linked list of matched units. This requires a very little number of operations and dramatically processing performance increase, followed by CPU load decrease. Currently the NeTAMS process with 3000+ units on a router with constant ~70Mbit/s traffic stream and 2x1GHz Intel Xeon processors consumes only 5-8% of CPU power. Processing improvement is described in detail in [18].



Fig.7. Private network connected to upstream ISP
Here a typical small network diagram is shown. The outbound (Internet) channel is provided with a LAN or DSL modem connected to the nearest ISP. There are a server acting as a firewall, NAT router, and possibly a corporate mail, web proxy or file server. This is a single point of control for all network management tasks: authentication, access control, policing, accounting etc. It is very attractive to run this router under freeware Linux or FreeBSD operating system supervision, where NeTAMS daemon could manage the majority of networking tasks. From the administration point of view the management process should be simple, open and self-evident. It is desirable to have a web interface management console instead of text files-based one. Because of open architecture, NeTAMS has two ways of presenting statistics data for management. First one is html service, which is responsible for periodic traffic statistics static HTML pages creation (fig.8.) These pages (separate sub-trees for the administrator and for the users) could be accessed using a proper configured HTTP server. Some sort of access control and encryption is usually implemented (.htaccess, AuthType Basic, HTTP over SSL).
For the other management tasks like traffic statistics visualization, units and policies management, access control and others, an Application Programming Interface (API) was made. Because practically all things could be done via text-based TELNET-like interface, this API is a simple high-level gateway between currently running NeTAMS daemon and user-level management application, for example a Perl CGI script.



Fig.8. Statistics page automatically generated by the HTML service


Here is an example of "show version" Common Gateway Interface application which uses netams_api.pl interface:

#!/usr/bin/perl
#
# netams_example.cgi
#
# Here is an example how is NeTAMS <=> TCPIP <=> CGI <=> WWW works like
# This script provides output of "show version" NeTAMS command
# You should provide a correct host/user/password combination to log in
# By commentin/uncommenting some lines, you can use a command-line version
# of this script.
#
# No responsibility and no support for this script are given!
# (c) 2002, Anton Vinokurov 
#-------------------------------------------------------------------------
# $Id: netams_example.cgi,v 1.2 2003/02/26 13:22:26 anton Exp $

use CGI; # comment this for command-line version
require "netams_api.pl";

$cgi=new CGI; # comment this for command-line version
print $cgi->header(-type=>'text/plain',-expires=>'now'); 

netams_login("localhost", 20001, "anton", "aaa");
netams_send("show version");
$result=netams_readline();

print "
$result
\n"; # for command-line version, uncomment next line and comment previous # print "$result\n"; netams_logout();
Fig.9. Simple CGI script which uses NeTAMS API

There are few additional services included to simplify administrator's life. Every user or user group could be assigned to a fixed amount of traffic allowed for transfer, based on period of time or policy. After an excess, the notification mail message could be sent. This behavior is much like the standard UNIX filesystem quota management, and NeTAMS service is named quota also. All the quota metadata is stored in the SQL database and could be managed via web interface (Admintool CGI script included into NeTAMS distribution). The other service is login, which gives the possibility to gain an access or stop it based on external authentication process (for example a CGI script located at gateway site). Access would be granted for a fixed amount of time or for given inactivity timeout. There are MAC (hardware) address to IP address strict binding and passwords management policies implemented. All information is also stored in the SQL table and it is ready for external scripts management. Other useful NeTAMS features are traffic snapshots, performance monitoring (for example, current packets per second and packet delay counters), and per-packet monitoring into the file or SQL table. It could be useful for example in debugging, performance analysis and profiling.
NeTAMS is a intensively developing program distributed under terms of BSD license. It is included into FreeBSD [11] Ports Collection as well as into ASPLinux [12] distribution.

While having such a good software technology for network management, there should be a hardware platform to execute it. There are several factors affecting the platform selection:
  1. Reliability
  2. Cost
  3. Manageability
Standard PC-based systems are very attractive because of cost reason. One is not required to develop his own hardware platform from scratch. Instead, it is much cheaper to select one from thousands of available on the market devices. The well-known PC-compatible platform gives a unique flexibility in component selection. The hardware box itself should be smaller than convenient PC case and probably rack or wall-mountable. Main board should be in mATX or smaller form-factor, there should be some space for additional components. Because of reliability issues, the number of fans should be reduced as much as possible, and rotating hard drives should be omitted also.
For specific tasks, our network gateway should have some additional features onboard, namely:
1. A voice gateway module with an analog or digital telephone interfaces to connect with central router using VoIP technology.
2. An additional 10/100 Ethernet interface or several switched interfaces for server computer or LAN connection.
After a careful analysis of existing pre-built solutions and barebones the decision was made to build our own device using a set of common components. This device has name "Dateliner DL1" [13]. Its layout is shown on fig.10.



Fig.10. Dateliner DL1 system board layout


The device is packed into a low-deep 19-inch rack-mount case 1.5 unit (67 mm) height. It contain an EPIA mainboard made by VIA, which have a Mini-IXT form-factor (17x17 cm) and a standard ATX power connector. It also has an 533MHz x86-compartible CPU onboard, 128Mb of SDRAM memory, one PCI slot, one builtin 10/100 Ethernet port based on Realtek chip, and other components usually found on ordinary Pentium-class mainboard. As a storage device, the 32Mb Flash-based IDE module used because of high reliability and low price. Finally, the device is equipped by a voice-over-IP gateway and Ethernet switch providing a simultaneous LAN and PSTN connectivity, and Flex-ATX power supply for mainboard, an IDE disk, gateway and switch modules. All these devises use +12V or +5V voltage given by a standard power supply.



Fig.11. Dateliner DL1 rack mountable device


All the connectors (except RS232 console port) are grouped onto the module located in front of the device for easier management and expansion (fig.11). Using a typical network diagram (fig.7) one can simply plug a DSL modem (or a wireless modem/router) and LAN segment as well as a key system or analog phone line into the one device.
The software located on the flash device is the operating system itself, and various network management utilities. In Dateliner DL1, the following well-reasoning combination is used:

Operating system FreeBSD 5.1-RELEASE specially stripped to fit into small flash disk. About 20 common process, file system, security and other utilities are included also.
Routing and bridgingStandard static routing built into the kernel. Zebra routing package with RIP, OSPF and BGP routing protocols. Ethernet bridging with 802.1q VLANs.
QoS managementALTQ package pre-built into the kernel, altqd and altqstat utilities. Dummynet traffic shaper. Netperf and IPerf measurement utilities.
Firewall and NATIPFW and NAT utilities (part of the kernel and OS)
Accounting and policingNeTAMS in ip-filter, netflow and libpcap modes
Voice gateway supportGNU Gatekeeper for H.323 Voice-over-IP
ManagementMini-HTTPD over SSL for the web management, SSH server for the remote console


4. How it works together.

IP traffic management process consists of the following disciplines:
  1. Defining a network topology
  2. Determining data paths and available network resources
  3. Determining the major data flow and its needs in network resources
  4. Applying the data flow policies (e.g. access rules, QoS parameters)
  5. Accounting the data which is transferred
  6. Monitoring the interesting flows in more details
The first three steps came from Network Technical Plan. It should be exactly defined how many users will be in the network and what access technology will be used. Fixation of this points gives the first assumption on hardware and throughput in the network. Definition of types of traffic going through the network points to the QoS requirements. Here are the short examples:

Scenario 1

Planned:
Number of users: 100
Location: within the building
Applications: file and print sharing, video on demand
Solution:
Technology: FastEthernet from desktops to switches, Gigabit Ethernet between switches and to servers QoS: IEEE 802.1p Ethernet Priority, IP multicasting (PIM dense mode)

Scenario 2

Planned:
Number of users: 100
Location: small offices located up to 5 km from the main building
Applications: file sharing, Internet access, voice over IP to remote sites
Solution:
Technology: Fixed Broadband Wireless Access station at main site, IEEE 802.11b or 802.16 -capable devices, FastEthernet at main and remote sites
QoS: Traffic management devices at both main and remote sites, VoIP gateways with IP ToS prioritization, VLANs

Because of high throughput of switched wired Local Area Networks the QoS is not an issue while network load is not too high. At the same time, relatively low speed wireless network sharing the common medium (ether) requires more careful QoS planning and traffic management. Here is the typical network diagram for the scenario 2.



Fig.12. Multiservice Point-to-Multipoint wireless network


Comparing with more common LAN environment, wireless point-to-multipoint network has some critical differences:
  1. Both base station LAN and subscriber station LAN have a switched throughput of around 100Mbits/s while wireless part of the network has only 2-30Mbits/s shared between all participants.
  2. All traffic is going through the Base Station, and it is very attractive point to place maximum number of management tools and technologies.
  3. Wireless resource is very expensive in terms of throughput and delay. There should be the minimum number of retransmissions, and traffic delivery policy should have maximum flexibility
  4. Error and packet loss ratio is around 10-5 even using ARQ and FEC, comparing to typical 10-9.
Without of QoS enforcing tools one subscriber station could potentially acquire all the wireless resources making word "guarantee" completely inapplicable here. Imagine we have a Voice over IP session between customer site 1 (subscriber 1) and the base station. While wireless medium is slightly used, VoIP RTP packets arrive in time without any loss or significant delay. But what if somebody at customer site 2 (subscriber 2) starts an FTP data transfer using ReGet program (multiflow traffic)? Because of self-limiting nature of TCP protocol, the whole bandwidth will be occupied by the data, leaving only a little space for RTP flow. Overload of output interface queues at Base station will lead to increase of RTP packet delay, delay variation and even packet loss. For the data stream it is not an issue - TCP will retransmit it soon, but for the voice stream results could be sorrowful.

Having of network topology (1) and available wireless resources (2), as well as prediction of what kind of traffic will flow there (3), administrator should apply (4) the necessary policies to the network devices. Unfortunately, only a small number of wireless hardware are fully QoS-capable, that is why Dateliner DL1 device is in place.
Besides the common network tasks (routing, firewalling, and access control) it can:
1. Classify traffic flows
2. Apply traffic class or priority based on a classification
3. Shape traffic flow based on a class identifier or manage an output queue

Accounting (5) and monitoring (6) tasks could be easily solved using NeTAMS software running on Dateliner DL1 hardware platform. It is easy to collect all accounting data in single place at the Base station side using NetFlow or raw SQL stream. There could be one web console located there to manage all NeTAMS processes, collect all traffic statistics for management, performance analysis and troubleshooting, accounting and billing. It is also the good place for all Dateliner DL1 devices web management. Traffic monitoring could be achieved by service monitor which is built into NeTAMS as well as by viewing of special performance counters.

5. Future work

It is clear that the present model might be significantly improved. There are several modifications that should be improved to make Dateliner and NeTAMS better Speaking about Dateliner box, some things need to be fixed:
  1. More flexible and rich web interface
  2. Additional support for more network adapters, including wireless 802.11a/b/g
  3. Flexible interfaces and modules enabling onsite replacement procedure possible
  4. More ports at onboard VoIP device, probably digital port interface card
  5. More support for telephony protocols and telephony management for web interface
  6. Configuration files archivation for easier backup/recover
  7. Software upgrade procedure and system installer
There are several issues in NeTAMS software. The major problem is the performance bottleneck. The only way to apply rules is not to "listen" over the interface but "pass" all the data via NeTAMS data-source service. Now it is achieved by the following model:



Fig. 13. FreeBSD user-kernel I/O with divert socket


Ipfw kernel layer simply forward all traffic to the divert socket where user-space NeTAMS daemon processes it and forwards it back to kernel. It results in two context switches - one from kernel to user and another from user to kernel. This is done for EVERY packet going through IP stack. If the system load is high, i.e. a lot of processes are executing, this results in two additional switches: one for software interrupt into the kernel while packet is received and another back to process being executed previously. With a heavy amount of traffic the system load will be high and the simplest way to lower it is to increase a processor power. Unfortunately, it is too crude solution. Another issue is the lack of QoS enforcement policy: there no place to apply it in context of NeTAMS management platform.

Kernel integration - netgraph
One of the solutions proposed to further NeTAMS development is to put most of IP packet processing into the kernel itself. There is a well-known mechanism named netgraph exists. It allows building an additional kernel module and installing them on a packet path inside the kernel using the simple mechanisms. To minimize the context switching, a forwarding and QoS policy decision will be done once at the first packet coming from a unique flow (there should be optimal flow estimator constructed, this is a topic for a individual research). All sequential packets will be forwarded and accounted using this "cached" policies without unnecessary context switching (fig.14). Sometimes, when the flow is expired by timeout, accounting data will be forwarded back to the NeTAMS data-source service.



Fig. 14. FreeBSD user-kernel I/O with netgraph module


In addition, netgraph module could be a perfect place to implement output queue strategy and apply a packet priorities because of fine-resolved kernel timer available and several other things.

This approach will separate packet processing from accounting processing, but there are a lot of "tightly coupled" services left in the NeTAMS process itself: HTML pages creation, monitoring, quota processing etc. The obvious advantage is the ability to share the common data structures protected by mutexes. But if a small bug exists in one service, it could potentially result in the crash of a whole process. Or, some services could require more processing power which is taken away from other ones. The good idea is to separate all services to processes. It will result in more reliable design and, what is very attractive, in ability to run some parts of NeTAMS system ON SEPARATE COMPUTERS.

Distributed model - PVM
Two things should be done:
1. Selection of data transport
2. Data exchange protocol to be defined

"Parts" of NeTAMS system could communicate with each other within of a :
1. Single process
2. Processes running on one host
3. Between processes running on multiple hosts

Hardware architecture and operating system on these hosts could be different. For me it is practically doubtless that one of transport technologies from weakly coupled clusters should be used. For a long time several inter-process communication techniques have been developed for research purposes. The need of intensive data sharing during hard numerical calculations results in MPI or PVM appearance. Parallel Virtual Machines is the good choice because of flexibility it gives to programmer and number of platforms supported. It works as a transport sub layer for applications that wants to exchange some data irrelative of operating system or hardware used, within one process or many.

Failover/cluster PC
Implementation of such a high-speed inter-process transport in combination with "spreading" of services over a network will make it possible to create the failover NeTAMS installations. A single system could run over two collocated hosts sharing common database and configuration. In case of failure of one of host all processing will be held by another transparently.

Web management
Existing Admintool script is the first effort to simplify NeTAMS management. It could be quite hard to manage a network with 1000 hosts over a Command Line Interface. In theory, ANY operation could be done via web interface scripts, from "users" creation to application of QoS policies, monitoring and performance analysis. Although design of such an interface is a routine task, it requires the careful programming. Currently, NeTAMS Application Programming Interface implemented as netams_api.pl module works with text-based commands. More reliable and fast solution is to use a binary data transfer with some sort of crash protection. This could be easily done when a PVM data transport interface will be implemented for the whole system.

6. Conclusion

In this document I've tried to point you to some of major traffic management problems in wired and wireless network and suggest my own solutions: Dateliner and NeTAMS. This hardware and software systems has grown from my research after a hard thinking. During this process a lot of problem arise and a lot of original ideas appear. Some of them require the additional research effort, and others - intensive testing and analysis by a community.

I would like to thank all people who helped me in my activity.
"Dateliner DL1" project is supported by my employer, Dateline Communications
ipCount, aaa+fw and NeTAMS projects were developed at network provided by Inorganic Chemistry Division, Chemistry Department, Lomonosov Moscow State University.
NeTAMS project is actively supported by Yury N. Shkandybin
Research of QoS at wireless networks is stimulated by Dr. Sergey L. Portnoy, Alvarion Inc.

7. References

[1] W.Stallings. "Wireless Communications & Networks", ISBN: 0130408646
[2] VoIP: http://www.cisco.com/en/US/tech/tk652/tk701/tech_digests_list.html
[3] Linux, http://www.linux.org
[4] FreeBSD, http://www.freebsd.org/
[5] Cisco Systems. http://www.cisco.com
[6] Allot Communications. http://www.allot.com/
[7] ALTQ project, http://www.csl.sony.co.jp/~kjc/software.html
[8] ALTQ in FreeBSD-CURRENT, http://www.rofug.ro/projects/freebsd-altq/
[9] PicoBSD and others, http://people.freebsd.org/~picobsd/
[10] NeTAMS project, http://www.netams.com/
[11] FreeBSD Ports Collection, http://www.freebsd.org/ports/index.html
[12] ASPLinux distribution, http://www.asplinux.com/
[13] Dateliner DL1 project, http://www.dateline.ru/tech/dateline/index-en.html
[14] NetFlow: http://www.cisco.com/warp/public/cc/pd/iosw/ioft/neflct/tech/napps_wp.htm
[15] ip_acct and ng_ipacct, ftp://ftp.wuppy.net.ru/pub/FreeBSD/local/kernel/ng_ipacct/


<- Back to my home page