I work in computer networking, with a particular interest in the
modelling and measurement of tele-traffic, in particular the TCP/IP
packet data flowing over the Internet. In general terms the aim
is to understand in greater detail how the traffic sources and network
structure and protocols interact, with a view to making the network,
and end applications, more efficient. This has lead to work in a
number of seemingly different areas including statistical estimation
and clock synchronisation.

Software clocks in computers are based
on local hardware synchronising to more accurate remote clocks.
Currently the NTP system is used to synchronise hosts to
remote servers across the Internet. The stability of modern PC
hardware however actually supports higher accuracy and robustness that
NTP currently delivers. We are developing replacement for the NTP
clients and servers based on new principles, in particular the need to
distinguish between *difference clocks* and *absolute clocks*, and the
associated primacy of rate stability over absolute clock error.
The RAD difference clock, for example, can measure RTTs to under a microsecond, even
if connectively to the time server is lost for periods of over a week!

The SyncLab Project has as its aim to provide a complete new system for network timing. Currently client software is available for Linux and BSD Unix which can connect to existing NTP servers. Download details, documentation and a number publications can be found on the project page.

This project has been made possible in part by a grants from
the Australian Reseach Council, the Cisco University Research Program Fund at Silicon Valley
Community Foundation, two Google Research Grant Awards, and a partnership with Symmetricom Inc. (now Microsemi).

By Network Inference we mean the application of sophisticated statistical techniques for the
translation of imperfect network measurement data into understanding
of the operation,
mechanisms, state, use, performance, and fairness of the network. For
example, the inference
techniques of Network Tomography use data probes like X-rays to look
`inside' the network body
to locate overloaded links. Such a capability is valuable across the
spectrum of network users: for
the Internet public to determine who is responsible for slow downloads,
for network operators to
troubleshoot their networks, and for regulators to police compliance to
Service Level Agreements. I am active in following three
directions within network inference.

Active Probing Here test packets or `probes' are injected into the network, collected at a set of receivers around the network edge, and inferences made on the end-to-end path based on measured end-to-end delays and/or losses. My interests in this area range from the underlying measurement infrastructure, the `heuristic' design of effective probe streams and their analysis, and the rigorous application of queueing theory to active probing problems. My colleagues in this area include Attila Pásztor, François Baccelli, Sridhar Machiraju, and Jean Bolot. The current focus in on the theoretical side, trying to build up a science of convex networks, a property which will allow optimal probing strategies to be well defined and devised.

Network Tomography Whereas in active probing inference probes, typically, follow a single end-to-end path which is modelled as a sequence of queues, by Network Tomography we mean a class of inversion problems (which may or may not involve probing) which is much more ambitous in the spatial dimension (multiple sources and receivers over the network) but treats nodes using simple black box models for loss or delay. For example a link may be characterised simply by a single number, a loss probability. My work in this area primarily concerns multicast probes which flow from a single source to multiple receivers, with copies being made at each branch point, tracing out a measurement tree in the process. I have worked on loss, delay, and topology inference in this context, with a major focus on generalising beyond the classical simplifying assumptions of perfect spatial and temporal independence. My colleagures here include Vijay Arya, Nick Duffield, François Baccelli, and Rhys Bowden.

Route Tracking (advanced Traceroute) One of the oldest probe based inference tools is traceroute, which makes use of features of the TCP/IP/ICMP protocol suite to trace out the IP-level path between a source and destination. However, because of load balancing, a high proportion of routes in the Internet today have multiple branches, and failing to take this into account can produce meaningless topology inferences. Paris Traceroute is a generalised Traceroute tool which attempts to trace routes as they really are, whether branched or not. I work with Paris Traceroute researchers Renata Teixeira, Timur Friedman, Christophe Diot, and Ítalo Cunha in applying statistical ideas to the problem of controlling the error in what is effectively topology estimation, and in efficiently tracking (branched) route changes over time.

Active Probing Here test packets or `probes' are injected into the network, collected at a set of receivers around the network edge, and inferences made on the end-to-end path based on measured end-to-end delays and/or losses. My interests in this area range from the underlying measurement infrastructure, the `heuristic' design of effective probe streams and their analysis, and the rigorous application of queueing theory to active probing problems. My colleagues in this area include Attila Pásztor, François Baccelli, Sridhar Machiraju, and Jean Bolot. The current focus in on the theoretical side, trying to build up a science of convex networks, a property which will allow optimal probing strategies to be well defined and devised.

Network Tomography Whereas in active probing inference probes, typically, follow a single end-to-end path which is modelled as a sequence of queues, by Network Tomography we mean a class of inversion problems (which may or may not involve probing) which is much more ambitous in the spatial dimension (multiple sources and receivers over the network) but treats nodes using simple black box models for loss or delay. For example a link may be characterised simply by a single number, a loss probability. My work in this area primarily concerns multicast probes which flow from a single source to multiple receivers, with copies being made at each branch point, tracing out a measurement tree in the process. I have worked on loss, delay, and topology inference in this context, with a major focus on generalising beyond the classical simplifying assumptions of perfect spatial and temporal independence. My colleagures here include Vijay Arya, Nick Duffield, François Baccelli, and Rhys Bowden.

Route Tracking (advanced Traceroute) One of the oldest probe based inference tools is traceroute, which makes use of features of the TCP/IP/ICMP protocol suite to trace out the IP-level path between a source and destination. However, because of load balancing, a high proportion of routes in the Internet today have multiple branches, and failing to take this into account can produce meaningless topology inferences. Paris Traceroute is a generalised Traceroute tool which attempts to trace routes as they really are, whether branched or not. I work with Paris Traceroute researchers Renata Teixeira, Timur Friedman, Christophe Diot, and Ítalo Cunha in applying statistical ideas to the problem of controlling the error in what is effectively topology estimation, and in efficiently tracking (branched) route changes over time.

In resource constrained environments
such as within core Internet routers, accurate measurement of traffic
features and statistics can be difficult. Two canonical approaches to
fast approximate measurement are: sampling of the data, and sketching,
which means the use of compact data structures which are fast to
update, but which store information imperfectly. My work in this
area has focussed on the measurement of the flow size distribution (number of packets in a flow such as a TCP connection). This is an important metric for numerous applications including traffic modelling, management, and attack detection.

We evaluate data collection mechanisms in a Fisher Information framework, comparing various sampling and sketching approaches in order to determine which inherently captures the most information about the distribution. We developed the Dual Sampling (DS) and the Optimised Flow Sampling Sketch (OFSS) (see below for OFSS code) methods which are both capable of being implemented at high speed. This work is with Paul Tune.

We evaluate data collection mechanisms in a Fisher Information framework, comparing various sampling and sketching approaches in order to determine which inherently captures the most information about the distribution. We developed the Dual Sampling (DS) and the Optimised Flow Sampling Sketch (OFSS) (see below for OFSS code) methods which are both capable of being implemented at high speed. This work is with Paul Tune.

Packet traffic has scale invariance
features, in particular long range dependence (LRD), which impacts on
network performance, performance analysis, accuracy of simulation, and
parameter estimation. My underlying interest has been in traffic
modelling, but in the analysis of real data, the need for more powerful
estimation tools naturally arises and I have also worked extensively in
this area. Much of the work here involves wavelets and is in
collaboration
with Patrice Abry
from the Signal Analysis group of the
Ecole Normale Supérieure de Lyon.
We confirmed that fractal traffic is real - not just an artifact of
poor estimation tools - and introduced wavelet analysis to the area.
Other colleagues include Patrick Flandrin, Murad Taqqu, Walter Willinger, and Matthew Roughan. Associated Matlab code is available for download at the links below.

With my former student Nicolas Hohn and also Patrice Abry
we developed models of packet arrivals based on cluster
point processes which describe a number of features of backbone traffic parsimoniously.
One of the conclusions is that TCP flows can be treated as independent in the Internet core!
Another is that multifractal models are not needed to describe some of
the other features which may appear scaling if the statistical methods
used to examine them are not powerful enough.

The Flow Sampling Sketch, or FSS, is a skampling
method, that is a hybrid between sampling and sketching, which allows
the flow size distribution to be estimated with very low resource
requirements in both time and memory. The OFSS method is an optimally
tuned/calibrated FSS with a statistical performance which is within a constant factor of
Flow Sampling, which is known to be optimal.
The OFSS Matlab code allows
the critical calibration parameter of OFSS, pf*, to be calculated
for any given input load alpha. It also gives the associated
amount of information, and minimal variance, of any estimator using the
OFSS method. More details are given on the code page.

The second
order estimation code includes a number of related capabilities:

- Matlab code for wavelet based scaling parameter estimation, including estimation of the Hurst parameter of self-similar processes, and the two (yes two!) parameters of long-range dependence (essential for confidence intervals on mean estimates!)
- Code for the special prefiltering necessary in the study of discrete data.
- A function for the automatic selection of the lower cutoff scale.
- A handy tool for the investigation of stationarity- incorporates the test of the `constancy of scaling'.
- On-Line version in C, for data of arbitrary length.

The multifractal
estimation code goes beyond second order:

- Tools for the wavelet based analysis of Multifractals and other MultiScaling processes.

Details of the RADclock
project (formally known as the TSCclock) and the current release can be found on the old
RADclock page and the new (work in progress) SyncLab Project site.