Active Probing Software Documentation Page
Note : This page and the documentation linked to it is currently under construction.Table of Contents
- Where can I find details about the linux packet sending programs?
- How to gain accurate timing using the TSC register
- Some comments on TSC based timestamping
- How can I read the TSC register directly in my programs?
- What changes were made to the original kernel to produce the 2.4.14-tsc kernel?
- tcpdump and TSC based timestamps
- How do I create probe stream files to be used by the packet sender?
- What are the output file formats used by the packet sending applications?
- Post processing utilities for tcpdump, .dt and .tscdt files
- Post processing utilities for round trip time tcpdump and .dt file
- How do I access the conversion constant via the shared memory segment set up by tscskewd?
Where can I find details about the linux packet sending programs?
Refer to the man pages for linuxps(8), tscskewd(8), readtscskew(1). Details about the post-processing utilities can be found in the section Post processing utilities for tcpdump, .dt and .tscdt files
How to gain accurate timing using the TSC register
The accuracy of the timestamps generated using TSC depends on the stability of the CPU clock source and on the accuracy to which the actual clock rate is known, which is used to convert the timestamps to standard time units. Measurements has shown that the stability of the desktop PC CPU clock signal sources is typically better than 0.1PPM. The clock rate can be estimated (clock calibration) using a reference clock with a known accuracy. Most probably the easiest way of the calibration is to use the standard software clock of the system, which is synchronised to an NTP server (preferably stratum 0). The CPU clock rate estimation is then based on counting the clock cycles in an interval of T seconds, the selection of which depends on the accuracy of the reference time source. If e is the accuracy of the reference clock, for example 1ms for an NTP synchorinsed SW clock (assuming synchronisation to a primary NTP server) the T is app. 6 hours to get an estimate with an accuracy of 0.1 PPM ( 2e/T < 0.1 PPM (1e-7) ).
To calibrate your clock according to the above described method use the tscskewd application included in this package. This application continuously measures and updates two registers ( count[CPU clock cycle] and time[microsec] ) which are used by other applications converting raw TSC timestamps to standard time units. The conversion is done off-line (tscdttodt and stscdttodt), to achieve higher performance and allow higher reliability and accuracy with keeping the raw data. If the accuracy of your synchronised NTP clock is worse than 1ms, average the values reported by the tscskewd utility ( after the 7 hour initialization period one estimate is reported each 96 seconds which is based on a 7 hour period T , 11 estimates are also reported during the initialization period to get a quick but less accurate conversion constant estimates).
Note: variations in the estimate can be caused by the changing rate of the TSC clock (which on desktop PCs, according to our experiences, is usually below 0.1 PPM and caused for example by variations in the room temperature ). However we have observed in a number of cases that larger variations were caused by NTPs attempts to correct for earlier wrong offset corrections. Errors in the estimate due to the second reason can be filtered out simply by taking long term averages. A good TSC conversion estimate remains accurate to 0.1 PPM for many days or even weeks if there are no large changes in the external conditions.
For more details on clock accuracy, calibration and synchronisation see our publications and articles.
Some comments on TSC based timestamping
tcpdump is using libpcap for packet capture. Under Linux the timestamp on receiving a packet is generated by the Linux kernel (in net/core/dev.c file, netif_rx function):
#ifndef CONFIG_CPU_IS_SLOW if(skb->stamp.tv_sec==0) get_fast_time(&skb->stamp); #else skb->stamp = xtime; #endif(from 2.2.14 kernel source, 2.4.14 doesn't have the CONFIG_CPU_IS_SLOW option)
This means that on hosts with Pentium class processors where the TSC register is available a timestamp is generated using the get_fast_time routine. The timestamp placed in the skb->stamp is accessible via an ioctl call ( ioctl(sock,SIOCGSTAMP,&stamp) ).
The pcap_read function of the pcap-linux.c in the libpcap package is using this ioctl call to obtain the timestamp on the received packets.
In our TSC based receiver monitor application for Linux we take advantage of the condition on checking if a timestamp has already been placed in skb->stamp. A new network driver is created, which places a TSC based timestamp in skb->stamp when executing its rx interrupt handling routine. This operation is very quick as only the raw TSC register value is stored, no conversion to standard time unit is performed. The timestamping in this case is not only quicker but also performed earlier in time and in more prioritized thread than the standard Linux solution. As a result of this we get a more accurate timestamp, significantly less affected by the 'noise' introduced by the operating system's scheduling. The most significant component of the remaining timestamping inaccuracy is the interrupt latency - how quickly let's the operating system to start the NIC driver's receiving routines, which is in the range of 10s of microseconds in the worst case under Linux according to our experiences. This 'noise' can be further reduced using the RT-Linux based receiver.
An example of a modified NIC driver is included in the available tsc kernel packages (see drivers/net/3c59x_tsc.c in linux-2.2.14-tsc or in linux-2.4.14-tsc). The additions to the original 3c59x.c (from Linux kernel 2.2.14) are marked by '// Attila'. If you are using a different NIC then based on this example most probably you can create a modified version of your driver as well.
NOTE: If you wish to use the modified driver you need to change your configuration files as well (on RedHat /etc/conf.modules change alias eth0 3c59x to 3c59x_tsc).
NOTE: If you use different cards and you can get TSC based timestamps using the above mentioned modified kernel versions in which the timestamp generation in the kernel has been modified to be TSC based.
NOTE: If you are using the modified driver and running tcpdump listening on the network interface with the modified driver or using a modified kernel the timestamps reported by tcpdump are going to be FALSE. The problem will disappear as soon as you change back the configuration to the original driver and reboot (or just remove the tsc and load the original module) or if reboot with your original kernel in the case of using a modified kernel. It is also possible to download a modified libpcap library and compile a new tcpdump application which overcomes this problem. This modified version should work fine with both standard and tsc based timestamps. The modified libpcap library can be used by other packet capturing applications as well. Only the pcap-linux.c file has been changed in the libpcap library, for details see the source code.
NOTE: The resolution of the TSC timestamps is determined by the CPU clock cycle duration (1/CPU rate) while the resoltion of the standard timestamps (e.g. used by tcpdump) is limited to 1 microsecond.
How can I read the TSC register directly in my programs?
If you are using the Pentium architecture or some modern AMD architectures, then you should have access to the free-running TSC register (which is reset each time the computer is rebooted). Using this register in conjunction with tscskewd(8) it is possible to get considerably better timestamping than the standard gettimeofday() system calls used by most programs. Architectures that support the TSC provide an extremely light-weight and fast assembly function rdtsc. We will give an example of how to use this now.
In order to keep your code readable, we recommend that you put in your ".h" definition file a macro for the rdtsc assembly command. It should look like the following:
#define rdtsc(hi,lo) __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi))
This will load the high 32-bits of the TSC register into the first argument you use to called rdtsc with and the low 32-bits of the TSC register into the second argument. This macro can then be used in any of your code (where you include the definition file containing the rdtsc macro) as follows:
struct tsc_stamp { long upper_32; long lower_32; } void main(void) { tsc_stamp my_stamp; rdtsc(my_stamp.upper_32, my_stamp.lower_32); }
Obviously this code does nothing interesting but read the TSC value. You will then want to do something with this value that you have obtained, for instance, convert it to a time value via the conversion constant estimated by tscskewd. For more information about reading out the conversion constant from your program, read the section How do I access the conversion constant via the shared memory segment set up by tscskewd?. For an example then use the conversion constant and your raw TSC stamps to produce accurate times, refer to the tscdttodt.c file available in the TSC linux packet sender package.
Stay tuned, as we hope to be releasing shortly a handy tool to convert TSC stamps into timestamps which should also simplify this process.
What changes were made to the original kernel to produce the 2.4.14-tsc kernel?
Three files were modified in order to improve the timing of the standard 2.4.14 kernel. The modified kernel is given the name 2.4.14-tsc. All of the changes that have been made in these files are marked with "// Attila". The three files that have been modified are as follows (click on the file name to view the source code):
- net/core/dev.c
- This is the device independent networking code. This is where linux normally timestamps received packets. This code has been modified to check if a (more accurate) timestamp has been put in by the NIC already, and if so doesn't put in a timestamp, otherwise it puts in a TSC based timestamp instead of the standard gettimeofday(). There is also a modification to the sending routine so that it only sends sent packets to any taps that have been set up if the packet is successfully sent to the card (this is less important).
- drivers/net/3c59x_tsc.c
- This is the network card driver for 3COM 3c59x cards. This is an example modified NIC which adds TSC timestamping functionality. The functions that were modified were vortex_start_xmit, boomerang_start_xmit, vortex_rx and boomerang_rx. What is most important is the modifications to the packet receiving code, i.e. the functions vortex_rx and boomerang_rx. They are modified to read the current values of the TSC counter as soon as they are invoked and then once a buffer for the received packet has been allocated, they copy the TSC counter values into the received socket buffer's timeval data structure called stamp which can then be read out later by user level functions. This is used by tscrecieve to get more accurate timestamps via the ioctl system call. Note that, when using a normal linux kernel, the stamp timeval structure within the socket buffer structure would normally be filled in by the device independent code, namely the code in dev.c (as described above). By moving this operation into the driver, the accuracy of the resulting timestamp is improved significantly.
- net/sched/sch_generic.c
- This file has only been slightly modified to correspond to the changes in dev.c mentioned above regarding the sending of "copies" to any active network taps. This change ensures that copies are only made once the original packet has already been delivered.
All of the changes in the three files mentioned above are marked with "// Attila". Search for this if you are interested in viewing the changes in the source code of the modified linux package.
tcpdump and TSC based timestamps
The 'tscreceiver' application only can be used as a Receiver Monitor. The recommended Sender Monitor application for this Linux based Active Probing Package is tcpdump.
If you are using the standard linux sender simply then you can use the standard tcpdump application as Sender Monitor. However if you are using the modified TSC kernel, you will find these "timestamps" are now raw TSC counter values and not times. For this reason, the modified tcpdump/libpcap application was written. The modified tcpdump/libpcap libraries detect TSC counter values and convert them to normal timestamps.
If you wish to benefit from the higher accuracy of TSC based timestamping install the modified version of the libpcap library and compile a new tcpdump application. Please extract both archive packages (libpcap-0.4tsc.tar.gz and tcpdump-3.4.tar.gz) into the same directory (eg. /usr/src), then first compile libpcap (./configure then make) and then tcpdump.
The modified tcpdump application to be able to convert TSC timestamps online to standard time units requires the 'tscskewd' application to be used. For this purpose the 'tscskewd' is recommended to be used by constant inputs and not in NTP mode. Refer to the tscskewd man page for details regarding how to do this.
tcpdump outputs can be converted to the preferred '.dt' output file format using the 'tcpdumptodt' application. tcpdump should be run using the -tt, -s54 and -x options and its output stored into a file which is later given as input to the 'tcpdumptodt'. An example tcpdump command with these options and some additional filtering options:
tcpdump -tt -s 54 -x host 10.0.0.104 and port 7775 > tcpdumpfile.txt
How do I create probe stream files to be used by the packet sender?
In general the packets of a probe stream can be defined by their:
- packet type
- packet size
- departure time
- time to live (max. number of hops)
The sender tools in this package are using UDP packets with a packet serial number placed into the first four bytes of the UDP payload. As a result of this the packet size typically can be varied between 32 and 1500 bytes. The time to live field of probes which are meant to reach the receiver should be set to -1.
This package is using the '.bin' binary file format to store probe stream definitions. These files are 64 bit floating point (C: double) arrays with num_of_probes x 3 elements (time, size, ttl).
New probe stream files can be easily created using Matlab or simple C
programs.
To create a probe stream using Matlab just follow these steps:
1. p = ones(<num_of_probes>, 3); 2. p(:,1) = <inter_departure_time_vector>; 3. p(:,2) = <packet_size_vector>; 4. p(:,3) = <ttl_vector>; // usually just -1
NOTE: set the TTL values to -1 if you don't want to limit your probes with the time-to-live field.
Now save save p as a double precision binary file using for example
5. pswrite('teststream.bin',p);
where pswrite is a simple Matlab function (pswrite.m):
function y = pswrite(x,a)
% pswrite - creates a binary file of timing information
% for rtSend and linuxSend applications
%
% pswrite('filename',idt_array)
%
% idt_array - array of inter departure times
%
fid = fopen(x,'wb');
fwrite(fid,a,'double');
y = fclose(fid);
Note: if you are running Matlab on a different platform you may encounter problems due to the different 'double' representations (eg. bit order)
What are the output file formats used by the packet sending applications?
There are two output formats used by the packet sending applications. These are as follows:
- '.tscdt'
- tsc based time stamps created by the rtps.o and rttscreceiver combined Receiver & Receiver Monitor application
- '.dt'
- the 'standard' timestamp file each timestamp stored in 64 bits in 32bit.32bit format (sec:remainder) the first value of the file is 1 (for compatibility with PSIM) which is followed by num_of_probes timestamps in the format spec. above lost packets are represented as -1 ( 0xffffffffffffffff )
Some Matlab files supporting the analysis of the '.dt' files are included in the package, and described in the section Post processing utilities for tcpdump, .dt and .tscdt files.
Post processing utilities for tcpdump, .dt and .tscdt files
There are a number of post processing utilities provided in the linux sender packages which allow tcpdump, .dt and .tscdt output files to be analysed (mostly using Matlab). The linux packet sender packages are found on the active probing software page.These utilities are as follows:
- dt_ts.m
- This is a Matlab utility for post processing .dt files. This reads in the .dt file which you give as an argument to the function in Matlab and converts that into time values which are stored in an array. It also detects lost packets and gives a idti array which is the indicies of values which are valid inter-departure times. For more information check the source file.
- tscdttodt
- This is a C program which converts .tscdt files to .dt files which can then use the application above to read the values into Matlab.
- tcpdumptodt
- This is a C program which converts tcpdump outputs to .dt files which can then be used in Matlab via the dt_ts routine above. Refer to the section tcpdump and TSC based timestamps for information about how tcpdump must be run for this program to be able to successfully read in the timestamps.
- 1secx10.bin
- This is not a program but an example binary probe stream file which sends an initial probe + 10 probe packets with 1 second inter-depature time and TTL of -1. This is provided as an example of how to use the packet sender.
Post processing utilities for round trip time tcpdump and .dt files
There are a number of post processing utilities provided in rtt post processing package which allow tcpdump and .dt output files to be further analysed (mostly using Matlab). The rtt post processing packages are found on the active probing software page. These utilities are as follows:
- dt_rtt.m
- This is a Matlab utility for post processing .dt files. This reads in the .dt file which you give as an argument to the function in Matlab and converts that into time values which are stored in an array of (hops,queries). It also detects lost packets and marks them in the output dt array as NaN for further processing. It is essential when comparing the .dt files generated by the DAG monitor that the lost packets correspond, ie. the monitor and the sender/receiving program detect the same lost packets in order. For more information check the source file.
- mean_dt.m
- This is a Matlab utility that takes two input arrays of round trip times, ie. one from the sender/receiver and one from the monitor and takes the difference between them, marking further NaN values where a round trip time cannot be calculated. It then returns the mean difference between the two arrays over the number of queries, per hop. Click on the title to see the MATLAB file.
- tcpdump2dt-rtt.c
- This is a C program which converts two tcpdump outputs, one for the receiver stream and one for the sender stream to a .dt output file and an optional number of hops argument (defaults to 1) to be written as the first value to the .dt file, which can then be used in Matlab via the dt_rtt routine above. The shell script below is used to produce the tcpdumps based on the DAG monitor stream .d3h file to produce a receiver tcpdump based on filtering for ICMP responses and a sender tcpdump based on filtering for UDP packets based on their packet size. Click on the title to see the C code.
- dag2rtt_tcpdump.sh
- This is the shell script that prepares the sender and receiver tcpdump streams based on the DAG monitor .d3h file to be input to the tcpdump2dt-rtt file. It takes the .d3h trace file, receive host (ie. www.kth.se) and a UDP packet size as its arguments. It also takes an optional argument for the number of hops for tcpdump2dt-rtt. The DAG 3.5E software package uses a different packet filter dagconvert, to the DAG 3.2E package which uses dagbpf. At present the shell script is configured for use with the DAG 3.5E software package however by commenting out the two dagconvert filter lines and uncommenting the two dagbpf lines, this script can also be used with the DAG 3.2E software package. Click on the title to see the script.
How do I access the conversion constant via the shared memory segment set up by tscskewd?
This can be done using the IPC shared memory functions available from <sys/ipc.h> and <sys/shm.h>. Note that tscskewd uses the shm_key 731127 to identify itself. An example of reading out the conversion constant within a C program is given below:
#include <stdio.h> #include <stdlib.h> #include <sys/ipc.h> #include <sys/shm.h> #include <sys/mman.h> #define SHMTSC_KEY 731127 int shmflg; /* shared memory segment variables */ volatile char *shmtsc; struct shmid_ds shmid_ds_buf; int shmtscid; unsigned long *pref_tsc_countl; unsigned long *pref_tsc_counth; unsigned long *pref_tsc_time; char *pwflag; unsigned long long count = 0; // tsc conversion constant - count unsigned long long time = 0; // tsc conversion constant - time void fetch_cc(void); void shm_cleanup(void); int main(void) { fetch_cc(); fprintf(stdout,"tscskewd shm: count:%Lu time: %Lu\n",count,time); shm_cleanup(); return 0; } void fetch_cc(void) { // Create shm segment shmflg = IPC_CREAT | 511; shmtscid = shmget(SHMTSC_KEY, 1024, shmflg); if ((shmtscid < 0)) { fprintf(stderr,"Shm init failed\n"); exit(-1); } // attach shm shmtsc = shmat(shmtscid, NULL, 0); // try to prevent swapping - works only for super-user shmctl(shmtscid, SHM_LOCK, 0); pref_tsc_countl = (long*)shmtsc; pref_tsc_counth = (long*)(shmtsc+sizeof(long)); pref_tsc_time = (long*)(shmtsc+2*sizeof(long)); pwflag = (char*)(shmtsc + 3* sizeof(long)); // check shm status shmctl(shmtscid, IPC_STAT, &shmid_ds_buf); if (shmid_ds_buf.shm_nattch == 1) { fprintf(stderr,"tscskewd not running\n"); exit(-1); } else { while (pwflag[0] == 'w') {}; count = pref_tsc_counth[0]; count = count << 32; count = count | pref_tsc_countl[0]; time = pref_tsc_time[0]; if (!count || !time) { fprintf(stderr,"Warning: tscskewd not initialised, please wait for a count and time value to be established\n"); shm_cleanup(); exit(-1); } } } void shm_cleanup(void) { // shm clean-up munlockall(); // shared memory // detach sgm shmdt((char*)shmtsc); // check shm status shmctl(shmtscid, IPC_STAT, &shmid_ds_buf); if (shmid_ds_buf.shm_nattch == 0) { // destroy shm segment shmctl(shmtscid, IPC_RMID, 0); } return; }
Using these values (which are the conversion constant) it is possible to convert raw TSC counter values into time stamps. For an example of how to do this refer to the tscdttodt.c file available in the TSC linux packet sender package.