James Healy and Lawrence Stewart Centre for Advanced Internet Architectures, Swinburne University of Technology, Melbourne, Australia CRICOS number 00111D 9th October, 2007 ---------------------------------------------- OVERVIEW ---------------------------------------------- DPD (Deterministic Packet Discard) v1.0 is a FreeBSD 7 kernel module that introduces predictable packet loss into TCP connections flowing through the kernel. It is particularly useful in TCP research in order to be able to directly compare behaviour triggered by a loss event that always occurs at the same point in the connection. Whilst it was developed for and has primarily been tested on FreeBSD 7, DPD should work on FreeBSD 5 and 6 as well. ----------------------- LICENCE ----------------------- The DPD code is released under a BSD licence. It leverages some of the hash code written by Tobias Weingartner, which is released under a BSD licence. Refer to licence headers in each source file for further details. ----------------------- USAGE ----------------------- Make sure you have the FreeBSD system sources installed under /usr/src These can be installed using sysinstall if they are not currently installed. To build the module, simply run: make To load the compiled module into the running kernel, run the following command as root: kldload ./dpd.ko To unload the module from the running kernel, run the following as root: kldunload dpd To delete all artifacts created by compiling the module, run: make clean ----------------------- CONFIGURATION ----------------------- DPD utilises the FreeBSD sysctl interface to export its configuration variables to user-space. DPD provides 4 sysctl configuration variables. The names of the 4 variables and their default values on load are: net.inet.dpd.enabled=0 net.inet.dpd.lose_in=1000 net.inet.dpd.lose_out=1000 net.inet.dpd.processing_direction=inout The net.inet.dpd.enabled variable controls whether the module actively monitors TCP flows and drops packets as configured. By default, the value is set to 0, which means the module will not drop any packets. Having the module loaded with net.inet.dpd.enabled set to 0 will have no impact on the performance of the network stack, as the packet filtering hooks are only inserted when net.inet.dpd.enabled is set to 1. To enable the module's operations, run the following command as root: sysctl net.inet.dpd.enabled=1 The net.inet.dpd.lose_in and net.inet.dpd.lose_out variables control the set of packet numbers to drop for each TCP connection e.g. the default value of 1000 means that when enabled, DPD will drop the 1000th packet of each TCP flow passing through the kernel. These variables take a comma-separated list of packet numbers and ranges of numbers of the form: x,y-z The x refers to an indvidual packet number, whilst the y-z refers to an inclusive range of packet numbers to drop. For example, a list like "1,50-52,1000" would cause DPD to drop the 1st, 50th, 51st, 52nd and 1000th packet of each TCP flow. To change the set of inbound packet numbers to lose to "1,50-52,1000", run the following command as root: sysctl net.inet.dpd.lose_in=1,50-52,1000 The net.inet.dpd.processing_direction variable controls the way the "lose_in" and "lose_out" variable settings affect the packets that actually get dropped. There are four possible values this variable can be set to: "in", "out", "inout" or "combined". To change the processing direction to "in" or "out", run either of the following commands as root: sysctl net.inet.dpd.processing_direction=in sysctl net.inet.dpd.processing_direction=out Setting net.inet.dpd.processing_direction to "in" or "out" will result in only packets in the inbound or outbound direction respecitvely being dropped. The set of packet numbers to drop is determined by the corresponding setting of the "lose_in" and "lose_out" variables respectively. Setting net.inet.dpd.processing_direction to "inout" will result in inbound packets being dropped according to the setting of the "lose_in" variable, and outbound packets being dropped according to the setting of the "lose_out" variable. To change the processing direction to "inout", run the following command as root: sysctl net.inet.dpd.processing_direction=inout Setting net.inet.dpd.processing_direction to "combined" will result in packets being dropped independent of the direction they are travelling. For example, you can use the "combined" processing direction to drop the 1000th packet of each flow regardless of whether the packet is travelling inbound or outbound. The set of packets to drop when using "combined" mode is only affected by the "lose_in" variable. To change the processing direction to "combined", run the following command as root: sysctl net.inet.dpd.processing_direction=combined ---------------------------------------------- OUTPUT FORMAT ---------------------------------------------- DPD outputs useful information to the console at module load and unload time. All output is written in plain ASCII text. Note: The "\" present in the example load and unload messages in this section indicates a line continuation and is not part of the actual message The module load message is written to the terminal when the module is loaded into the running kernel. The text below shows an example module load message. The fields are tab delimited key-value pairs which provide some basic information. module_load_time_secs=1187923474 module_load_time_usecs=587926 \ dpdver=1.0 Field descriptions are as follows: module_load_time_secs: Time at which the module was loaded, in seconds since the UNIX epoch module_load_time_usecs: Time at which the module was loaded, in microseconds since module_load_time_secs dpdver: Version of DPD The module unload message is written to the terminal when the module is unloaded from the running kernel. The text below shows an example module unload message. The fields are tab delimited key-value pairs which provide statistics about DPD operations since the module was loaded. module_unload_time_secs=1191464985 module_unload_time_usecs=409504 \ num_inbound_tcp_pkts=156771 num_outbound_tcp_pkts=208565 \ total_tcp_pkts=365336 num_inbound_skipped_pkts_malloc=0 \ num_outbound_skipped_pkts_malloc=0 num_inbound_skipped_pkts_icb=0 \ num_outbound_skipped_pkts_icb=0 total_skipped_tcp_pkts=0 \ flow_list=136.186.229.252:52059-136.186.229.102:5001;dropped_in=1;dropped_out=1, Field descriptions are as follows: module_unload_time_secs: Time at which the module was unloaded, in seconds since the UNIX epoch. module_unload_time_usecs: Time at which the module was unloaded, in microseconds since module_load_time_secs. num_inbound_tcp_pkts: Number of TCP packets that traversed up the network stack. This only includes inbound TCP packets during the periods when DPD was enabled. num_outbound_tcp_pkts: Number of TCP packets that traversed down the network stack. This only includes outbound TCP packets during the periods when DPD was enabled. total_tcp_pkts: The summation of num_inbound_tcp_pkts and num_outbound_tcp_pkts. num_inbound_skipped_pkts_malloc: Number of inbound packets that were not processed because of failed malloc() calls. num_outbound_skipped_pkts_malloc: Number of outbound packets that were not processed because of failed malloc() calls. num_inbound_skipped_pkts_icb: Number of inbound packets that were not processed because of failure to find the IP control block associated with the packet. num_outbound_skipped_pkts_icb: Number of outbound packets that were not processed because of failure to find the IP control block associated with the packet. total_skipped_tcp_pkts: The summation of all skipped packet counters. flow_list: A CSV list of TCP flows that were seen since the module was loaded. Each flow entry in the CSV list is formatted as "ip1:port1-ip2:port2;dropped_in=X;dropped_out=Y", where X represents the number of inbound packets for the flow that were dropped, and Y represents the number of outbound packets for the flow that were dropped. The ordering of the flow endpoints is arbitrary. If there are no entries in the list (i.e. no packets belonging to TCP flows were processed), the value will be blank. If there is at least one entry in the list, a trailing comma will always be present. ---------------------------------------------- KNOWN LIMITATIONS ---------------------------------------------- Current known limitations of the DPD software and any relevant workarounds are outlined below: 1. The module does not handle IPv6 at all, including TCP carried within IPv6 packets. It would be relatively straight forward to patch the code to make this possible, but there is currently no simple workaround. 2. The hash table used within the code is sized to hold 65536 flows. This is not a hard limit, because chaining is used to handle collisions within the hash table structure. However, we suspect (based on analogies with other hash table performance data) that the hash table lookup performance (and therefore the module's packet processing performance) will degrade in an exponential manner as the number of unique flows handled in a module load/unload cycle approaches and surpasses 65536. 3. The module does not currently provide a means to configure recurring packet loss. This functionality is on the TODO list for a future version. 4. The module currently must be run on one of the TCP flow endpoints in order to function correctly. It would be useful to modify DPD so that it could run on a transparent bridge similar to the FreeBSD Dummynet module. This functionality is on the TODO list for a future version. 5. If using DPD on a machine that is also running SIFTR or other modules utilising the PFIL architecture (e.g. IPFW), the order in which you load the modules is important. You should kldload DPD first, as this will ensure TCP packets are dropped by DPD before any of the other modules "see" and process them. ---------------------------------------------- RELATED READING ---------------------------------------------- This software was developed as part of the NewTCP research project at Swinburne University's Centre for Advanced Internet Architectures. More information on the project is available at: http://caia.swin.edu.au/urp/newtcp/ A number of software tools and technical reports related to experimental TCP research in general are available respectively at: http://caia.swin.edu.au/urp/newtcp/tools.html http://caia.swin.edu.au/urp/newtcp/papers.html At the time of writing, the following software tools may be of interest: Statistical Information For TCP Research (SIFTR) At the time of writing, the following reports may be of interest: CAIA technical report 070824A: "Characterising the Behaviour and Performance of SIFTR v1.1.0" CAIA technical report 070717B: "Tuning and Testing the FreeBSD 6 TCP Stack" CAIA technical report 070622A: "An Introduction to FreeBSD 6 Kernel Hacking"