Tuning TCP for Linux 2.4 and 2.6
NB: Recent versions of Linux (version 2.6.17 and later) have full autotuning with 4 MB maximum buffer sizes. Except in some rare cases, manual tuning is unlikely to substantially improve the performance of these kernels over most network paths, and is not generally recommended
Since autotuning and large default buffer sizes were released progressively over a succession of different kernel versions, it is best to inspect and only adjust the tuning as needed. When you upgrade kernels, you may want to consider removing any local tuning.
All system parameters can be read or set by accessing special files in the /proc file system. E.g.:
cat /proc/sys/net/ipv4/tcp_moderate_rcvbuf
If the parameter tcp_moderate_rcvbuf is present and has value 1 then autotuning is in effect. With autotuning, the receiver buffer size (and TCP window size) is dynamically updated (autotuned) for each connection. (Sender side autotuning has been present and unconditionally enabled for many years now).
The per connection memory space defaults are set with two 3 element arrays:
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers
/proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP snd buffers
These are arrays of three values: minimum, initial and maximum buffer size. They are used to set the bounds on autotuning and balance memory usage while under memory stress. Note that these are controls on the actual memory usage (not just TCP window size) and include memory used by the socket data structures as well as memory wasted by short packets in large buffers. The maximum values have to be larger than the BDP of the path by some suitable overhead.
With autotuning, the middle value just determines the initial buffer size. It is best to set it to some optimal value for typical small flows. With autotuning, excessively large initial buffer waste memory and can even hurt performance.
If autotuning is not present (Linux 2.4 before 2.4.27 or Linux 2.6 before 2.6.7), you may want to get a newer kernel. Alternately, you can adjust the default socket buffer size for all TCP connections by setting the middle tcp_rmem value to the calculated BDP. This is NOT recommended for kernels with autotuning. Since the sending side is autotuned, this is never recommended for tcp_wmem.
The maximum buffer size that applications can request (the maximum acceptable values for SO_SNDBUF and SO_RCVBUF arguments to the setsockopt() system call) can be limited with /proc variables:
/proc/sys/net/core/rmem_max - maximum receive window
/proc/sys/net/core/wmem_max - maximum send window
The kernel sets the actual memory limit to twice the requested value (effectively doubling rmem_max and wmem_max) to provide for sufficient memory overhead. You do not need to adjust these unless your are planing to use some form of application tuning.
NB: Manually adjusting socket buffer sizes with setsockopt() disables autotuning. Application that are optimized for other operating systems may implicitly defeat Linux autotuning.
The following values (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reasonable for all paths with a 4MB BDP or smaller (you must be root):
echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf
echo 108544 > /proc/sys/net/core/wmem_max
echo 108544 > /proc/sys/net/core/rmem_max
echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem
echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
Do not adjust tcp_mem unless you know exactly what you are doing. This array (in units of pages) determines how the system balances the total network buffer space against all other LOWMEM memory usage. The three elements are initialized at boot time to appropriate fractions of the available system memory.
You do not need to adjust rmem_default or wmem_default (at least not for TCP tuning). These are the default buffer sizes for non-TCP sockets (e.g. unix domain and UDP sockets).
All standard advanced TCP features are on by default. You can check them by:
cat /proc/sys/net/ipv4/tcp_timestamps
cat /proc/sys/net/ipv4/tcp_window_scaling
cat /proc/sys/net/ipv4/tcp_sack
Linux supports both /proc and sysctl (using alternate forms of the variable names - e.g. net.core.rmem_max) for inspecting and adjusting network tuning parameters. The following is a useful shortcut for inspecting all tcp parameters:
sysctl -a | fgrep tcp
For additional information on kernel variables, look at the documentation included with your kernel source, typically in some location such as /usr/src/linux-
If you would like to have these changes to be preserved across reboots, you can add the tuning commands to your the file /etc/rc.d/rc.local .
Autotuning was prototyped under the Web100 project. Web100 also provides complete TCP instrumentation and some additional features to improve performance on paths with very large BDP.