summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/helper/conntrackd.conf22
-rw-r--r--doc/manual/conntrack-tools.tmpl214
-rw-r--r--doc/misc/README187
-rw-r--r--doc/misc/clusterip.sh254
4 files changed, 569 insertions, 108 deletions
diff --git a/doc/helper/conntrackd.conf b/doc/helper/conntrackd.conf
index 6ffe008..efa318a 100644
--- a/doc/helper/conntrackd.conf
+++ b/doc/helper/conntrackd.conf
@@ -3,11 +3,21 @@
#
Helper {
- # Before this, you have to make sure you have registered the `ftp'
- # user-space helper stub via:
+ #
+ # Set up the userspace helpers when the daemon is started. If unset,
+ # you have manually set up the user-space helper stub, e.g.
#
# nfct add helper ftp inet tcp
#
+ # This new setting simplifies new deployment, so it is recommended to
+ # turn it on. On existing deployments, make sure to remove the nfct
+ # command invocation since it is not required anymore.
+ #
+ # Default: no (for backward compatibility reasons)
+ # Recommended: yes
+ #
+ Setup yes
+
Type ftp inet tcp {
#
# Set NFQUEUE number you want to use to receive traffic from
@@ -73,7 +83,7 @@ Helper {
}
}
Type mdns inet udp {
- QueueNum 6
+ QueueNum 5
QueueLen 10240
Policy mdns {
ExpectMax 8
@@ -81,7 +91,7 @@ Helper {
}
}
Type ssdp inet udp {
- QueueNum 5
+ QueueNum 6
QueueLen 10240
Policy ssdp {
ExpectMax 8
@@ -89,7 +99,7 @@ Helper {
}
}
Type ssdp inet tcp {
- QueueNum 5
+ QueueNum 7
QueueLen 10240
Policy ssdp {
ExpectMax 8
@@ -97,7 +107,7 @@ Helper {
}
}
Type slp inet udp {
- QueueNum 7
+ QueueNum 8
QueueLen 10240
Policy slp {
ExpectMax 8
diff --git a/doc/manual/conntrack-tools.tmpl b/doc/manual/conntrack-tools.tmpl
index 739b7f1..822dd49 100644
--- a/doc/manual/conntrack-tools.tmpl
+++ b/doc/manual/conntrack-tools.tmpl
@@ -19,7 +19,7 @@
</authorgroup>
<copyright>
- <year>2008-2012</year>
+ <year>2008-2020</year>
<holder>Pablo Neira Ayuso</holder>
</copyright>
@@ -35,10 +35,8 @@
</legalnotice>
<releaseinfo>
- This document details how to install and configure the
- <ulink url="http://conntrack-tools.netfilter.org">conntrack-tools</ulink>
- &gt;= 1.4.0. This document will evolve in the future to cover new features
- and changes.</releaseinfo>
+ This document details how to install and to configure the <ulink url="http://conntrack-tools.netfilter.org">conntrack-tools</ulink>.
+ </releaseinfo>
</bookinfo>
@@ -46,21 +44,13 @@
<chapter id="introduction"><title>Introduction</title>
- <para>This document should be a kick-off point to install and configure the
- <ulink url="http://conntrack-tools.netfilter.org">conntrack-tools</ulink>.
- If you find any error or imprecision in this document, please send an email
- to the author, it will be appreciated.</para>
+<para>This documentation provides a description on how to install and to configure the <ulink url="http://conntrack-tools.netfilter.org">conntrack-tools</ulink>.</para>
- <para>In this document, the author assumes that the reader is familiar with firewalling concepts and iptables in general. If this is not your case, I suggest you to read the iptables documentation before going ahead. Moreover, the reader must also understand the difference between <emphasis>stateful</emphasis> and <emphasis>stateless</emphasis> firewalls. If this is not your case, I strongly suggest you to read the article <ulink url="http://people.netfilter.org/pablo/docs/login.pdf">Netfilter's Connection Tracking System</ulink> published in <emphasis>:login; the USENIX magazine</emphasis>. That document contains a general description that should help to clarify the concepts.</para>
-
-<para>If you do not fulfill the previous requirements, this documentation is likely to be a source of frustration. Probably, you wonder why I'm insisting on these prerequisites too much, the fact is that if your iptables rule-set is <emphasis>stateless</emphasis>, it is very likely that the <emphasis>conntrack-tools</emphasis> will not be of any help for you. You have been warned!</para>
+<para>This documentation assumes that the reader is familiar with basic firewalling and Netfilter concepts. You also must understand the difference between <emphasis>stateless</emphasis> and <emphasis>stateful</emphasis> firewalls. Otherwise, please read <ulink url="http://people.netfilter.org/pablo/docs/login.pdf">Netfilter's Connection Tracking System</ulink> published in <emphasis>:login; the USENIX magazine</emphasis> for a quick reference.</para>
</chapter>
<chapter id="what"><title>What are the conntrack-tools?</title>
- <para>The conntrack-tools are a set of free software tools for GNU/Linux that allow system administrators interact, from user-space, with the in-kernel <ulink url="http://people.netfilter.org/pablo/docs/login.pdf">Connection Tracking System</ulink>, which is the module that enables stateful packet inspection for iptables. Probably, you did not hear about this module so far. However, if any of the rules of your rule-set use the <emphasis>state</emphasis> or <emphasis>ctstate</emphasis> iptables matches, you are indeed using it.
- </para>
-
<para>The <ulink url="http://conntrack-tools.netfilter.org">conntrack-tools</ulink> package contains two programs:</para>
<itemizedlist>
@@ -72,17 +62,18 @@
</listitem>
</itemizedlist>
- <para>Although the name of both tools is very similar - and you can blame me for that, I'm not a marketing guy - they are used for very different tasks.</para>
+<para>Mind the trailing <emphasis>d</emphasis> that refers to either the command line utility or the daemon.</para>
</chapter>
<chapter id="requirements"><title>Requirements</title>
- <para>You have to install the following software in order to get the <emphasis>conntrack-tools</emphasis> working. Make sure that you have installed them correctly before going ahead:</para>
+<para>If you are using the Linux kernel that your distribution provides, then you most likely can skip this.</para>
+
+<para>If you compile your own Linux kernel, then please make sure the following options are enabled.</para>
+
+<para>You require a <ulink url="http://www.kernel.org">Linux kernel</ulink> version &gt;= 2.6.18.</para>
- <itemizedlist>
- <listitem>
- <para><ulink url="http://www.kernel.org">Linux kernel</ulink> version &gt;= 2.6.18 that, at least, has support for:</para>
<itemizedlist>
<listitem>
<para>Connection Tracking System.</para>
@@ -123,19 +114,47 @@
</itemizedlist>
</listitem>
</itemizedlist>
- <note><title>Verifying kernel support</title>
- <para>
- Make sure you have loaded <emphasis>nf_conntrack</emphasis>, <emphasis>nf_conntrack_ipv4</emphasis> (if your setup also supports IPv6, <emphasis>nf_conntrack_ipv6</emphasis>) and <emphasis>nf_conntrack_netlink</emphasis>.
- </para>
- </note>
- </listitem>
+
+<note><title>Validating Linux kernel support</title>
+<para>You can validate that your Linux kernel support for the <emphasis>conntrack-tools</emphasis> through <emphasis>modinfo</emphasis>.</para>
+
+ <programlisting>
+ # modinfo nf_conntrack
+filename: /lib/modules/5.2.0/kernel/net/netfilter/nf_conntrack.ko
+license: GPL
+alias: nf_conntrack-10
+alias: nf_conntrack-2
+alias: ip_conntrack
+depends: nf_defrag_ipv6,libcrc32c,nf_defrag_ipv4
+retpoline: Y
+intree: Y
+name: nf_conntrack
+vermagic: 5.7.0+ SMP preempt mod_unload modversions
+parm: tstamp:Enable connection tracking flow timestamping. (bool)
+parm: acct:Enable connection tracking flow accounting. (bool)
+parm: nf_conntrack_helper:Enable automatic conntrack helper assignment (default 0) (bool)
+parm: expect_hashsize:uint
+parm: enable_hooks:Always enable conntrack hooks (bool)
+</programlisting>
+
+<para>Make sure <emphasis>nf_conntrack_netlink</emphasis> is also available.</para>
+</note>
+
+<para>You also need to install the following library dependencies:</para>
+
+ <itemizedlist>
<listitem>
- <para>libnfnetlink: the netfilter netlink library use the official release available in <ulink url="http://www.netfilter.org">netfilter.org</ulink></para>
+ <para>libnfnetlink: the netfilter netlink library use the official release available in <ulink url="http://www.netfilter.org/projects/libnfnetlink">netfilter.org</ulink></para>
</listitem>
<listitem>
- <para>libnetfilter_conntrack: the netfilter netlink library use the official release available in <ulink url="http://www.netfilter.org">netfilter.org</ulink></para>
+ <para>libnetfilter_conntrack: the netfilter netlink library use the official release available in <ulink url="http://www.netfilter.org/projects/libnetfilter_conntrack">netfilter.org</ulink></para>
</listitem>
</itemizedlist>
+
+<note><title>Installing library dependencies</title>
+<para>Your distribution most likely also provides packages for this software, so you do not have to compile it yourself.</para>
+</note>
+
</chapter>
<chapter id="Installation"><title>Installation</title>
@@ -148,18 +167,8 @@
(non-root)$ make
(root) # make install</programlisting>
-<note><title>Fedora Users</title>
- <para>If you are installing the libraries in /usr/local/, do not forget to do the following things:</para>
- <itemizedlist>
- <listitem><para>PKG_CONFIG_PATH=/usr/local/lib/pkgconfig; export PKG_CONFIG_PATH</para></listitem>
- <listitem><para>Add `/usr/local/lib' to your /etc/ld.so.conf file and run `ldconfig'</para></listitem>
- </itemizedlist>
- <para>Check `ldd' for trouble-shooting, read <ulink url="http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html">this</ulink> for more information on how libraries work.</para>
-</note>
-
-<note><title>Verifying kernel support</title>
- <para>To check that the modules are enabled in the kernel, run <emphasis>`conntrack -E'</emphasis> and generate traffic, you should see flow events reporting new connections and updates.
- </para>
+<note><title>Installing conntrack and conntrackd</title>
+<para>Your distribution most likely also provides packages for this software, so you do not have to compile it yourself.</para>
</note>
</chapter>
@@ -174,7 +183,7 @@
tcp 6 431698 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34849 dport=993 packets=244 bytes=18723 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34849 packets=203 bytes=144731 [ASSURED] mark=0 use=1
</programlisting>
-<para>The command line tool <emphasis>conntrack</emphasis> can be used to display the same information:</para>
+<para>You can list the existing flows using the <emphasis>conntrack</emphasis> utility via <emphasis>-L</emphasis> command:</para>
<programlisting>
# conntrack -L
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=0 use=1
@@ -182,25 +191,23 @@
conntrack v1.4.6 (conntrack-tools): 2 flow entries have been shown.
</programlisting>
-<para>You can natively filter the output without using <emphasis>grep</emphasis>:</para>
+ <para>The <emphasis>conntrack</emphasis> syntax is similar to <emphasis>iptables</emphasis>.</para>
+
+<para>You can filter out the listing without using <emphasis>grep</emphasis>:</para>
<programlisting>
# conntrack -L -p tcp --dport 993
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=0 use=1
conntrack v1.4.6 (conntrack-tools): 1 flow entries have been shown.
</programlisting>
-<para>Update the mark based on a selection, this allows you to change the mark of an entry without using the CONNMARK target:</para>
+<para>You can update the ct mark, extending the previous example:</para>
<programlisting>
# conntrack -U -p tcp --dport 993 --mark 10
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=10 use=1
conntrack v1.4.6 (conntrack-tools): 1 flow entries have been updated.
</programlisting>
-<para>Delete one entry, this can be used to block traffic if:</para>
-<itemizedlist>
- <listitem><para>You have a stateful rule-set that blocks traffic in INVALID state.</para></listitem>
- <listitem><para>You set <emphasis>/proc/sys/net/netfilter/nf_conntrack_tcp_loose</emphasis> to zero.</para></listitem>
-</itemizedlist>
+<para>You can also delete entries</para>
<programlisting>
# conntrack -D -p tcp --dport 993
@@ -208,7 +215,14 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been updated.
conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
</programlisting>
-<para>Display the connection tracking events:</para>
+<para>
+This allows you to block TCP traffic if:</para>
+<itemizedlist>
+ <listitem><para>You have a stateful rule-set that drops traffic in INVALID state.</para></listitem>
+ <listitem><para>You set <emphasis>/proc/sys/net/netfilter/nf_conntrack_tcp_loose</emphasis> to zero.</para></listitem>
+</itemizedlist>
+
+<para>You can also listen to the connection tracking events:</para>
<programlisting>
# conntrack -E
[NEW] udp 17 30 src=192.168.2.100 dst=192.168.2.1 sport=57767 dport=53 [UNREPLIED] src=192.168.2.1 dst=192.168.2.100 sport=53 dport=57767
@@ -218,20 +232,23 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.2.100 dst=66.102.9.104 sport=33379 dport=80 src=66.102.9.104 dst=192.168.2.100 sport=80 dport=33379 [ASSURED]
</programlisting>
-<para>You can also display the existing flows in XML format, filter the output based on the NAT handling applied, etc.</para>
+<para>There are many options, including support for XML output, more advanced filters, and so on. Please check the manpage for more information.</para>
</chapter>
<chapter id="settingup"><title>Setting up conntrackd: the daemon</title>
- <para>The daemon <emphasis>conntrackd</emphasis> supports two working modes:</para>
+ <para>The <emphasis>conntrackd</emphasis> daemon supports three modes:</para>
- <itemizedlist>
+ <itemizedlist>
<listitem>
- <para><emphasis>State table synchronization</emphasis>: the daemon can be used to synchronize the connection tracking state table between several firewall replicas. This can be used to deploy fault-tolerant stateful firewalls. This is the main feature of the daemon.</para>
+ <para><emphasis>State table synchronization</emphasis>, to synchronize the connection tracking state table between several firewalls in High Availability (HA) scenarios.</para>
</listitem>
<listitem>
- <para><emphasis>Flow-based statistics collection</emphasis>: the daemon can be used to collect flow-based statistics. This feature is similar to what <ulink url="http://www.netfilter.org/projects/ulogd/">ulogd-2.x</ulink> provides.</para>
+ <para><emphasis>Userspace connection tracking helpers</emphasis>, for layer 7 Application Layer Gateway (ALG) such as DHCPv6, MDNS, RPC, SLP and Oracle TNS. As an alternative to the in-kernel connection tracking helpers that are available in the Linux kernel.</para>
+ </listitem>
+ <listitem>
+ <para><emphasis>Flow-based statistics collection</emphasis>, to collect flow-based statistics as an alternative to <ulink url="http://www.netfilter.org/projects/ulogd/">ulogd2</ulink>, although <emphasis>ulogd2</emphasis> allows for more flexible statistics collection.</para>
</listitem>
</itemizedlist>
@@ -239,15 +256,12 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
<sect2 id="sync-requirements"><title>Requirements</title>
- <para>In order to get <emphasis>conntrackd</emphasis> working in synchronization mode, you have to fulfill the following requirements:</para>
+ <para>If you would like to configure <emphasis>conntrackd</emphasis> to work in state synchronization mode, then you require:</para>
<orderedlist>
<listitem>
- <para>A <emphasis>high availability manager</emphasis> like <ulink url="http://www.keepalived.org">keepalived</ulink> that manages the virtual IPs of the
- firewall cluster, detects errors, and decide when to migrate the virtual IPs
- from one firewall replica to another. Without it, <emphasis>conntrackd</emphasis> will not work appropriately.</para>
- <para>The state synchronization setup requires a working installation of <ulink url="http://www.keepalived.org">keepalived</ulink>, preferibly a recent version. Check if your distribution comes with a recent packaged version. Otherwise, you may compile it from the sources.
+ <para>A working installation of <ulink url="http://www.keepalived.org">keepalived</ulink>, preferibly a recent version. Check if your distribution comes with a recent packaged version. Otherwise, you may compile it from the sources.
</para>
<para>
@@ -342,7 +356,7 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
</sect2>
-<sect2 id="sync-pb"><title>Active-Backup setup</title>
+<sect2 id="sync-pb"><title>Active-Backup setups</title>
<note><title>Stateful firewall architectures</title>
<para>A good reading to extend the information about firewall architectures is <ulink url="http://1984.lsi.us.es/~pablo/docs/intcomp09.pdf">Demystifying cluster-based fault-tolerant firewalls</ulink> published in IEEE Internet Computing magazine.
@@ -380,19 +394,19 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
</sect2>
-<sect2 id="sync-aa"><title>Active-Active setup</title>
+<sect2 id="sync-aa"><title>Active-Active setups</title>
<para>The Active-Active setup consists of having more than one stateful
- firewall replicas actively filtering traffic. Thus, we reduce the resource
- waste that implies to have a backup firewall which does nothing.</para>
+ firewall actively filtering traffic. Thus, we reduce the resource
+ waste that implies to have a backup firewall which is spare.</para>
<para>We can classify the type of Active-Active setups in several
families:</para>
<itemizedlist>
<listitem>
- <para><emphasis>Symmetric path routing</emphasis>: The stateful firewall
- replicas share the workload in terms of flows, ie. the packets that are
+ <para><emphasis>Symmetric path routing</emphasis>: The stateful firewalls
+ share the workload in terms of flows, ie. the packets that are
part of a flow are always filtered by the same firewall.</para>
</listitem>
<listitem>
@@ -406,24 +420,20 @@ conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
</listitem>
</itemizedlist>
- <para>As for 0.9.8, the design of <emphasis>conntrackd</emphasis> allows you
- to deploy an symmetric Active-Active setup based on a static approach.
- For example, assume that you have two virtual IPs, vIP1 and vIP2, and two
- firewall replicas, FW1 and FW2. You can give the virtual vIP1 to the
- firewall FW1 and the vIP2 to the FW2.
+ <para><emphasis>conntrackd</emphasis> allows you to deploy an symmetric
+Active-Active setup based on a static approach. For example, assume that you
+have two virtual IPs, vIP1 and vIP2, and two firewall replicas, FW1 and FW2.
+You can give the virtual vIP1 to the firewall FW1 and the vIP2 to the FW2.
</para>
- <para>Unfortunately, you will have to wait for the support for the
- Active-Active setup based on dynamic approach, ie. a workload sharing setup
- without directors that allow the stateful firewall share the filtering.</para>
-
- <para>On the other hand, the asymmetric scenario may work if your setup
- fulfills several strong assumptions. However, in the opinion of the author
- of this work, the asymmetric setup goes against the design of stateful
- firewalls and <emphasis>conntrackd</emphasis>. Therefore, you have two
- choices here: you can deploy an Active-Backup setup or go back to your
- old stateless rule-set (in that case, the conntrack-tools will not be
- of any help anymore, of course).</para>
+ <para>The asymmetric path scenario is hard: races might occurs between state
+ synchronization and packet forwarding. If you would like to deploy an
+ Active-Active setup with an assymmetic multi-path routing configuration,
+ then, make sure the same firewall <emphasis>forwards</emphasis> packets
+ coming in the original and the reply directions. If you cannot guarantee
+ this and you still would like to deply an Active-Active setup, then you
+ might have to consider downgrading your firewall ruleset policy to stateless
+filtering.</para>
</sect2>
@@ -895,32 +905,13 @@ maintainance.</para></listitem>
<para>The following steps describe how to enable the RPC portmapper helper for NFSv3 (this is similar for other helpers):</para>
<orderedlist>
-<listitem><para>Register user-space helper:
-
-<programlisting>
-nfct add helper rpc inet udp
-nfct add helper rpc inet tcp
-</programlisting>
-
-This registers the portmapper helper for both UDP and TCP (NFSv3 traffic goes both over TCP and UDP).
-</para></listitem>
-
-<listitem><para>Add iptables rule using the CT target:
-
-<programlisting>
-# iptables -I OUTPUT -t raw -p udp --dport 111 -j CT --helper rpc
-# iptables -I OUTPUT -t raw -p tcp --dport 111 -j CT --helper rpc
-</programlisting>
-
-With this, packets matching port TCP/UDP/111 are passed to user-space for
-inspection. If there is no instance of conntrackd configured to support
-user-space helpers, no inspection happens and packets are not sent to
-user-space.</para></listitem>
<listitem><para>Add configuration to conntrackd.conf:
<programlisting>
Helper {
+ Setup yes
+
Type rpc inet udp {
QueueNum 1
QueueLen 10240
@@ -952,6 +943,25 @@ for inspection to user-space</para>
</listitem>
+<listitem><para>Run conntrackd:
+<programlisting>
+# conntrackd -d -C /path/to/conntrackd.conf
+</programlisting>
+</para>
+</listitem>
+
+<listitem><para>Add iptables rule using the CT target:
+
+<programlisting>
+# iptables -I OUTPUT -t raw -p udp --dport 111 -j CT --helper rpc
+# iptables -I OUTPUT -t raw -p tcp --dport 111 -j CT --helper rpc
+</programlisting>
+
+With this, packets matching port TCP/UDP/111 are passed to user-space for
+inspection. If there is no instance of conntrackd configured to support
+user-space helpers, no inspection happens and packets are not sent to
+user-space.</para></listitem>
+
</orderedlist>
<para>Now you can test this (assuming you have some working NFSv3 setup) with:
diff --git a/doc/misc/README b/doc/misc/README
new file mode 100644
index 0000000..7d0a1ae
--- /dev/null
+++ b/doc/misc/README
@@ -0,0 +1,187 @@
+= Setting up active-active load-sharing hash-based stateful firewall =
+ by Pablo Neira Ayuso <pablo@netfilter.org> in 2010
+
+If you want to know more about this configuration and other firewall
+architectures, please read:
+
+* Demystifying cluster-based fault-tolerant firewalls.
+ IEEE Internet Computing, 13(6):31-38, December 2009.
+ Available at: https://perso.ens-lyon.fr/laurent.lefevre/pdf/IC2009_Neira_Gasca_Lefevre.pdf
+
+== 0x0 intro ==
+
+Under this directory you can find a script that allows you to setup a simple
+active-active hash-based load-sharing firewall cluster based on the iptables'
+cluster match.
+
+== 0x1 testbed ==
+
+My testbed looks like the following:
+
+ ---------- eth1 eth2 ----------
+ client A ------| |--- firewall 1 ----| |
+ (192.168.0.2) | switch | (.0.5) (.1.5) | switch |--- server
+ | | | | (192.168.1.2)
+ client B ------| |--- firewall 2 ----| |
+ (192.168.0.11) ---------- (.0.5) (.1.5) ----------
+ eth1 eth2
+
+The firewalls perform SNAT to masquerade clients. Note that both cluster
+firewall have the same IP addresses. For administrative purposes, it is
+a good idea that each firewall has its one IP address to SSH them, make
+sure you add the appropriate rule to skip the cluster match rule-set!
+More comments: although the picture shows two switches, I'm actually
+using one and I separated the clients and the server in two different
+VLANs.
+
+The script also sets a multicast MAC address that is the same for both
+firewalls so that the switch floods the same packets to both firewalls.
+Using a multicast MAC address is a RFC violation [1], since network node
+must not include multicast MAC address in ARP replies, but:
+
+ a) it is the only way I found so far to obtain the behaviour from my
+ HP procurve switches.
+
+ b) the VRRP MAC address range is not supported appropritely by switch
+ vendors, at least by my HP procurve switches. If switch vendors
+ support this MAC address range appropriately, they will handle them
+ as multicast MAC address. As of 2011 I did not find any switch handling
+ VRRP MAC address range as multicast ports (they still handle them as
+ normal unicast MAC addresses, therefore my solution does not work with
+ two nodes with the same VRRP MAC address).
+
+The cluster match relies upon the Connection Tracking System (conntrack).
+Thus, traffic coming in the reply direction which does not belong this node
+is labeled as INVALID for TCP and ICMP protocols. The scripts add a rule to
+drop this traffic to avoid possible packet duplication. For UDP traffic,
+you will have to add a rule to drop NEW traffic in the reply direction
+because conntrack considers it valid. If you don't do this, both nodes
+may accept reply traffic, thus, sending duplicated packets to the client,
+which is not what you want.
+
+During my last experiments, I was using the Linux kernel 2.6.37 in the
+firewalls and the server. Everything you need to setup this configuration
+is available in stock Linux kernels. No external patches with new features
+are required.
+
+== 0x2 running scripts ==
+
+Copy the script to each node, then adjust the script variables to your
+configuration.
+
+On firewall 1:
+firewall1# ./clusterip-node1.sh start
+
+On firewall 2:
+firewall2# ./clusterip-node2.sh start
+
+== 0x3 trouble-shooting ==
+
+Some troubleshooting may help to understand how this setup works. Check
+the following if you experience problems:
+
+1) Check that Multicast MAC address are assigned to the NICs:
+
+firewall1$ ip maddr
+[...]
+2: eth1
+[...]
+ link 01:00:5e:00:01:01 static
+3: eth2
+[...]
+ link 01:00:5e:00:01:02 static
+
+The scripts add the multicast MAC addresses to the NICs, if this
+is not done the traffic will be discarded by the firewalls'
+networking stack.
+
+2) ICMP ping the server from one the clients:
+
+client$ ping -c 1 192.168.1.2
+PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
+64 bytes from 192.168.1.2: icmp_seq=1 ttl=63 time=0.220 ms
+
+--- 192.168.1.2 ping statistics ---
+1 packets transmitted, 1 received, 0% packet loss, time 0ms
+rtt min/avg/max/mdev = 0.220/0.220/0.220/0.000 ms
+
+If this does not work, make sure the firewalls are including the
+multicast MAC address in their ARP replies, you can check this
+by looking at the neigbour cache:
+
+client$ ip neighbour
+[...]
+192.168.0.5 dev eth1 lladdr 01:00:5e:00:01:01 REACHABLE
+
+server$ ip neighbour
+[...]
+192.168.1.5 dev eth1 lladdr 01:00:5e:00:01:02 REACHABLE
+
+firewall$ ip neighbour
+[...]
+192.168.0.5 dev eth1 lladdr 01:00:5e:00:01:01 REACHABLE
+192.168.1.5 dev eth2 lladdr 01:00:5e:00:01:02 REACHABLE
+
+3) Test TCP connections: you can use netcat to start simple connections
+between the client and the server.
+
+You can also use intensive HTTP traffic generation to test performance
+like injectX.c and httpterm from Willy Tarreau:
+
+http://1wt.eu/tools/inject/
+http://1wt.eu/tools/httpterm/
+
+clientA:~/http-client-benchmark# ./client -t 60 -u 200 -G 192.168.1.2:8000
+# hits hits/s ^h/s ^bytes kB/s errs rst tout mhtime
+ 266926 26692 26766 3881270 3779 0 0 0 0.237
+ 294067 26733 27141 3935621 3785 0 0 0 0.176
+
+clientB~/http-client-benchmark# ./client -t 30 -u 40 -G 192.168.1.2:8020
+# hits hits/s ^h/s ^bytes kB/s errs rst tout mhtime
+ 53250 17750 17368 2518448 2513 0 0 0 0.240
+ 70766 17691 17516 2539907 2505 0 0 0 0.297
+
+^h/s is the current number of HTTP petitions per second. This means
+that you get ~45000 HTTP petitions per second. In my setup, with only
+one firewall active I get ~27000 HTTP petitions per second. We obtain
+extra performance of ~66%, not that bad 8-).
+
+I have configured httpterm to send object of 0 bytes over HTTP
+to obtain the maximum number of HTTP flows. This is the worst case
+scenario in firewall load.
+
+I forgot to mention that I set CPU affinity for NICs IRQs. I've got
+two cores, one for each firewall NIC.
+
+== 0x4 report sucessful setups ==
+
+My testbed is composed of low-cost basic five years old HP proliant
+systems, you can see that the numbers are not great. I like knowing
+about numbers, I'd appreciate if you drop me a line to tell me the
+numbers that you get and your experience.
+
+== 0x5 conclusions and future works ==
+
+The cluster match allows to setup load-sharing hash-based stateful
+firewalls that is a way to avoid having a spare backup firewall as
+it happens in classical Primary-Backup setups.
+
+Still, there is some pending work to fully integrate conntrackd and HA
+managers with it (in case that you want high availability, of course).
+
+-o-
+
+[1] More specifically, it's a RFC 1812 (section 3.3.2) violation.
+It's been reported that this is a problem for CISCO routers:
+http://marc.info/?l=netfilter&m=128810399113170&w=2
+
+Michele Codutti: "The problem is the multicast MAC address that these
+routers doesn't "like". They discard any incoming packet with MAC
+multicast address to be compliant with RFC1812. The only documented
+(by Cisco) workaround is to put a fixed arp entry with the multicast
+address that maps the clustered IP in the router."
+
+If you keep reading the mailing thread, the reported problem affected
+Cisco 7200 VXR.
+
+--02/02/2010
diff --git a/doc/misc/clusterip.sh b/doc/misc/clusterip.sh
new file mode 100644
index 0000000..911f676
--- /dev/null
+++ b/doc/misc/clusterip.sh
@@ -0,0 +1,254 @@
+#!/bin/sh
+
+#
+# (C) 2009-2011 by Pablo Neira Ayuso <pneira@us.es>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+
+#
+# Here, you can find the variables that you have to change.
+#
+
+# enable this for debugging
+LOG_DEBUG=0
+
+# number of cluster node (must be unique, from 1 to N cluster nodes)
+NODE=1
+
+# this is the real MAC address of eth1
+REAL_HWADDR1=00:18:71:68:f2:34
+
+# this is the real MAC address of eth2
+REAL_HWADDR2=00:11:0a:60:e7:32
+
+#
+# These variables MUST have the same values in both cluster nodes
+#
+
+# number of nodes that belong this cluster
+TOTAL_NODES=2
+
+# this is the cluster multicast MAC address of eth1
+MC_HWADDR1=01:00:5e:00:01:01
+
+# this is the cluster multicast MAC address of eth2
+MC_HWADDR2=01:00:5e:00:01:02
+
+# cluster IP address of eth1
+ADDR1=192.168.0.5/24
+
+# cluster IP address of eth2
+ADDR2=192.168.1.5/24
+
+# random seed for hashing
+SEED=0xdeadbeef
+
+start_cluster_address()
+{
+ # set cluster IP addresses
+ ip a a $ADDR1 dev eth1
+ ip a a $ADDR2 dev eth2
+ # set cluster multicast MAC addresses
+ ip maddr add $MC_HWADDR1 dev eth1
+ ip maddr add $MC_HWADDR2 dev eth2
+ # mangle ARP replies to include the cluster multicast MAC addresses
+ arptables -I OUTPUT -o eth1 --h-length 6 \
+ -j mangle --mangle-mac-s $MC_HWADDR1
+ # mangle ARP request to use the original MAC address (otherwise the
+ # stack drops this packet).
+ arptables -I INPUT -i eth1 --h-length 6 --destination-mac \
+ $MC_HWADDR1 -j mangle --mangle-mac-d $REAL_HWADDR1
+ arptables -I OUTPUT -o eth2 --h-length 6 \
+ -j mangle --mangle-mac-s $MC_HWADDR2
+ arptables -I INPUT -i eth2 --h-length 6 --destination-mac \
+ $MC_HWADDR2 -j mangle --mangle-mac-d $REAL_HWADDR2
+}
+
+stop_cluster_address()
+{
+ # delete cluster IP addresses
+ ip a d $ADDR1 dev eth1
+ ip a d $ADDR2 dev eth2
+ # delete cluster multicast MAC addresses
+ ip maddr del $MC_HWADDR1 dev eth1
+ ip maddr del $MC_HWADDR2 dev eth2
+ # delete ARP replies mangling
+ arptables -D OUTPUT -o eth1 --h-length 6 \
+ -j mangle --mangle-mac-s $MC_HWADDR1
+ # delete ARP requests mangling
+ arptables -D INPUT -i eth1 --h-length 6 --destination-mac \
+ $MC_HWADDR1 -j mangle --mangle-mac-d $REAL_HWADDR1
+ arptables -D OUTPUT -o eth2 --h-length 6 \
+ -j mangle --mangle-mac-s $MC_HWADDR2
+ arptables -D INPUT -i eth2 --h-length 6 --destination-mac \
+ $MC_HWADDR2 -j mangle --mangle-mac-d $REAL_HWADDR2
+}
+
+start_nat()
+{
+ iptables -A POSTROUTING -t nat -s 192.168.0.11 \
+ -j SNAT --to-source 192.168.1.5
+ iptables -A POSTROUTING -t nat -s 192.168.0.2 \
+ -j SNAT --to-source 192.168.1.5
+}
+
+stop_nat()
+{
+ iptables -D POSTROUTING -t nat -s 192.168.0.11 \
+ -j SNAT --to-source 192.168.1.5
+ iptables -D POSTROUTING -t nat -s 192.168.0.2 \
+ -j SNAT --to-source 192.168.1.5
+}
+
+iptables_start_cluster_rules()
+{
+ # mark packets that belong to this node (go direction)
+ iptables -A CLUSTER-RULES -t mangle -i eth1 -m cluster \
+ --cluster-total-nodes $TOTAL_NODES --cluster-local-node $1 \
+ --cluster-hash-seed $SEED -j MARK --set-mark 0xffff
+
+ # mark packet that belong to this node (reply direction)
+ # note: we *do* need this to change the packet type to PACKET_HOST,
+ # otherwise the stack silently drops the packet.
+ iptables -A CLUSTER-RULES -t mangle -i eth2 -m cluster \
+ --cluster-total-nodes $TOTAL_NODES --cluster-local-node $1 \
+ --cluster-hash-seed $SEED -j MARK --set-mark 0xffff
+}
+
+iptables_stop_cluster_rules()
+{
+ iptables -D CLUSTER-RULES -t mangle -i eth1 -m cluster \
+ --cluster-total-nodes $TOTAL_NODES --cluster-local-node $1 \
+ --cluster-hash-seed $SEED -j MARK --set-mark 0xffff
+
+ iptables -D CLUSTER-RULES -t mangle -i eth2 -m cluster \
+ --cluster-total-nodes $TOTAL_NODES --cluster-local-node $1 \
+ --cluster-hash-seed $SEED -j MARK --set-mark 0xffff
+}
+
+start_cluster_ruleset() {
+ iptables -N CLUSTER-RULES -t mangle
+
+ iptables_start_cluster_rules $NODE
+
+ iptables -A PREROUTING -t mangle -j CLUSTER-RULES
+
+ if [ $LOG_DEBUG -eq 1 ]
+ then
+ iptables -A PREROUTING -t mangle -i eth1 -m mark \
+ --mark 0xffff -j LOG --log-prefix "cluster-accept: "
+ iptables -A PREROUTING -t mangle -i eth1 -m mark \
+ ! --mark 0xffff -j LOG --log-prefix "cluster-drop: "
+ iptables -A PREROUTING -t mangle -i eth2 -m mark \
+ --mark 0xffff \
+ -j LOG --log-prefix "cluster-reply-accept: "
+ iptables -A PREROUTING -t mangle -i eth2 -m mark \
+ ! --mark 0xffff \
+ -j LOG --log-prefix "cluster-reply-drop: "
+ fi
+
+ # drop packets that don't belong to us (go direction)
+ iptables -A PREROUTING -t mangle -i eth1 -m mark \
+ ! --mark 0xffff -j DROP
+
+ # drop packets that don't belong to us (reply direction)
+ iptables -A PREROUTING -t mangle -i eth2 -m mark \
+ ! --mark 0xffff -j DROP
+}
+
+stop_cluster_ruleset() {
+ iptables -D PREROUTING -t mangle -j CLUSTER-RULES
+
+ if [ $LOG_DEBUG -eq 1 ]
+ then
+ iptables -D PREROUTING -t mangle -i eth1 -m mark \
+ --mark 0xffff -j LOG --log-prefix "cluster-accept: "
+ iptables -D PREROUTING -t mangle -i eth1 -m mark \
+ ! --mark 0xffff -j LOG --log-prefix "cluster-drop: "
+ iptables -D PREROUTING -t mangle -i eth2 -m mark \
+ --mark 0xffff \
+ -j LOG --log-prefix "cluster-reply-accept: "
+ iptables -D PREROUTING -t mangle -i eth2 -m mark \
+ ! --mark 0xffff \
+ -j LOG --log-prefix "cluster-reply-drop: "
+ fi
+
+ iptables -D PREROUTING -t mangle -i eth1 -m mark \
+ ! --mark 0xffff -j DROP
+
+ iptables -D PREROUTING -t mangle -i eth2 -m mark \
+ ! --mark 0xffff -j DROP
+
+ iptables_stop_cluster_rules $NODE
+
+ iptables -F CLUSTER-RULES -t mangle
+ iptables -X CLUSTER-RULES -t mangle
+}
+
+case "$1" in
+start)
+ echo "starting cluster configuration for node $NODE."
+
+ # just in case that you forget it
+ echo 1 > /proc/sys/net/ipv4/ip_forward
+
+ # disable TCP pickup
+ echo 0 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal
+ echo 0 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_loose
+
+ start_cluster_address
+ start_nat
+
+ # drop invalid flows from eth2 (not allowed). This is mandatory
+ # because traffic which does not belong to this node is always
+ # labeled as INVALID by TCP and ICMP state tracking. For protocols like
+ # UDP, you will have to drop NEW traffic from eth2, otherwise reply
+ # traffic may be accepted by both nodes, thus duplicating the traffic.
+ iptables -A PREROUTING -t mangle -i eth2 \
+ -m state --state INVALID -j DROP
+
+ start_cluster_ruleset
+ ;;
+stop)
+ echo "stopping cluster configuration for node $NODE."
+
+ stop_cluster_address
+ stop_nat
+
+ iptables -D PREROUTING -t mangle -i eth2 \
+ -m state --state INVALID -j DROP
+
+ stop_cluster_ruleset
+ ;;
+primary)
+ logger "cluster-match-script: entering MASTER state for node $2"
+ if [ -x $CONNTRACKD_SCRIPT ]
+ then
+ sh $CONNTRACKD_SCRIPT primary $NODE $2
+ fi
+ iptables_start_cluster_rules $2
+ ;;
+backup)
+ logger "cluster-match-script: entering BACKUP state for node $2"
+ if [ -x $CONNTRACKD_SCRIPT ]
+ then
+ sh $CONNTRACKD_SCRIPT backup $NODE $2
+ fi
+ iptables_stop_cluster_rules $2
+ ;;
+fault)
+ logger "cluster-match-script: entering FAULT state for node $2"
+ if [ -x $CONNTRACKD_SCRIPT ]
+ then
+ sh $CONNTRACKD_SCRIPT fault $NODE $2
+ fi
+ iptables_stop_cluster_rules $2
+ ;;
+*)
+ echo "$0 start|stop|add|del [nodeid]"
+ ;;
+esac