+ ebtables/iptables interaction on a Linux-based bridge +

+ Table of Contents +

+ This document describes how iptables and + ebtables filtering tables interact on a Linux-based bridge.
+ Getting a bridging firewall consists of patching the kernel source + code with two patches. The first patch is called "br-nf" and makes + bridged IP frames/packets go through the iptables chains. + The second patch adds ebtables support in the kernel. + Ebtables filters on the Ethernet layer, while iptables + only filters IP packets.
+ It is possible to use ebtables without compiling the br-nf + code into the kernel. The only reason why the ebtables patch + has to be applied after the br-nf patch is because some files are + changed by both patches. +

+ This section only considers ebtables, not + iptables. +

+ First thing to keep in mind is that we are talking about + the Ethernet layer here, so the OSI layer 2 (Data link + layer), or layer 1 (Link layer, Network Access layer) by the TCP/IP Network + Model. All samples below will be explained according to the TCP/IP + Network Model. +

+ A packet destined for the local computer according to the + bridge (which works on the Ethernet layer) isn't + necessarily destined for the local computer according to + the IP layer. That's how routing works (MAC destination is + the router, IP destination is the actual box you want to + communicate with). +

+ There are five hooks defined in the Linux bridging code. + The sixth hook (BROUTING) is added by the ebtables patch. + The hooks are places in the network + code where software can hook itself in to process the + packets/frames passing that hook. +

+ Ebtables has three built in tables: + filter, nat and broute, as shown in Figure 2c. +

+ The filter OUTPUT and nat OUTPUT chains are separated and have + a different usage. +

+ Figures 2b and 2c give a clear view where the + ebtables chains are hooked into the bridge code. +

+ When a NIC enslaved to a bridge receives a frame, the frame + will first go through the BROUTING chain. In this special + chain you can choose whether to route or bridge frames, + enabling you to make a brouter. The definitions found on + the Internet for what a brouter actually is differ a bit. + The next definition describes the brouting ability using the + BROUTING chain quite well: +

+ A brouter is a device which + bridges some frames/packets (i.e. forwards based on Link layer + information) and routes other frames/packets (i.e. forwards based + on Network layer information). The bridge/route decision is + based on configuration information. +

+ A brouter can be used, for example, + to act as a normal router for IP traffic between 2 + networks, while bridging specific traffic (NetBEUI, ARP, + whatever) between those networks. The IP routing + table does not use the bridge logical device and the box has + IP addresses assigned to the physical network devices that + also happen to be bridge ports (bridge enslaved NICs).
+ The default decision in the BROUTING chain is bridging. +

+ Next the frame passes through the PREROUTING chain. + In this chain you can alter the destination MAC address + of frames (DNAT). + If the frame passes this chain, the bridging code will decide where the + frame should be sent. The bridge does this by looking at + the destination MAC address, it doesn't care about the + Network layer addresses (e.g. IP address). +

+ Incoming frames on non-forwarding ports of a bridge will + not be seen by ebtables, not even by the BROUTING + chain. +

+ If the bridge decides the frame is destined for the local + computer, the frame will go through the INPUT chain. + In this chain you can filter frames destined for the bridge box. + After traversal of the INPUT chain, the frame will be passed up + to the Network layer code (e.g. to the IP code). + So, a routed IP packet will go through + the ebtables INPUT chain, not through the + ebtables FORWARD chain. This is logical. +

+ Otherwise the frame should possibly be sent onto another side + of the bridge. If it should, the frame will go through the + FORWARD chain and the POSTROUTING chain. The bridged frames can be + filtered in the FORWARD chain. In the POSTROUTING chain you can alter the MAC + source address (SNAT). +

+ Locally originated frames will, after the bridging decision, traverse + the nat OUTPUT, the filter OUTPUT and the nat POSTROUTING chains. + The nat OUTPUT chain allows to alter the destination + MAC address and the filter OUTPUT chain allows to + filter frames originating from the bridge box. Note that + the nat OUTPUT chain is traversed after the bridging + decision, so this is actually too late. We should change this. The nat + POSTROUTING chain is the same one as described above. +

+ It's also possible for routed frames to go + through these three chains when the destination + device is a logical bridge device. +

+ Note that the iptables nat OUTPUT chain is situated after the + routing decision. As commented in the previous section, + this is too late for DNAT. This is solved by rerouting the + IP packet if it has been DNAT'ed, before continuing. +

+ Figures 3a and 3b give a clear view where the + iptables chains are hooked into the IP code. When the br-nf + patch is compiled into the kernel, the iptables chains are + also hooked in the hooks of the bridging code. However, + this does not mean that they are no longer hooked into their + standard IP code hooks. For IP packets that get into + contact with the bridging code, the br-nf patch will + decide in which place in the network code the iptables + chains are traversed. Obviously, it is guaranteed that no chain is + traversed twice by the same packet. All packets that do not come into + contact with the bridge code traverse the iptables chains + in the standard way as seen in Figure 3b.
+ The following sections try, among other things, + to explain what the br-nf patch does and why it does it. +

+ It's possible to see a single IP packet/frame traverse the + nat PREROUTING, filter INPUT, nat OUTPUT, filter OUTPUT and + nat POSTROUTING ebtables chains.
+ This can happen when the bridge is also used as a router. + The Ethernet frame(s) containing that IP packet will have + the bridge's destination MAC address, while the destination + IP address is not of the bridge. Including the + iptables chains and assuming the br-nf code is + compiled into the kernel, this is how the IP packet runs + through the bridge/router (actually there is more going on, + see section 6): +

+ Figure 3c. Bridge/router routes packet to a + bridge interface (simplistic view)
+

+ This assumes that the routing decision sends the packet to + a bridge interface. If the routing decision sends the + packet to a physical network card, this is what happens: +

+ Figure 3d. Bridge/router routes packet to a + physical interface (simplistic view)
+

+ What is obviously "asymmetric" here is that the + iptables PREROUTING chain is traversed before the + ebtables INPUT chain, however this cannot be + helped without sacrificing other functionality. See the + next section. +

+ Take an IP packet received by the bridge. Let's assume we + want to do some IP DNAT on it. + Changing the destination address of the packet (IP address + and MAC address) has to happen before the bridge code + decides what to do with the frame/packet. + The decision of the bridge code can be one of these: +

+ So, this IP DNAT has to happen very early in the bridge + code. Namely before the bridge code actually does anything. + This is at the same place as where the ebtables nat + PREROUTING chain will be traversed (for the same reason). + This should explain the asymmetry encountered in Figures 3c + and 3d. +

+ A bridged packet never enters any network code above layer + 1 (Link layer). So, a bridged IP packet/frame will never enter the + IP code. + Therefore all iptables chains will be traversed + while the IP packet is in the bridge code. The chain + traversal will look like this: +

+ The wish to be able to use physical devices belonging to a + bridge (bridge ports) in iptables rules is valid. + Knowing the input bridge ports is necessary to prevent + spoofing attacks. Say br0 has ports eth0 and eth1. If + iptables rules can only use br0 there's no way of + knowing when a box on the eth0 side changes its source IP + address to that of a box on the eth1 side, except by + looking at the MAC source address (and then still...). With + the br-nf patch you can use eth0 and eth1 in your + iptables rules and therefore catch these attempts. +

+ To make this possible the iptables chains have to + be traversed after the bridge code decided where the frame + needs to be sent (eth0, eth1, both or none). This has some + impact on the scheme presented in section 3 (so, we are looking at routed + traffic here). It actually looks like this (in the case of + Figure 3c): +

+ Figure 6a. Chain traversal when routing and br-nf + is compiled into the kernel
+

+ All chains are now traversed while in the bridge code.
+ This is the work of the br-nf patch. Obviously this does not + mean that the routed IP packets never enter the IP code. They + just don't pass any iptables chains while in the IP code. +

+ If one does not compile the br-nf code into the kernel, the + chains will be traversed as shown below. However, then one + can only use br0, not eth0/eth1 to filter. +

+ Figure 6b. Chain traversal when routing and br-nf + code is not compiled into the kernel
+

+ Notice that the iptables PREROUTING chain is now in + the natural position in the chain list and too far to be able + to change the bridging decision. More precise: the iptables + PREROUTING chain is now traversed when the packet is already + in the IP code. +

+ 6.2. IP DNAT for locally generated packets (so in the + iptables nat OUTPUT chain): +

+ The 'normal' way locally generated packets would go through + the chains looks like this: +

+ From section 6.1 we know that this + actually looks like this (due to the br-nf code): +

+ Note that the iptables nat OUTPUT chain is traversed while the + packet is in the IP code, while the iptables filter OUTPUT chain + is traversed when the packet has entered the bridge code. + This makes it possible to do DNAT to another device in the + nat OUTPUT chain and lets us use the bridge ports in the + filter OUTPUT chain. +

+ Note that in Figures 6a and 6d the iptables + POSTROUTING chain is traversed before the ebtables + POSTROUTING chain, while it's the way around for Figure 5. + The rule is as follows: +

+ for bridged traffic the ebtables POSTROUTING chain + is traversed before the iptables POSTROUTING chain, + for all other traffic it's the way around. +

+ 7. Two possible ways for frames/packets to pass through the + iptables PREROUTING, FORWARD and POSTROUTING + chains +

+ With the br-nf patch there are 2 ways a frame/packet can + pass through the 3 given iptables chains. The + first way is when the frame is bridged, so the + iptables chains are called by the bridge code. The + second way is when the packet is routed. So special care + has to be taken to distinguish between those two, + especially in the iptables FORWARD chain. Here's + an example of strange things to look out for: +

+ The default gateway for 172.16.1.2 and + 172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge + interface br0 with ports eth1 and eth2. +

+ The idea is that traffic between 172.16.1.4 and 172.16.2 is + bridged, while the rest is routed, using masquerading. +

+ The catch is in the first line. Because the + iptables code gets executed for both bridged + packets and routed packets we need to make a distinction + between the two. We don't really want the bridged frames/packets + to be masqueraded. If we omit the first line then + everything will work too, but things will happen + differently. Let's say 172.16.1.2 pings 172.16.1.4. The + bridge receives the ping request and will transmit it + through its eth1 port after first masquerading the IP + address. So the packet's source IP address will now be + 172.16.1.1 and 172.16.1.4 will respond to the bridge. + Masquerading will change the IP destination of this + response from 172.16.1.1 to 172.16.1.4. Everything works + fine. But it's better not to have this behaviour. Thus, we + use the first line to avoid this. Note that + if we would want to filter the connections to and from the + internet, we would certainly need the first line so we don't + filter the local connections as well. +

+ 8. IP DNAT in the iptables PREROUTING chain on + frames/packets entering on a bridge port +

+ Through some groovy play it is assured that (see + /net/bridge/br_netfilter.c) DNAT'ed packets that after + DNAT'ing have the same output device as the input device + they came on (the logical bridge device which we like to + call br0) will go through the ebtables FORWARD + chain, not through the ebtables INPUT/OUTPUT chains. All + other DNAT'ed packets will be purely routed, so won't go + through the ebtables FORWARD chain, will go through + the ebtables INPUT chain and might go through the + ebtables OUTPUT chain.
+

+ The side effect explained here occurs when the br-nf code + is compiled in the kernel, the IP packet is routed and the + out device for that packet is a logical bridge device. The + side effect is encountered when filtering on the MAC source + in the iptables FORWARD chains. As should be clear + from earlier sections, the traversal of the + iptables FORWARD chains is postponed until the + packet is in the bridge code. This is done so we can + filter on the bridge port out device. This has a side + effect on the MAC source address, because the IP code will + have changed the MAC source address to the MAC address of + the bridge device. It is therefore impossible, in the + iptables FORWARD chains, to filter on the MAC + source address of the computer sending the packet in + question to the bridge/router. If you really need to filter + on this MAC source address, you should do it in the nat + PREROUTING chain. Agreed, very ugly, but making it possible + to filter on the real MAC source address in the FORWARD + chains would involve a very dirty hack and is probably not + worth it. This of course makes the anti-spoofing remark of + section 6 funny. If I'm [BDS] + pressured enough I could hack something up to make this + unpleasant side effect go away. +