From 08934e3091865e6a4165702401f032e8327e380e Mon Sep 17 00:00:00 2001 From: Bart De Schuymer Date: Sun, 2 Jun 2002 14:02:18 +0000 Subject: describes the br-nf/ebtables/iptables interaction --- docs/how_it_works.html | 255 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 255 insertions(+) create mode 100755 docs/how_it_works.html (limited to 'docs/how_it_works.html') diff --git a/docs/how_it_works.html b/docs/how_it_works.html new file mode 100755 index 0000000..9ec8b27 --- /dev/null +++ b/docs/how_it_works.html @@ -0,0 +1,255 @@ + +How bridge/ebtables/iptables interaction works + + + + + +

How bridge/ebtables/iptables interaction works

+ +

1. How frames traverse the ebtables chains:

This section only considers ebtables, _not_ iptables.

+     Route
+       ^
+       |
+I  +--------+ Bridge  +----------+                     +-------+      +-----------+   O
+N->|BROUTING|-------->|PREROUTING|----->[BRIDGING]---->|FORWARD| ---->|POSTROUTING|-->U
+   +--------+         +----------+      [DECISION]     +-------+      +-----------+   T
+                                             |                              ^ 
+                                             v                              |
+                                          +-----+                      +----------+
+                                          |INPUT|                      |OUTPUT (2)|
+                                          +-----+                      +----------+
+                                             |                              ^
+                                             |                              |
+                                             |                         +----------+
+                                             |                         +OUTPUT (1)+
+                                             |                         +----------+
+                                             |                              ^
+                                             +------->Local Process---------+
+

+First thing to keep in mind is that we are talking about the ethernet layer here, +so the OSI layer 2. A packet destined for the local computer according to the bridge +(which works on the ethernet layer) isn't necessarily destined for the local computer +according to the ip layer. That's how routing works (MAC destination is the router, ip +destination is the actual box you want to communicate with).

+Ebtables currently has three tables: filter, nat and broute. The filter table has a +FORWARD, INPUT and OUTPUT chain. The nat table has a PREROUTING, OUTPUT and POSTROUTING chain. +The broute table has the BROUTING chain. In the figure the filter OUTPUT chain has (2) +appended and the nat OUTPUT chain has (1) appended. So these two OUTPUT chains are not +the same (and have a different intended use).

+When a nic enslaved to a bridge receives a frame, the frame will first go through the BROUTING +chain. In this special chain one can choose whether to route or bridge frames. The default +is bridging and we will assume the decision in this chain is 'bridge'. So, next the frame +passes through the PREROUTING chain. This chain is intended for you to be able to alter the +destination MAC address of +frames (DNAT). If the frame passes this chain, the bridging code will decide where the +frame should be sent. The bridge does this by looking at the destination MAC address, it +doesn't care about the OSI layer 3 addresses (e.g. ip address). Note that frames coming in +on non-forwarding ports of a bridge will not be seen by ebtables, not even by the BROUTING +chain.

+If the bridge decides the frame is for the bridging computer, the frame will go through the +INPUT chain. In this chain you can filter frames destined for the bridge box. After passing +the INPUT chain, the frame will be given to the code on layer 3 (i.e. it will be passed up), +e.g. to the ip code. So, a routed ip packet will go through the ebtables INPUT chain, not +through the ebtables FORWARD chain. This is logical.

+Else the frame should possibly be sent onto another side of the bridge. If it should, the +frame will go through the FORWARD chain and the POSTROUTING chain. In the FORWARD chain one +can filter frames that will be bridged, the POSTROUTING chain is intended to be able to +change the MAC source address (SNAT).

+Frames that originate from the bridge box itself will go, after the bridging decision, through the +nat OUTPUT chain, through the filter OUTPUT chain and the POSTROUTING chain. The +nat OUTPUT chain allows you to alter the destination MAC address and the filter OUTPUT chain +allows you to filter frames originating from the bridge box. Note that the nat OUTPUT chain is +traversed after the bridging decision, so actually too late. We should change this. The POSTROUTING +chain is the same one as described above. Note that it is also possible for routed frames to go +through these chains, this is when the destination device is a logical bridge device.

+2. A machine used as a bridge and a router (not a brouter):

+It's possible to see a single ip packet pass the PREROUTING, INPUT, nat OUTPUT, filter OUTPUT +and POSTROUTING ebtables chains.

+This can happen when the bridge is also used as a router. The ethernet frame(s) containing that +ip packet will have the bridge's destination MAC address, while the destination ip address is not +that of the bridge. Including the iptables chains, this is how the ip packet runs through the +bridge/router (eb=ebtables , ip=iptables ):

ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet

+This assumes that the routing decision sends the packet to a bridge interface. If the routing +decision sends the packet to a physical network card, this is what happens:

ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->send packet

+What is obviously "asymmetric" here is that the iptables PREROUTING chain is traversed before +the ebtables INPUT chain, however this can not be helped. See the next section.

+3. DNATing bridged packets:

+Take an ip packet received by the bridge, it enters the bridge code. Lets assume we want to do +some ip DNAT on it. Changing the destination address of the packet (ip address and MAC address) +has to happen before the bridge code decides what to do with the packet. The bridge code can decide +to bridge it (if the destination MAC address is on another side of the bridge), flood it over all +the forwarding bridge ports (the position of the box with the destination MAC is unknown to the bridge), +give it to the higher protocol code (here, the ip code) if the destination MAC address is that of the +bridge, or ignore it (the destination MAC address is located on the same side of the bridge).

+So, this ip DNAT has to happen very early in the bridge code. Namely before the bridge code +actually does anything. This is at the same place as where the ebtables PREROUTING chain will +be traversed (for the same reason).

+4. Chain traversal for bridged ip packets:

+A bridged packet never enters any network code above layer 2. So a bridged ip packet will never +enter the ip code. Therefore all iptables chains will be traversed while the ip packet is in the +bridge code. The chain traversal will look like this:

+ebPREROUTING->ipPREROUTING->ebFORWARD->ipFORWARD->ebPOSTROUTING->ipPOSTROUTING

+Once again note that there is a certain form of asymmetry here that cannot be helped.

+5. Using a bridge port in iptables rules:

+The wish to be able to use physical devices belonging to a bridge (bridge ports) in iptables rules +is valid. It's necessary to prevent spoofing attacks. Say br0 has ports eth0 and eth1. If iptables +rules can only use br0 there's no way of knowing when a box on the eth0 side changes it's source ip +address to that of a box on the eth1 side, except by looking at the MAC source address (and then +still...). With the current bridge/iptables patch (0.0.6 or later) you can use eth0 and eth1 in your +iptables rules and therefore catch these attempts.

+1. iptables wants to use bridge ports:

+To make this possible the iptables chains have to be traversed after the bridge code decided where +the frame needs to be sent (eth0, eth1, both or none). This has some impact on the scheme presented +in section 2 (so, we are looking at routed traffic here). It actually looks like this:

+ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ebOUTPUT(1)->ebOUTPUT(2)->ipPOSTROUTING->ebPOSTROUTING->send packet

+Note that this is the work of the br-nf patch. If one does not compile the br-nf code into the kernel, +the chains will be traversed as shown below. However, then one can only use br0, not eth0/eth1 to +filter.

ebPREROUTING->ebINPUT->ipPREROUTING->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet

+Notice that ipPREROUTING is now in the natural position in the chain list and too far to be able to change +the bridging decision. More precise: ipPREROUTING is now traversed while the packet is in the ip code.

+2. IP DNAT for locally generated packets (so in the iptables nat OUTPUT chain):

+The 'normal' way locally generated packets would go through the chains looks like this:

+ipOUTPUT(1)->ipOUTPUT(2)->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING

+From the section 5.1 we know that this actually looks like this:

+ipOUTPUT(1)->ipOUTPUT(2)->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->ipPOSTROUTING

+Here we denote by ipOUTPUT(1) (resp. ipOUTPUT(2)) the iptables nat (resp. filter) OUTPUT chain. Note that +the ipOUTPUT(1) chain is traversed while the packet is in the ip code, while the ipOUTPUT(2) chain is traversed when +the packet has entered the bridge code. This makes it possible to do DNAT to another device in ipOUTPUT(1) and lets +one use the bridge ports in the ipOUTPUT(2) chain.

+4. Two possible ways for frames/packets to pass through the iptables PREROUTING, FORWARD and POSTROUTING +chains:

+With the br-nf patch there are 2 ways a frame/packet can pass through the 3 given iptables +chains. The first way is when the frame is bridged, so the iptables chains are called by the bridge code. +The second way is when the packet is routed. So special care has to be taken to distinguish between those +two, especially in the iptables FORWARD chain. Here's an example of strange things to look out for:

+Consider the following situation (my personal setup)

+         +-----------------+
+         |   cable modem   |
+         +-------+---------+
+                 |
+                 |
+             eth0|IP via DHCP from ISP
+         +-------+---------+
+         |bridge/router/fw |
+         +--+-----------+--+
+        eth1| 172.16.1.1|eth2
+            |   (br0)   |
+            |           |
+  172.16.1.4|           |172.16.1.2
+ +----------+---+    +--+------------+
+ |test computer/|    |    desktop    |
+ |backup server |    +---------------+
+ +--------------+

+With this setup I can test the bridge+ebtables+iptables code while having access to the internet from all +three computers. The default gateway for 172.16.1.2 and 172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge +interface br0 with ports eth1 and eth2.

More details:

+The idea is that traffic between 172.16.1.4 and 172.16.2 is bridged, while the rest is routed, using +masquerading. Here's the "script" I use at bootup for the bridge/router:

+iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -d 172.16.1.0/24 -j ACCEPT
+iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -j MASQUERADE
+insmod ebtables
+insmod ebtable_filter
+insmod ebtable_nat
+insmod ebt_nat
+insmod ebt_log
+insmod ebt_arp
+insmod ebt_ip
+insmod br_db
+brctl addbr br0
+brctl stp br0 off
+brctl addif br0 eth1
+brctl addif br0 eth2
+ifconfig eth1 0 0.0.0.0
+ifconfig eth2 0 0.0.0.0
+ifconfig br0 172.16.1.1 netmask 255.255.255.0 up
+echo '1' > /proc/sys/net/ipv4/ip_forward

+The catch is in the first line. Because the iptables code gets executed for both bridged packets and routed +packets we need to make a distinction between the two. We don't really want the bridged packets to be +masqueraded. If we omit the first line then everything will work too, but things will happen differently. +Let's say 172.16.1.2 pings 172.16.1.4. The bridge receives the ping request and will transmit it through its eth1 +port after first masquerading the ip address. So the packet's source ip address will now be 172.16.1.1 and +172.16.1.4 will respond to the bridge. Masquerading will change the ip destination of this response from +172.16.1.1 to 172.16.1.4. Everything works fine. But it's better not to have this behaviour. Thus, we use the +first line of the script to avoid this. Note that if I wanted to filter the connections to and from the +internet, I would certainly need the first line so I don't filter the local connections as well.

+5. ip DNAT in the iptables PREROUTING chain on frames/packets entering on a bridge port:

Through some groovy play it is assured that (see /net/bridge/br_netfilter.c) DNAT'ed packets that after DNAT'ing +have the same output device as the input device they came on (the logical bridge device which we like to call br0) +will be bridged, not routed. So they will go through the ebtables FORWARD chain. All other DNAT'ed packets will be +routed, so won't go through the ebtables FORWARD chain, will go through the ebtables INPUT chain and might go +through the ebtables OUTPUT chain.

+Released under the GPL.

+Bart De Schuymer.

+Last updated the 19th May 2002.

+ \ No newline at end of file -- cgit v1.2.3