diff options
author | Bart De Schuymer <bdschuym@pandora.be> | 2002-06-02 14:02:18 +0000 |
---|---|---|
committer | Bart De Schuymer <bdschuym@pandora.be> | 2002-06-02 14:02:18 +0000 |
commit | 08934e3091865e6a4165702401f032e8327e380e (patch) | |
tree | 09d072080007dc6ec5ded07133c65bde65728c45 /docs | |
parent | 8ecde2955fcc739657e95f87e53e74092873c892 (diff) |
describes the br-nf/ebtables/iptables interaction
Diffstat (limited to 'docs')
-rwxr-xr-x | docs/how_it_works.html | 255 |
1 files changed, 255 insertions, 0 deletions
diff --git a/docs/how_it_works.html b/docs/how_it_works.html new file mode 100755 index 0000000..9ec8b27 --- /dev/null +++ b/docs/how_it_works.html @@ -0,0 +1,255 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd"> +<HTML><HEAD><TITLE>How bridge/ebtables/iptables interaction works</TITLE> +<META http-equiv=Content-Type content="text/html; charset=iso-8859-15"> +<STYLE type=text/css>H1 { + FONT: bold 25pt Times, serif; TEXT-ALIGN: center; TEXT-DECORATION: underline +} +P { + FONT: 20pt Times, serif +} +LI { + MARGIN-BOTTOM: 2em; FONT: 22pt 'Times New Roman', serif +} +PRE { + FONT: 18pt Courier, monospace +} +.statement { + TEXT-DECORATION: underline +} +.section { + FONT: bold 22pt Times +} +.case { + FONT-STYLE: italic +} +</STYLE> + +<META content="MSHTML 6.00.2505.0" name=GENERATOR></HEAD> +<BODY> +<H1>How bridge/ebtables/iptables interaction works</H1> + +<P class=section>1. How frames traverse the <EM>ebtables</EM> chains:</P> +<P>This section only considers <EM>ebtables</EM>, _not_ <EM>iptables</EM>.</P> +<PRE> + Route + ^ + | +I +--------+ Bridge +----------+ +-------+ +-----------+ O +N->|BROUTING|-------->|PREROUTING|----->[BRIDGING]---->|FORWARD| ---->|POSTROUTING|-->U + +--------+ +----------+ [DECISION] +-------+ +-----------+ T + | ^ + v | + +-----+ +----------+ + |INPUT| |OUTPUT (2)| + +-----+ +----------+ + | ^ + | | + | +----------+ + | +OUTPUT (1)+ + | +----------+ + | ^ + +------->Local Process---------+ +</PRE> +<P> +First thing to keep in mind is that we are talking about the ethernet layer here, +so the OSI layer 2. A packet destined for the local computer according to the bridge +(which works on the ethernet layer) isn't necessarily destined for the local computer +according to the ip layer. That's how routing works (MAC destination is the router, ip +destination is the actual box you want to communicate with).</P> +<P> +<EM>Ebtables</EM> currently has three tables: filter, nat and broute. The filter table has a +FORWARD, INPUT and OUTPUT chain. The nat table has a PREROUTING, OUTPUT and POSTROUTING chain. +The broute table has the BROUTING chain. In the figure the filter OUTPUT chain has (2) +appended and the nat OUTPUT chain has (1) appended. So these two OUTPUT chains are not +the same (and have a different intended use).</P> +<P> +When a nic enslaved to a bridge receives a frame, the frame will first go through the BROUTING +chain. In this special chain one can choose whether to route or bridge frames. The default +is bridging and we will assume the decision in this chain is 'bridge'. So, next the frame +passes through the PREROUTING chain. This chain is intended for you to be able to alter the +destination MAC address of +frames (DNAT). If the frame passes this chain, the bridging code will decide where the +frame should be sent. The bridge does this by looking at the destination MAC address, it +doesn't care about the OSI layer 3 addresses (e.g. ip address). Note that frames coming in +on non-forwarding ports of a bridge will not be seen by <EM>ebtables</EM>, not even by the BROUTING +chain.</P> +<P> +If the bridge decides the frame is for the bridging computer, the frame will go through the +INPUT chain. In this chain you can filter frames destined for the bridge box. After passing +the INPUT chain, the frame will be given to the code on layer 3 (i.e. it will be passed up), +e.g. to the ip code. So, a routed ip packet will go through the <EM>ebtables</EM> INPUT chain, not +through the <EM>ebtables</EM> FORWARD chain. This is logical.</P> +<P> +Else the frame should possibly be sent onto another side of the bridge. If it should, the +frame will go through the FORWARD chain and the POSTROUTING chain. In the FORWARD chain one +can filter frames that will be bridged, the POSTROUTING chain is intended to be able to +change the MAC source address (SNAT).</P> +<P> +Frames that originate from the bridge box itself will go, after the bridging decision, through the +nat OUTPUT chain, through the filter OUTPUT chain and the POSTROUTING chain. The +nat OUTPUT chain allows you to alter the destination MAC address and the filter OUTPUT chain +allows you to filter frames originating from the bridge box. Note that the nat OUTPUT chain is +traversed after the bridging decision, so actually too late. We should change this. The POSTROUTING +chain is the same one as described above. Note that it is also possible for routed frames to go +through these chains, this is when the destination device is a logical bridge device.</P> +<P class=section> +2. A machine used as a bridge and a router (not a brouter):</P> +<P> +It's possible to see a single ip packet pass the PREROUTING, INPUT, nat OUTPUT, filter OUTPUT +and POSTROUTING <EM>ebtables</EM> chains.</P> +<P> +This can happen when the bridge is also used as a router. The ethernet frame(s) containing that +ip packet will have the bridge's destination MAC address, while the destination ip address is not +that of the bridge. Including the <EM>iptables</EM> chains, this is how the ip packet runs through the +bridge/router (eb=ebtables , ip=iptables ):</P> +<PRE>ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet</PRE> +<P> +This assumes that the routing decision sends the packet to a bridge interface. If the routing +decision sends the packet to a physical network card, this is what happens:</P> +<PRE>ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->send packet</PRE> +<P> +What is obviously "asymmetric" here is that the <EM>iptables</EM> PREROUTING chain is traversed before +the <EM>ebtables</EM> INPUT chain, however this can not be helped. See the next section.</P> +<P class=section> +3. DNATing bridged packets:</P> +<P> +Take an ip packet received by the bridge, it enters the bridge code. Lets assume we want to do +some ip DNAT on it. Changing the destination address of the packet (ip address and MAC address) +has to happen before the bridge code decides what to do with the packet. The bridge code can decide +to bridge it (if the destination MAC address is on another side of the bridge), flood it over all +the forwarding bridge ports (the position of the box with the destination MAC is unknown to the bridge), +give it to the higher protocol code (here, the ip code) if the destination MAC address is that of the +bridge, or ignore it (the destination MAC address is located on the same side of the bridge).</P> +<P> +So, this ip DNAT has to happen very early in the bridge code. Namely before the bridge code +actually does anything. This is at the same place as where the <EM>ebtables</EM> PREROUTING chain will +be traversed (for the same reason).</P> +<P class=section> +4. Chain traversal for bridged ip packets:</P> +<P> +A bridged packet never enters any network code above layer 2. So a bridged ip packet will never +enter the ip code. Therefore all <EM>iptables</EM> chains will be traversed while the ip packet is in the +bridge code. The chain traversal will look like this:</P> +<PRE> +ebPREROUTING->ipPREROUTING->ebFORWARD->ipFORWARD->ebPOSTROUTING->ipPOSTROUTING</PRE> +<P> +Once again note that there is a certain form of asymmetry here that cannot be helped.</P> +<P class=section> +5. Using a bridge port in <EM>iptables</EM> rules:</P> +<P> +The wish to be able to use physical devices belonging to a bridge (bridge ports) in <EM>iptables</EM> rules +is valid. It's necessary to prevent spoofing attacks. Say br0 has ports eth0 and eth1. If <EM>iptables</EM> +rules can only use br0 there's no way of knowing when a box on the eth0 side changes it's source ip +address to that of a box on the eth1 side, except by looking at the MAC source address (and then +still...). With the current bridge/iptables patch (0.0.6 or later) you can use eth0 and eth1 in your +<EM>iptables</EM> rules and therefore catch these attempts.</P> +<P class=case> +1. <EM>iptables</EM> wants to use bridge ports:<P> +<P> +To make this possible the <EM>iptables</EM> chains have to be traversed after the bridge code decided where +the frame needs to be sent (eth0, eth1, both or none). This has some impact on the scheme presented +in section 2 (so, we are looking at routed traffic here). It actually looks like this:</P> +<PRE> +ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ebOUTPUT(1)->ebOUTPUT(2)->ipPOSTROUTING->ebPOSTROUTING->send packet</PRE> +<P> +Note that this is the work of the br-nf patch. If one does not compile the br-nf code into the kernel, +the chains will be traversed as shown below. However, then one can only use br0, not eth0/eth1 to +filter.</P> +<PRE>ebPREROUTING->ebINPUT->ipPREROUTING->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet</PRE> +<P> +Notice that ipPREROUTING is now in the natural position in the chain list and too far to be able to change +the bridging decision. More precise: ipPREROUTING is now traversed while the packet is in the ip code.</P> +<P class=case> +2. IP DNAT for locally generated packets (so in the <EM>iptables</EM> nat OUTPUT chain):</P> +<P> +The 'normal' way locally generated packets would go through the chains looks like this:</P> +<PRE> +ipOUTPUT(1)->ipOUTPUT(2)->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING</PRE> +<P> +From the section 5.1 we know that this actually looks like this:</P> +<PRE> +ipOUTPUT(1)->ipOUTPUT(2)->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->ipPOSTROUTING</PRE> +<P> +Here we denote by ipOUTPUT(1) (resp. ipOUTPUT(2)) the <EM>iptables</EM> nat (resp. filter) OUTPUT chain. Note that +the ipOUTPUT(1) chain is traversed while the packet is in the ip code, while the ipOUTPUT(2) chain is traversed when +the packet has entered the bridge code. This makes it possible to do DNAT to another device in ipOUTPUT(1) and lets +one use the bridge ports in the ipOUTPUT(2) chain.</P> +<P class=section> +4. Two possible ways for frames/packets to pass through the <EM>iptables</EM> PREROUTING, FORWARD and POSTROUTING +chains:</P> +<P> +With the br-nf patch there are 2 ways a frame/packet can pass through the 3 given <EM>iptables</EM> +chains. The first way is when the frame is bridged, so the <EM>iptables</EM> chains are called by the bridge code. +The second way is when the packet is routed. So special care has to be taken to distinguish between those +two, especially in the <EM>iptables</EM> FORWARD chain. Here's an example of strange things to look out for:</P> +<P> +Consider the following situation (my personal setup)</P> +<PRE> + +-----------------+ + | cable modem | + +-------+---------+ + | + | + eth0|IP via DHCP from ISP + +-------+---------+ + |bridge/router/fw | + +--+-----------+--+ + eth1| 172.16.1.1|eth2 + | (br0) | + | | + 172.16.1.4| |172.16.1.2 + +----------+---+ +--+------------+ + |test computer/| | desktop | + |backup server | +---------------+ + +--------------+</PRE> +<P> +With this setup I can test the bridge+ebtables+iptables code while having access to the internet from all +three computers. The default gateway for 172.16.1.2 and 172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge +interface br0 with ports eth1 and eth2.</P> +<P class=case>More details:</P> +<P> +The idea is that traffic between 172.16.1.4 and 172.16.2 is bridged, while the rest is routed, using +masquerading. Here's the "script" I use at bootup for the bridge/router:</P> +<PRE> +iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -d 172.16.1.0/24 -j ACCEPT +iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -j MASQUERADE +insmod ebtables +insmod ebtable_filter +insmod ebtable_nat +insmod ebt_nat +insmod ebt_log +insmod ebt_arp +insmod ebt_ip +insmod br_db +brctl addbr br0 +brctl stp br0 off +brctl addif br0 eth1 +brctl addif br0 eth2 +ifconfig eth1 0 0.0.0.0 +ifconfig eth2 0 0.0.0.0 +ifconfig br0 172.16.1.1 netmask 255.255.255.0 up +echo '1' > /proc/sys/net/ipv4/ip_forward</PRE> +<P> +The catch is in the first line. Because the <EM>iptables</EM> code gets executed for both bridged packets and routed +packets we need to make a distinction between the two. We don't really want the bridged packets to be +masqueraded. If we omit the first line then everything will work too, but things will happen differently. +Let's say 172.16.1.2 pings 172.16.1.4. The bridge receives the ping request and will transmit it through its eth1 +port after first masquerading the ip address. So the packet's source ip address will now be 172.16.1.1 and +172.16.1.4 will respond to the bridge. Masquerading will change the ip destination of this response from +172.16.1.1 to 172.16.1.4. Everything works fine. But it's better not to have this behaviour. Thus, we use the +first line of the script to avoid this. Note that if I wanted to filter the connections to and from the +internet, I would certainly need the first line so I don't filter the local connections as well.</P> +<P class=section> +5. ip DNAT in the <EM>iptables</EM> PREROUTING chain on frames/packets entering on a bridge port:</P> +<P>Through some groovy play it is assured that (see /net/bridge/br_netfilter.c) DNAT'ed packets that after DNAT'ing +have the same output device as the input device they came on (the logical bridge device which we like to call br0) +will be bridged, not routed. So they will go through the <EM>ebtables</EM> FORWARD chain. All other DNAT'ed packets will be +routed, so won't go through the <EM>ebtables</EM> FORWARD chain, will go through the <EM>ebtables</EM> INPUT chain and might go +through the <EM>ebtables</EM> OUTPUT chain.</P> +<P> +Released under the GPL.</P> +<P> +Bart De Schuymer.</P> +<P> +Last updated the 19th May 2002.</P> +</BODY></HTML>
\ No newline at end of file |