diff options
Diffstat (limited to 'doc/statements.txt')
-rw-r--r-- | doc/statements.txt | 221 |
1 files changed, 185 insertions, 36 deletions
diff --git a/doc/statements.txt b/doc/statements.txt index 7c7240c8..39b31fd2 100644 --- a/doc/statements.txt +++ b/doc/statements.txt @@ -11,7 +11,7 @@ The verdict statement alters control flow in the ruleset and issues policy decis [horizontal] *accept*:: Terminate ruleset evaluation and accept the packet. The packet can still be dropped later by another hook, for instance accept -in the forward hook still allows to drop the packet later in the postrouting hook, +in the forward hook still allows one to drop the packet later in the postrouting hook, or another forward base chain that has a higher priority number and is evaluated afterwards in the processing pipeline. *drop*:: Terminate ruleset evaluation and drop the packet. @@ -71,7 +71,7 @@ EXTENSION HEADER STATEMENT The extension header statement alters packet content in variable-sized headers. This can currently be used to alter the TCP Maximum segment size of packets, -similar to TCPMSS. +similar to the TCPMSS target in iptables. .change tcp mss --------------- @@ -80,6 +80,13 @@ tcp flags syn tcp option maxseg size set 1360 tcp flags syn tcp option maxseg size set rt mtu --------------- +You can also remove tcp options via reset keyword. + +.remove tcp option +--------------- +tcp flags syn reset tcp option sack-perm +--------------- + LOG STATEMENT ~~~~~~~~~~~~~ [verse] @@ -93,10 +100,11 @@ packets, such as header fields, via the kernel log (where it can be read with dmesg(1) or read in the syslog). In the second form of invocation (if 'nflog_group' is specified), the Linux -kernel will pass the packet to nfnetlink_log which will multicast the packet -through a netlink socket to the specified multicast group. One or more userspace -processes may subscribe to the group to receive the packets, see -libnetfilter_queue documentation for details. +kernel will pass the packet to nfnetlink_log which will send the log through a +netlink socket to the specified group. One userspace process may subscribe to +the group to receive the logs, see man(8) ulogd for the Netfilter userspace log +daemon and libnetfilter_log documentation for details in case you would like to +develop a custom program to digest your logs. In the third form of invocation (if level audit is specified), the Linux kernel writes a message into the audit buffer suitably formatted for reading @@ -163,37 +171,77 @@ REJECT STATEMENT ____ *reject* [ *with* 'REJECT_WITH' ] -'REJECT_WITH' := *icmp type* 'icmp_code' | - *icmpv6 type* 'icmpv6_code' | - *icmpx type* 'icmpx_code' | +'REJECT_WITH' := *icmp* 'icmp_reject_code' | + *icmpv6* 'icmpv6_reject_code' | + *icmpx* 'icmpx_reject_code' | *tcp reset* ____ A reject statement is used to send back an error packet in response to the matched packet otherwise it is equivalent to drop so it is a terminating statement, ending rule traversal. This statement is only valid in base chains -using the *input*, +using the *prerouting*, *input*, *forward* or *output* hooks, and user-defined chains which are only called from those chains. -.different ICMP reject variants are meant for use in different table families +.Keywords may be used to reject when specifying the ICMP code [options="header"] |================== -|Variant |Family | Type -|icmp| -ip| -icmp_code -|icmpv6| -ip6| -icmpv6_code -|icmpx| -inet| -icmpx_code +|Keyword | Value +|net-unreachable | +0 +|host-unreachable | +1 +|prot-unreachable| +2 +|port-unreachable| +3 +|frag-needed| +4 +|net-prohibited| +9 +|host-prohibited| +10 +|admin-prohibited| +13 +|=================== + +.keywords may be used to reject when specifying the ICMPv6 code +[options="header"] +|================== +|Keyword |Value +|no-route| +0 +|admin-prohibited| +1 +|addr-unreachable| +3 +|port-unreachable| +4 +|policy-fail| +5 +|reject-route| +6 |================== -For a description of the different types and a list of supported keywords refer -to DATA TYPES section above. The common default reject value is -*port-unreachable*. + +The ICMPvX Code type abstraction is a set of values which overlap between ICMP +and ICMPv6 Code types to be used from the inet family. + +.keywords may be used when specifying the ICMPvX code +[options="header"] +|================== +|Keyword |Value +|no-route| +0 +|port-unreachable| +1 +|host-unreachable| +2 +|admin-prohibited| +3 +|================= + +The common default ICMP code to reject is *port-unreachable*. Note that in bridge family, reject statement is only allowed in base chains which hook into input or prerouting. @@ -270,7 +318,7 @@ ct event set new,related,destroy NOTRACK STATEMENT ~~~~~~~~~~~~~~~~~ -The notrack statement allows to disable connection tracking for certain +The notrack statement allows one to disable connection tracking for certain packets. [verse] @@ -288,7 +336,7 @@ A meta statement sets the value of a meta expression. The existing meta fields are: priority, mark, pkttype, nftrace. + [verse] -*meta* {*mark* | *priority* | *pkttype* | *nftrace*} *set* 'value' +*meta* {*mark* | *priority* | *pkttype* | *nftrace* | *broute*} *set* 'value' A meta statement sets meta data associated with a packet. + @@ -308,6 +356,9 @@ pkt_type |nftrace | ruleset packet tracing on/off. Use *monitor trace* command to watch traces| 0, 1 +|broute | +broute on/off. packets are routed instead of being bridged| +0, 1 |========================== LIMIT STATEMENT @@ -324,8 +375,13 @@ ____ A limit statement matches at a limited rate using a token bucket filter. A rule using this statement will match until this limit is reached. It can be used in combination with the log statement to give limited logging. The optional -*over* keyword makes it match over the specified rate. Default *burst* is 5. -if you specify *burst*, it must be non-zero value. +*over* keyword makes it match over the specified rate. + +The *burst* value influences the bucket size, i.e. jitter tolerance. With +packet-based *limit*, the bucket holds exactly *burst* packets, by default +five. If you specify packet *burst*, it must be a non-zero value. With +byte-based *limit*, the bucket's minimum size is the given rate's byte value +and the *burst* value adds to that, by default zero bytes. .limit statement values [options="header"] @@ -343,8 +399,8 @@ NAT STATEMENTS ~~~~~~~~~~~~~~ [verse] ____ -*snat* [[*ip* | *ip6*] *to*] 'ADDR_SPEC' [*:*'PORT_SPEC'] ['FLAGS'] -*dnat* [[*ip* | *ip6*] *to*] 'ADDR_SPEC' [*:*'PORT_SPEC'] ['FLAGS'] +*snat* [[*ip* | *ip6*] [ *prefix* ] *to*] 'ADDR_SPEC' [*:*'PORT_SPEC'] ['FLAGS'] +*dnat* [[*ip* | *ip6*] [ *prefix* ] *to*] 'ADDR_SPEC' [*:*'PORT_SPEC'] ['FLAGS'] *masquerade* [*to :*'PORT_SPEC'] ['FLAGS'] *redirect* [*to :*'PORT_SPEC'] ['FLAGS'] @@ -382,6 +438,9 @@ Before kernel 4.18 nat statements require both prerouting and postrouting base c to be present since otherwise packets on the return path won't be seen by netfilter and therefore no reverse translation will take place. +The optional *prefix* keyword allows to map to map *n* source addresses to *n* +destination addresses. See 'Advanced NAT examples' below. + .NAT statement values [options="header"] |================== @@ -392,7 +451,7 @@ You may specify a mapping to relate a list of tuples composed of arbitrary expression key with address value. | ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. meta mark map { 10 : 192.168.1.2, 20 : 192.168.1.3 } |port| -Specifies that the source/destination address of the packet should be modified. | +Specifies that the source/destination port of the packet should be modified. | port number (16 bit) |=============================== @@ -441,6 +500,52 @@ add rule inet nat postrouting meta oif ppp0 masquerade ------------------------ +.Advanced NAT examples +---------------------- + +# map prefixes in one network to that of another, e.g. 10.141.11.4 is mangled to 192.168.2.4, +# 10.141.11.5 is mangled to 192.168.2.5 and so on. +add rule nat postrouting snat ip prefix to ip saddr map { 10.141.11.0/24 : 192.168.2.0/24 } + +# map a source address, source port combination to a pool of destination addresses and ports: +add rule nat postrouting dnat to ip saddr . tcp dport map { 192.168.1.2 . 80 : 10.141.10.2-10.141.10.5 . 8888-8999 } + +# The above example generates the following NAT expression: +# +# [ nat dnat ip addr_min reg 1 addr_max reg 10 proto_min reg 9 proto_max reg 11 ] +# +# which expects to obtain the following tuple: +# IP address (min), source port (min), IP address (max), source port (max) +# to be obtained from the map. The given addresses and ports are inclusive. + +# This also works with named maps and in combination with both concatenations and ranges: +table ip nat { + map ipportmap { + typeof ip saddr : interval ip daddr . tcp dport + flags interval + elements = { 192.168.1.2 : 10.141.10.1-10.141.10.3 . 8888-8999, 192.168.2.0/24 : 10.141.11.5-10.141.11.20 . 8888-8999 } + } + + chain prerouting { + type nat hook prerouting priority dstnat; policy accept; + ip protocol tcp dnat ip to ip saddr map @ipportmap + } +} + +@ipportmap maps network prefixes to a range of hosts and ports. +The new destination is taken from the range provided by the map element. +Same for the destination port. + +Note the use of the "interval" keyword in the typeof description. +This is required so nftables knows that it has to ask for twice the +amount of storage for each key-value pair in the map. + +": ipv4_addr . inet_service" would allow associating one address and one port +with each key. But for this case, for each key, two addresses and two ports +(The minimum and maximum values for both) have to be stored. + +------------------------ + TPROXY STATEMENT ~~~~~~~~~~~~~~~~ Tproxy redirects the packet to a local socket without changing the packet header @@ -589,13 +694,19 @@ for details. [verse] ____ -*queue* [*num* 'queue_number'] [*bypass*] -*queue* [*num* 'queue_number_from' - 'queue_number_to'] ['QUEUE_FLAGS'] +*queue* [*flags* 'QUEUE_FLAGS'] [*to* 'queue_number'] +*queue* [*flags* 'QUEUE_FLAGS'] [*to* 'queue_number_from' - 'queue_number_to'] +*queue* [*flags* 'QUEUE_FLAGS'] [*to* 'QUEUE_EXPRESSION' ] 'QUEUE_FLAGS' := 'QUEUE_FLAG' [*,* 'QUEUE_FLAGS'] 'QUEUE_FLAG' := *bypass* | *fanout* +'QUEUE_EXPRESSION' := *numgen* | *hash* | *symhash* | *MAP STATEMENT* ____ +QUEUE_EXPRESSION can be used to compute a queue number +at run-time with the hash or numgen expressions. It also +allows one to use the map statement to assign fixed queue numbers +based on external inputs such as the source ip address or interface names. .queue statement values [options="header"] @@ -651,7 +762,7 @@ string ip filter forward dup to 10.2.3.4 device "eth0" # copy raw frame to another interface -netdetv ingress dup to "eth0" +netdev ingress dup to "eth0" dup to "eth0" # combine with map dst addr to gateways @@ -661,10 +772,27 @@ dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" } FWD STATEMENT ~~~~~~~~~~~~~ The fwd statement is used to redirect a raw packet to another interface. It is -only available in the netdev family ingress hook. It is similar to the dup -statement except that no copy is made. +only available in the netdev family ingress and egress hooks. It is similar to +the dup statement except that no copy is made. +You can also specify the address of the next hop and the device to forward the +packet to. This updates the source and destination MAC address of the packet by +transmitting it through the neighboring layer. This also decrements the ttl +field of the IP packet. This provides a way to effectively bypass the classical +forwarding path, thus skipping the fib (forwarding information base) lookup. + +[verse] *fwd to* 'device' +*fwd* [*ip* | *ip6*] *to* 'address' *device* 'device' + +.Using the fwd statement +------------------------ +# redirect raw packet to device +netdev ingress fwd to "eth0" + +# forward packet to next hop 192.168.200.1 via eth0 device +netdev ingress ether saddr set fwd ip to 192.168.200.1 device "eth0" +----------------------------------- SET STATEMENT ~~~~~~~~~~~~~ @@ -680,6 +808,10 @@ will not grow indefinitely) either from the set definition or from the statement that adds or updates them. The set statement can be used to e.g. create dynamic blacklists. +Dynamic updates are also supported with maps. In this case, the *add* or +*update* rule needs to provide both the key and the data element (value), +separated via ':'. + [verse] {*add* | *update*} *@*'setname' *{* 'expression' [*timeout* 'timeout'] [*comment* 'string'] *}* @@ -764,3 +896,20 @@ ____ # jump to different chains depending on layer 4 protocol type: nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain } ------------------------ + +XT STATEMENT +~~~~~~~~~~~~ +This represents an xt statement from xtables compat interface. It is a +fallback if translation is not available or not complete. + +[verse] +____ +*xt* 'TYPE' 'NAME' + +'TYPE' := *match* | *target* | *watcher* +____ + +Seeing this means the ruleset (or parts of it) were created by *iptables-nft* +and one should use that to manage it. + +*BEWARE:* nftables won't restore these statements. |