VERDICT STATEMENT ~~~~~~~~~~~~~~~~~ The verdict statement alters control flow in the ruleset and issues policy decisions for packets. [verse] {accept | drop | queue | continue | return} {jump | goto} 'chain' [horizontal] *accept*:: Terminate ruleset evaluation and accept the packet. *drop*:: Terminate ruleset evaluation and drop the packet. *queue*:: Terminate ruleset evaluation and queue the packet to userspace. *continue*:: Continue ruleset evaluation with the next rule. FIXME *return*:: Return from the current chain and continue evaluation at the next rule in the last chain. If issued in a base chain, it is equivalent to *accept*. *jump* 'chain':: Continue evaluation at the first rule in 'chain'. The current position in the ruleset is pushed to a call stack and evaluation will continue there when the new chain is entirely evaluated of a *return* verdict is issued. *goto* 'chain':: Similar to *jump*, but the current position is not pushed to the call stack, meaning that after the new chain evaluation will continue at the last chain instead of the one containing the goto statement. .Verdict statements ------------------- # process packets from eth0 and the internal network in from_lan # chain, drop all packets from eth0 with different source addresses. filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan filter input iif eth0 drop ------------------- PAYLOAD STATEMENT ~~~~~~~~~~~~~~~~~ The payload statement alters packet content. It can be used for example to set ip DSCP (differv) header field or ipv6 flow labels. .route some packets instead of bridging --------------------------------------- # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging # assumes 00:11:22:33:44:55 is local MAC address. bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55 ------------------------------------------- .Set IPv4 DSCP header field --------------------------- ip forward ip dscp set 42 --------------------------- EXTENSION HEADER STATEMENT ~~~~~~~~~~~~~~~~~~~~~~~~~~ The extension header statement alters packet content in variable-sized headers. This can currently be used to alter the TCP Maximum segment size of packets, similar to TCPMSS. .change tcp mss --------------- tcp flags syn tcp option maxseg size set 1360 # set a size based on route information: tcp flags syn tcp option maxseg size set rt mtu --------------- LOG STATEMENT ~~~~~~~~~~~~~ [verse] log [prefix quoted_string] [level syslog-level] [flags log-flags] log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen size] The log statement enables logging of matching packets. When this statement is used from a rule, the Linux kernel will print some information on all matching packets, such as header fields, via the kernel log (where it can be read with dmesg(1) or read in the syslog). If the group number is specified, the Linux kernel will pass the packet to nfnetlink_log which will multicast the packet through a netlink socket to the specified multicast group. One or more userspace processes may subscribe to the group to receive the packets, see libnetfilter_queue documentation for details. This is a non-terminating statement, so the rule evaluation continues after the packet is logged. .log statement options [options="header"] |================== |Keyword | Description | Type |prefix| Log message prefix| quoted string |level| Syslog level of logging | string: emerg, alert, crit, err, warn [default], notice, info, debug |group| NFLOG group to sned messages to| unsigned integer (16 bit) |snaplen| Length of packet payload to include in netlink messages | unsigned integer (32 bit) |queue-threshold| Number of packets to queue inside the kernel before sending them to userpace | unsigned integer (32 bit) |================================== .log statement options [options="header"] |================== | Flag | Description |tcp sequence| Log TCP sequence numbers. |tcp options| Log options from the TCP packet header. |ip options| Log options from the IP/IPv6 packet header. |skuid| Log the userid of the process which generated the packet. |ether| Decode MAC addresses and protocol. |all| Enable all log flags listed above. |============================== .Using log statement -------------------- # log the UID which generated the packet and ip options ip filter output log flags skuid flags ip options # log the tcp sequence numbers and tcp options from the TCP packet ip filter output log flags tcp sequence,options # enable all supported log flags ip6 filter output log flags all ----------------------- REJECT STATEMENT ~~~~~~~~~~~~~~~~ [verse] *reject* [ with {icmp | icmpv6 | icmpx} type {icmp_code | icmpv6_code | icmpx_code} ] *reject* [ with tcp reset ] A reject statement is used to send back an error packet in response to the matched packet otherwise it is equivalent to drop so it is a terminating statement, ending rule traversal. This statement is only valid in the input, forward and output chains, and user-defined chains which are only called from those chains. .different ICMP reject variants are meant for use in different table families [options="header"] |================== |Variant |Family | Type |icmp| ip| icmp_code |icmpv6| ip6| icmpv6_code |icmpx| inet| icmpx_code |================== For a description of the different types and a list of supported keywords refer to DATA TYPES section above. The common default reject value is *port-unreachable*. + Note that in bridge family, reject statement is only allowed in base chains which hook into input or prerouting. COUNTER STATEMENT ~~~~~~~~~~~~~~~~~ A counter statement sets the hit count of packets along with the number of bytes. [verse] counter [ packets 'number' bytes 'number' ] CONNTRACK STATEMENT ~~~~~~~~~~~~~~~~~~~ The conntrack statement can be used to set the conntrack mark and conntrack labels. [verse] *ct* {mark | event | label | zone} set 'value' The ct statement sets meta data associated with a connection. The zone id has to be assigned before a conntrack lookup takes place, i.e. this has to be done in prerouting and possibly output (if locally generated packets need to be placed in a distinct zone), with a hook priority of -300. .Conntrack statement types [options="header"] |================== |Keyword| Description| Value |event| conntrack event bits | bitmask, integer (32 bit) |helper| name of ct helper object to assign to the connection | quoted string |mark| Connection tracking mark | mark |label| Connection tracking label| label |zone| conntrack zone| integer (16 bit) |================== .save packet nfmark in conntrack -------------------------------- ct mark set meta mark -------------------------------- .set zone mapped via interface ------------------------------ table inet raw { chain prerouting { type filter hook prerouting priority -300; ct zone set iif map { "eth1" : 1, "veth1" : 2 } } chain output { type filter hook output priority -300; ct zone set oif map { "eth1" : 1, "veth1" : 2 } } } ------------------------------------------------------ .restrict events reported by ctnetlink -------------------------------------- ct event set new,related,destroy -------------------------------------- META STATEMENT ~~~~~~~~~~~~~~ A meta statement sets the value of a meta expression. The existing meta fields are: priority, mark, pkttype, nftrace. + meta {mark | priority | pkttype | nftrace} set 'value' + A meta statement sets meta data associated with a packet. + .Meta statement types [options="header"] |================== |Keyword| Description| Value |priority | TC packet priority| tc_handle |mark| Packet mark | mark |pkttype | packet type | pkt_type |nftrace | ruleset packet tracing on/off. Use monitor trace command to watch traces| 0, 1 |========================== LIMIT STATEMENT ~~~~~~~~~~~~~~~ [verse] *limit* rate [over] 'packet_number' / {second | minute | hour | day} [burst 'packet_number' packets] *limit* rate [over] 'byte_number' {bytes | kbytes | mbytes} / {second | minute | hour | day | week} [burst 'byte_number' bytes] A limit statement matches at a limited rate using a token bucket filter. A rule using this statement will match until this limit is reached. It can be used in combination with the log statement to give limited logging. The over keyword, that is optional, makes it match over the specified rate. .limit statement values [options="header"] |================== |Value | Description | Type| |packet_number | Number of packets | unsigned integer (32 bit) |byte_number | Number of bytes | unsigned integer (32 bit) |======================== NAT STATEMENTS ~~~~~~~~~~~~~~ [verse] *snat* to address [:port] [persistent, random, fully-random] *snat* to address - address [:port - port] [persistent, random, fully-random] *dnat* to address [:port] [persistent, random, fully-random] *dnat* to address [:port - port] [persistent, random, fully-random] *masquerade* to [:port] [persistent, random, fully-random] *masquerade* to [:port - port] [persistent, random, fully-random] *redirect* to [:port] [persistent, random, fully-random] *redirect* to [:port - port] [persistent, random, fully-random] The nat statements are only valid from nat chain types. + The *snat* and *masquerade* statements specify that the source address of the packet should be modified. While *snat* is only valid in the postrouting and input chains, *masquerade* makes sense only in postrouting. The dnat and redirect statements are only valid in the prerouting and output chains, they specify that the destination address of the packet should be modified. You can use nonbase chains which are called from base chains of nat chain type too. All future packets in this connection will also be mangled, and rules should cease being examined. The *masquerade* statement is a special form of snat which always uses the outgoing interface's IP address to translate to. It is particularly useful on gateways with dynamic (public) IP addresses. The *redirect* statement is a special form of dnat which always translates the destination address to the local host's one. It comes in handy if one only wants to alter the destination port of incoming traffic on different interfaces. Note that all nat statements require both prerouting and postrouting base chains to be present since otherwise packets on the return path won't be seen by netfilter and therefore no reverse translation will take place. .NAT statement values [options="header"] |================== |Expression| Description| Type |address| Specifies that the source/destination address of the packet should be modified. You may specify a mapping to relate a list of tuples composed of arbitrary expression key with address value. | ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. meta mark map { 10 : 192.168.1.2, 20 : 192.168.1.3 } |port| Specifies that the source/destination address of the packet should be modified. | port number (16 bits) |=============================== .NAT statement flags [options="header"] |================== |Flag| Description |persistent | Gives a client the same source-/destination-address for each connection. |random| If used then port mapping will be randomized using a random seeded MD5 hash mix using source and destination address and destination port. |fully-random| If used then port mapping is generated based on a 32-bit pseudo-random algorithm. |============================= .Using NAT statements --------------------- # create a suitable table/chain setup for all further examples add table nat add chain nat prerouting { type nat hook prerouting priority 0; } add chain nat postrouting { type nat hook postrouting priority 100; } # translate source addresses of all packets leaving via eth0 to address 1.2.3.4 add rule nat postrouting oif eth0 snat to 1.2.3.4 # redirect all traffic entering via eth0 to destination address 192.168.1.120 add rule nat prerouting iif eth0 dnat to 192.168.1.120 # translate source addresses of all packets leaving via eth0 to whatever # locally generated packets would use as source to reach the same destination add rule nat postrouting oif eth0 masquerade # redirect incoming TCP traffic for port 22 to port 2222 add rule nat prerouting tcp dport 22 redirect to :2222 ------------------------ TPROXY STATEMENT ~~~~~~~~~~~~~~~~ Tproxy redirects the packet to a local socket without changing the packet header in any way. If any of the arguments is missing the data of the incoming packet is used as parameter. Tproxy matching requires another rule that ensures the presence of transport protocol header is specified. [verse] tproxy to 'address' : 'port' tproxy to {'address' | : 'port'} This syntax can be used in *ip/ip6* tables where network layer protocol is obvious. Either ip address or port can be specified, but at least one of them is necessary. [verse] tproxy {ip | ip6} to 'address' [: 'port'] tproxy to : 'port' This syntax can be used in *inet* tables. The *ip/ip6* parameter defines the family the rule will match. The *address* parameter must be of this family. When only *port* is defined, the address family should not be specified. In this case the rule will match for both families. .tproxy attributes [options="header"] |================= | Name | Description | address | IP address the listening socket with IP_TRANSPARENT option is bound to. | port | Port the listening socket with IP_TRANSPARENT option is bound to. |================= .Example ruleset for tproxy statement ------------------------------------- table ip x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to 1.1.1.1 udp dport ssh tproxy to :2222 } } table ip6 x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to [dead::beef] udp dport ssh tproxy to :2222 } } table inet x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport 321 tproxy to :ssh tcp dport 99 tproxy ip to 1.1.1.1:999 udp dport 155 tproxy ip6 to [dead::beef]:smux } } ------------------------------------- FLOW OFFLOAD STATEMENT ~~~~~~~~~~~~~~~~~~~~~~ A flow offload statement allows us to select what flows you want to accelerate forwarding through layer 3 network stack bypass. You have to specify the flowtable name where you want to offload this flow. *flow offload* @flowtable QUEUE STATEMENT ~~~~~~~~~~~~~~~ This statement passes the packet to userspace using the nfnetlink_queue handler. The packet is put into the queue identified by its 16-bit queue number. Userspace can inspect and modify the packet if desired. Userspace must then drop or re-inject the packet into the kernel. See libnetfilter_queue documentation for details. [verse] *queue* [num 'queue_number'] [bypass] *queue* [num 'queue_number_from' - 'queue_number_to'] [bypass,fanout] .queue statement values [options="header"] |================== |Value | Description | Type |queue_number | Sets queue number, default is 0. | unsigned integer (16 bit) |queue_number_from | Sets initial queue in the range, if fanout is used. | unsigned integer (16 bit) |queue_number_to | Sets closing queue in the range, if fanout is used. | unsigned integer (16 bit) |===================== .queue statement flags [options="header"] |================== |Flag | Description |bypass | Let packets go through if userspace application cannot back off. Before using this flag, read libnetfilter_queue documentation for performance tuning recommendations. |fanout | Distribute packets between several queues. |=============================== DUP STATEMENT ~~~~~~~~~~~~~ The dup statement is used to duplicate a packet and send the copy to a different destination. [verse] *dup* to 'device' *dup* to 'address' device '*device*' .Dup statement values [options="header"] |================== |Expression | Description | Type |address | Specifies that the copy of the packet should be sent to a new gateway.| ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping. e.g. ip saddr map { 192.168.1.2 : 10.1.1.1 } |device | Specifies that the copy should be transmitted via device. | string |=================== .Using the dup statement ------------------------ # send to machine with ip address 10.2.3.4 on eth0 ip filter forward dup to 10.2.3.4 device "eth0" # copy raw frame to another interface netdetv ingress dup to "eth0" dup to "eth0" # combine with map dst addr to gateways dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" } ----------------------------------- FWD STATEMENT ~~~~~~~~~~~~~ The fwd statement is used to redirect a raw packet to another interface. It is only available in the netdev family ingress hook. It is similar to the dup statement except that no copy is made. *fwd* to 'device' SET STATEMENT ~~~~~~~~~~~~~ The set statement is used to dynamically add or update elements in a set from the packet path. The set setname must already exist in the given table and must have been created with the dynamic flag. Furthermore, these sets must specify both a maximum set size (to prevent memory exhaustion) and a timeout (so that number of entries in set will not grow indefinitely). The set statement can be used to e.g. create dynamic blacklists. [verse] {add | update} '@setname' { 'expression' [timeout 'timeout'] [comment 'string'] } .Example for simple blacklist ----------------------------- # declare a set, bound to table "filter", in family "ip". Timeout and size are mandatory because we will add elements from packet path. nft add set ip filter blackhole "{ type ipv4_addr; flags timeout; size 65536; }" # whitelist internal interface. nft add rule ip filter input meta iifname "internal" accept # drop packets coming from blacklisted ip addresses. nft add rule ip filter input ip saddr @blackhole counter drop # add source ip addresses to the blacklist if more than 10 tcp connection requests occurred per second and ip address. # entries will timeout after one minute, after which they might be re-added if limit condition persists. nft add rule ip filter input tcp flags syn tcp dport ssh meter flood size 128000 { ip saddr timeout 10s limit rate over 10/second} add @blackhole { ip saddr timeout 1m } drop # inspect state of the rate limit meter: nft list meter ip filter flood # inspect content of blackhole: nft list set ip filter blackhole # manually add two addresses to the set: nft add element filter blackhole { 10.2.3.4, 10.23.1.42 } -----------------------------------------------