nftables - nft command line tool

	Commit message (Collapse)	Author	Age	Files	Lines
*	src: add gretap support	Pablo Neira Ayuso	2023-01-02	1	-2/+11
\| \| \| \|	Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add geneve matching support	Pablo Neira Ayuso	2023-01-02	1	-3/+29
\| \| \| \| \| \|	Add support for GENEVE vni and (ether) type header field. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add gre support	Pablo Neira Ayuso	2023-01-02	1	-3/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	GRE has a number of fields that are conditional based on flags, which requires custom dependency code similar to icmp and icmpv6. Matching on optional fields is not supported at this stage. Since this is a layer 3 tunnel protocol, an implicit dependency on NFT_META_L4PROTO for IPPROTO_GRE is generated. To achieve this, this patch adds new infrastructure to remove an outer dependency based on the inner protocol from delinearize path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add vxlan matching support	Pablo Neira Ayuso	2023-01-02	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the initial infrastructure to support for inner header tunnel matching and its first user: vxlan. A new struct proto_desc field for payload and meta expression to specify that the expression refers to inner header matching is used. The existing codebase to generate bytecode is fully reused, allowing for reusing existing supported layer 2, 3 and 4 protocols. Syntax requires to specify vxlan before the inner protocol field: ... vxlan ip protocol udp ... vxlan ip saddr 1.2.3.0/24 This also works with concatenations and anonymous sets, eg. ... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 } You have to restrict vxlan matching to udp traffic, otherwise it complains on missing transport protocol dependency, e.g. ... udp dport 4789 vxlan ip daddr 1.2.3.4 The bytecode that is generated uses the new inner expression: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] JSON support is not included in this patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	xt: Rewrite unsupported compat expression dumping	Phil Sutter	2022-12-13	1	-0/+18
\| \| \| \| \| \| \| \| \|	Choose a format which provides more information and is easily parseable. Then teach parsers about it and make it explicitly reject the ruleset giving a meaningful explanation. Also update the man pages with some more details. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	parser_bison: display too many levels of nesting error	Pablo Neira Ayuso	2022-10-07	1	-4/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of hitting this assertion: nft: parser_bison.y:70: open_scope: Assertion `state->scope < array_size(state->scopes) - 1' failed. Aborted this is easier to trigger with implicit chains where one level of nesting from the existing chain scope is supported. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1615 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: allow burst 0 for byte ratelimit and use it as default	Pablo Neira Ayuso	2022-08-31	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Packet-based limit burst is set to 5, as in iptables. However, byte-based limit burst adds to the rate to calculate the bucket size, and this is also sets this to 5 (... bytes in this case). Update it to use zero byte burst by default instead. This patch also updates manpage to describe how the burst value influences the kernel module's token bucket in each of the two modes. This documentation update is based on original text by Phil Sutter. Adjust tests/py to silence warnings due to mismatching byte burst. Fixes: 285baccfea46 ("src: disallow burst 0 in ratelimits") Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: add missing synproxy scope closure	Florian Westphal	2022-06-27	1	-1/+1
\| \| \| \| \|	Fixes: 232f2c3287fc ("scanner: synproxy: Move to own scope") Signed-off-by: Florian Westphal <fw@strlen.de>
*	parser_bison: fix error location for set elements	Pablo Neira Ayuso	2022-06-27	1	-2/+2
\| \| \| \| \| \| \|	opt_newline causes interfere since it points to the previous line. Refer to set element key for error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	Revert "scanner: flags: move to own scope"	Florian Westphal	2022-06-10	1	-15/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Excess nesting of scanner scopes is very fragile and error prone: rule `iif != lo ip daddr 127.0.0.1/8 counter limit rate 1/second log flags all prefix "nft_lo4 " drop` fails with `Error: No symbol type information` hinting at `prefix` Problem is that we nest via: counter limit log flags By the time 'prefix' is scanned, state is still stuck in 'counter' due to this nesting. Working around "prefix" isn't enough, any other keyword, e.g. "level" in 'flags all level debug' will be parsed as 'string' too. So, revert this. Fixes: a16697097e2b ("scanner: flags: move to own scope") Reported-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
*	scanner: dup, fwd, tproxy: Move to own scopes	Phil Sutter	2022-03-01	1	-3/+6
\| \| \| \| \| \|	With these three scopes in place, keyword 'to' may be isolated. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: meta: Move to own scope	Phil Sutter	2022-03-01	1	-4/+5
\| \| \| \| \| \| \|	This allows to isolate 'length' and 'protocol' keywords shared by other scopes as well. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: at: Move to own scope	Phil Sutter	2022-03-01	1	-7/+8
\| \| \| \| \| \| \|	Modification of raw TCP option rule is a bit more complicated to avoid pushing tcp_hdr_option_type into the introduced scope by accident. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: nat: Move to own scope	Phil Sutter	2022-03-01	1	-6/+7
\| \| \| \| \| \| \| \| \| \|	Unify nat, masquerade and redirect statements, they widely share their syntax. Note the workaround of adding "prefix" to SCANSTATE_IP. This is required to fix for 'snat ip prefix ...' style expressions. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: policy: move to own scope	Phil Sutter	2022-03-01	1	-3/+4
\| \| \| \| \| \|	Isolate 'performance' and 'memory' keywords. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: flags: move to own scope	Phil Sutter	2022-03-01	1	-14/+15
\| \| \| \| \| \|	This isolates at least 'constant', 'dynamic' and 'all' keywords. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: reject: Move to own scope	Phil Sutter	2022-03-01	1	-1/+2
\| \| \| \| \| \|	Two more keywords isolated. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: import, export: Move to own scopes	Phil Sutter	2022-03-01	1	-2/+4
\| \| \| \| \| \| \|	In theory, one could use a common scope for both import and export commands, their parameters are identical. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: reset: move to own Scope	Phil Sutter	2022-03-01	1	-3/+4
\| \| \| \| \| \|	Isolate two more keywords shared with list command. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: monitor: Move to own Scope	Phil Sutter	2022-03-01	1	-1/+2
\| \| \| \| \| \|	Some keywords are shared with list command. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: rt: Extend scope over rt0, rt2 and srh	Phil Sutter	2022-03-01	1	-3/+3
\| \| \| \| \| \| \|	These are technically all just routing headers with different types, so unify them under the same scope. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: type: Move to own scope	Phil Sutter	2022-03-01	1	-32/+33
\| \| \| \| \| \|	As a side-effect, this fixes for use of 'classid' as set data type. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: dst, frag, hbh, mh: Move to own scopes	Phil Sutter	2022-03-01	1	-8/+12
\| \| \| \| \| \| \|	These are the remaining IPv6 extension header expressions, only rt expression was scoped already. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: ah, esp: Move to own scopes	Phil Sutter	2022-03-01	1	-4/+6
\| \| \| \| \| \|	They share 'sequence' keyword with icmp and tcp expressions. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: osf: Move to own scope	Phil Sutter	2022-03-01	1	-2/+3
\| \| \| \| \| \|	It shares two keywords with PARSER_SC_IP. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: dccp, th: Move to own scopes	Phil Sutter	2022-03-01	1	-4/+6
\| \| \| \| \| \| \|	With them in place, heavily shared keywords 'sport' and 'dport' may be isolated. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: udp{,lite}: Move to own scope	Phil Sutter	2022-03-01	1	-5/+7
\| \| \| \| \| \| \|	All used keywords are shared with others, so no separation for now apart from 'csumcov' which was actually missing from scanner.l. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: comp: Move to own scope.	Phil Sutter	2022-03-01	1	-2/+3
\| \| \| \| \| \|	Isolates only 'cpi' keyword for now. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: synproxy: Move to own scope	Phil Sutter	2022-03-01	1	-7/+8
\| \| \| \| \| \|	Quite a few keywords are shared with PARSER_SC_TCP. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: tcp: Move to own scope	Phil Sutter	2022-03-01	1	-1/+1
\| \| \| \| \| \| \|	Apart from header fields, this isolates TCP option types and fields, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: igmp: Move to own scope	Phil Sutter	2022-03-01	1	-1/+2
\| \| \| \| \| \| \|	At least isolates 'mrt' and 'group' keywords, the latter is shared with log statement. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: icmp{,v6}: Move to own scope	Phil Sutter	2022-03-01	1	-6/+7
\| \| \| \| \| \|	Unify the two, header fields are almost identical. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	src: add tcp option reset support	Florian Westphal	2022-02-28	1	-0/+11
\| \| \| \| \| \| \|	This allows to replace a tcp option with nops, similar to the TCPOPTSTRIP feature of iptables. Signed-off-by: Florian Westphal <fw@strlen.de>
*	parser_bison: missing synproxy support in map declarations	Pablo Neira Ayuso	2022-01-19	1	-0/+1
\| \| \| \| \| \| \|	Update parser to allow for maps with synproxy. Fixes: f44ab88b1088 ("src: add synproxy stateful object support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: allow quoted string in flowtable_expr_member	Stijn Tintel	2021-12-23	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Devices with interface names starting with a digit can not be configured in flowtables. Trying to do so throws the following error: Error: syntax error, unexpected number, expecting comma or '}' devices = { eth0, 6in4-wan6 }; This is however a perfectly valid interface name. Solve the issue by allowing the use of quoted strings. Suggested-by: Jo-Philipp Wich <jo@mein.io> Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	iptopt: fix crash with invalid field/type combo	Florian Westphal	2021-12-07	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	% nft describe ip option rr value segmentation fault after this fix, this exits with 'Error: unknown ip option type/field'. Problem is that 'rr' doesn't have a value template, so the template struct is all-zeroes, so we crash when trying to use tmpl->dtype (its NULL). Furthermore, expr_describe tries to print expr->identifier but expr is exthdr, not symbol: ->identifier contains garbage. Signed-off-by: Florian Westphal <fw@strlen.de>
*	ipopt: drop unused 'ptr' argument	Florian Westphal	2021-12-07	1	-2/+2
\| \| \| \| \| \| \| \| \|	Its always 0, so remove it. Looks like this was intended to support variable options that have array-like members, but so far this isn't implemented, better remove dead code and implement it properly when such support is needed. Signed-off-by: Florian Westphal <fw@strlen.de>
*	mptcp: add subtype matching	Florian Westphal	2021-12-01	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MPTCP multiplexes the various mptcp signalling data using the first 4 bits of the mptcp option. This allows to match on the mptcp subtype via: tcp option mptcp subtype 1 This misses delinearization support. mptcp subtype is the first tcp option field that has a length of less than one byte. Serialization processing will add a binop for this, but netlink delinearization can't remove them, yet. Also misses a new datatype/symbol table to allow to use mnemonics like 'mp_join' instead of raw numbers. For this reason, no tests are added yet. Signed-off-by: Florian Westphal <fw@strlen.de>
*	tcpopt: add md5sig, fastopen and mptcp options	Florian Westphal	2021-12-01	1	-2/+8
\| \| \| \| \| \| \| \| \|	Allow to use "fastopen", "md5sig" and "mptcp" mnemonics rather than the raw option numbers. These new keywords are only recognized while scanner is in tcp state. Signed-off-by: Florian Westphal <fw@strlen.de>
*	parser: split tcp option rules	Florian Westphal	2021-12-01	1	-19/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At this time the parser will accept nonsensical input like tcp option mss left 2 which will be treated as 'tcp option maxseg size 2'. This is because the enum space overlaps. Split the rules so that 'tcp option mss' will only accept field names specific to the mss/maxseg option kind. Signed-off-by: Florian Westphal <fw@strlen.de> (cherry picked from commit 46168852c03d73c29b557c93029dc512ca6e233a)
*	scanner: add tcp flex scope	Florian Westphal	2021-12-01	1	-5/+6
\| \| \| \| \| \| \| \|	This moves tcp options not used anywhere else (e.g. in synproxy) to a distinct scope. This will also allow to avoid exposing new option keywords in the ruleset context. Signed-off-by: Florian Westphal <fw@strlen.de>
*	tcpopt: remove KIND keyword	Florian Westphal	2021-12-01	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tcp option <foo> kind ... never makes any sense, as "tcp option <foo>" already tells the kernel to look for the foo <kind>. "tcp option sack kind 5" matches if the sack option is present; its a more complicated form of the simpler "tcp option sack exists". "tcp option sack kind 1" (or any other value than 5) will never match. So remove this. Test cases are converted to "exists". Signed-off-by: Florian Westphal <fw@strlen.de>
*	parser: allow for string raw payload base	Pablo Neira Ayuso	2021-11-16	1	-2/+11
\| \| \| \| \| \| \|	Remove new 'ih' token, allow to represent the raw payload base with a string instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: raw payload match and mangle on inner header / payload data	Pablo Neira Ayuso	2021-11-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support to match on inner header / payload data: # nft add rule x y @ih,32,32 0x14000000 counter you can also mangle payload data: # nft add rule x y @ih,32,32 set 0x14000000 counter This update triggers a checksum update at the layer 4 header via csum_flags, mangling odd bytes is also aligned to 16-bits. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: extend limit syntax	Jeremy Sowden	2021-11-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The documentation describes the syntax of limit statements thus: limit rate [over] packet_number / TIME_UNIT [burst packet_number packets] limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT] TIME_UNIT := second \| minute \| hour \| day BYTE_UNIT := bytes \| kbytes \| mbytes From this one might infer that a limit may be specified by any of the following: limit rate 1048576/second limit rate 1048576 mbytes/second limit rate 1048576 / second limit rate 1048576 mbytes / second However, the last does not currently parse: $ sudo /usr/sbin/nft add filter input limit rate 1048576 mbytes / second Error: wrong rate format add filter input limit rate 1048576 mbytes / second ^^^^^^^^^^^^^^^^^^^^^^^^^ Extend the `limit_rate_bytes` parser rule to support it, and add some new Python test-cases. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: add `limit_rate_pkts` and `limit_rate_bytes` rules	Jeremy Sowden	2021-11-03	1	-62/+59
\| \| \| \| \| \| \| \| \|	Factor the `N / time-unit` and `N byte-unit / time-unit` expressions from limit expressions out into separate `limit_rate_pkts` and `limit_rate_bytes` rules respectively. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: add new `limit_bytes` rule	Jeremy Sowden	2021-11-03	1	-6/+9
\| \| \| \| \| \| \| \|	Refactor the `N byte-unit` expression out of the `limit_bytes_burst` rule into a separate `limit_bytes` rule. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: queue: consolidate queue statement syntax	Pablo Neira Ayuso	2021-08-20	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Print queue statement using the 'queue ... to' syntax to consolidate the syntax around Florian's proposal introduced in 6cf0f2c17bfb ("src: queue: allow use of arbitrary queue expressions"). Retain backward compatibility, 'queue num' syntax is still allowed. Update and add new tests. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: permit symbolic define for 'queue num' again	Florian Westphal	2021-08-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WHen I simplified the parser to restrict 'queue num' I forgot that instead of range and immediate value its also allowed to pass in a variable expression, e.g. define myq = 0 add rule ... 'queue num $myq bypass' Allow those as well and add a test case for this. Fixes: 767f0af82a389 ("parser: restrict queue num expressiveness") Reported-by: Amish <anon.amish@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
*	mnl: revisit hook listing	Pablo Neira Ayuso	2021-08-06	1	-20/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update this command to display the hook datapath for a packet depending on its family. This patch also includes: - Group of existing hooks based on the hook location. - Order hooks by priority, from INT_MIN to INT_MAX. - Do not add sign to priority zero. - Refresh include/linux/netfilter/nfnetlink_hook.h cache copy. - Use NFNLA_CHAIN_* attributes to print the chain family, table and name. If NFNLA_CHAIN_* attributes are not available, display the hookfn name. - Update syntax: remove optional hook parameter, promote the 'device' argument. The following example shows the hook datapath for IPv4 packets coming in from netdevice 'eth0': # nft list hooks ip device eth0 family ip { hook ingress { +0000000010 chain netdev x y [nf_tables] +0000000300 chain inet m w [nf_tables] } hook input { -0000000100 chain ip a b [nf_tables] +0000000300 chain inet m z [nf_tables] } hook forward { -0000000225 selinux_ipv4_forward 0000000000 chain ip a c [nf_tables] } hook output { -0000000225 selinux_ipv4_output } hook postrouting { +0000000225 selinux_ipv4_postroute } } Note that the listing above includes the existing netdev and inet hooks/chains which might interfer in the travel of an incoming IPv4 packet. This allows users to debug the pipeline, basically, to understand in what order the hooks/chains are evaluated for the IPv4 packets. If the netdevice is not specified, then the ingress hooks are not shown. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>