nftables - nft command line tool

	Commit message (Collapse)	Author	Age	Files	Lines
*	netlink: add size description for constant sets	Pablo Neira Ayuso	2017-05-26	1	-0/+2
\| \| \| \| \| \| \| \|	The kernel side can make better decisions with this information when selecting the right backend, so add this information to the set netlink message. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	segtree: reset element size counter before adding intervals to set	Pablo Neira Ayuso	2017-05-26	1	-0/+2
\| \| \| \| \| \| \|	Otherwise we get double the real size in terms of set elements during the interval expansion to individual elements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	rule: adjust set expression size accordingly with intervals	Pablo Neira Ayuso	2017-05-26	1	-6/+11
\| \| \| \| \| \| \|	For implicit sets, we have to call set_to_intervals() before we add the set so we have the net size in elements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: ip: switch implicit dependencies to meta l4proto too	Florian Westphal	2017-05-19	2	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	after ip6 nexthdr also switch ip to meta l4proto instead of ip protocol. While its needed for ipv6 (due to extension headers) this isn't needed for ip but it has the advantage that tcp dport 22 produces same expressions for ip/ip6/inet families. Signed-off-by: Florian Westphal <fw@strlen.de>
*	payload: enforce ip/ip6 protocol depending on icmp or icmpv6	Florian Westphal	2017-05-19	1	-4/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After some discussion with Pablo we agreed to treat icmp/icmpv6 specially. in the case of a rule like 'tcp dport 22' the inet, bridge and netdev families only care about the lower layer protocol. In the icmpv6 case however we'd like to also enforce an ipv6 protocol check (and ipv4 check in icmp case). This extends payload_gen_special_dependency() to consider this. With this patch: add rule $pf filter input meta l4proto icmpv6 add rule $pf filter input meta l4proto icmpv6 icmpv6 type echo-request add rule $pf filter input icmpv6 type echo-request will work in all tables and all families. For inet/bridge/netdev, an ipv6 protocol dependency is added; this will not match ipv4 packets with ip->protocol == icmpv6, EXCEPT in the case of the ip family. Its still possible to match icmpv6-in-ipv4 in inet/bridge/netdev with an explicit dependency: add rule inet f i ip protocol ipv6-icmp meta l4proto ipv6-icmp icmpv6 type ... Implicit dependencies won't get removed at the moment, so bridge ... icmp type echo-request will be shown as ether type ip meta l4proto 1 icmp type echo-request Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: ipv6: switch implicit dependencies to meta l4proto	Florian Westphal	2017-05-19	2	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when using rule like ip6 filter input tcp dport 22 nft generates: [ payload load 1b @ network header + 6 => reg 1 ] [ cmp eq reg 1 0x00000006 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x00001600 ] which is: ip6 filter input ip6 nexthdr tcp dport 22 IOW, such a rule won't match if e.g. a fragment header is in place. This changes ip6_proto to use 'meta l4proto' which is the protocol header found by exthdr walk. A side effect is that for bridge we get a shorter dependency chain as it no longer needs to prepend 'ether proto ipv6' for old 'ip6 nexthdr' dep. Only problem: ip6 nexthdr tcp tcp dport 22 will now inject a (useless) meta l4 dependency as ip6 nexthdr is no longer flagged as EXPR_F_PROTOCOL, to avoid this add a small helper that skips the unneded meta dependency in that case. Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: allow update of net base w. meta l4proto icmpv6	Florian Westphal	2017-05-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	nft add rule ip6 f i meta l4proto ipv6-icmp icmpv6 type nd-router-advert <cmdline>:1:50-60: Error: conflicting protocols specified: unknown vs. icmpv6 add icmpv6 to nexthdr list so base gets updated correctly. Reported-by: Thomas Woerner <twoerner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
*	payload: split ll proto dependency into helper	Florian Westphal	2017-05-18	1	-11/+18
\| \| \| \| \| \| \|	will be re-used in folloup patch for icmp/icmpv6 depenency handling. Signed-off-by: Florian Westphal <fw@strlen.de>
*	netlink_delinearize: reject: remove dependency for tcp-resets	Florian Westphal	2017-05-18	1	-0/+6
\| \| \| \| \| \|	We can remove a l4 dependency in ip/ipv6 families. Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: add a comment wrt. reject dependency insertion	Florian Westphal	2017-05-18	1	-0/+8
\| \| \| \| \| \| \| \|	at first I thought this was a bug but this in fact seems the right thing, add a comment/example why adding dependency as first statement makes sense. Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: delete the old cache when dumping is interrupted	Liping Zhang	2017-05-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the dumping operation is interrupted, we will restart the cache_init(), but unfortunatly, we forget to delete the old cache. So in extreme case, we will leak a huge amount of memory. Running the following commands can simulate the extreme case: # nft add table t # nft add set t s {type inet_service \;} # for i in $(seq 65000); do nft add element t s {$i} done & # while : ; do time nft list ruleset -nn done After a while, oom killer will be triggered: [ 2808.243537] Out of memory: Kill process 16975 (nft) score 649 or sacrifice child [ 2808.255372] Killed process 16975 (nft) total-vm:1955348kB, anon-rss:1952120kB, file-rss:0kB, shmem-rss:0kB [ 2858.353729] nft invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_ MOVABLE\|__GFP_COLD), nodemask=(null), order=0, oom_score_adj=0 [ 2858.374521] nft cpuset=/ mems_allowed=0 ... Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	expression: print sets and maps in pretty format	Arturo Borrero Gonzalez	2017-05-15	2	-1/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Print elements per line instead of all in a single line. The elements which can be 'short' are printed 5 per line, and others, like IPv4 addresses are printed 2 per line. Example: % nft list ruleset -nnn table ip t { set s { type inet_service elements = { 1, 2, 3, 4, 10, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 12345 } } map m { type inet_service . iface_index : verdict elements = { 123 . "lo" : accept, 1234 . "lo" : accept, 12345 . "lo" : accept, 12346 . "lo" : accept, 12347 . "lo" : accept } } set s3 { type ipv4_addr elements = { 1.1.1.1, 2.2.2.2, 3.3.3.3 } } } Signed-off-by: Arturo Borrero Gonzalez <arturo@debian.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink_delink_delinearize: don't store dependency unless relop checks is eq ↵	Florian Westphal	2017-05-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	check 'ip protocol ne 6' is not a dependency for nexthdr protocol, and must not be stored as such. Fixes: 0b858391781ba308 ("src: annotate follow up dependency just after killing another") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink_delinearize: don't kill dependencies accross statements	Florian Westphal	2017-05-08	2	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nft currently translates ip protocol tcp meta mark set 1 tcp dport 22 to mark set 0x00000001 tcp dport 22 This is wrong, the latter form is same as mark set 0x00000001 ip protocol tcp tcp dport 22 and thats not correct (original rule sets mark for tcp packets only). We need to clear the dependency stack whenever we see a statement other than stmt_expr, as these will have side effects (counter, payload mangling, logging and the like). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: allow listing sets in one table	Florian Westphal	2017-05-04	2	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	currently nft can lists sets: nft list sets but unlike e.g. 'quotas' or 'counters' we didn't support restricting it to a table. Now its possible to restrict set definition listing to one table: nft list sets table inet filter Reported-by: Thomas Woerner <twoerner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	ct: add conntrack event mask support	Florian Westphal	2017-04-24	1	-0/+30
\| \| \| \| \|	Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	hash: generate a random seed if seed option is empty	Liping Zhang	2017-04-15	4	-13/+22
\| \| \| \| \| \| \| \| \| \| \|	Typing the "nft add rule x y ct mark set jhash ip saddr mod 2" will not generate a random seed, instead, the seed will always be zero. So if seed option is empty, we shoulde not set the NFTA_HASH_SEED attribute, then a random seed will be generated in the kernel. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: fix build warning on i686	Florian Westphal	2017-04-08	1	-1/+1
\| \| \| \| \| \| \|	datatype.c:182:13: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=] printf("%lu", val); Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	exthdr: avoid crash with older kernels	Florian Westphal	2017-04-08	1	-16/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if kernel is older it won't understand the EXTHDR_OP attribute, i.e. the rule gets accepted as a check for ipv6 exthdr. On dump nft is then presented with a invalid ipv6 exthdr. So we need to get rid of the assert and output an "invalid" message on list. Longterm we need a proper vm description or kernel-side check to reject such messages in first place. After patch, test suite yields erros of type ip6/tcpopt.t: WARNING: 'src/nft add rule --debug=netlink ip6 test-ip6 \ input tcp option sack right 1': 'tcp option sack right 1' mismatches 'ip6 nexthdr 6 unknown-exthdr unknown 0x1 [invalid type]' Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: hash: fix seed attribute not listed	Laura Garcia Liebana	2017-03-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tests warned about a problem with the seed listing. /tests/py# ./nft-test.py ip/hash.t ip/hash.t: WARNING: line: 4: 'src/nft add rule --debug=netlink \ ip test-ip4 pre ct mark set jhash ip saddr . ip daddr mod 2 \ seed 0xdeadbeef': 'ct mark set jhash ip saddr . ip daddr mod 2 \ seed 0xdeadbeef' mismatches 'ct mark set jhash ip saddr . ip \ daddr mod 2' ip/hash.t: WARNING: line: 6: 'src/nft add rule --debug=netlink \ ip test-ip4 pre ct mark set jhash ip saddr . ip daddr mod 2 seed \ 0xdeadbeef offset 100': 'ct mark set jhash ip saddr . ip daddr \ mod 2 seed 0xdeadbeef offset 100' mismatches 'ct mark set jhash \ ip saddr . ip daddr mod 2 offset 100' ip/hash.t: 6 unit tests, 0 error, 2 warning The expression type is now treated as an unsigned int in the hash_expr_print() function. Fixes 3a86406 ("src: hash: support of symmetric hash") Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: Make flush command selective of the set structure type	Elise Lennion	2017-03-24	3	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The internal set infrastructure is used for sets, maps and flow tables. The flush command requires the set type but currently it works for all of them. E.g. if there is a set named 's' in a table 't' the following command shouldn't be valid but still executes: $ nft flush flow table t s This patch makes the flush command selective so 'flush flow table' only works in flow tables and so on. Fixes: 6d37dae ("parser_bison: Allow flushing maps") Fixes: 2daa0ee ("parser_bison: Allow flushing flow tables") Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	sets: Fix for missing space after last element	Phil Sutter	2017-03-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Not having a space between the last element in a set and the closing curly brace looks ugly, so add it here. This also adjusts all shell testcases as they match whitespace in nft output and therefore fail otherwise. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: set: Fix nested set merge size adjustment	Phil Sutter	2017-03-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When merging a nested set into the parent one, we are actually replacing one item with the items of the nested set. Therefore we have to remove the replaced item from size. The respective bug isn't as easy to trigger, since the size field seems to be relevant only when set elements are ranges which are checked for overlaps. Here's an example of how to trigger it: \| add rule ip saddr { { 1.1.1.0/24, 3.3.3.0/24 }, 2.2.2.0/24 } Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: set: Allow for set elems to be sets	Phil Sutter	2017-03-21	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recursive use of sets is handled in parts by parser_bison.y, which has a rule for inline unnamed sets in set_list_member_expr, e.g. like this: \| add rule ip saddr { { 1.1.1.0, 2.2.2.0 }, 3.3.3.0 } Yet there is another way to have an unnamed set inline, which is via define: \| define myset = { \| 1.1.1.0, \| 2.2.2.0, \| } \| add rule ip saddr { $myset, 3.3.3.0 } This didn't work because the inline set comes in as EXPR_SET_ELEM with EXPR_SET as key. This patch handles that case by replacing the former by a copy of the latter, so the following set list merging can take place. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser_bison: Allow flushing maps	Elise Lennion	2017-03-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables the command flush on maps, which removes all entries in it: $ nft flush map filter map1 Command above flushes map 'map1' in table 'filter'. The documentation was updated accordingly. Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser_bison: Allow flushing flow tables	Elise Lennion	2017-03-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This patch enables the command flush on flow tables, which removes all entries in it: $ nft flush flow table filter ft-https Command above flushes flow table 'ft-https' in table 'filter'. Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	proto: Add some exotic ICMPv6 types	Phil Sutter	2017-03-20	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for matching on inverse ND messages as defined by RFC3122 (not implemented in Linux) and MLDv2 as defined by RFC3810. Note that ICMPV6_MLD2_REPORT macro is defined in linux/icmpv6.h but including that header leads to conflicts with symbols defined in netinet/icmp6.h. In addition to the above, "mld-listener-done" is introduced as an alias for "mld-listener-reduction". Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: implement add/create/delete for ct helper objects	Florian Westphal	2017-03-16	3	-2/+87
\| \| \| \| \|	Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: allow listing all ct helpers	Florian Westphal	2017-03-16	3	-0/+22
\| \| \| \| \| \| \| \| \| \| \|	this implements nft list ct helpers table filter table ip filter { ct helper ftp-standard { .. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: refactor CMD_OBJ_QUOTA/COUNTER handling	Florian Westphal	2017-03-16	1	-12/+20
\| \| \| \| \| \| \|	... to make adding CMD_OBJ_CT_HELPER support easier. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add initial ct helper support	Florian Westphal	2017-03-16	5	-4/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds initial support for defining conntrack helper objects which can then be assigned to connections using the objref infrastructure: table ip filter { ct helper ftp-standard { type "ftp" protocol tcp } chain y { tcp dport 21 ct helper set "ftp-standard" } } Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	fib: Support existence check	Phil Sutter	2017-03-13	3	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows to check whether a FIB entry exists for a given packet by comparing the expression with a boolean keyword like so: \| fib daddr oif exists The implementation requires introduction of a generic expression flag EXPR_F_BOOLEAN which allows relational expression to signal it's LHS that a boolean comparison is being done (indicated by boolean type on RHS). In contrast to exthdr existence checks, fib expression can't know this in beforehand because the LHS syntax is absolutely identical to a non-boolean comparison. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: fix crash when inputting an incomplete set add command	Liping Zhang	2017-03-13	2	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After inputting the following nft command, set->keytype is not initialized but we try to destroy it, so NULL pointer dereference will happen: # nft add set t s Segmentation fault (core dumped) #0 dtype_free (dtype=0x0) at datatype.c:1049 #1 set_datatype_destroy (dtype=0x0) at datatype.c:1051 #2 0x0000000000407f1a in set_free (set=0x838790) at rule.c:213 #3 0x000000000042ff70 in nft_parse (scanner=scanner@entry=0x8386a0, state=state@entry=0x7ffc313ea670) at parser_bison.c:9355 #4 0x000000000040727d in nft_run (scanner=scanner@entry=0x8386a0, state=state@entry=0x7ffc313ea670, msgs=msgs@entry=0x7ffc313ea660) at main.c:237 #5 0x0000000000406e4a in main (argc=<optimized out>, argv=<optimized out>) at main.c:376 Fixes: b9b6092304ae ("evaluate: store byteorder for set keys") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	exthdr: Implement existence check	Phil Sutter	2017-03-10	5	-5/+68
\| \| \| \| \| \| \| \| \| \| \|	This allows to check for existence of an IPv6 extension or TCP option header by using the following syntax: \| exthdr frag exists \| tcpopt window exists Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	exthdr: Add support for exthdr specific flags	Phil Sutter	2017-03-10	4	-8/+14
\| \| \| \| \| \| \| \| \|	This allows to have custom flags in exthdr expression, which is necessary for upcoming existence checks (of both IPv6 extension headers as well as TCP options). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	Introduce boolean datatype and boolean expression	Phil Sutter	2017-03-10	3	-0/+42
\| \| \| \| \|	Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink: BUG when object type is unknown	Florian Westphal	2017-03-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This will only trigger during development when adding new object types to the parser. The BUG() gives a clear hint where the serialization code needs to go. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink: use nftnl_udata_put_u32()/nftnl_udata_get_u32()	Pablo Neira Ayuso	2017-03-06	1	-6/+8
\| \| \| \| \| \| \|	Use these new type-specific helper functions instead available in libnftnl. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: hash: support of symmetric hash	Laura Garcia Liebana	2017-03-06	6	-35/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides symmetric hash support according to source ip address and port, and destination ip address and port. The new attribute NFTA_HASH_TYPE has been included to support different types of hashing functions. Currently supported NFT_HASH_JENKINS through jhash and NFT_HASH_SYM through symhash. The main difference between both types are: - jhash requires an expression with sreg, symhash doesn't. - symhash supports modulus and offset, but not seed. Examples: nft add rule ip nat prerouting ct mark set jhash ip saddr mod 2 nft add rule ip nat prerouting ct mark set symhash mod 2 Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: revisit tcp options support	Pablo Neira Ayuso	2017-02-28	4	-123/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework syntax, add tokens so we can extend the grammar more easily. This has triggered several syntax changes with regards to the original patch, specifically: tcp option sack0 left 1 There is no space between sack and the block number anymore, no more offset field, now they are a single field. Just like we do with rt, rt0 and rt2. This simplifies our grammar and that is good since it makes our life easier when extending it later on to accomodate new features. I have also renamed sack_permitted to sack-permitted. I couldn't find any option using underscore so far, so let's keep it consistent with what we have. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: support zone set statement with optional direction	Florian Westphal	2017-02-28	4	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \|	nft automatically understands 'ct zone set 1' but when a direction is specified too we get a parser error since they are currently only allowed for plain ct expressions. This permits the existing syntax ('ct original zone') for all tokens with an optional direction also for set statements. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	ct: refactor print function so it can be re-used for ct statement	Florian Westphal	2017-02-28	1	-4/+9
\| \| \| \| \| \| \| \| \|	Once directional zone support is added we also need to print the direction of the statement, so factor the common code to re-use this helper from the statement print function. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add conntrack zone support	Florian Westphal	2017-02-28	3	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables zone get/set support. As the zone can be optionally tied to a direction as well we need a new token for this (unless we turn reply/original into tokens in which case we could handle zone via STRING). There was some discussion on how zone set support should be handled, especially 'zone set 1'. There are several issues to consider: 1. its not possible to change a zone 'later on', any given conntrack flow has exactly one zone for its entire lifetime. 2. to create conntracks in a given zone, the zone therefore has to be assigned before the packet gets picked up by conntrack (so that lookup finds the correct existing flow or the flow is created with the desired zone id). In iptables, this is enforced because zones are assigned with CT target and this is restricted to the 'raw' table in iptables, which runs after defragmentation but before connection tracking. 3. Thus, in nftables the 'ct zone set' rule needs to hook before conntrack too, e.g. via table raw { chain pre { type filter hook prerouting priority -300; iif eth3 ct zone set 23 } chain out { type filter hook output priority -300; oif eth3 ct zone set 23 } } ... but this is not enforced. There were two alternatives to better document this. One was to use an explicit 'template' keyword: nft ... template zone set 23 ... but 'connection tracking templates' are a kernel detail that users should not and need not know about. The other one was to use the meta keyword instead since we're (from a practical point of view) assigning the zone to the packet, not the conntrack: nft ... meta zone set 23 However, next patch also supports 'directional' zones, and nft ... meta original zone 23 makes no sense because 'direction' refers to a direction as understood by the connection tracker. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: store byteorder for set data	Pablo Neira Ayuso	2017-02-28	3	-2/+20
\| \| \| \| \| \| \| \| \|	Add new UDATA_SET_DATABYTEORDER attribute for NFTA_SET_UDATA to store the datatype byteorder. This is required if integer_type is used on the rhs of the mapping given that this datatype comes with no specific byteorder. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink: rework NFTNL_SET_USERDATA to accomodate new attributes	Pablo Neira Ayuso	2017-02-28	1	-32/+18
\| \| \| \| \| \| \|	Rework the NFTNL_SET_USERDATA in netlink_delinearize_set() to accomodate rhs datatype byteorder in mappings. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: rename set_keytype_alloc() to set_datatype_alloc()	Pablo Neira Ayuso	2017-02-28	4	-6/+6
\| \| \| \| \| \| \|	This function can be used either side of the map, so rename it to something generic. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: set byteorder as lhs expression context in stmt_evaluate_arg()	Pablo Neira Ayuso	2017-02-28	1	-9/+15
\| \| \| \| \| \| \|	stmt_evaluate_arg() needs to take the lhs map expression byteorder in order to evaluate the lhs of mappings accordingly. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	datatype: add DTYPE_F_CLONE flag	Pablo Neira Ayuso	2017-02-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This flag allows us to identify datatypes that are instances from original datatypes. This fixes a possible double free when attaching a concatenation datatype to set->keytype while being also referenced from concatenation expressions. ip6/flowtable.t: ERROR: line 5: src/nft add rule --debug=netlink ip6 test-ip6 input flow table acct_out { meta iif . ip6 saddr timeout 600s counter }: This rule should not have failed. * Error in `src/nft': double free or corruption (fasttop): 0x000000000117ce70 * Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink_delinearize: remove integer_type_postprocess()	Pablo Neira Ayuso	2017-02-25	1	-29/+0
\| \| \| \| \| \| \| \| \| \|	Not required anymore since the set definition now comes with the right byteorder for integer types via NFTA_SET_USERDATA area. So we don't need to look at the lhs anymore. Note that this was a workaround that does not work with named sets, where we cannot assume we have a lhs, since it is valid to have a named set that is not referenced from any rule. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netlink: store set byteorder in NFTA_SET_USERDATA	Pablo Neira Ayuso	2017-02-25	1	-1/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The integer datatype has neither specific byteorder nor length. This results in the following broken output: # nft list ruleset table ip x { chain y { mark set cpu map { 0 : 0x00000001, 16777216 : 0x00000002} } } Currently, with BYTEORDER_INVALID, nft defaults on network byteorder, hence the output above. This patch stores the key byteorder in the userdata using a TLV structure in the NFTA_SET_USERDATA area, so nft can interpret key accordingly when dumping the set back to userspace. Thus, after this patch the listing is correct: # nft list ruleset table ip x { chain y { mark set cpu map { 0 : 0x00000001, 1 : 0x00000002} } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>