nftables - nft command line tool

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	evaluate: stmt_evaluate_nat_map() only if stmt->nat.ipportmap == true	Pablo Neira Ayuso	2020-02-25	1	-17/+11
\| \| \| \| \| \| \|	stmt_evaluate_nat_map() is only called when the parser sets on stmt->nat.ipportmap. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: nat concatenation support with anonymous maps	Pablo Neira Ayuso	2020-02-24	1	-3/+20
\| \| \| \| \| \| \| \| \|	This patch extends the parser to define the mapping datatypes, eg. ... dnat ip addr . port to ip saddr map { 1.1.1.1 : 2.2.2.2 . 30 } ... dnat ip addr . port to ip saddr map @y Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: allow nat maps containing both ip(6) address and port	Florian Westphal	2020-02-24	1	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nft will now be able to handle map destinations { type ipv4_addr . inet_service : ipv4_addr . inet_service } chain f { dnat to ip daddr . tcp dport map @destinations } Something like this won't work though: meta l4proto tcp dnat ip6 to numgen inc mod 4 map { 0 : dead::f001 . 8080, .. as we lack the type info to properly dissect "dead::f001" as an ipv6 address. For the named map case, this info is available in the map definition, but for the anon case we'd need to resort to guesswork. Support is added by peeking into the map definition when evaluating a nat statement with a map. Right now, when a map is provided as address, we will only check that the mapped-to data type matches the expected size (of an ipv4 or ipv6 address). After this patch, if the mapped-to type is a concatenation, it will take a peek at the individual concat expressions. If its a combination of address and service, nft will translate this so that the kernel nat expression looks at the returned register that would store the inet_service part of the octet soup returned from the lookup expression. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: add two new helpers	Florian Westphal	2020-02-24	1	-29/+32
\| \| \| \| \| \| \| \| \| \| \|	In order to support 'dnat to ip saddr map @foo', where @foo returns both an address and a inet_service, we will need to peek into the map and process the concatenations sub-expressions. Add two helpers for this, will be used in followup patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: process concat expressions when used as mapped-to expr	Florian Westphal	2020-02-24	1	-0/+4
\| \| \| \| \| \| \| \| \|	Needed to avoid triggering the 'dtype->size == 0' tests. Evaluation will build a new concatenated type that holds the size of the aggregate. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: print correct statement name on family mismatch	Florian Westphal	2020-02-22	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	nft add rule inet filter c ip daddr 1.2.3.4 dnat ip6 to f00::1 Error: conflicting protocols specified: ip vs. unknown. You must specify ip or ip6 family in tproxy statement Should be: ... "in nat statement". Fixes: fbe27464dee4588d90 ("src: add nat support for the inet family") Signed-off-by: Florian Westphal <fw@strlen.de>
*	evaluate: change shift byte-order to host-endian.	Jeremy Sowden	2020-02-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The byte-order of the righthand operands of the right-shifts generated for payload and exthdr expressions is big-endian. However, all right operands should be host-endian. Since evaluation of the shift binop will insert a byte-order conversion to enforce this, change the endianness in order to avoid the extra operation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: correct variable name.	Jeremy Sowden	2020-02-07	1	-6/+6
\| \| \| \| \| \| \| \|	Rename the `lshift` variable used to store an right-shift expression to `rshift`. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: Add support for concatenated set ranges	Stefano Brivio	2020-02-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After exporting field lengths via NFTNL_SET_DESC_CONCAT attributes, we now need to adjust parsing of user input and generation of netlink key data to complete support for concatenation of set ranges. Instead of using separate elements for start and end of a range, denoting the end element by the NFT_SET_ELEM_INTERVAL_END flag, as it's currently done for ranges without concatenation, we'll use the new attribute NFTNL_SET_ELEM_KEY_END as suggested by Pablo. It behaves in the same way as NFTNL_SET_ELEM_KEY, but it indicates that the included key represents the upper bound of a range. For example, "packets with an IPv4 address between 192.0.2.0 and 192.0.2.42, with destination port between 22 and 25", needs to be expressed as a single element with two keys: NFTA_SET_ELEM_KEY: 192.0.2.0 . 22 NFTA_SET_ELEM_KEY_END: 192.0.2.42 . 25 To achieve this, we need to: - adjust the lexer rules to allow multiton expressions as elements of a concatenation. As wildcards are not allowed (semantics would be ambiguous), exclude wildcards expressions from the set of possible multiton expressions, and allow them directly where needed. Concatenations now admit prefixes and ranges - generate, for each element in a range concatenation, a second key attribute, that includes the upper bound for the range - also expand prefixes and non-ranged values in the concatenation to ranges: given a set with interval and concatenation support, the kernel has no way to tell which elements are ranged, so they all need to be. For example, 192.0.2.0 . 192.0.2.9 : 1024 is sent as: NFTA_SET_ELEM_KEY: 192.0.2.0 . 1024 NFTA_SET_ELEM_KEY_END: 192.0.2.9 . 1024 - aggregate ranges when elements received by the kernel represent concatenated ranges, see concat_range_aggregate() - perform a few minor adjustments where interval expressions are already handled: we have intervals in these sets, but the set specification isn't just an interval, so we can't just aggregate and deaggregate interval ranges linearly v4: No changes v3: - rework to use a separate key for closing element of range instead of a separate element with EXPR_F_INTERVAL_END set (Pablo Neira Ayuso) v2: - reworked netlink_gen_concat_data(), moved loop body to a new function, netlink_gen_concat_data_expr() (Phil Sutter) - dropped repeated pattern in bison file, replaced by a new helper, compound_expr_alloc_or_add() (Phil Sutter) - added set_is_nonconcat_range() helper (Phil Sutter) - in expr_evaluate_set(), we need to set NFT_SET_SUBKEY also on empty sets where the set in the context already has the flag - dropped additional 'end' parameter from netlink_gen_data(), temporarily set EXPR_F_INTERVAL_END on expressions and use that from netlink_gen_concat_data() to figure out we need to add the 'end' element (Phil Sutter) - replace range_mask_len() by a simplified version, as we don't need to actually store the composing masks of a range (Phil Sutter) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: Add support for NFTNL_SET_DESC_CONCAT	Stefano Brivio	2020-02-07	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support arbitrary range concatenations, the kernel needs to know how long each field in the concatenation is. The new libnftnl NFTNL_SET_DESC_CONCAT set attribute describes this as an array of lengths, in bytes, of concatenated fields. While evaluating concatenated expressions, export the datatype size into the new field_len array, and hand the data over via libnftnl. Similarly, when data is passed back from libnftnl, parse it into the set description. When set data is cloned, we now need to copy the additional fields in set_clone(), too. This change depends on the libnftnl patch with title: set: Add support for NFTA_SET_DESC_CONCAT attributes v4: No changes v3: Rework to use set description data instead of a stand-alone attribute v2: No changes Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: white-space fixes.	Jeremy Sowden	2020-01-28	1	-6/+5
\| \| \| \| \| \| \|	Remove some trailing white-space and fix some indentation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: better error notice when interval flag is not set on	Pablo Neira Ayuso	2020-01-16	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Users get confused with the existing error notice, let's try a different one: # nft add element x y { 1.1.1.0/24 } Error: You must add 'flags interval' to your set declaration if you want to add prefix elements add element x y { 1.1.1.0/24 } ^^^^^^^^^^ Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1380 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Phil Sutter <phil@nwl.cc>
*	evaluate: fix expr_set_context call for shift binops.	Jeremy Sowden	2020-01-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	expr_evaluate_binop calls expr_set_context for shift expressions to set the context data-type to `integer`. This clobbers the byte-order of the context, resulting in unexpected conversions to NBO. For example: $ sudo nft flush ruleset $ sudo nft add table t $ sudo nft add chain t c '{ type filter hook output priority mangle; }' $ sudo nft add rule t c oif lo tcp dport ssh ct mark set '0x10 \| 0xe' $ sudo nft add rule t c oif lo tcp dport ssh ct mark set '0xf << 1' $ sudo nft list table t table ip t { chain c { type filter hook output priority mangle; policy accept; oif "lo" tcp dport 22 ct mark set 0x0000001e oif "lo" tcp dport 22 ct mark set 0x1e000000 } } Replace it with a call to __expr_set_context and set the byteorder to that of the left operand since this is the value being shifted. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: print a hint about 'typeof' syntax on 0 keylen	Florian Westphal	2019-12-17	1	-5/+18
\| \| \| \| \| \| \| \| \| \|	If user says 'type integer; ...' in a set definition, don't just throw an error -- provide a hint that the typeof keyword can be used to provide the needed size information. Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: store expr, not dtype to track data in sets	Florian Westphal	2019-12-16	1	-18/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will be needed once we add support for the 'typeof' keyword to handle maps that could e.g. store 'ct helper' "type" values. Instead of: set foo { type ipv4_addr . mark; this would allow set foo { typeof(ip saddr) . typeof(ct mark); (exact syntax TBD). This would be needed to allow sets that store variable-sized data types (string, integer and the like) that can't be used at at the moment. Adding special data types for everything is problematic due to the large amount of different types needed. For anonymous sets, e.g. "string" can be used because the needed size can be inferred from the statement, e.g. 'osf name { "Windows", "Linux }', but in case of named sets that won't work because 'type string' lacks the context needed to derive the size information. With 'typeof(osf name)' the context is there, but at the moment it won't help because the expression is discarded instantly and only the data type is retained. Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: add ability to set/get secmarks to/from connection	Christian Göttsche	2019-11-25	1	-2/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Labeling established and related packets requires the secmark to be stored in the connection. Add the ability to store and retrieve secmarks like: ... chain input { ... # label new incoming packets ct state new meta secmark set tcp dport map @secmapping_in # add label to connection ct state new ct secmark set meta secmark # set label for est/rel packets from connection ct state established,related meta secmark set ct secmark ... } ... chain output { ... # label new outgoing packets ct state new meta secmark set tcp dport map @secmapping_out # add label to connection ct state new ct secmark set meta secmark # set label for est/rel packets from connection ct state established,related meta secmark set ct secmark ... } ... This patch also disallow constant value on the right hand side. # nft add rule x y meta secmark 12 Error: Cannot be used with right hand side constant value add rule x y meta secmark 12 ~~~~~~~~~~~~ ^^ # nft add rule x y ct secmark 12 Error: Cannot be used with right hand side constant value add rule x y ct secmark 12 ~~~~~~~~~~ ^^ # nft add rule x y ct secmark set 12 Error: ct secmark must not be set to constant value add rule x y ct secmark set 12 ^^^^^^^^^^^^^^^^^ This patch improves 3bc84e5c1fdd ("src: add support for setting secmark"). Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add and use `set_is_meter` helper	Jeremy Sowden	2019-11-06	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sets constructed for meters are flagged as anonymous and dynamic. However, in some places there are only checks that they are dynamic, which can lead to normal sets being classified as meters. For example: # nft add table t # nft add set t s { type ipv4_addr; size 256; flags dynamic,timeout; } # nft add chain t c # nft add rule t c tcp dport 80 meter m size 128 { ip saddr limit rate 10/second } # nft list meters table ip t { set s { type ipv4_addr size 256 flags dynamic,timeout } meter m { type ipv4_addr size 128 flags dynamic } } # nft list meter t m table ip t { meter m { type ipv4_addr size 128 flags dynamic } } # nft list meter t s Error: No such file or directory list meter t s ^ Add a new helper `set_is_meter` and use it wherever there are checks for meters. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: flowtable: add support for named flowtable listing	Eric Jallot	2019-10-31	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to dump a named flowtable. # nft list flowtable inet t f table inet t { flowtable f { hook ingress priority filter + 10 devices = { eth0, eth1 } } } Also: libnftables-json.adoc: fix missing quotes. Fixes: db0697ce7f60 ("src: support for flowtable listing") Fixes: 872f373dc50f ("doc: Add JSON schema documentation") Signed-off-by: Eric Jallot <ejallot@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add synproxy stateful object support	Fernando Fernandez Mancera	2019-09-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for "synproxy" stateful object. For example (for TCP port 80 and using maps with saddr): table ip foo { synproxy https-synproxy { mss 1460 wscale 7 timestamp sack-perm } synproxy other-synproxy { mss 1460 wscale 5 } chain bar { tcp dport 80 synproxy name "https-synproxy" synproxy name ip saddr map { 192.168.1.0/24 : "https-synproxy", 192.168.2.0/24 : "other-synproxy" } } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: flag fwd and queue statements as terminal	Florian Westphal	2019-09-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both queue and fwd statement end evaluation of a rule: in ... fwd to "eth0" accept ... queue accept "accept" is redundant and never evaluated in the kernel. Add the missing "TERMINAL" flag so the evaluation step will catch any trailing expressions: nft add rule filter input queue counter Error: Statement after terminal statement has no effect Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: evaluate: catch invalid 'meta day' values in eval step	Florian Westphal	2019-09-06	1	-4/+13
\| \| \| \|	Signed-off-by: Florian Westphal <fw@strlen.de>
*	meta: Introduce new conditions 'time', 'day' and 'hour'	Ander Juaristi	2019-09-06	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These keywords introduce new checks for a timestamp, an absolute date (which is converted to a timestamp), an hour in the day (which is converted to the number of seconds since midnight) and a day of week. When converting an ISO date (eg. 2019-06-06 17:00) to a timestamp, we need to substract it the GMT difference in seconds, that is, the value of the 'tm_gmtoff' field in the tm structure. This is because the kernel doesn't know about time zones. And hence the kernel manages different timestamps than those that are advertised in userspace when running, for instance, date +%s. The same conversion needs to be done when converting hours (e.g 17:00) to seconds since midnight as well. The result needs to be computed modulo 86400 in case GMT offset (difference in seconds from UTC) is negative. We also introduce a new command line option (-t, --seconds) to show the actual timestamps when printing the values, rather than the ISO dates, or the hour. Some usage examples: time < "2019-06-06 17:00" drop; time < "2019-06-06 17:20:20" drop; time < 12341234 drop; day "Saturday" drop; day 6 drop; hour >= 17:00 drop; hour >= "17:00:01" drop; hour >= 63000 drop; We need to convert an ISO date to a timestamp without taking into account the time zone offset, since comparison will be done in kernel space and there is no time zone information there. Overwriting TZ is portable, but will cause problems when parsing a ruleset that has 'time' and 'hour' rules. Parsing an 'hour' type must not do time zone conversion, but that will be automatically done if TZ has been overwritten to UTC. Hence, we use timegm() to parse the 'time' type, even though it's not portable. Overwriting TZ seems to be a much worse solution. Finally, be aware that timestamps are converted to nanoseconds when transferring to the kernel (as comparison is done with nanosecond precision), and back to seconds when retrieving them for printing. We swap left and right values in a range to properly handle cross-day hour ranges (e.g. 23:15-03:22). Signed-off-by: Ander Juaristi <a@juaristi.eus> Reviewed-by: Florian Westphal <fw@strlen.de>
*	evaluate: New internal helper __expr_evaluate_range	Ander Juaristi	2019-09-06	1	-4/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is used by the followup patch to evaluate a range without emitting an error when the left value is larger than the right one. This is done to handle time-matching such as 23:00-01:00 -- expr_evaluate_range() will reject this, but we want to be able to evaluate and then handle this as a request to match from 23:00 to 1am. Signed-off-by: Ander Juaristi <a@juaristi.eus> Signed-off-by: Florian Westphal <fw@strlen.de>
*	src: allow variable in chain policy	Fernando Fernandez Mancera	2019-08-08	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to use variables in chain policy definition, e.g. define default_policy = "accept" add table ip foo add chain ip foo bar {type filter hook input priority filter; policy $default_policy} Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: allow variables in the chain priority specification	Fernando Fernandez Mancera	2019-08-08	1	-14/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to use variables in chain priority definitions, e.g. define prio = filter define prionum = 10 define prioffset = "filter - 150" add table ip foo add chain ip foo bar { type filter hook input priority $prio; } add chain ip foo ber { type filter hook input priority $prionum; } add chain ip foo bor { type filter hook input priority $prioffset; } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add parse_ctx object	Pablo Neira Ayuso	2019-08-08	1	-2/+4
\| \| \| \| \| \| \| \|	This object stores the dynamic symbol tables that are loaded from files. Pass this object to datatype parse functions, although this new parameter is not used yet, this is just a preparation patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	cache: add NFT_CACHE_UPDATE and NFT_CACHE_FLUSHED flags	Pablo Neira Ayuso	2019-07-23	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFT_CACHE_FLUSHED tells cache_update() to skip the netlink dump to populate the cache, since the existing ruleset is going to flushed by this batch. NFT_CACHE_UPDATE tells rule_evaluate() to perform incremental updates to the cache based on the existing batch, this is required by the rule commands that use the index and the position selectors. This patch removes cache_flush() which is not required anymore. This cache removal is coming too late, in the evaluation phase, after the initial cache_update() invocation. Be careful with NFT_CACHE_UPDATE, this flag needs to be left in place if NFT_CACHE_FLUSHED is set on. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: missing location for chain nested in table definition	Pablo Neira Ayuso	2019-07-22	1	-0/+1
\| \| \| \| \| \|	error reporting may crash because location is unset. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: evaluate: support prefix expression in statements	Florian Westphal	2019-07-22	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently nft dumps core when it encounters a prefix expression as part of a statement, e.g. iifname ens3 snat to 10.0.0.0/28 yields: BUG: unknown expression type prefix nft: netlink_linearize.c:688: netlink_gen_expr: Assertion `0' failed. This assertion is correct -- we can't linearize a prefix because kernel doesn't know what that is. For LHS prefixes, they get converted to a binary 'and' such as '10.0.0.0 & 255.255.255.240'. For RHS, we can do something similar and convert them into a range. snat to 10.0.0.0/28 will be converted into: iifname "ens3" snat to 10.0.0.0-10.0.0.15 Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1187 Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: bogus error when refering to existing non-base chain	Pablo Neira Ayuso	2019-07-18	1	-6/+3
\| \| \| \| \| \| \| \|	add rule ip testNEW test6 jump test8 ^^^^^ Error: invalid verdict chain expression value Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: introduce SYNPROXY matching	Fernando Fernandez Mancera	2019-07-17	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for "synproxy" statement. For example (for TCP port 8888): table ip x { chain y { type filter hook prerouting priority raw; policy accept; tcp dport 8888 tcp flags syn notrack } chain z { type filter hook input priority filter; policy accept; tcp dport 8888 ct state invalid,untracked synproxy mss 1460 wscale 7 timestamp sack-perm ct state invalid drop } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: missing basic evaluation of expectations	Pablo Neira Ayuso	2019-07-16	1	-4/+30
\| \| \| \| \| \| \| \| \| \| \|	Basic ct expectation object evaluation. This fixes tests/py errors. Error reporting is very sparse at this stage. I'm intentionally leaving this as future work to store location objects for each field, so user gets better indication on what is missing when configuring expectations. Fixes: 1dd08fcfa07a ("src: add ct expectations support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add ct expectations support	Stéphane Veyret	2019-07-16	1	-0/+4
\| \| \| \| \| \| \|	This modification allow to directly add/list/delete expectations. Signed-off-by: Stéphane Veyret <sveyret@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: honor NFT_SET_OBJECT flag	Pablo Neira Ayuso	2019-07-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is noticeable when displaying mispelling errors, however, there are also few spots not checking for the object map flag. Before: # nft flush set inet filter countermxx Error: No such file or directory; did you mean set ‘countermap’ in table inet ‘filter’? flush set inet filter countermxx ^^^^^^^^^^ After: # nft flush set inet filter countermxx Error: No such file or directory; did you mean map ‘countermap’ in table inet ‘filter’? flush set inet filter countermxx ^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: use set_is_anonymous()	Pablo Neira Ayuso	2019-07-16	1	-1/+1
\| \| \| \|	Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: missing object maps handling in list and flush commands	Pablo Neira Ayuso	2019-07-16	1	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFT_SET_OBJECT tells there is an object map. # nft list ruleset table inet filter { map countermap { type ipv4_addr : counter } } The following command fails: # nft flush set inet filter countermap This patch checks for NFT_SET_OBJECT from new set_is_literal() and map_is_literal() functions. This patch also adds tests for this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add set_is_datamap(), set_is_objmap() and set_is_map() helpers	Pablo Neira Ayuso	2019-07-16	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Two map types are currently possible: * data maps, ie. set_is_datamap(). * object maps, ie. set_is_objmap(). This patch adds helper functions to check for the map type. set_is_map() allows you to check for either map type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	exthdr: add support for matching IPv4 options	Stephen Suryaputra	2019-07-04	1	-0/+17
\| \| \| \| \| \| \| \| \|	Add capability to have rules matching IPv4 options. This is developed mainly to support dropping of IP packets with loose and/or strict source route route options. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	ct: support for NFT_CT_{SRC,DST}_{IP,IP6}	Pablo Neira Ayuso	2019-06-21	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These keys are available since kernel >= 4.17. You can still use NFT_CT_{SRC,DST}, however, you need to specify 'meta protocol' in first place to provide layer 3 context. Note that NFT_CT_{SRC,DST} are broken with set, maps and concatenations. This patch is implicitly fixing these cases. If your kernel is < 4.17, you can still use address matching via explicit meta nfproto: meta nfproto ipv4 ct original saddr 1.2.3.4 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: do not allow to list/flush anonymous sets via list command	Pablo Neira Ayuso	2019-06-19	1	-6/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't allow this: # nft list set x __set0 table ip x { set __set0 { type ipv4_addr flags constant elements = { 1.1.1.1 } } } Constant sets never change and they are attached to a rule (anonymous flag is set on), do not list their content through this command. Do not allow flush operation either. After this patch: # nft list set x __set0 Error: No such file or directory list set x __set0 ^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: allow get/list/flush dynamic sets and maps via list command	Pablo Neira Ayuso	2019-06-19	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before: # nft list set ip filter untracked_unknown Error: No such file or directory; did you mean set ‘untracked_unknown’ in table ip ‘filter’? list set ip filter untracked_unknown ^^^^^^^^^^^^^^^^^ After: # nft list set ip filter untracked_unknown table ip filter { set untracked_unknown { type ipv4_addr . inet_service . ipv4_addr . inet_service . inet_proto size 100000 flags dynamic,timeout } } Add a testcase for this too. Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add cache level flags	Pablo Neira Ayuso	2019-06-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The score approach based on command type is confusing. This patch introduces cache level flags, each flag specifies what kind of object type is needed. These flags are set on/off depending on the list of commands coming in this batch. cache_is_complete() now checks if the cache contains the objects that are needed through these new flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: remove useless parameter from cache_flush()	Pablo Neira Ayuso	2019-06-17	1	-1/+1
\| \| \| \| \| \|	Command type is never used in cache_flush(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: double datatype_free() with dynamic integer datatypes	Pablo Neira Ayuso	2019-06-14	1	-2/+0
\| \| \| \| \| \|	datatype_set() already deals with this case, remove this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: update byteorder only for implicit maps	Pablo Neira Ayuso	2019-06-14	1	-1/+2
\| \| \| \| \| \| \| \|	The byteorder adjustment for the integer datatype is only required by implicit maps. Fixes: b9b6092304ae ("evaluate: store byteorder for set keys") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: use-after-free in meter	Pablo Neira Ayuso	2019-06-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to bbe139fdf5a5 ("evaluate: use-after-free in implicit set"). ==12727== Invalid read of size 4 ==12727== at 0x72DB515: expr_free (expression.c:86) ==12727== by 0x72D3092: set_free (rule.c:367) ==12727== by 0x72DB555: expr_destroy (expression.c:79) ==12727== by 0x72DB555: expr_free (expression.c:95) ==12727== by 0x72D7A35: meter_stmt_destroy (statement.c:137) ==12727== by 0x72D7A07: stmt_free (statement.c:50) ==12727== by 0x72D7AD7: stmt_list_free (statement.c:60) ==12727== by 0x72D32EF: rule_free (rule.c:610) ==12727== by 0x72D3834: chain_free (rule.c:827) ==12727== by 0x72D45D4: table_free (rule.c:1184) ==12727== by 0x72D46A7: __cache_flush (rule.c:293) ==12727== by 0x72D472C: cache_release (rule.c:313) ==12727== by 0x72D4A79: cache_update (rule.c:264) ==12727== Address 0x64f14c8 is 56 bytes inside a block of size 128 free'd ==12727== at 0x4C2CDDB: free (vg_replace_malloc.c:530) ==12727== by 0x72D7A2C: meter_stmt_destroy (statement.c:136) ==12727== by 0x72D7A07: stmt_free (statement.c:50) ==12727== by 0x72D7AD7: stmt_list_free (statement.c:60) ==12727== by 0x72D32EF: rule_free (rule.c:610) ==12727== by 0x72D3834: chain_free (rule.c:827) ==12727== by 0x72D45D4: table_free (rule.c:1184) ==12727== by 0x72D46A7: __cache_flush (rule.c:293) ==12727== by 0x72D472C: cache_release (rule.c:313) ==12727== by 0x72D4A79: cache_update (rule.c:264) ==12727== by 0x72F82CE: nft_evaluate (libnftables.c:388) ==12727== by 0x72F8A8B: nft_run_cmd_from_buffer (libnftables.c:428) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add reference counter for dynamic datatypes	Pablo Neira Ayuso	2019-06-13	1	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two datatypes are using runtime datatype allocation: * Concatenations. * Integer, that require byteorder adjustment. From the evaluation / postprocess step, transformations are common, hence expressions may end up fetching (infering) datatypes from an existing one. This patch adds a reference counter to release the dynamic datatype object when it is shared. The API includes the following helper functions: * datatype_set(expr, datatype), to assign a datatype to an expression. This helper already deals with reference counting for dynamic datatypes. This also drops the reference counter of any previous datatype (to deal with the datatype replacement case). * datatype_get(datatype) bumps the reference counter. This function also deals with nul-pointers, that occurs when the datatype is unset. * datatype_free() drops the reference counter, and it also releases the datatype if there are not more clients of it. Rule of thumb is: The reference counter of any newly allocated datatype is set to zero. This patch also updates every spot to use datatype_set() for non-dynamic datatypes, for consistency. In this case, the helper just makes an simple assignment. Note that expr_alloc() has been updated to call datatype_get() on the datatype that is assigned to this new expression. Moreover, expr_free() calls datatype_free(). This fixes valgrind reports like this one: ==28352== 1,350 (440 direct, 910 indirect) bytes in 5 blocks are definitely lost in loss recor 3 of 3 ==28352== at 0x4C2BBAF: malloc (vg_replace_malloc.c:299) ==28352== by 0x4E79558: xmalloc (utils.c:36) ==28352== by 0x4E7963D: xzalloc (utils.c:65) ==28352== by 0x4E6029B: dtype_alloc (datatype.c:1073) ==28352== by 0x4E6029B: concat_type_alloc (datatype.c:1127) ==28352== by 0x4E6D3B3: netlink_delinearize_set (netlink.c:578) ==28352== by 0x4E6D68E: list_set_cb (netlink.c:648) ==28352== by 0x5D74023: nftnl_set_list_foreach (set.c:780) ==28352== by 0x4E6D6F3: netlink_list_sets (netlink.c:669) ==28352== by 0x4E5A7A3: cache_init_objects (rule.c:159) ==28352== by 0x4E5A7A3: cache_init (rule.c:216) ==28352== by 0x4E5A7A3: cache_update (rule.c:266) ==28352== by 0x4E7E0EE: nft_evaluate (libnftables.c:388) ==28352== by 0x4E7EADD: nft_run_cmd_from_filename (libnftables.c:479) ==28352== by 0x109A53: main (main.c:310) This patch also removes the DTYPE_F_CLONE flag which is broken and not needed anymore since proper reference counting is in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: Support intra-transaction rule references	Phil Sutter	2019-06-07	1	-20/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A rule may be added before or after another one using index keyword. To support for the other rule being added within the same batch, one has to make use of NFTNL_RULE_ID and NFTNL_RULE_POSITION_ID attributes. This patch does just that among a few more crucial things: * If cache is complete enough to contain rules, update cache when evaluating rule commands so later index references resolve correctly. * Reduce rule_translate_index() to its core code which is the actual linking of rules and consequently rename the function. The removed bits are pulled into the calling rule_evaluate() to reduce code duplication in between cache updates with and without rule reference. * Pass the current command op to rule_evaluate() as indicator whether to insert before or after a referenced rule or at beginning or end of chain in cache. Exploit this from chain_evaluate() to avoid adding the chain's rules a second time. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	evaluate: use-after-free in implicit set	Pablo Neira Ayuso	2019-06-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	# cat example.nft table inet test { chain test { ip daddr { 2.2.2.2, 4.4.4.4} counter accept } } # valgrind nft -f example.nft valgrind reports: ==2272== Invalid read of size 4 ==2272== at 0x4E612A5: expr_free (expression.c:86) ==2272== by 0x4E58EA2: set_free (rule.c:367) ==2272== by 0x4E612DA: expr_destroy (expression.c:79) ==2272== by 0x4E612DA: expr_free (expression.c:93) ==2272== by 0x4E612DA: expr_destroy (expression.c:79) ==2272== by 0x4E612DA: expr_free (expression.c:93) ==2272== by 0x4E5D7E7: stmt_free (statement.c:50) ==2272== by 0x4E5D8B7: stmt_list_free (statement.c:60) ==2272== by 0x4E590FF: rule_free (rule.c:610) ==2272== by 0x4E5C094: cmd_free (rule.c:1420) ==2272== by 0x4E7E7EF: nft_run_cmd_from_filename (libnftables.c:490) ==2272== by 0x109A53: main (main.c:310) ==2272== Address 0x65d94c8 is 56 bytes inside a block of size 128 free'd ==2272== at 0x4C2CDDB: free (vg_replace_malloc.c:530) ==2272== by 0x4E6143C: mapping_expr_destroy (expression.c:966) ==2272== by 0x4E612DA: expr_destroy (expression.c:79) ==2272== by 0x4E612DA: expr_free (expression.c:93) ==2272== by 0x4E5D7E7: stmt_free (statement.c:50) ==2272== by 0x4E5D8B7: stmt_list_free (statement.c:60) ==2272== by 0x4E590FF: rule_free (rule.c:610) ==2272== by 0x4E5C094: cmd_free (rule.c:1420) ==2272== by 0x4E7E7EF: nft_run_cmd_from_filename (libnftables.c:490) ==2272== by 0x109A53: main (main.c:310) ==2272== Block was alloc'd at ==2272== at 0x4C2BBAF: malloc (vg_replace_malloc.c:299) ==2272== by 0x4E79248: xmalloc (utils.c:36) ==2272== by 0x4E7932D: xzalloc (utils.c:65) ==2272== by 0x4E60690: expr_alloc (expression.c:45) ==2272== by 0x4E68B1D: payload_expr_alloc (payload.c:159) ==2272== by 0x4E91013: nft_parse (parser_bison.y:4242) ==2272== by 0x4E7E722: nft_parse_bison_filename (libnftables.c:374) ==2272== by 0x4E7E722: nft_run_cmd_from_filename (libnftables.c:471) ==2272== by 0x109A53: main (main.c:310) Fixes: cc7b37d18a68 ("src: Interpret OP_NEQ against a set as OP_LOOKUP") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: single cache_update() call to build cache before evaluation	Pablo Neira Ayuso	2019-06-06	1	-75/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows us to make one single cache_update() call. Thus, there is not need to rebuild an incomplete cache from the middle of the batch processing. Note that nft_run_cmd_from_filename() does not need a full netlink dump to build the cache anymore, this should speed nft -f with incremental updates and very large rulesets. cache_evaluate() calculates the netlink dump to populate the cache that this batch needs. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>