summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* src: add <nft.h> header and include it as firstThomas Haller2023-08-2549-8/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <config.h> is generated by the configure script. As it contains our feature detection, it want to use it everywhere. Likewise, in some of our sources, we define _GNU_SOURCE. This defines the C variant we want to use. Such a define need to come before anything else, and it would be confusing if different source files adhere to a different C variant. It would be good to use autoconf's AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure that <config.h> is always included as first. Instead of going through all source files and include <config.h> as first, add a new header "include/nft.h", which is supposed to be included in all our sources (and as first). This will also allow us later to prepare some common base, like include <stdbool.h> everywhere. We aim that headers are self-contained, so that they can be included in any order. Which, by the way, already didn't work because some headers define _GNU_SOURCE, which would only work if the header gets included as first. <nft.h> is however an exception to the rule: everything we compile shall rely on having <nft.h> header included as first. This applies to source files (which explicitly include <nft.h>) and to internal header files (which are only compiled indirectly, by being included from a source file). Note that <config.h> has no include guards, which is at least ugly to include multiple times. It doesn't cause problems in practice, because it only contains defines and the compiler doesn't warn about redefining a macro with the same value. Still, <nft.h> also ensures to include <config.h> exactly once. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: define _GNU_SOURCE to get strptime() from <time.h>Thomas Haller2023-08-251-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To use `strptime()`, the documentation indicates #define _XOPEN_SOURCE #include <time.h> However, previously this was done wrongly. For example, when building with musl we got a warning: CC meta.lo meta.c:40: warning: "_XOPEN_SOURCE" redefined 40 | #define _XOPEN_SOURCE | In file included from /usr/include/errno.h:8, from meta.c:13: /usr/include/features.h:16: note: this is the location of the previous definition 16 | #define _XOPEN_SOURCE 700 | Defining "__USE_XOPEN" is wrong. This is a glibc internal define not for the user. Note that if we just set _XOPEN_SOURCE (or _XOPEN_SOURCE=700), we won't get other things like "struct tm.tm_gmtoff". Instead, we already define _GNU_SOURCE at other places. Do that here too, it will give us strptime() and all is good. Also, those directives should be defined as first thing (or via "-D" command line). See [1]. This is also important, because to use "time_t" in a header, we would need to include <time.h>. That only works, if we get the feature test macros right. That is, define the _?_SOURCE macro as first thing. [1] https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flag NFT_CTX_INPUT_JSON to enable JSON parsingThomas Haller2023-08-241-2/+2
| | | | | | | | | | | | | | By default, the input is parsed using the nftables grammar. When setting NFT_CTX_OUTPUT_JSON flag, nftables will first try to parse the input as JSON before falling back to the nftables grammar. But NFT_CTX_OUTPUT_JSON flag also turns on JSON for the output. Add a flag NFT_CTX_INPUT_JSON which allows to treat only the input as JSON, but keep the output mode unchanged. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flag NFT_CTX_INPUT_NO_DNS to avoid blockingThomas Haller2023-08-242-30/+48
| | | | | | | | | | | | | | | | | | | getaddrinfo() blocks while trying to resolve the name. Blocking the caller of the library is in many cases undesirable. Also, while reconfiguring the firewall, it's not clear that resolving names via the network will work or makes sense. Add a new input flag NFT_CTX_INPUT_NO_DNS to opt-out from getaddrinfo() and only accept plain IP addresses. We could also use AI_NUMERICHOST with getaddrinfo() instead of inet_pton(). By parsing via inet_pton(), we are better aware of what we expect and can generate a better error message in case of failure. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flags for nft_ctxThomas Haller2023-08-242-0/+21
| | | | | | | | | | | | Similar to the existing output flags, add input flags. No flags are yet implemented, that will follow. One difference to nft_ctx_output_set_flags(), is that the setter for input flags returns the previously set flags. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: error out on meter overlap with an existing set/map declarationPablo Neira Ayuso2023-08-231-0/+18
| | | | | | | | | | | | | | | | | | | One of the problems with meters is that they use the set/map infrastructure behind the scenes which might be confusing to users. This patch errors out in case user declares a meter whose name overlaps with an existing set/map: meter.nft:15:18-91: Error: File exists; meter ‘syn4-meter’ overlaps an existing set ‘syn4-meter’ in family inet tcp dport 22 meter syn4-meter { ip saddr . tcp dport timeout 5m limit rate 20/minute } counter accept ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ An old 5.10 kernel bails out simply with EEXIST, with this patch a better hint is provided. Dynamic sets are preferred over meters these days. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: chain listing implicitly sets on terse optionPablo Neira Ayuso2023-08-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If user specifies a chain to be listed (which is internally handled via filtering options), then toggle NFT_CACHE_TERSE to skip fetching set content from kernel for non-anonymous sets. With a large IPv6 set with bogons, before this patch: # time nft list chain inet raw x table inet raw { chain x { ip6 saddr @bogons6 ip6 saddr { aaaa::, bbbb:: } } } real 0m2,913s user 0m1,345s sys 0m1,568s After this patch: # time nft list chain inet raw prerouting table inet raw { chain x { ip6 saddr @bogons6 ip6 saddr { aaaa::, bbbb:: } } } real 0m0,056s user 0m0,018s sys 0m0,039s This speeds up chain listing in the presence of a large set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: use reentrant localtime_r()/gmtime_r() functionsThomas Haller2023-08-221-19/+22
| | | | | | | | | | These functions are POSIX.1-2001. We should have them in all environments we care about. Use them as they are thread-safe. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: don't assume time_t is 64 bit in date_type_print()Thomas Haller2023-08-221-5/+8
| | | | | | | | | time_t on 32bit arch is not uint64_t. Even if it always were, it would be ugly to make such an assumption (without a static assert). Copy the value to a time_t variable first. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftutils: add and use wrappers for getprotoby{name,number}_r(), ↵Thomas Haller2023-08-206-30/+157
| | | | | | | | | | | | | | | getservbyport_r() We should aim to use the thread-safe variants of getprotoby{name,number} and getservbyport(). However, they may not be available with other libc, so it requires a configure check. As that is cumbersome, add wrappers that do that at one place. These wrappers are thread-safe, if libc provides the reentrant versions. Use them. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: use strtok_r() instead of strtok()Thomas Haller2023-08-181-2/+3
| | | | | | | | | strtok_r() is probably(?) everywhere available where we care. Use it. It is thread-safe, and libnftables shouldn't make assumptions about what other threads of the process are doing. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: deduplicate map with data intervalFlorian Westphal2023-08-031-13/+7
| | | | | | | | | | Its copypasted, the copy is same as original except that it specifies a map key that maps to an interval. Add an exra rule that returns 0 or EXPR_F_INTERVAL, then use that in a single rule. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: allow ct timeouts to use time_spec valuesFlorian Westphal2023-08-032-6/+13
| | | | | | | | | | | | | | | For some reason the parser only allows raw numbers (seconds) for ct timeouts, e.g. ct timeout ttcp { protocol tcp; policy = { syn_sent : 3, ... Also permit time_spec, e.g. "established : 5d". Print the nicer time formats on output, but retain raw numbers support on input for compatibility. Signed-off-by: Florian Westphal <fw@strlen.de>
* libnftables: Drop cache in -c/--check modePablo Neira Ayuso2023-08-011-2/+5
| | | | | | | | | | | | | | | | | | | | | Extend e0aace943412 ("libnftables: Drop cache in error case") to also drop the cache with -c/--check, this is a dry run mode and kernel does not get any update. This fixes a bug with -o/--optimize, which first runs in an implicit -c/--check mode to validate that the ruleset is correct, then it provides the proposed optimization. In this case, if the cache is not emptied, old objects in the cache refer to scanner data that was already released, which triggers BUG like this: BUG: invalid input descriptor type 151665524 nft: erec.c:161: erec_print: Assertion `0' failed. Aborted This bug was triggered in a ruleset that contains a set for geoip filtering. This patch also extends tests/shell to cover this case. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ct expectation: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-07-314-1/+4
| | | | | | | | | | Just like "ct timeout", "ct expectation" is in need of the same fix, we get segfault on "nft list ct expectation table t", if table t exists. This is the exact same pattern as resolved for "ct timeout" in commit 1d2e22fc0521 ("ct timeout: fix 'list object x' vs. 'list objects in table' confusion"). Signed-off-by: Florian Westphal <fw@strlen.de>
* rule: allow src/dstnat prios in input and outputFlorian Westphal2023-07-311-2/+4
| | | | | | | | | | | | | | Dan Winship says: The "dnat" command is usable from either "prerouting" or "output", but the "dstnat" priority is only usable from "prerouting". (Likewise, "snat" is usable from either "postrouting" or "input", but "srcnat" is only usable from "postrouting".) No need to restrict those priorities to pre/postrouting. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1694 Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink: delinearize: copy set keytype if neededFlorian Westphal2023-07-271-0/+2
| | | | | | | | | | | | | Output before: add @dynmark { 0xa020304 [invalid type] timeout 1s : 0x00000002 } comment "also check timeout-gc" after: add @dynmark { 10.2.3.4 timeout 1s : 0x00000002 } comment "also check timeout-gc" This is a followup to 76c358ccfea0 ("src: maps: update data expression dtype based on set"), which did fix the map expression, but not the key. Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: stash context statement length when generating payload/meta dependencyPablo Neira Ayuso2023-07-191-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | ... meta mark set ip dscp generates an implicit dependency from the inet family to match on meta nfproto ip. The length of this implicit expression is incorrectly adjusted to the statement length, ie. relational to compare meta nfproto takes 4 bytes instead of 1 byte. The evaluation of 'ip dscp' under the meta mark statement triggers this implicit dependency which should not consider the context statement length since it is added before the statement itself. This problem shows when listing the ruleset, since netlink_parse_cmp() where left->len < right->len, hence handling the implicit dependency as a concatenation, but it is actually a bug in the evaluation step that leads to incorrect bytecode. Fixes: 3c64ea7995cb ("evaluate: honor statement length in integer evaluation") Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand") Tested-by: Brian Davidson <davidson.brian@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: prefer raw_type instead of desc->typeFlorian Westphal2023-07-171-1/+1
| | | | | | | | | | | | | | | | | | On ancient kernels desc can be NULL, because such kernels do not understand NFTA_EXTHDR_TYPE. Thus they don't include it in the reverse dump, so the tcp/ip option gets treated like an ipv6 exthdr, but no matching description will be found. This then gives a crash due to the null deref. Just use the raw value here, this avoid a crash and at least print *something*, e.g.: unknown-exthdr unknown & 0xf0 [invalid type] == 0x0 [invalid type] Signed-off-by: Florian Westphal <fw@strlen.de>
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-137-13/+62
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Cache looked up set for list commandsPhil Sutter2023-07-133-7/+15
| | | | | | | | | | Evaluation phase checks the given table and set exist in cache. Relieve execution phase from having to perform the lookup again by storing the set reference in cmd->set. Just have to increase the ref counter so cmd_free() does the right thing (which lacked handling of MAP and METER objects for some reason). Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Merge some cases in cmd_evaluate_list()Phil Sutter2023-07-131-32/+4
| | | | | | | | The code for set, map and meter were almost identical apart from the specific last check. Fold them together and make the distinction in that spot only. Signed-off-by: Phil Sutter <phil@nwl.cc>
* libnftables: drop check for nf_sock in nft_ctx_free()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | The "nft_ctx" API does not provide a way to change or reconnect the netlink socket. And none of the users would rely on that. Also note that nft_ctx_new() initializes nf_sock via nft_mnl_socket_open(), which panics of the socket could not be initialized. This means, the check is unnecessary and needlessly confusing. Drop it. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: inline creation of nf_sock in nft_ctx_new()Thomas Haller2023-07-101-6/+1
| | | | | | | | | | | | The function only has one caller. It's not clear how to extend this in a useful way, so that it makes sense to keep the initialization in a separate function. Simplify the code, by inlining and dropping the static function nft_ctx_netlink_init(). There was only one caller. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: drop unused argument nf_sock from nft_netlink()Thomas Haller2023-07-101-4/+3
| | | | | Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: always initialize netlink socket in nft_ctx_new()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | | | | nft_ctx_new() has a flags argument, but currently no flags are supported. The documentation suggests to pass 0 (NFT_CTX_DEFAULT). Initializing the netlink socket happens by default already, we should do it for all flags. Also because nft_ctx_netlink_init() is not public API so it's not clear how the user gets a functioning context instance otherwise. If we ever want to not initialize the netlink socket for a context instance, then there should be a dedicated flag for doing that (and additional API for making that mode of operation usable). Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: place byteorder conversion before rshift in payload statementPablo Neira Ayuso2023-07-081-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For bitfield that spans more than one byte, such as ip6 dscp, byteorder conversion needs to be done before rshift. Add unary expression for this conversion only in the case of meta and ct statements. Before this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <--------- incorrect [ meta set mark with reg 1 ] After this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <-------- correct [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ meta set mark with reg 1 ] For the matching case, binary transfer already deals with the rshift to adjust left and right hand side of the expression, the unary conversion is not needed in such case. Fixes: 8221d86e616b ("tests: py: add test-cases for ct and packet mark payload expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: use div_round_up in byteorder lengthPablo Neira Ayuso2023-07-061-1/+1
| | | | | | | | | Use div_round_up() to calculate the byteorder length, otherwise fields that take % BITS_PER_BYTE != 0 are not considered by the byteorder expression. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cli: Make cli_init() return to callerPhil Sutter2023-07-041-21/+42
| | | | | | | | | | | | | | | Avoid direct exit() calls as that leaves the caller-allocated nft_ctx object in place. Making sure it is freed helps with valgrind-analyses at least. To signal desired exit from CLI, introduce global cli_quit boolean and make all cli_exit() implementations also set cli_rc variable to the appropriate return code. The logic is to finish CLI only if cli_quit is true which asserts proper cleanup as it is set only by the respective cli_exit() function. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Call nft_ctx_free() before exitingPhil Sutter2023-07-041-17/+19
| | | | | | | | | | Introduce labels for failure and regular exit so all direct exit() calls after nft_ctx allocation may be replaced by a single goto statement. Simply drop that return call in interactive branch, code will continue at 'out' label naturally. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Make 'buf' variable branch-localPhil Sutter2023-07-041-2/+4
| | | | | | | | It is used only to linearize non-option argv for passing to nft_run_cmd_from_buffer(), reduce its scope. Allows to safely move the free() call there, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
* expression: define .clone for catchall set elementPablo Neira Ayuso2023-06-301-2/+13
| | | | | | | Otherwise reuse of catchall set element expression in variable triggers a null-pointer dereference. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-212-4/+3
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-204-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface names in flowtablesFlorian Westphal2023-06-201-8/+12
| | | | | | Previous patch wasn't enough, also disable this for flowtable device lists. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface namesFlorian Westphal2023-06-201-5/+31
| | | | | | | | | | | device "" results in an assertion during evaluation. Before: nft: expression.c:426: constant_expr_alloc: Assertion `(((len) + (8) - 1) / (8)) > 0' failed. After: zero_length_devicename_assert:3:42-49: Error: you cannot set an empty interface name type filter hook ingress device""lo" priority -1 ^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: don't assert on scope underflowsFlorian Westphal2023-06-201-2/+1
| | | | | | | | | | | | | | | close_scope() gets called from the object destructors; imbalance can cause us to hit assert(). Before: nft: parser_bison.y:88: close_scope: Assertion `state->scope > 0' failed. After: assertion3:4:7-7: Error: too many levels of nesting jump { assertion3:5:8-8: Error: too many levels of nesting jump assertion3:5:9-9: Error: syntax error, unexpected newline, expecting '{' assertion3:7:1-1: Error: syntax error, unexpected end of file Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: do not abort when prefix map has non-map elementFlorian Westphal2023-06-201-4/+13
| | | | | | | | | | | Before: nft: evaluate.c:1849: __mapping_expr_expand: Assertion `i->etype == EXPR_MAPPING' failed. after: Error: expected mapping, not set element snat ip prefix to ip saddr map { 10.141.11.0/24 : 192.168.2.0/24, 10.141.12.1 } Signed-off-by: Florian Westphal <fw@strlen.de>
* json: dccp: remove erroneous const qualifierFlorian Westphal2023-06-201-1/+1
| | | | | | | | | | | This causes a clang warning: parser_json.c:767:6: warning: variable 'opt_type' is uninitialized when used here [-Wuninitialized] if (opt_type < DCCPOPT_TYPE_MIN || opt_type > DCCPOPT_TYPE_MAX) { ^~~~~~~~ ... because it deduces the object is readonly. Signed-off-by: Florian Westphal <fw@strlen.de>
* json: add inner payload supportPablo Neira Ayuso2023-06-202-6/+62
| | | | | | Add support for vxlan, geneve, gre and gretap. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add json support for last statementPablo Neira Ayuso2023-06-203-0/+31
| | | | | | | | | | This patch adds json support for the last statement, it works for me here. However, tests/py still displays a warning: any/last.t: WARNING: line 12: '{"nftables": [{"add": {"rule": {"family": "ip", "table": "test-ip4", "chain": "input", "expr": [{"last": {"used": 300000}}]}}}]}': '[{"last": {"used": 300000}}]' mismatches '[{"last": null}]' Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: include set elements in "nft set list"Florian Westphal2023-06-192-7/+3
| | | | | | | | | | | | | | | | | Make "nft list sets" include set elements in listing by default. In nftables 1.0.0, "nft list sets" did not include the set elements, but with "--json" they were included. 1.0.1 and newer never include them. This causes a problem for people updating from 1.0.0 and relying on the presence of the set elements. Change nftables to always include the set elements. The "--terse" option is honored to get the "no elements" behaviour. Fixes: a1a6b0a5c3c4 ("cache: finer grain cache population for list commands") Link: https://marc.info/?l=netfilter&m=168704941828372&w=2 Signed-off-by: Florian Westphal <fw@strlen.de>
* exthdr: add boolean DCCP option matchingJeremy Sowden2023-06-018-0/+320
| | | | | | | | | | Iptables supports the matching of DCCP packets based on the presence or absence of DCCP options. Extend exthdr expressions to add this functionality to nftables. Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: permit use of constant values in set lookup keysFlorian Westphal2023-05-242-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Something like: Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something like add rule ip saddr . 1.2.3.4 . 80 @s goto c1 fails with: "Error: Can't parse symbolic invalid expressions". This fails because the relational expression first evaluates the left hand side, so when concat evaluation sees '1.2.3.4' no key context is available. Check if the RHS is a set reference, and, if so, evaluate the right hand side. This sets a pointer to the set key in the evaluation context structure which then makes the concat evaluation step parse 1.2.3.4 and 80 as ipv4 address and 16bit port number. On delinearization, extend relop postprocessing to copy the datatype from the rhs (set reference, has proper datatype according to set->key) to the lhs (concat expression). Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-221-0/+40
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: set NFT_SET_EVAL flag if dynamic set already existsPablo Neira Ayuso2023-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft reports EEXIST when reading an existing set whose NFT_SET_EVAL has been previously inferred from the ruleset. # cat test.nft table ip test { set dlist { type ipv4_addr size 65535 } chain output { type filter hook output priority filter; policy accept; udp dport 1234 update @dlist { ip daddr } counter packets 0 bytes 0 } } # nft -f test.nft # nft -f test.nft test.nft:2:6-10: Error: Could not process rule: File exists set dlist { ^^^^^ Phil Sutter says: In the first call, the set lacking 'dynamic' flag does not exist and is therefore added to the cache. Consequently, both the 'add set' command and the set statement point at the same set object. In the second call, a set with same name exists already, so the object created for 'add set' command is not added to cache and consequently not updated with the missing flag. The kernel thus rejects the NEWSET request as the existing set differs from the new one. Set on the NFT_SET_EVAL flag if the existing set sets it on. Fixes: 8d443adfcc8c1 ("evaluate: attempt to set_eval flag if dynamic updates requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: add hint error handlerPablo Neira Ayuso2023-05-111-2/+39
| | | | | | | | | | | | | | | | | | | | | | | If user provides a symbol that cannot be parsed and the datatype provides an error handler, provide a hint through the misspell infrastructure. For instance: # cat test.nft table ip x { map y { typeof ip saddr : verdict elements = { 1.2.3.4 : filter_server1 } } } # nft -f test.nft test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'? elements = { 1.2.3.4 : filter_server1 } ^^^^^^^^^^^^^^ While at it, normalize error to "Could not parse symbolic %s expression". Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: misspell support with symbol table parser for error reportingPablo Neira Ayuso2023-05-111-2/+48
| | | | | | | | | | | | | | | | | | Some datatypes provide a symbol table that is parsed as an integer. Improve error reporting by using the misspell infrastructure, to provide a hint to the user, whenever possible. If base datatype, usually the integer datatype, fails to parse the symbol, then try a fuzzy match on the symbol table to provide a hint in case the user has mistype it. For instance: test.nft:3:11-14: Error: Could not parse Differentiated Services Code Point expression; did you you mean `cs0`? ip dscp ccs0 ^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: do not remove counter in verdict mapsPablo Neira Ayuso2023-05-101-7/+43
| | | | | | | | | | | | | Add counter to set element instead of dropping it: # nft -c -o -f test.nft Merging: test.nft:6:3-50: ip saddr 1.1.1.1 ip daddr 2.2.2.2 counter accept test.nft:7:3-48: ip saddr 1.1.1.2 ip daddr 3.3.3.3 counter drop into: ip daddr . ip saddr vmap { 2.2.2.2 . 1.1.1.1 counter : accept, 3.3.3.3 . 1.1.1.2 counter : drop } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: skip optimization if anonymous set uses stateful statementPablo Neira Ayuso2023-05-101-1/+1
| | | | | | | | fee6bda06403 ("evaluate: remove anon sets with exactly one element") introduces an optimization to remove use of sets with single element. Skip this optimization if set element contains stateful statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>