summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* build: Bump version to 1.0.8v1.0.8Pablo Neira Ayuso2023-07-141-3/+3
| | | | | | | Update dependency on libnftnl >= 1.2.6 which contains support for meta broute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: missing dccpopt.h breaks make distcheckPablo Neira Ayuso2023-07-141-0/+1
| | | | | | Add it to Makefile.am. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-1313-23/+162
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Cache looked up set for list commandsPhil Sutter2023-07-133-7/+15
| | | | | | | | | | Evaluation phase checks the given table and set exist in cache. Relieve execution phase from having to perform the lookup again by storing the set reference in cmd->set. Just have to increase the ref counter so cmd_free() does the right thing (which lacked handling of MAP and METER objects for some reason). Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Merge some cases in cmd_evaluate_list()Phil Sutter2023-07-131-32/+4
| | | | | | | | The code for set, map and meter were almost identical apart from the specific last check. Fold them together and make the distinction in that spot only. Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: cover old scanner bugPablo Neira Ayuso2023-07-112-0/+1132
| | | | | | | | Add a test to cover 423abaa40ec4 ("scanner: don't rely on fseek for input stream repositioning") that fixes the bug described in https://bugs.gentoo.org/675188. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: drop check for nf_sock in nft_ctx_free()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | The "nft_ctx" API does not provide a way to change or reconnect the netlink socket. And none of the users would rely on that. Also note that nft_ctx_new() initializes nf_sock via nft_mnl_socket_open(), which panics of the socket could not be initialized. This means, the check is unnecessary and needlessly confusing. Drop it. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: inline creation of nf_sock in nft_ctx_new()Thomas Haller2023-07-101-6/+1
| | | | | | | | | | | | The function only has one caller. It's not clear how to extend this in a useful way, so that it makes sense to keep the initialization in a separate function. Simplify the code, by inlining and dropping the static function nft_ctx_netlink_init(). There was only one caller. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: drop unused argument nf_sock from nft_netlink()Thomas Haller2023-07-101-4/+3
| | | | | Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: always initialize netlink socket in nft_ctx_new()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | | | | nft_ctx_new() has a flags argument, but currently no flags are supported. The documentation suggests to pass 0 (NFT_CTX_DEFAULT). Initializing the netlink socket happens by default already, we should do it for all flags. Also because nft_ctx_netlink_init() is not public API so it's not clear how the user gets a functioning context instance otherwise. If we ever want to not initialize the netlink socket for a context instance, then there should be a dedicated flag for doing that (and additional API for making that mode of operation usable). Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: place byteorder conversion before rshift in payload statementPablo Neira Ayuso2023-07-083-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For bitfield that spans more than one byte, such as ip6 dscp, byteorder conversion needs to be done before rshift. Add unary expression for this conversion only in the case of meta and ct statements. Before this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <--------- incorrect [ meta set mark with reg 1 ] After this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <-------- correct [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ meta set mark with reg 1 ] For the matching case, binary transfer already deals with the rshift to adjust left and right hand side of the expression, the unary conversion is not needed in such case. Fixes: 8221d86e616b ("tests: py: add test-cases for ct and packet mark payload expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: use div_round_up in byteorder lengthPablo Neira Ayuso2023-07-063-8/+8
| | | | | | | | | Use div_round_up() to calculate the byteorder length, otherwise fields that take % BITS_PER_BYTE != 0 are not considered by the byteorder expression. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: Introduce valgrind modePhil Sutter2023-07-041-0/+47
| | | | | | | | Pass flag '-V' to run-tests.sh to run all 'nft' invocations in valgrind leak checking environment. Code copied from iptables' shell-testsuite where it proved to be useful already. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cli: Make cli_init() return to callerPhil Sutter2023-07-042-22/+43
| | | | | | | | | | | | | | | Avoid direct exit() calls as that leaves the caller-allocated nft_ctx object in place. Making sure it is freed helps with valgrind-analyses at least. To signal desired exit from CLI, introduce global cli_quit boolean and make all cli_exit() implementations also set cli_rc variable to the appropriate return code. The logic is to finish CLI only if cli_quit is true which asserts proper cleanup as it is set only by the respective cli_exit() function. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Call nft_ctx_free() before exitingPhil Sutter2023-07-041-17/+19
| | | | | | | | | | Introduce labels for failure and regular exit so all direct exit() calls after nft_ctx allocation may be replaced by a single goto statement. Simply drop that return call in interactive branch, code will continue at 'out' label naturally. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Make 'buf' variable branch-localPhil Sutter2023-07-041-2/+4
| | | | | | | | It is used only to linearize non-option argv for passing to nft_run_cmd_from_buffer(), reduce its scope. Allows to safely move the free() call there, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: refcount memleak in map rhs with timeoutsPablo Neira Ayuso2023-07-041-0/+48
| | | | | | Extend coverage for refcount leaks on map element expiration. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expression: define .clone for catchall set elementPablo Neira Ayuso2023-06-302-2/+34
| | | | | | | Otherwise reuse of catchall set element expression in variable triggers a null-pointer dereference. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: py: Document JSON mode in READMEPhil Sutter2023-06-271-0/+31
| | | | | | | Mostly identify the various files that (may) appear or exist already and how to deal with them. Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: cover refcount leak of mapping rhsPablo Neira Ayuso2023-06-271-0/+38
| | | | | | | | Add a test to cover reference count leak in maps by adding twice same element, then flush. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: coverage for simple port knocking rulesetPablo Neira Ayuso2023-06-262-0/+59
| | | | | | Add a test to cover port knocking simple ruleset. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: json: add missing/expected json outputFlorian Westphal2023-06-241-0/+14
| | | | | | | | | | | | | | nft-test.py generates following warning: any/last.t: WARNING: line 12: '{"nftables": [{"add": {"rule": {"family": "ip", "table": "test-ip4", "chain": "input", "expr": [{"last": {"used": 300000}}]}}}]}': '[{"last": {"used": 300000}}]' mismatches '[{"last": null}]' This is because "last" expression is stateful; but nft-test.py explicitly asks for stateless output. Thus we need to provide a json.output file, without it, nft-test.py uses last.json as the expected output file. Fixes: ae8786756b0c ("src: add json support for last statement") Signed-off-by: Florian Westphal <fw@strlen.de>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-213-5/+4
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-205-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface names in flowtablesFlorian Westphal2023-06-202-8/+17
| | | | | | Previous patch wasn't enough, also disable this for flowtable device lists. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface namesFlorian Westphal2023-06-202-5/+36
| | | | | | | | | | | device "" results in an assertion during evaluation. Before: nft: expression.c:426: constant_expr_alloc: Assertion `(((len) + (8) - 1) / (8)) > 0' failed. After: zero_length_devicename_assert:3:42-49: Error: you cannot set an empty interface name type filter hook ingress device""lo" priority -1 ^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: don't assert on scope underflowsFlorian Westphal2023-06-202-2/+7
| | | | | | | | | | | | | | | close_scope() gets called from the object destructors; imbalance can cause us to hit assert(). Before: nft: parser_bison.y:88: close_scope: Assertion `state->scope > 0' failed. After: assertion3:4:7-7: Error: too many levels of nesting jump { assertion3:5:8-8: Error: too many levels of nesting jump assertion3:5:9-9: Error: syntax error, unexpected newline, expecting '{' assertion3:7:1-1: Error: syntax error, unexpected end of file Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: do not abort when prefix map has non-map elementFlorian Westphal2023-06-203-4/+32
| | | | | | | | | | | Before: nft: evaluate.c:1849: __mapping_expr_expand: Assertion `i->etype == EXPR_MAPPING' failed. after: Error: expected mapping, not set element snat ip prefix to ip saddr map { 10.141.11.0/24 : 192.168.2.0/24, 10.141.12.1 } Signed-off-by: Florian Westphal <fw@strlen.de>
* json: dccp: remove erroneous const qualifierFlorian Westphal2023-06-201-1/+1
| | | | | | | | | | | This causes a clang warning: parser_json.c:767:6: warning: variable 'opt_type' is uninitialized when used here [-Wuninitialized] if (opt_type < DCCPOPT_TYPE_MIN || opt_type > DCCPOPT_TYPE_MAX) { ^~~~~~~~ ... because it deduces the object is readonly. Signed-off-by: Florian Westphal <fw@strlen.de>
* json: add inner payload supportPablo Neira Ayuso2023-06-206-6/+1122
| | | | | | Add support for vxlan, geneve, gre and gretap. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add json support for last statementPablo Neira Ayuso2023-06-205-0/+49
| | | | | | | | | | This patch adds json support for the last statement, it works for me here. However, tests/py still displays a warning: any/last.t: WARNING: line 12: '{"nftables": [{"add": {"rule": {"family": "ip", "table": "test-ip4", "chain": "input", "expr": [{"last": {"used": 300000}}]}}}]}': '[{"last": {"used": 300000}}]' mismatches '[{"last": null}]' Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: include set elements in "nft set list"Florian Westphal2023-06-192-7/+3
| | | | | | | | | | | | | | | | | Make "nft list sets" include set elements in listing by default. In nftables 1.0.0, "nft list sets" did not include the set elements, but with "--json" they were included. 1.0.1 and newer never include them. This causes a problem for people updating from 1.0.0 and relying on the presence of the set elements. Change nftables to always include the set elements. The "--terse" option is honored to get the "no elements" behaviour. Fixes: a1a6b0a5c3c4 ("cache: finer grain cache population for list commands") Link: https://marc.info/?l=netfilter&m=168704941828372&w=2 Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: bogus EBUSY errors in transactionsPablo Neira Ayuso2023-06-191-0/+121
| | | | | | | | | | | | | | | Make sure reference tracking during transaction update is correct by checking for bogus EBUSY error. For example, when deleting map with chain reference X, followed by a delete chain X command. This test is covering the following paths: - prepare + abort (via -c/--check option) - prepare + commit - release (when netns is destroyed) Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: add test case for chain-in-use-splatFlorian Westphal2023-06-161-0/+19
| | | | | | | | | | | | | | | | | | | WARNING [.]: at net/netfilter/nf_tables_api.c:1885 6.3.4-201.fc38.x86_64 #1 nft_immediate_destroy+0xc1/0xd0 [nf_tables] __nf_tables_abort+0x4b9/0xb20 [nf_tables] nf_tables_abort+0x39/0x50 [nf_tables] nfnetlink_rcv_batch+0x47c/0x8e0 [nfnetlink] nfnetlink_rcv+0x179/0x1a0 [nfnetlink] netlink_unicast+0x19e/0x290 This is because of chain->use underflow, at time destroy function is called, ->use has wrapped back to -1. Fixed via "netfilter: nf_tables: fix chain binding transaction logic". Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: fix spurious errors in terse listing in jsonPablo Neira Ayuso2023-06-111-1/+1
| | | | | | | Sometimes table handle becomes 192, which makes this test fail. Check for 192.168 instead to make sure terse listing works fine instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: add boolean DCCP option matchingJeremy Sowden2023-06-0116-0/+445
| | | | | | | | | | Iptables supports the matching of DCCP packets based on the presence or absence of DCCP options. Extend exthdr expressions to add this functionality to nftables. Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: extend tests for destroy commandFernando Fernandez Mancera2023-06-0118-0/+74
| | | | | | | | | Extend tests to cover destroy command for chains, flowtables, sets, maps. In addition rename a destroy command test for rules with a duplicated number. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: permit use of constant values in set lookup keysFlorian Westphal2023-05-244-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Something like: Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something like add rule ip saddr . 1.2.3.4 . 80 @s goto c1 fails with: "Error: Can't parse symbolic invalid expressions". This fails because the relational expression first evaluates the left hand side, so when concat evaluation sees '1.2.3.4' no key context is available. Check if the RHS is a set reference, and, if so, evaluate the right hand side. This sets a pointer to the set key in the evaluation context structure which then makes the concat evaluation step parse 1.2.3.4 and 80 as ipv4 address and 16bit port number. On delinearization, extend relop postprocessing to copy the datatype from the rhs (set reference, has proper datatype according to set->key) to the lhs (concat expression). Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-222-3/+61
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: set NFT_SET_EVAL flag if dynamic set already existsPablo Neira Ayuso2023-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft reports EEXIST when reading an existing set whose NFT_SET_EVAL has been previously inferred from the ruleset. # cat test.nft table ip test { set dlist { type ipv4_addr size 65535 } chain output { type filter hook output priority filter; policy accept; udp dport 1234 update @dlist { ip daddr } counter packets 0 bytes 0 } } # nft -f test.nft # nft -f test.nft test.nft:2:6-10: Error: Could not process rule: File exists set dlist { ^^^^^ Phil Sutter says: In the first call, the set lacking 'dynamic' flag does not exist and is therefore added to the cache. Consequently, both the 'add set' command and the set statement point at the same set object. In the second call, a set with same name exists already, so the object created for 'add set' command is not added to cache and consequently not updated with the missing flag. The kernel thus rejects the NEWSET request as the existing set differs from the new one. Set on the NFT_SET_EVAL flag if the existing set sets it on. Fixes: 8d443adfcc8c1 ("evaluate: attempt to set_eval flag if dynamic updates requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: add hint error handlerPablo Neira Ayuso2023-05-112-2/+40
| | | | | | | | | | | | | | | | | | | | | | | If user provides a symbol that cannot be parsed and the datatype provides an error handler, provide a hint through the misspell infrastructure. For instance: # cat test.nft table ip x { map y { typeof ip saddr : verdict elements = { 1.2.3.4 : filter_server1 } } } # nft -f test.nft test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'? elements = { 1.2.3.4 : filter_server1 } ^^^^^^^^^^^^^^ While at it, normalize error to "Could not parse symbolic %s expression". Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: misspell support with symbol table parser for error reportingPablo Neira Ayuso2023-05-111-2/+48
| | | | | | | | | | | | | | | | | | Some datatypes provide a symbol table that is parsed as an integer. Improve error reporting by using the misspell infrastructure, to provide a hint to the user, whenever possible. If base datatype, usually the integer datatype, fails to parse the symbol, then try a fuzzy match on the symbol table to provide a hint in case the user has mistype it. For instance: test.nft:3:11-14: Error: Could not parse Differentiated Services Code Point expression; did you you mean `cs0`? ip dscp ccs0 ^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: do not remove counter in verdict mapsPablo Neira Ayuso2023-05-103-7/+51
| | | | | | | | | | | | | Add counter to set element instead of dropping it: # nft -c -o -f test.nft Merging: test.nft:6:3-50: ip saddr 1.1.1.1 ip daddr 2.2.2.2 counter accept test.nft:7:3-48: ip saddr 1.1.1.2 ip daddr 3.3.3.3 counter drop into: ip daddr . ip saddr vmap { 2.2.2.2 . 1.1.1.1 counter : accept, 3.3.3.3 . 1.1.1.2 counter : drop } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: skip optimization if anonymous set uses stateful statementPablo Neira Ayuso2023-05-103-1/+5
| | | | | | | | fee6bda06403 ("evaluate: remove anon sets with exactly one element") introduces an optimization to remove use of sets with single element. Skip this optimization if set element contains stateful statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow stateful statements with anonymous verdict mapsPablo Neira Ayuso2023-05-103-3/+4
| | | | | | | | | | | | | | Evaluation fails to accept stateful statements in verdict maps, relax the following check for anonymous sets: test.nft:4:29-35: Error: missing statement in map declaration ip saddr vmap { 127.0.0.1 counter : drop, * counter : accept } ^^^^^^^ The existing code generates correctly the counter in the anonymous verdict map. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: restore typeof interval map data typeFlorian Westphal2023-05-022-3/+8
| | | | | | | | | | | | | When "typeof ... : interval ..." gets used, existing logic failed to validate the expressions. "interval" means that kernel reserves twice the size, so consider this when validating and restoring. Also fix up the dump file of the existing test case to be symmetrical. Signed-off-by: Florian Westphal <fw@strlen.de>
* doc: add nat examplesFlorian Westphal2023-05-021-2/+51
| | | | | | | | | | | | | | nftables nat is much more capable than what the existing documentation describes. In particular, nftables can fully emulate iptables NETMAP target and can perform n:m address mapping. Add a new example section extracted from commit log messages when those features got added. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* doc: list set/map flag keywords in a tableFlorian Westphal2023-05-021-3/+18
| | | | | | add descriptions of the set/map flags. Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: introduce meta broute supportSriram Yagnaraman2023-04-297-1/+31
| | | | | | | | | | | Can be used in bridge prerouting hook to divert a packet to the ip stack for routing. This is a replacement for "ebtables -t broute" functionality. Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20230224095251.11249-1-sriram.yagnaraman@est.tech/ Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Florian Westphal <fw@strlen.de>
* doc: correct NAT statement descriptionJeremy Sowden2023-04-291-1/+1
| | | | | | | | Specifying a port specifies that a port, not an address, should be modified. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>