summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-137-13/+62
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Cache looked up set for list commandsPhil Sutter2023-07-133-7/+15
| | | | | | | | | | Evaluation phase checks the given table and set exist in cache. Relieve execution phase from having to perform the lookup again by storing the set reference in cmd->set. Just have to increase the ref counter so cmd_free() does the right thing (which lacked handling of MAP and METER objects for some reason). Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Merge some cases in cmd_evaluate_list()Phil Sutter2023-07-131-32/+4
| | | | | | | | The code for set, map and meter were almost identical apart from the specific last check. Fold them together and make the distinction in that spot only. Signed-off-by: Phil Sutter <phil@nwl.cc>
* libnftables: drop check for nf_sock in nft_ctx_free()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | The "nft_ctx" API does not provide a way to change or reconnect the netlink socket. And none of the users would rely on that. Also note that nft_ctx_new() initializes nf_sock via nft_mnl_socket_open(), which panics of the socket could not be initialized. This means, the check is unnecessary and needlessly confusing. Drop it. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: inline creation of nf_sock in nft_ctx_new()Thomas Haller2023-07-101-6/+1
| | | | | | | | | | | | The function only has one caller. It's not clear how to extend this in a useful way, so that it makes sense to keep the initialization in a separate function. Simplify the code, by inlining and dropping the static function nft_ctx_netlink_init(). There was only one caller. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: drop unused argument nf_sock from nft_netlink()Thomas Haller2023-07-101-4/+3
| | | | | Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: always initialize netlink socket in nft_ctx_new()Thomas Haller2023-07-101-2/+1
| | | | | | | | | | | | | | | | | nft_ctx_new() has a flags argument, but currently no flags are supported. The documentation suggests to pass 0 (NFT_CTX_DEFAULT). Initializing the netlink socket happens by default already, we should do it for all flags. Also because nft_ctx_netlink_init() is not public API so it's not clear how the user gets a functioning context instance otherwise. If we ever want to not initialize the netlink socket for a context instance, then there should be a dedicated flag for doing that (and additional API for making that mode of operation usable). Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: place byteorder conversion before rshift in payload statementPablo Neira Ayuso2023-07-081-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For bitfield that spans more than one byte, such as ip6 dscp, byteorder conversion needs to be done before rshift. Add unary expression for this conversion only in the case of meta and ct statements. Before this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <--------- incorrect [ meta set mark with reg 1 ] After this patch: # nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <-------- correct [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ meta set mark with reg 1 ] For the matching case, binary transfer already deals with the rshift to adjust left and right hand side of the expression, the unary conversion is not needed in such case. Fixes: 8221d86e616b ("tests: py: add test-cases for ct and packet mark payload expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: use div_round_up in byteorder lengthPablo Neira Ayuso2023-07-061-1/+1
| | | | | | | | | Use div_round_up() to calculate the byteorder length, otherwise fields that take % BITS_PER_BYTE != 0 are not considered by the byteorder expression. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cli: Make cli_init() return to callerPhil Sutter2023-07-041-21/+42
| | | | | | | | | | | | | | | Avoid direct exit() calls as that leaves the caller-allocated nft_ctx object in place. Making sure it is freed helps with valgrind-analyses at least. To signal desired exit from CLI, introduce global cli_quit boolean and make all cli_exit() implementations also set cli_rc variable to the appropriate return code. The logic is to finish CLI only if cli_quit is true which asserts proper cleanup as it is set only by the respective cli_exit() function. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Call nft_ctx_free() before exitingPhil Sutter2023-07-041-17/+19
| | | | | | | | | | Introduce labels for failure and regular exit so all direct exit() calls after nft_ctx allocation may be replaced by a single goto statement. Simply drop that return call in interactive branch, code will continue at 'out' label naturally. Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: Make 'buf' variable branch-localPhil Sutter2023-07-041-2/+4
| | | | | | | | It is used only to linearize non-option argv for passing to nft_run_cmd_from_buffer(), reduce its scope. Allows to safely move the free() call there, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
* expression: define .clone for catchall set elementPablo Neira Ayuso2023-06-301-2/+13
| | | | | | | Otherwise reuse of catchall set element expression in variable triggers a null-pointer dereference. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-212-4/+3
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-204-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface names in flowtablesFlorian Westphal2023-06-201-8/+12
| | | | | | Previous patch wasn't enough, also disable this for flowtable device lists. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: reject zero-length interface namesFlorian Westphal2023-06-201-5/+31
| | | | | | | | | | | device "" results in an assertion during evaluation. Before: nft: expression.c:426: constant_expr_alloc: Assertion `(((len) + (8) - 1) / (8)) > 0' failed. After: zero_length_devicename_assert:3:42-49: Error: you cannot set an empty interface name type filter hook ingress device""lo" priority -1 ^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: don't assert on scope underflowsFlorian Westphal2023-06-201-2/+1
| | | | | | | | | | | | | | | close_scope() gets called from the object destructors; imbalance can cause us to hit assert(). Before: nft: parser_bison.y:88: close_scope: Assertion `state->scope > 0' failed. After: assertion3:4:7-7: Error: too many levels of nesting jump { assertion3:5:8-8: Error: too many levels of nesting jump assertion3:5:9-9: Error: syntax error, unexpected newline, expecting '{' assertion3:7:1-1: Error: syntax error, unexpected end of file Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: do not abort when prefix map has non-map elementFlorian Westphal2023-06-201-4/+13
| | | | | | | | | | | Before: nft: evaluate.c:1849: __mapping_expr_expand: Assertion `i->etype == EXPR_MAPPING' failed. after: Error: expected mapping, not set element snat ip prefix to ip saddr map { 10.141.11.0/24 : 192.168.2.0/24, 10.141.12.1 } Signed-off-by: Florian Westphal <fw@strlen.de>
* json: dccp: remove erroneous const qualifierFlorian Westphal2023-06-201-1/+1
| | | | | | | | | | | This causes a clang warning: parser_json.c:767:6: warning: variable 'opt_type' is uninitialized when used here [-Wuninitialized] if (opt_type < DCCPOPT_TYPE_MIN || opt_type > DCCPOPT_TYPE_MAX) { ^~~~~~~~ ... because it deduces the object is readonly. Signed-off-by: Florian Westphal <fw@strlen.de>
* json: add inner payload supportPablo Neira Ayuso2023-06-202-6/+62
| | | | | | Add support for vxlan, geneve, gre and gretap. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add json support for last statementPablo Neira Ayuso2023-06-203-0/+31
| | | | | | | | | | This patch adds json support for the last statement, it works for me here. However, tests/py still displays a warning: any/last.t: WARNING: line 12: '{"nftables": [{"add": {"rule": {"family": "ip", "table": "test-ip4", "chain": "input", "expr": [{"last": {"used": 300000}}]}}}]}': '[{"last": {"used": 300000}}]' mismatches '[{"last": null}]' Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: include set elements in "nft set list"Florian Westphal2023-06-192-7/+3
| | | | | | | | | | | | | | | | | Make "nft list sets" include set elements in listing by default. In nftables 1.0.0, "nft list sets" did not include the set elements, but with "--json" they were included. 1.0.1 and newer never include them. This causes a problem for people updating from 1.0.0 and relying on the presence of the set elements. Change nftables to always include the set elements. The "--terse" option is honored to get the "no elements" behaviour. Fixes: a1a6b0a5c3c4 ("cache: finer grain cache population for list commands") Link: https://marc.info/?l=netfilter&m=168704941828372&w=2 Signed-off-by: Florian Westphal <fw@strlen.de>
* exthdr: add boolean DCCP option matchingJeremy Sowden2023-06-018-0/+320
| | | | | | | | | | Iptables supports the matching of DCCP packets based on the presence or absence of DCCP options. Extend exthdr expressions to add this functionality to nftables. Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: permit use of constant values in set lookup keysFlorian Westphal2023-05-242-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Something like: Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something like add rule ip saddr . 1.2.3.4 . 80 @s goto c1 fails with: "Error: Can't parse symbolic invalid expressions". This fails because the relational expression first evaluates the left hand side, so when concat evaluation sees '1.2.3.4' no key context is available. Check if the RHS is a set reference, and, if so, evaluate the right hand side. This sets a pointer to the set key in the evaluation context structure which then makes the concat evaluation step parse 1.2.3.4 and 80 as ipv4 address and 16bit port number. On delinearization, extend relop postprocessing to copy the datatype from the rhs (set reference, has proper datatype according to set->key) to the lhs (concat expression). Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-221-0/+40
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: set NFT_SET_EVAL flag if dynamic set already existsPablo Neira Ayuso2023-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft reports EEXIST when reading an existing set whose NFT_SET_EVAL has been previously inferred from the ruleset. # cat test.nft table ip test { set dlist { type ipv4_addr size 65535 } chain output { type filter hook output priority filter; policy accept; udp dport 1234 update @dlist { ip daddr } counter packets 0 bytes 0 } } # nft -f test.nft # nft -f test.nft test.nft:2:6-10: Error: Could not process rule: File exists set dlist { ^^^^^ Phil Sutter says: In the first call, the set lacking 'dynamic' flag does not exist and is therefore added to the cache. Consequently, both the 'add set' command and the set statement point at the same set object. In the second call, a set with same name exists already, so the object created for 'add set' command is not added to cache and consequently not updated with the missing flag. The kernel thus rejects the NEWSET request as the existing set differs from the new one. Set on the NFT_SET_EVAL flag if the existing set sets it on. Fixes: 8d443adfcc8c1 ("evaluate: attempt to set_eval flag if dynamic updates requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: add hint error handlerPablo Neira Ayuso2023-05-111-2/+39
| | | | | | | | | | | | | | | | | | | | | | | If user provides a symbol that cannot be parsed and the datatype provides an error handler, provide a hint through the misspell infrastructure. For instance: # cat test.nft table ip x { map y { typeof ip saddr : verdict elements = { 1.2.3.4 : filter_server1 } } } # nft -f test.nft test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'? elements = { 1.2.3.4 : filter_server1 } ^^^^^^^^^^^^^^ While at it, normalize error to "Could not parse symbolic %s expression". Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: misspell support with symbol table parser for error reportingPablo Neira Ayuso2023-05-111-2/+48
| | | | | | | | | | | | | | | | | | Some datatypes provide a symbol table that is parsed as an integer. Improve error reporting by using the misspell infrastructure, to provide a hint to the user, whenever possible. If base datatype, usually the integer datatype, fails to parse the symbol, then try a fuzzy match on the symbol table to provide a hint in case the user has mistype it. For instance: test.nft:3:11-14: Error: Could not parse Differentiated Services Code Point expression; did you you mean `cs0`? ip dscp ccs0 ^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: do not remove counter in verdict mapsPablo Neira Ayuso2023-05-101-7/+43
| | | | | | | | | | | | | Add counter to set element instead of dropping it: # nft -c -o -f test.nft Merging: test.nft:6:3-50: ip saddr 1.1.1.1 ip daddr 2.2.2.2 counter accept test.nft:7:3-48: ip saddr 1.1.1.2 ip daddr 3.3.3.3 counter drop into: ip daddr . ip saddr vmap { 2.2.2.2 . 1.1.1.1 counter : accept, 3.3.3.3 . 1.1.1.2 counter : drop } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: skip optimization if anonymous set uses stateful statementPablo Neira Ayuso2023-05-101-1/+1
| | | | | | | | fee6bda06403 ("evaluate: remove anon sets with exactly one element") introduces an optimization to remove use of sets with single element. Skip this optimization if set element contains stateful statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow stateful statements with anonymous verdict mapsPablo Neira Ayuso2023-05-101-1/+2
| | | | | | | | | | | | | | Evaluation fails to accept stateful statements in verdict maps, relax the following check for anonymous sets: test.nft:4:29-35: Error: missing statement in map declaration ip saddr vmap { 127.0.0.1 counter : drop, * counter : accept } ^^^^^^^ The existing code generates correctly the counter in the anonymous verdict map. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: restore typeof interval map data typeFlorian Westphal2023-05-021-1/+6
| | | | | | | | | | | | | When "typeof ... : interval ..." gets used, existing logic failed to validate the expressions. "interval" means that kernel reserves twice the size, so consider this when validating and restoring. Also fix up the dump file of the existing test case to be symmetrical. Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: introduce meta broute supportSriram Yagnaraman2023-04-291-0/+2
| | | | | | | | | | | Can be used in bridge prerouting hook to divert a packet to the ip stack for routing. This is a replacement for "ebtables -t broute" functionality. Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20230224095251.11249-1-sriram.yagnaraman@est.tech/ Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Florian Westphal <fw@strlen.de>
* json: formatting fixesJeremy Sowden2023-04-291-21/+20
| | | | | | | A few indentation tweaks for the JSON parser. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: fix enum/integer mismatchesFlorian Westphal2023-04-292-3/+3
| | | | | | | | | | | | | | | | | | | gcc 13 complains about type confusion: cache.c:1178:5: warning: conflicting types for 'nft_cache_update' due to enum/integer mismatch; have 'int(struct nft_ctx *, unsigned int, struct list_head *, const struct nft_cache_filter *)' [-Wenum-int-mismatch] cache.h:74:5: note: previous declaration of 'nft_cache_update' with type 'int(struct nft_ctx *, enum cmd_ops, struct list_head *, const struct nft_cache_filter *)' Same for: rule.c:1915:13: warning: conflicting types for 'obj_type_name' due to enum/integer mismatch; have 'const char *(enum stmt_types)' [-Wenum-int-mismatch] 1915 | const char *obj_type_name(enum stmt_types type) | ^~~~~~~~~~~~~ expression.c:1543:24: warning: conflicting types for 'expr_ops_by_type' due to enum/integer mismatch; have 'const struct expr_ops *(uint32_t)' {aka 'const struct expr_ops *(unsigned int)'} [-Wenum-int-mismatch] 1543 | const struct expr_ops *expr_ops_by_type(uint32_t value) | ^~~~~~~~~~~~~~~~ Convert to the stricter type (enum) where possible. Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: incomplete extended error reporting for singleton device in chainPablo Neira Ayuso2023-04-251-0/+1
| | | | | | | | | | | Fix error reporting when single device is specifies in chain: # nft add chain netdev filter ingress '{ devices = { x }; }' add chain netdev filter ingress { devices = { x }; } ^ Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: handle singleton element in netdevice setPablo Neira Ayuso2023-04-251-14/+32
| | | | | | | | expr_evaluate_set() turns sets with singleton element into value, nft_dev_add() expects a list of expression, so it crashes. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1676 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: allow to specify comment on chainPablo Neira Ayuso2023-04-252-7/+20
| | | | | | Allow users to add a comment when declaring a chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: allow to specify comment on tablePablo Neira Ayuso2023-04-242-5/+18
| | | | | | | | | | | | | | | | | | | | Allow users to add a comment when declaring a table: # sudo nft add table inet test3 '{comment "this is a comment";}' # nft list ruleset table inet test3 { comment "this is a comment" } # nft -j list ruleset {"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 3, "comment": "this is a comment"}}]} # nft -j list ruleset > test.json # nft flush ruleset # nft -j -f test.json # nft -j list ruleset {"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 4, "comment": "this is a comment"}}]} Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1670 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: skip protocol context update for nfproto with same table familyPablo Neira Ayuso2023-04-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inefficient bytecode crashes ruleset listing: [ meta load nfproto => reg 1 ] [ cmp eq reg 1 0x00000002 ] <-- this specifies NFPROTO_IPV4 but table family is IPv4! [ payload load 4b @ network header + 12 => reg 1 ] [ cmp gte reg 1 0x1000000a ] [ cmp lte reg 1 0x1f00000a ] [ masq ] This IPv4 table obviously only see IPv4 traffic, but bytecode specifies a redundant match on NFPROTO_IPV4. After this patch, listing works: # nft list ruleset table ip crash { chain crash { type nat hook postrouting priority srcnat; policy accept; ip saddr 10.0.0.16-10.0.0.31 masquerade } } Skip protocol context update in case that this information is redundant. Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1562 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bail out if new flowtable does not specify hook and priorityPablo Neira Ayuso2023-04-241-1/+5
| | | | | | | | | | | | | | | | | | | | If user forgets to specify the hook and priority and the flowtable does not exist, then bail out: # cat flowtable-incomplete.nft table t { flowtable f { devices = { lo } } } # nft -f /tmp/k flowtable-incomplete.nft:2:12-12: Error: missing hook and priority in flowtable declaration flowtable f { ^ Update one existing tests/shell to specify a hook and priority. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow for updating devices on existing netdev chainPablo Neira Ayuso2023-04-245-63/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to add/remove devices to an existing chain: # cat ruleset.nft table netdev x { chain y { type filter hook ingress devices = { eth0 } priority 0; policy accept; } } # nft -f ruleset.nft # nft add chain netdev x y '{ devices = { eth1 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth0, eth1 } priority 0; policy accept; } } # nft delete chain netdev x y '{ devices = { eth0 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth1 } priority 0; policy accept; } } This feature allows for creating an empty netdev chain, with no devices. In such case, no packets are seen until a device is registered. This patch includes extended netlink error reporting: # nft add chain netdev x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add chain netdev x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: flowtable support for extended netlink error reportingPablo Neira Ayuso2023-04-241-60/+82
| | | | | | | | | | | | | | | This patch extends existing flowtable support to improve error reporting: # nft add flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add flowtable inet x y { devices = { x } ; } ^ # nft delete flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory delete flowtable inet x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: set SO_SNDBUF before SO_SNDBUFFORCEPablo Neira Ayuso2023-04-243-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set SO_SNDBUF before SO_SNDBUFFORCE: Unpriviledged user namespace does not have CAP_NET_ADMIN on the host (user_init_ns) namespace. SO_SNDBUF always succeeds in Linux, always try SO_SNDBUFFORCE after it. Moreover, suggest the user to bump socket limits if EMSGSIZE after having see EPERM previously, when calling SO_SNDBUFFORCE. Provide a hint to the user too: # nft -f test.nft netlink: Error: Could not process rule: Message too long Please, rise /proc/sys/net/core/wmem_max on the host namespace. Hint: 4194304 bytes Dave Pfike says: Prior to this patch, nft inside a systemd-nspawn container was failing to install my ruleset (which includes a large-ish map), with the error netlink: Error: Could not process rule: Message too long strace reveals: setsockopt(3, SOL_SOCKET, SO_SNDBUFFORCE, [524288], 4) = -1 EPERM (Operation not permitted) This is despite the nspawn process supposedly having CAP_NET_ADMIN. A web search reveals at least one other user having the same issue: https://old.reddit.com/r/Proxmox/comments/scnoav/lxc_container_debian_11_nftables_geoblocking/ Reported-by: Dave Pifke <dave@pifke.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* main: Error out when combining -i/--interactive and -f/--filePablo Neira Ayuso2023-04-181-0/+10
| | | | | | | | | These two options are mutually exclusive, display error in that case: # nft -i -f test.nft Error: -i/--interactive and -f/--file options cannot be combined Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: support for redirect and masqueradePablo Neira Ayuso2023-04-051-32/+119
| | | | | | | | | | | | | | The redirect and masquerade statements can be handled as verdicts: - if redirect statement specifies no ports. - masquerade statement, in any case. Exceptions to the rule: If redirect statement specifies ports, then nat map transformation can be used iif both statements specify ports. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1668 Fixes: 0a6dbfce6dc3 ("optimize: merge nat rules with same selectors into map") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: do not reset protocol context for nat protocol expressionPablo Neira Ayuso2023-04-051-3/+1
| | | | | | | | This patch reverts 403b46ada490 ("netlink_delinearize: kill dependency before eval of 'redirect' stmt"). Since ("evaluate: bogus missing transport protocol"), this workaround is not required anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bogus missing transport protocolPablo Neira Ayuso2023-04-051-3/+8
| | | | | | | | | | | | | | | | | | | Users have to specify a transport protocol match such as meta l4proto tcp before the redirect statement, even if the redirect statement already implicitly refers to the transport protocol, for instance: test.nft:3:16-53: Error: transport protocol mapping is only valid after transport protocol match redirect to :tcp dport map { 83 : 8083, 84 : 8084 } ~~~~~~~~ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Evaluate the redirect expression before the mandatory check for the transport protocol match, so protocol context already provides a transport protocol. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: assert nat type on nat statement helperPablo Neira Ayuso2023-04-051-0/+4
| | | | | | Add assert() to helper function to expression from NAT statement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>