summaryrefslogtreecommitdiffstats
path: root/src/rule.c
Commit message (Collapse)AuthorAgeFilesLines
* Add support for table's persist flagPhil Sutter6 days1-0/+12
| | | | | | | | | Bison parser lacked support for passing multiple flags, JSON parser did not support table flags at all. Document also 'owner' flag (and describe their relationship in nft.8. Signed-off-by: Phil Sutter <phil@nwl.cc>
* nftables: do mot merge payloads on negationSriram Rajagopalan2024-03-131-1/+0
| | | | | | | | | | | else, a rule like tcp sport != 22 tcp dport != 23 will match even if the destination is 23 as long as sport is != 22. (or vice versa). Signed-off-by: Sriram Rajagopalan <sriramr@arista.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: translate meter into dynamic setPablo Neira Ayuso2024-03-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 129f9d153279 ("nft: migrate man page examples with `meter` directive to sets") already replaced meters by dynamic sets. This patch removes NFT_SET_ANONYMOUS flag from the implicit set that is instantiated via meter, so the listing shows a dynamic set instead which is the recommended approach these days. Therefore, a batch like this: add table t add chain t c add rule t c tcp dport 80 meter m size 128 { ip saddr timeout 1s limit rate 10/second } gets translated to a dynamic set: table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } chain c { tcp dport 80 update @m { ip saddr timeout 1s limit rate 10/second burst 5 packets } } } Check for NFT_SET_ANONYMOUS flag is also relaxed for list and flush meter commands: # nft list meter ip t m table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } } # nft flush meter ip t m As a side effect the legacy 'list meter' and 'flush meter' commands allow to flush a dynamic set to retain backward compatibility. This patch updates testcases/sets/0022type_selective_flush_0 and testcases/sets/0038meter_list_0 as well as the json output which now uses the dynamic set representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: fix ASAN errors in chain priority to textual namesPablo Neira Ayuso2024-03-051-6/+9
| | | | | | | | | | | | | | | | | | | | | ASAN reports several errors when listing this ruleset: table ip x { chain y { type filter hook input priority -2147483648; policy accept; } } src/rule.c:1002:8: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself src/rule.c:1001:11: runtime error: signed integer overflow: -2147483648 - 50 cannot be represented in type 'int' Use int64_t for the offset to avoid an underflow when calculating closest existing priority definition. Use llabs() because abs() is undefined with INT32_MIN. Fixes: c8a0e8c90e2d ("src: Set/print standard chain prios with textual names") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: fix sym refcount assertionFlorian Westphal2024-01-151-1/+5
| | | | | | | | | | | | Scope release must happen last. afl provided a reproducer where policy is a define, because scope is released too early we get: nft: src/rule.c:559: scope_release: Assertion `sym->refcnt == 1' failed. ... because chain->policy is EXPR_SYMBOL. Fixes: 627c451b2351 ("src: allow variables in the chain priority specification") Signed-off-by: Florian Westphal <fw@strlen.de>
* src: remove xfree() and use plain free()Thomas Haller2023-11-091-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xmalloc() (and similar x-functions) are used for allocation. They wrap malloc()/realloc() but will abort the program on ENOMEM. The meaning of xmalloc() is that it wraps malloc() but aborts on failure. I don't think x-functions should have the notion, that this were potentially a different memory allocator that must be paired with a particular xfree(). Even if the original intent was that the allocator is abstracted (and possibly not backed by standard malloc()/free()), then that doesn't seem a good idea. Nowadays libc allocators are pretty good, and we would need a very special use cases to switch to something else. In other words, it will never happen that xmalloc() is not backed by malloc(). Also there were a few places, where a xmalloc() was already "wrongly" paired with free() (for example, iface_cache_release(), exit_cookie(), nft_run_cmd_from_buffer()). Or note how pid2name() returns an allocated string from fscanf(), which needs to be freed with free() (and not xfree()). This requirement bubbles up the callers portid2name() and name_by_portid(). This case was actually handled correctly and the buffer was freed with free(). But it shows that mixing different allocators is cumbersome to get right. Of course, we don't actually have different allocators and whether to use free() or xfree() makes no different. The point is that xfree() serves no actual purpose except raising irrelevant questions about whether x-functions are correctly paired with xfree(). Note that xfree() also used to accept const pointers. It is bad to unconditionally for all deallocations. Instead prefer to use plain free(). To free a const pointer use free_const() which obviously wraps free, as indicated by the name. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add free_const() and use it instead of xfree()Thomas Haller2023-11-091-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost everywhere xmalloc() and friends is used instead of malloc(). This is almost everywhere paired with xfree(). xfree() has two problems. First, it brings the wrong notion that xmalloc() should be paired with xfree(), as if xmalloc() would not use the plain malloc() allocator. In practices, xfree() just wraps free(), and it wouldn't make sense any other way. xfree() should go away. This will be addressed in the next commit. The problem addressed by this commit is that xfree() accepts a const pointer. Paired with the practice of almost always using xfree() instead of free(), all our calls to xfree() cast away constness of the pointer, regardless whether that is necessary. Declaring a pointer as const should help us to catch wrong uses. If the xfree() function always casts aways const, the compiler doesn't help. There are many places that rightly cast away const during free. But not all of them. Add a free_const() macro, which is like free(), but accepts const pointers. We should always make an intentional choice whether to use free() or free_const(). Having a free_const() macro makes this very common choice clearer, instead of adding a (void*) cast at many places. Note that we now pair xmalloc() allocations with a free() call (instead of xfree(). That inconsistency will be resolved in the next commit. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: never merge across non-expression statementsFlorian Westphal2023-09-291-4/+2
| | | | | | | | | | | | | | | | | | The existing logic can merge across non-expression statements, if there is only one payload expression. Example: ether saddr 00:11:22:33:44:55 counter ether type 8021q is turned into counter ether saddr 00:11:22:33:44:55 ether type 8021q which isn't the same thing. Fix this up and add test cases for adjacent vlan and ip header fields. 'Counter' serves as a non-merge fence. Signed-off-by: Florian Westphal <fw@strlen.de>
* include: include <string.h> in <nft.h>Thomas Haller2023-09-281-1/+0
| | | | | | | | <string.h> provides strcmp(), as such it's very basic and used everywhere. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: expand sets and maps before evaluationPablo Neira Ayuso2023-09-191-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 3975430b12d9 ("src: expand table command before evaluation") moved ruleset expansion before evaluation, except for sets and maps. For sets and maps there is still a post_expand() phase. This patch moves sets and map expansion to allocate an independent CMD_OBJ_SETELEMS command to add elements to named set and maps which is evaluated, this consolidates the ruleset expansion to happen always before the evaluation step for all objects, except for anonymous sets and maps. This approach avoids an interference with the set interval code which detects overlaps and merges of adjacents ranges. This set interval routine uses set->init to maintain a cache of existing elements. Then, the post_expand() phase incorrectly expands set->init cache and it triggers a bogus ENOENT errors due to incorrect bytecode (placing element addition before set creation) in combination with user declared sets using the flat syntax notation. Since the evaluation step (coming after the expansion) creates implicit/anonymous sets and maps, those are not expanded anymore. These anonymous sets still need to be evaluated from set_evaluate() path and the netlink bytecode generation path, ie. do_add_set(), needs to deal with anonymous sets. Note that, for named sets, do_add_set() does not use set->init. Such content is part of the existing cache, and the CMD_OBJ_SETELEMS command is responsible for adding elements to named sets. Fixes: 3975430b12d9 ("src: expand table command before evaluation") Reported-by: Jann Haber <jannh@selfnet.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <stdlib.h> in <nft.h>Thomas Haller2023-09-111-1/+0
| | | | | | | | | | | | | | It provides malloc()/free(), which is so basic that we need it everywhere. Include via <nft.h>. The ultimate purpose is to define more things in <nft.h>. While it has not corresponding C sources, <nft.h> can contain macros and static inline functions, and is a good place for things that we shall have everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that is a very basic dependency, that will be needed for that. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: set internal_location for table and chainPablo Neira Ayuso2023-08-311-0/+2
| | | | | | | JSON parser does not seem to set on this, better provide a default location. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: simplify chain_alloc()Pablo Neira Ayuso2023-08-311-3/+1
| | | | | | | Remove parameter to set the chain name which is only used from netlink path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove check for NULL before calling expr_free()Pablo Neira Ayuso2023-08-311-2/+2
| | | | | | expr_free() already handles NULL pointer, remove redundant check. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use internal_location for unspecified location at allocation timePablo Neira Ayuso2023-08-311-7/+14
| | | | | | | Set location to internal_location instead of NULL to ensure this is always set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: fix "const static" declarationThomas Haller2023-08-301-2/+2
| | | | | | | | | | | | | | Gcc warns against this with "-Wextra": src/rule.c:869:1: error: static is not at beginning of declaration [-Werror=old-style-declaration] 869 | const static struct prio_tag std_prios[] = { | ^~~~~ src/rule.c:878:1: error: static is not at beginning of declaration [-Werror=old-style-declaration] 878 | const static struct prio_tag bridge_std_prios[] = { | ^~~~~ Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <std{bool,int}.h> via <nft.h>Thomas Haller2023-08-251-1/+0
| | | | | | | | | | | | | | | | | | | | There is a minimum base that all our sources will end up needing. This is what <nft.h> provides. Add <stdbool.h> and <stdint.h> there. It's unlikely that we want to implement anything, without having "bool" and "uint32_t" types available. Yes, this means the internal headers are not self-contained, with respect to what <nft.h> provides. This is the exception to the rule, and our internal headers should rely to have <nft.h> included for them. They should not include <nft.h> themselves, because <nft.h> needs always be included as first. So when an internal header would include <nft.h> it would be unnecessary, because the header is *always* included already. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add <nft.h> header and include it as firstThomas Haller2023-08-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <config.h> is generated by the configure script. As it contains our feature detection, it want to use it everywhere. Likewise, in some of our sources, we define _GNU_SOURCE. This defines the C variant we want to use. Such a define need to come before anything else, and it would be confusing if different source files adhere to a different C variant. It would be good to use autoconf's AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure that <config.h> is always included as first. Instead of going through all source files and include <config.h> as first, add a new header "include/nft.h", which is supposed to be included in all our sources (and as first). This will also allow us later to prepare some common base, like include <stdbool.h> everywhere. We aim that headers are self-contained, so that they can be included in any order. Which, by the way, already didn't work because some headers define _GNU_SOURCE, which would only work if the header gets included as first. <nft.h> is however an exception to the rule: everything we compile shall rely on having <nft.h> header included as first. This applies to source files (which explicitly include <nft.h>) and to internal header files (which are only compiled indirectly, by being included from a source file). Note that <config.h> has no include guards, which is at least ugly to include multiple times. It doesn't cause problems in practice, because it only contains defines and the compiler doesn't warn about redefining a macro with the same value. Still, <nft.h> also ensures to include <config.h> exactly once. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftutils: add and use wrappers for getprotoby{name,number}_r(), ↵Thomas Haller2023-08-201-3/+4
| | | | | | | | | | | | | | | getservbyport_r() We should aim to use the thread-safe variants of getprotoby{name,number} and getservbyport(). However, they may not be available with other libc, so it requires a configure check. As that is cumbersome, add wrappers that do that at one place. These wrappers are thread-safe, if libc provides the reentrant versions. Use them. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: allow ct timeouts to use time_spec valuesFlorian Westphal2023-08-031-3/+6
| | | | | | | | | | | | | | | For some reason the parser only allows raw numbers (seconds) for ct timeouts, e.g. ct timeout ttcp { protocol tcp; policy = { syn_sent : 3, ... Also permit time_spec, e.g. "established : 5d". Print the nicer time formats on output, but retain raw numbers support on input for compatibility. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct expectation: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-07-311-0/+1
| | | | | | | | | | Just like "ct timeout", "ct expectation" is in need of the same fix, we get segfault on "nft list ct expectation table t", if table t exists. This is the exact same pattern as resolved for "ct timeout" in commit 1d2e22fc0521 ("ct timeout: fix 'list object x' vs. 'list objects in table' confusion"). Signed-off-by: Florian Westphal <fw@strlen.de>
* rule: allow src/dstnat prios in input and outputFlorian Westphal2023-07-311-2/+4
| | | | | | | | | | | | | | Dan Winship says: The "dnat" command is usable from either "prerouting" or "output", but the "dstnat" priority is only usable from "prerouting". (Likewise, "snat" is usable from either "postrouting" or "input", but "srcnat" is only usable from "postrouting".) No need to restrict those priorities to pre/postrouting. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1694 Signed-off-by: Florian Westphal <fw@strlen.de>
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-131-3/+12
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Cache looked up set for list commandsPhil Sutter2023-07-131-4/+8
| | | | | | | | | | Evaluation phase checks the given table and set exist in cache. Relieve execution phase from having to perform the lookup again by storing the set reference in cmd->set. Just have to increase the ref counter so cmd_free() does the right thing (which lacked handling of MAP and METER objects for some reason). Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-211-1/+1
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* cache: include set elements in "nft set list"Florian Westphal2023-06-191-7/+1
| | | | | | | | | | | | | | | | | Make "nft list sets" include set elements in listing by default. In nftables 1.0.0, "nft list sets" did not include the set elements, but with "--json" they were included. 1.0.1 and newer never include them. This causes a problem for people updating from 1.0.0 and relying on the presence of the set elements. Change nftables to always include the set elements. The "--terse" option is honored to get the "no elements" behaviour. Fixes: a1a6b0a5c3c4 ("cache: finer grain cache population for list commands") Link: https://marc.info/?l=netfilter&m=168704941828372&w=2 Signed-off-by: Florian Westphal <fw@strlen.de>
* src: fix enum/integer mismatchesFlorian Westphal2023-04-291-2/+2
| | | | | | | | | | | | | | | | | | | gcc 13 complains about type confusion: cache.c:1178:5: warning: conflicting types for 'nft_cache_update' due to enum/integer mismatch; have 'int(struct nft_ctx *, unsigned int, struct list_head *, const struct nft_cache_filter *)' [-Wenum-int-mismatch] cache.h:74:5: note: previous declaration of 'nft_cache_update' with type 'int(struct nft_ctx *, enum cmd_ops, struct list_head *, const struct nft_cache_filter *)' Same for: rule.c:1915:13: warning: conflicting types for 'obj_type_name' due to enum/integer mismatch; have 'const char *(enum stmt_types)' [-Wenum-int-mismatch] 1915 | const char *obj_type_name(enum stmt_types type) | ^~~~~~~~~~~~~ expression.c:1543:24: warning: conflicting types for 'expr_ops_by_type' due to enum/integer mismatch; have 'const struct expr_ops *(uint32_t)' {aka 'const struct expr_ops *(unsigned int)'} [-Wenum-int-mismatch] 1543 | const struct expr_ops *expr_ops_by_type(uint32_t value) | ^~~~~~~~~~~~~~~~ Convert to the stricter type (enum) where possible. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: allow for updating devices on existing netdev chainPablo Neira Ayuso2023-04-241-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to add/remove devices to an existing chain: # cat ruleset.nft table netdev x { chain y { type filter hook ingress devices = { eth0 } priority 0; policy accept; } } # nft -f ruleset.nft # nft add chain netdev x y '{ devices = { eth1 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth0, eth1 } priority 0; policy accept; } } # nft delete chain netdev x y '{ devices = { eth0 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth1 } priority 0; policy accept; } } This feature allows for creating an empty netdev chain, with no devices. In such case, no packets are seen until a device is registered. This patch includes extended netlink error reporting: # nft add chain netdev x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add chain netdev x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Avoid a memleak with 'reset rules' commandPhil Sutter2023-03-201-5/+0
| | | | | | | | | | | | | | | | | | | Like other 'reset' commands, 'reset rules' also lists the (part of the) ruleset which was affected to give users a chance to store the zeroed values. Therefore do_command_reset() calls do_command_list(). This in turn calls do_list_ruleset() for CMD_OBJ_RULES which wasn't prepared for values stored in cmd->handle other than a possible family value and thus freely reused the pointers as scratch area for the do_list_table() call whiich in the past fetched each table's data directly from kernel. Meanwhile ruleset listing code has been integrated into the common caching logic, the 'cmd' pointer became unused by do_list_table(). The temporary cmd->handle manipulation is not needed anymore, dropping it prevents a memleak caused by overwriting of allocated table name pointer. Fixes: 1694df2de79f3 ("Implement 'reset rule' and 'reset rules' commands") Signed-off-by: Phil Sutter <phil@nwl.cc>
* Reduce signature of do_list_table()Phil Sutter2023-03-201-4/+3
| | | | | | | | | Since commit 16fac7d11bdf5 ("src: use cache infrastructure for rule objects"), the function does not use the passed 'cmd' object anymore. Remove it to affirm correctness of a follow-up fix and simplification in do_list_ruleset(). Signed-off-by: Phil Sutter <phil@nwl.cc>
* cmd: move command functions to src/cmd.cPablo Neira Ayuso2023-03-111-206/+0
| | | | | | Move several command functions to src/cmd.c to debloat src/rule.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: expand table command before evaluationPablo Neira Ayuso2023-02-241-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested syntax notation results in one single table command which includes all other objects. This differs from the flat notation where there is usually one command per object. This patch adds a previous step to the evaluation phase to expand the objects that are contained in the table into independent commands, so both notations have similar representations. Remove the code to evaluate the nested representation in the evaluation phase since commands are independently evaluated after the expansion. The commands are expanded after the set element collapse step, in case that there is a long list of singleton element commands to be added to the set, to shorten the command list iteration. This approach also avoids interference with the object cache that is populated in the evaluation, which might refer to objects coming in the existing command list that is being processed. There is still a post_expand phase to detach the elements from the set which could be consolidated by updating the evaluation step to handle the CMD_OBJ_SETELEMS command type. This patch fixes 27c753e4a8d4 ("rule: expand standalone chain that contains rules") which broke rule addition/insertion by index because the expansion code after the evaluation messes up the cache. Fixes: 27c753e4a8d4 ("rule: expand standalone chain that contains rules") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: expand standalone chain that contains rulesPablo Neira Ayuso2023-02-071-3/+12
| | | | | | | | | | | | | | | | Otherwise rules that this chain contains are ignored when expressed using the following syntax: chain inet filter input2 { type filter hook input priority filter; policy accept; ip saddr 1.2.3.4 tcp dport { 22, 443, 123 } drop } When expanding the chain, remove the rule so the new CMD_OBJ_CHAIN case does not expand it again. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1655 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: add helper function to expand chain rules into commandsPablo Neira Ayuso2023-02-071-17/+22
| | | | | | | This patch adds a helper function to expand chain rules into commands. This comes in preparation for the follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to command "destroy"Fernando F. Mancera2023-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | "destroy" command performs a deletion as "delete" command but does not fail if the object does not exist. As there is no NLM_F_* flag for ignoring such error, it needs to be ignored directly on error handling. Example of use: # nft list ruleset table ip filter { chain output { } } # nft destroy table ip missingtable # echo $? 0 # nft list ruleset table ip filter { chain output { } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset rule' and 'reset rules' commandsPhil Sutter2023-01-181-0/+10
| | | | | | | | Reset rule counters and quotas in kernel, i.e. without having to reload them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET message type. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: add vxlan matching supportPablo Neira Ayuso2023-01-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the initial infrastructure to support for inner header tunnel matching and its first user: vxlan. A new struct proto_desc field for payload and meta expression to specify that the expression refers to inner header matching is used. The existing codebase to generate bytecode is fully reused, allowing for reusing existing supported layer 2, 3 and 4 protocols. Syntax requires to specify vxlan before the inner protocol field: ... vxlan ip protocol udp ... vxlan ip saddr 1.2.3.0/24 This also works with concatenations and anonymous sets, eg. ... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 } You have to restrict vxlan matching to udp traffic, otherwise it complains on missing transport protocol dependency, e.g. ... udp dport 4789 vxlan ip daddr 1.2.3.4 The bytecode that is generated uses the new inner expression: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] JSON support is not included in this patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Warn for tables with compat expressions in rulesPhil Sutter2022-11-181-3/+13
| | | | | | | | | | | | | | | | | | While being able to "look inside" compat expressions using nft is a nice feature, it is also (yet another) pitfall for unaware users, deceiving them into assuming interchangeability (or at least compatibility) between iptables-nft and nft. In reality, which involves 'nft list ruleset | nft -f -', any correctly translated compat expressions will turn into native nftables ones not understood by (the version of) iptables-nft which created them in the first place. Other compat expressions will vanish, potentially compromising the firewall ruleset. Emit a warning (as comment) to give users a chance to stop and reconsider before shooting their own foot. Signed-off-by: Phil Sutter <phil@nwl.cc>
* rule: do not display handle for implicit chainPablo Neira Ayuso2022-10-071-0/+6
| | | | | | | | | Implicit chains do not allow for incremental updates, do not display rule handle since kernel refuses to update an implicit chain which is already bound. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1615 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: check address family in set collapseDerek Hageman2022-09-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 498a5f0c219d added collapsing of set operations in different commands. However, the logic is currently too relaxed. It is valid to have a table and set with identical names on different address families. For example: table ip a { set x { type inet_service; } } table ip6 a { set x { type inet_service; } } add element ip a x { 1 } add element ip a x { 2 } add element ip6 a x { 2 } The above currently results in nothing being added to the ip6 family table due to being collapsed into the ip table add. Prior to 498a5f0c219d the set add would work. The fix is simply to check the family in addition to the table and set names before allowing a collapse. [ Add testcase to tests/shell --pablo ] Fixes: 498a5f0c219d ("rule: collapse set element commands") Signed-off-by: Derek Hageman <hageman@inthat.cloud> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: crash when uncollapsing command with unexisting table or setPablo Neira Ayuso2022-07-071-1/+3
| | | | | | | | If ruleset update refers to an unexisting table or set, then cmd->elem.set is NULL. Fixes: 498a5f0c219d ("rule: collapse set element commands") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: store netlink error location for set elementsPablo Neira Ayuso2022-06-271-6/+6
| | | | | | | | | | | | | | | | | Store set element location in the per-command netlink error location array. This allows for fine grain error reporting when adding and deleting elements. # nft -f test.nft test.nft:5:4-20: Error: Could not process rule: File exists 00:01:45:09:0b:26 : drop, ^^^^^^^^^^^^^^^^^ test.nft contains a large map with one redundant entry. Thus, users do not have to find the needle in the stack. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove NFT_NLATTR_LOC_MAX limit for netlink location error reportingPablo Neira Ayuso2022-06-271-2/+8
| | | | | | | Set might have more than 16 elements, use a runtime array to store netlink error location. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: collapse set element commandsPablo Neira Ayuso2022-06-191-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Robots might generate a long list of singleton element commands such as: add element t s { 1.0.1.0/24 } ... add element t s { 1.0.2.0/23 } collapse them into one single command before the evaluation step, ie. add element t s { 1.0.1.0/24, ..., 1.0.2.0/23 } this speeds up overlap detection and set element automerge operations in this worst case scenario. Since 3da9643fb9ff9 ("intervals: add support to automerge with kernel elements"), the new interval tracking relies on mergesort. The pattern above triggers the set sorting for each element. This patch adds a list to cmd objects that store collapsed commands. Moreover, expressions also contain a reference to the original command, to uncollapse the commands after the evaluation step. These commands are uncollapsed after the evaluation step to ensure error reporting works as expected (command and netlink message are mapped 1:1). For the record: - nftables versions <= 1.0.2 did not perform any kind of overlap check for the described scenario above (because set cache only contained elements in the kernel in this case). This is a problem for kernels < 5.7 which rely on userspace to detect overlaps. - the overlap detection could be skipped for kernels >= 5.7. - The extended netlink error reporting available for set elements since 5.19-rc might allow to remove the uncollapse step, in this case, error reporting does not rely on the netlink sequence to refer to the command triggering the problem. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: simplify chain lookup in do_list_chainChander Govindarajan2022-05-311-6/+2
| | | | | | | | use the chain_cache_find function for faster lookup of chain instead of iterating over all chains in table Signed-off-by: ChanderG <mail@chandergovind.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* intervals: add support to automerge with kernel elementsPablo Neira Ayuso2022-04-131-10/+0
| | | | | | | | | | | | | | | | | | Extend the interval codebase to support for merging elements in the kernel with userspace element updates. Add a list of elements to be purged to cmd and set objects. These elements representing outdated intervals are deleted before adding the updated ranges. This routine splices the list of userspace and kernel elements, then it mergesorts to identify overlapping and contiguous ranges. This splice operation is undone so the set userspace cache remains consistent. Incrementally update the elements in the cache, this allows to remove dd44081d91ce ("segtree: Fix add and delete of element in same batch"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: update mnl_nft_setelem_del() to allow for more reusePablo Neira Ayuso2022-04-131-1/+1
| | | | | | Pass handle and element list as parameters to allow for code reuse. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace interval segment tree overlap and automergePablo Neira Ayuso2022-04-131-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rewrite of the segtree interval codebase. This patch now splits the original set_to_interval() function in three routines: - add set_automerge() to merge overlapping and contiguous ranges. The elements, expressed either as single value, prefix and ranges are all first normalized to ranges. This elements expressed as ranges are mergesorted. Then, there is a linear list inspection to check for merge candidates. This code only merges elements in the same batch, ie. it does not merge elements in the kernela and the userspace batch. - add set_overlap() to check for overlapping set elements. Linux kernel >= 5.7 already checks for overlaps, older kernels still needs this code. This code checks for two conflict types: 1) between elements in this batch. 2) between elements in this batch and kernelspace. The elements in the kernel are temporarily merged into the list of elements in the batch to check for this overlaps. The EXPR_F_KERNEL flag allows us to restore the set cache after the overlap check has been performed. - set_to_interval() now only transforms set elements, expressed as range e.g. [a,b], to individual set elements using the EXPR_F_INTERVAL_END flag notation to represent e.g. [a,b+1), where b+1 has the EXPR_F_INTERVAL_END flag set on. More relevant updates: - The overlap and automerge routines are now performed in the evaluation phase. - The userspace set object representation now stores a reference to the existing kernel set object (in case there is already a set with this same name in the kernel). This is required by the new overlap and automerge approach. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: Avoid segfault with anonymous chainsPablo Neira Ayuso2022-03-171-0/+3
| | | | | | | | | | | | | Phil Sutter says: "When trying to add a rule which contains an anonymous chain to a non-existent chain, string_misspell_update() is called with a NULL string because the anonymous chain has no name. Avoid this by making the function NULL-pointer tolerant." Fixes: c330152b7f777 ("src: support for implicit chain bindings") Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>