summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* datatype: fix leak and cleanup reference counting for struct datatypeThomas Haller2023-09-142-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test `./tests/shell/run-tests.sh -V tests/shell/testcases/maps/nat_addr_port` fails: ==118== 195 (112 direct, 83 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3 ==118== at 0x484682C: calloc (vg_replace_malloc.c:1554) ==118== by 0x48A39DD: xmalloc (utils.c:37) ==118== by 0x48A39DD: xzalloc (utils.c:76) ==118== by 0x487BDFD: datatype_alloc (datatype.c:1205) ==118== by 0x487BDFD: concat_type_alloc (datatype.c:1288) ==118== by 0x488229D: stmt_evaluate_nat_map (evaluate.c:3786) ==118== by 0x488229D: stmt_evaluate_nat (evaluate.c:3892) ==118== by 0x488229D: stmt_evaluate (evaluate.c:4450) ==118== by 0x488328E: rule_evaluate (evaluate.c:4956) ==118== by 0x48ADC71: nft_evaluate (libnftables.c:552) ==118== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:595) ==118== by 0x402983: main (main.c:534) I think the reference handling for datatype is wrong. It was introduced by commit 01a13882bb59 ('src: add reference counter for dynamic datatypes'). We don't notice it most of the time, because instances are statically allocated, where datatype_get()/datatype_free() is a NOP. Fix and rework. - Commit 01a13882bb59 comments "The reference counter of any newly allocated datatype is set to zero". That seems not workable. Previously, functions like datatype_clone() would have returned the refcnt set to zero. Some callers would then then set the refcnt to one, but some wouldn't (set_datatype_alloc()). Calling datatype_free() with a refcnt of zero will overflow to UINT_MAX and leak: if (--dtype->refcnt > 0) return; While there could be schemes with such asymmetric counting that juggle the appropriate number of datatype_get() and datatype_free() calls, this is confusing and error prone. The common pattern is that every alloc/clone/get/ref is paired with exactly one unref/free. Let datatype_clone() return references with refcnt set 1 and in general be always clear about where we transfer ownership (take a reference) and where we need to release it. - set_datatype_alloc() needs to consistently return ownership to the reference. Previously, some code paths would and others wouldn't. - Replace datatype_set(key, set_datatype_alloc(dtype, key->byteorder)) with a __datatype_set() with takes ownership. Fixes: 01a13882bb59 ('src: add reference counter for dynamic datatypes') Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <stdlib.h> in <nft.h>Thomas Haller2023-09-112-1/+1
| | | | | | | | | | | | | | It provides malloc()/free(), which is so basic that we need it everywhere. Include via <nft.h>. The ultimate purpose is to define more things in <nft.h>. While it has not corresponding C sources, <nft.h> can contain macros and static inline functions, and is a good place for things that we shall have everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that is a very basic dependency, that will be needed for that. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: rename "dtype_clone()" to datatype_clone()Thomas Haller2023-09-081-1/+1
| | | | | | | | | | | | | The struct is called "datatype" and related functions have the fitting "datatype_" prefix. Rename. Also rename the internal "dtype_alloc()" to "datatype_alloc()". This is a follow up to commit 01a13882bb59 ('src: add reference counter for dynamic datatypes'), which started adding "datatype_*()" functions. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: simplify chain_alloc()Pablo Neira Ayuso2023-08-311-1/+1
| | | | | | | Remove parameter to set the chain name which is only used from netlink path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* utils: call abort() after BUG() macroThomas Haller2023-08-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Otherwise, we get spurious warnings. The compiler should be aware that there is no return from BUG(). Call abort() there, which is marked as __attribute__ ((__noreturn__)). In file included from ./include/nftables.h:6, from ./include/rule.h:4, from src/payload.c:26: src/payload.c: In function 'icmp_dep_to_type': ./include/utils.h:39:34: error: this statement may fall through [-Werror=implicit-fallthrough=] 39 | #define BUG(fmt, arg...) ({ fprintf(stderr, "BUG: " fmt, ##arg); assert(0); }) | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/payload.c:791:17: note: in expansion of macro 'BUG' 791 | BUG("Invalid map for simple dependency"); | ^~~ src/payload.c:792:9: note: here 792 | case PROTO_ICMP_ECHO: return ICMP_ECHO; | ^~~~ Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: drop "format" attribute from nft_gmp_print()Thomas Haller2023-08-291-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | nft_gmp_print() passes the format string and arguments to gmp_vfprintf(). Note that the format string is then interpreted by gmp, which also understand special specifiers like "%Zx". Note that with clang we get various compiler warnings: datatype.c:299:26: error: invalid conversion specifier 'Z' [-Werror,-Wformat-invalid-specifier] nft_gmp_print(octx, "0x%Zx [invalid type]", expr->value); ~^ gcc doesn't warn, because to gcc 'Z' is a deprecated alias for 'z' and because the 3rd argument of the attribute((format())) is zero (so gcc doesn't validate the arguments). But Z specifier in gmp expects a "mpz_t" value and not a size_t. It's really not the same thing. The correct solution is not to mark the function to accept a printf format string. Fixes: 2535ba7006f2 ('src: get rid of printf') Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: rework SNPRINTF_BUFFER_SIZE() and handle truncationThomas Haller2023-08-291-9/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before, the macro asserts against truncation. This is despite the callers still checked for truncation and tried to handle it. Probably for good reason. With stmt_evaluate_log_prefix() it's not clear that the code ensures that truncation cannot happen, so we must not assert against it, but handle it. Also, - wrap the macro in "do { ... } while(0)" to make it more function-like. - evaluate macro arguments exactly once, to make it more function-like. - take pointers to the arguments that are being modified. - use assert() instead of abort(). - use size_t type for arguments related to the buffer size. - drop "size". It was mostly redundant to "offset". We can know everything we want based on "len" and "offset" alone. - "offset" previously was incremented before checking for truncation. So it would point somewhere past the buffer. This behavior does not seem useful. Instead, on truncation "len" will be zero (as before) and "offset" will point one past the buffer (one past the terminating NUL). Thereby, also fix a warning from clang: evaluate.c:4134:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable] size_t size = 0; ^ meta.c:1006:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable] size_t size; ^ Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <std{bool,int}.h> via <nft.h>Thomas Haller2023-08-257-7/+3
| | | | | | | | | | | | | | | | | | | | There is a minimum base that all our sources will end up needing. This is what <nft.h> provides. Add <stdbool.h> and <stdint.h> there. It's unlikely that we want to implement anything, without having "bool" and "uint32_t" types available. Yes, this means the internal headers are not self-contained, with respect to what <nft.h> provides. This is the exception to the rule, and our internal headers should rely to have <nft.h> included for them. They should not include <nft.h> themselves, because <nft.h> needs always be included as first. So when an internal header would include <nft.h> it would be unnecessary, because the header is *always* included already. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* configure: use AC_USE_SYSTEM_EXTENSIONS to get _GNU_SOURCEThomas Haller2023-08-251-2/+0
| | | | | | | | | | | | | | | | | Let "configure" detect which features are available. Also, nftables is a Linux project, so portability beyond gcc/clang and glibc/musl is less relevant. And even if it were, then feature detection by "configure" would still be preferable. Use AC_USE_SYSTEM_EXTENSIONS ([1]). Available since autoconf 2.60, from 2006 ([2]). [1] https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/Posix-Variants.html#index-AC_005fUSE_005fSYSTEM_005fEXTENSIONS-1046 [2] https://lists.gnu.org/archive/html/autoconf/2006-06/msg00111.html Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: don't define _GNU_SOURCE in public headerThomas Haller2023-08-251-1/+0
| | | | | | | | | | | | | | _GNU_SOURCE is supposed to be defined as first thing, before including any libc headers. Defining it in the public header of nftables is wrong, because it would only (somewhat) work if the user includes the nftables header as first thing too. But that is not what users commonly would do, in particular with autotools projects, where users would include <config.h> first. It's also unnecessary. Nothing in "nftables/libnftables.h" itself requires _GNU_SOURCE. Drop it. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add <nft.h> header and include it as firstThomas Haller2023-08-255-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <config.h> is generated by the configure script. As it contains our feature detection, it want to use it everywhere. Likewise, in some of our sources, we define _GNU_SOURCE. This defines the C variant we want to use. Such a define need to come before anything else, and it would be confusing if different source files adhere to a different C variant. It would be good to use autoconf's AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure that <config.h> is always included as first. Instead of going through all source files and include <config.h> as first, add a new header "include/nft.h", which is supposed to be included in all our sources (and as first). This will also allow us later to prepare some common base, like include <stdbool.h> everywhere. We aim that headers are self-contained, so that they can be included in any order. Which, by the way, already didn't work because some headers define _GNU_SOURCE, which would only work if the header gets included as first. <nft.h> is however an exception to the rule: everything we compile shall rely on having <nft.h> header included as first. This applies to source files (which explicitly include <nft.h>) and to internal header files (which are only compiled indirectly, by being included from a source file). Note that <config.h> has no include guards, which is at least ugly to include multiple times. It doesn't cause problems in practice, because it only contains defines and the compiler doesn't warn about redefining a macro with the same value. Still, <nft.h> also ensures to include <config.h> exactly once. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flag NFT_CTX_INPUT_JSON to enable JSON parsingThomas Haller2023-08-242-0/+6
| | | | | | | | | | | | | | By default, the input is parsed using the nftables grammar. When setting NFT_CTX_OUTPUT_JSON flag, nftables will first try to parse the input as JSON before falling back to the nftables grammar. But NFT_CTX_OUTPUT_JSON flag also turns on JSON for the output. Add a flag NFT_CTX_INPUT_JSON which allows to treat only the input as JSON, but keep the output mode unchanged. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flag NFT_CTX_INPUT_NO_DNS to avoid blockingThomas Haller2023-08-243-0/+10
| | | | | | | | | | | | | | | | | | | getaddrinfo() blocks while trying to resolve the name. Blocking the caller of the library is in many cases undesirable. Also, while reconfiguring the firewall, it's not clear that resolving names via the network will work or makes sense. Add a new input flag NFT_CTX_INPUT_NO_DNS to opt-out from getaddrinfo() and only accept plain IP addresses. We could also use AI_NUMERICHOST with getaddrinfo() instead of inet_pton(). By parsing via inet_pton(), we are better aware of what we expect and can generate a better error message in case of failure. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add input flags for nft_ctxThomas Haller2023-08-242-0/+8
| | | | | | | | | | | | Similar to the existing output flags, add input flags. No flags are yet implemented, that will follow. One difference to nft_ctx_output_set_flags(), is that the setter for input flags returns the previously set flags. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ct expectation: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-07-311-0/+1
| | | | | | | | | | Just like "ct timeout", "ct expectation" is in need of the same fix, we get segfault on "nft list ct expectation table t", if table t exists. This is the exact same pattern as resolved for "ct timeout" in commit 1d2e22fc0521 ("ct timeout: fix 'list object x' vs. 'list objects in table' confusion"). Signed-off-by: Florian Westphal <fw@strlen.de>
* include: missing dccpopt.h breaks make distcheckPablo Neira Ayuso2023-07-141-0/+1
| | | | | | Add it to Makefile.am. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-133-4/+9
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cli: Make cli_init() return to callerPhil Sutter2023-07-041-1/+1
| | | | | | | | | | | | | | | Avoid direct exit() calls as that leaves the caller-allocated nft_ctx object in place. Making sure it is freed helps with valgrind-analyses at least. To signal desired exit from CLI, introduce global cli_quit boolean and make all cli_exit() implementations also set cli_rc variable to the appropriate return code. The logic is to finish CLI only if cli_quit is true which asserts proper cleanup as it is set only by the respective cli_exit() function. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-211-1/+1
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add json support for last statementPablo Neira Ayuso2023-06-201-0/+2
| | | | | | | | | | This patch adds json support for the last statement, it works for me here. However, tests/py still displays a warning: any/last.t: WARNING: line 12: '{"nftables": [{"add": {"rule": {"family": "ip", "table": "test-ip4", "chain": "input", "expr": [{"last": {"used": 300000}}]}}}]}': '[{"last": {"used": 300000}}]' mismatches '[{"last": null}]' Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: add boolean DCCP option matchingJeremy Sowden2023-06-013-0/+45
| | | | | | | | | | Iptables supports the matching of DCCP packets based on the presence or absence of DCCP options. Extend exthdr expressions to add this functionality to nftables. Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-221-3/+21
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* datatype: add hint error handlerPablo Neira Ayuso2023-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | If user provides a symbol that cannot be parsed and the datatype provides an error handler, provide a hint through the misspell infrastructure. For instance: # cat test.nft table ip x { map y { typeof ip saddr : verdict elements = { 1.2.3.4 : filter_server1 } } } # nft -f test.nft test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'? elements = { 1.2.3.4 : filter_server1 } ^^^^^^^^^^^^^^ While at it, normalize error to "Could not parse symbolic %s expression". Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: introduce meta broute supportSriram Yagnaraman2023-04-291-0/+2
| | | | | | | | | | | Can be used in bridge prerouting hook to divert a packet to the ip stack for routing. This is a replacement for "ebtables -t broute" functionality. Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20230224095251.11249-1-sriram.yagnaraman@est.tech/ Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: fix enum/integer mismatchesFlorian Westphal2023-04-292-2/+2
| | | | | | | | | | | | | | | | | | | gcc 13 complains about type confusion: cache.c:1178:5: warning: conflicting types for 'nft_cache_update' due to enum/integer mismatch; have 'int(struct nft_ctx *, unsigned int, struct list_head *, const struct nft_cache_filter *)' [-Wenum-int-mismatch] cache.h:74:5: note: previous declaration of 'nft_cache_update' with type 'int(struct nft_ctx *, enum cmd_ops, struct list_head *, const struct nft_cache_filter *)' Same for: rule.c:1915:13: warning: conflicting types for 'obj_type_name' due to enum/integer mismatch; have 'const char *(enum stmt_types)' [-Wenum-int-mismatch] 1915 | const char *obj_type_name(enum stmt_types type) | ^~~~~~~~~~~~~ expression.c:1543:24: warning: conflicting types for 'expr_ops_by_type' due to enum/integer mismatch; have 'const struct expr_ops *(uint32_t)' {aka 'const struct expr_ops *(unsigned int)'} [-Wenum-int-mismatch] 1543 | const struct expr_ops *expr_ops_by_type(uint32_t value) | ^~~~~~~~~~~~~~~~ Convert to the stricter type (enum) where possible. Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: set SO_SNDBUF before SO_SNDBUFFORCEPablo Neira Ayuso2023-04-242-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set SO_SNDBUF before SO_SNDBUFFORCE: Unpriviledged user namespace does not have CAP_NET_ADMIN on the host (user_init_ns) namespace. SO_SNDBUF always succeeds in Linux, always try SO_SNDBUFFORCE after it. Moreover, suggest the user to bump socket limits if EMSGSIZE after having see EPERM previously, when calling SO_SNDBUFFORCE. Provide a hint to the user too: # nft -f test.nft netlink: Error: Could not process rule: Message too long Please, rise /proc/sys/net/core/wmem_max on the host namespace. Hint: 4194304 bytes Dave Pfike says: Prior to this patch, nft inside a systemd-nspawn container was failing to install my ruleset (which includes a large-ish map), with the error netlink: Error: Could not process rule: Message too long strace reveals: setsockopt(3, SOL_SOCKET, SO_SNDBUFFORCE, [524288], 4) = -1 EPERM (Operation not permitted) This is despite the nspawn process supposedly having CAP_NET_ADMIN. A web search reveals at least one other user having the same issue: https://old.reddit.com/r/Proxmox/comments/scnoav/lxc_container_debian_11_nftables_geoblocking/ Reported-by: Dave Pifke <dave@pifke.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: support shifts larger than the width of the left operandPablo Neira Ayuso2023-03-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | If we want to left-shift a value of narrower type and assign the result to a variable of a wider type, we are constrained to only shifting up to the width of the narrower type. Thus: add rule t c meta mark set ip dscp << 2 works, but: add rule t c meta mark set ip dscp << 8 does not, even though the lvalue is large enough to accommodate the result. Upgrade the maximum length based on the statement datatype length, which is provided via context, if it is larger than expression lvalue. Update netlink_delinearize.c to handle the case where the length of a shift expression does not match that of its left-hand operand. Based on patch from Jeremy Sowden. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix a couple of typo's in commentsJeremy Sowden2023-03-121-1/+1
| | | | | Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* cmd: move command functions to src/cmd.cPablo Neira Ayuso2023-03-112-6/+6
| | | | | | Move several command functions to src/cmd.c to debloat src/rule.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add last statementPablo Neira Ayuso2023-02-282-0/+11
| | | | | | | | | | | | | | | | | | | | | This new statement allows you to know how long ago there was a matching packet. # nft list ruleset table ip x { chain y { [...] ip protocol icmp last used 49m54s884ms counter packets 1 bytes 64 } } if this statement never sees a packet, then the listing says: ip protocol icmp last used never counter packets 0 bytes 0 Add tests/py in this patch too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: expand table command before evaluationPablo Neira Ayuso2023-02-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested syntax notation results in one single table command which includes all other objects. This differs from the flat notation where there is usually one command per object. This patch adds a previous step to the evaluation phase to expand the objects that are contained in the table into independent commands, so both notations have similar representations. Remove the code to evaluate the nested representation in the evaluation phase since commands are independently evaluated after the expansion. The commands are expanded after the set element collapse step, in case that there is a long list of singleton element commands to be added to the set, to shorten the command list iteration. This approach also avoids interference with the object cache that is populated in the evaluation, which might refer to objects coming in the existing command list that is being processed. There is still a post_expand phase to detach the elements from the set which could be consolidated by updating the evaluation step to handle the CMD_OBJ_SETELEMS command type. This patch fixes 27c753e4a8d4 ("rule: expand standalone chain that contains rules") which broke rule addition/insertion by index because the expansion code after the evaluation messes up the cache. Fixes: 27c753e4a8d4 ("rule: expand standalone chain that contains rules") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use start condition with new destroy commandPablo Neira Ayuso2023-02-211-0/+1
| | | | | | | | | | | | | tests/py reports the following problem: any/ct.t: ERROR: line 116: add rule ip test-ip4 output ct event set new | related | destroy | label: This rule should not have failed. any/ct.t: ERROR: line 117: add rule ip test-ip4 output ct event set new,related,destroy,label: This rule should not have failed. any/ct.t: ERROR: line 118: add rule ip test-ip4 output ct event set new,destroy: This rule should not have failed. Use start condition and update parser to handle 'destroy' keyword. Fixes: e1dfd5cc4c46 ("src: add support to command "destroy") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to command "destroy"Fernando F. Mancera2023-02-062-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | "destroy" command performs a deletion as "delete" command but does not fail if the object does not exist. As there is no NLM_F_* flag for ignoring such error, it needs to be ignored directly on error handling. Example of use: # nft list ruleset table ip filter { chain output { } } # nft destroy table ip missingtable # echo $? 0 # nft list ruleset table ip filter { chain output { } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset rule' and 'reset rules' commandsPhil Sutter2023-01-185-1/+15
| | | | | | | | Reset rule counters and quotas in kernel, i.e. without having to reload them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET message type. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: add gretap supportPablo Neira Ayuso2023-01-021-0/+2
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add geneve matching supportPablo Neira Ayuso2023-01-021-0/+13
| | | | | | Add support for GENEVE vni and (ether) type header field. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add gre supportPablo Neira Ayuso2023-01-023-0/+17
| | | | | | | | | | | | | GRE has a number of fields that are conditional based on flags, which requires custom dependency code similar to icmp and icmpv6. Matching on optional fields is not supported at this stage. Since this is a layer 3 tunnel protocol, an implicit dependency on NFT_META_L4PROTO for IPPROTO_GRE is generated. To achieve this, this patch adds new infrastructure to remove an outer dependency based on the inner protocol from delinearize path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: display (inner) tag in --debug=proto-ctxPablo Neira Ayuso2023-01-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For easier debugging, add decoration on protocol context: # nft --debug=proto-ctx add rule netdev x y udp dport 4789 vxlan ip protocol icmp counter update link layer protocol context (inner): link layer : netdev <- network layer : none transport layer : none payload data : none update network layer protocol context (inner): link layer : netdev network layer : ip <- transport layer : none payload data : none update network layer protocol context (inner): link layer : netdev network layer : ip <- transport layer : none payload data : none update transport layer protocol context (inner): link layer : netdev network layer : ip transport layer : icmp <- payload data : none Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add vxlan matching supportPablo Neira Ayuso2023-01-026-3/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the initial infrastructure to support for inner header tunnel matching and its first user: vxlan. A new struct proto_desc field for payload and meta expression to specify that the expression refers to inner header matching is used. The existing codebase to generate bytecode is fully reused, allowing for reusing existing supported layer 2, 3 and 4 protocols. Syntax requires to specify vxlan before the inner protocol field: ... vxlan ip protocol udp ... vxlan ip saddr 1.2.3.0/24 This also works with concatenations and anonymous sets, eg. ... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 } You have to restrict vxlan matching to udp traffic, otherwise it complains on missing transport protocol dependency, e.g. ... udp dport 4789 vxlan ip daddr 1.2.3.4 The bytecode that is generated uses the new inner expression: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] JSON support is not included in this patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add dl_proto_ctx()Pablo Neira Ayuso2023-01-021-1/+7
| | | | | | | | | | Add dl_proto_ctx() to access protocol context (struct proto_ctx and struct payload_dep_ctx) from the delinearize path. This patch comes in preparation for supporting outer and inner protocol context. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add eval_proto_ctx()Pablo Neira Ayuso2023-01-022-1/+4
| | | | | | | | | | | Add eval_proto_ctx() to access protocol context (struct proto_ctx). Rename struct proto_ctx field to _pctx to highlight that this field is internal and the helper function should be used. This patch comes in preparation for supporting outer and inner protocol context. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xt: Rewrite unsupported compat expression dumpingPhil Sutter2022-12-132-0/+3
| | | | | | | | | Choose a format which provides more information and is easily parseable. Then teach parsers about it and make it explicitly reject the ruleset giving a meaningful explanation. Also update the man pages with some more details. Signed-off-by: Phil Sutter <phil@nwl.cc>
* xt: Purify enum nft_xt_typePhil Sutter2022-12-131-1/+1
| | | | | | Remove NFT_XT_MAX from the enum, it is not a valid xt type. Signed-off-by: Phil Sutter <phil@nwl.cc>
* xt: Delay libxtables access until translationPhil Sutter2022-12-131-5/+4
| | | | | | | | | | | | | | | | | | There is no point in spending efforts setting up the xt match/target when it is not printed afterwards. So just store the statement data from libnftnl in struct xt_stmt and perform the extension lookup from xt_stmt_xlate() instead. This means some data structures are only temporarily allocated for the sake of passing to libxtables callbacks, no need to drag them around. Also no need to clone the looked up extension, it is needed only to call the functions it provides. While being at it, select numeric output in xt_xlate_*_params - otherwise there will be reverse DNS lookups which should not happen by default. Signed-off-by: Phil Sutter <phil@nwl.cc>
* Warn for tables with compat expressions in rulesPhil Sutter2022-11-181-0/+1
| | | | | | | | | | | | | | | | | | While being able to "look inside" compat expressions using nft is a nice feature, it is also (yet another) pitfall for unaware users, deceiving them into assuming interchangeability (or at least compatibility) between iptables-nft and nft. In reality, which involves 'nft list ruleset | nft -f -', any correctly translated compat expressions will turn into native nftables ones not understood by (the version of) iptables-nft which created them in the first place. Other compat expressions will vanish, potentially compromising the firewall ruleset. Emit a warning (as comment) to give users a chance to stop and reconsider before shooting their own foot. Signed-off-by: Phil Sutter <phil@nwl.cc>
* parser_bison: display too many levels of nesting errorPablo Neira Ayuso2022-10-071-0/+1
| | | | | | | | | | | | | Instead of hitting this assertion: nft: parser_bison.y:70: open_scope: Assertion `state->scope < array_size(state->scopes) - 1' failed. Aborted this is easier to trigger with implicit chains where one level of nesting from the existing chain scope is supported. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1615 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: resync nf_tables.h cache copyPablo Neira Ayuso2022-09-161-1/+27
| | | | | | Get this header in sync with nf-next as of 6.0-rc. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expr: update EXPR_MAX and add missing commentsFlorian Westphal2022-08-301-1/+6
| | | | | | | | | | | | | | | WHen flagcmp and catchall expressions got added the EXPR_MAX definition wasn't changed. Should have no impact in practice however, this value is only checked to prevent crash when old nft release is used to list a ruleset generated by a newer nft release and a unknown 'typeof' expression. v2: Pablo suggested to add EXPR_MAX into enum so its easier to spot. Adding __EXPR_MAX + define EXPR_MAX (__EXPR_MAX - 1) causes '__EXPR_MAX not handled in switch' warnings, hence the 'EXPR_MAX =' solution. Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delinearize: also postprocess OP_AND in set element contextFlorian Westphal2022-08-051-1/+3
| | | | | | | | | | | | | Pablo reports: add rule netdev nt y update @macset { vlan id timeout 5s } listing still shows the raw expression: update @macset { @ll,112,16 & 0xfff timeout 5s } so also cover the 'set element' case. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>