summaryrefslogtreecommitdiffstats
path: root/src/evaluate.c
Commit message (Collapse)AuthorAgeFilesLines
* evaluate: prevent merge of sets with incompatible keysFlorian Westphal5 days1-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Its not enough to check for interval flag, this would assert in interval code due to concat being passed to the interval code: BUG: unhandled key type 13 After fix: same_set_name_but_different_keys_assert:8:6-7: Error: set already exists with different datatype (concatenation of (IPv4 address, network interface index) vs network interface index) set s4 { ^^ This also improves error verbosity when mixing datamap and objref maps: invalid_transcation_merge_map_and_objref_map:9:13-13: Error: map already exists with different datatype (IPv4 address vs string) .. instead of 'Cannot merge map with incompatible existing map of same name'. The 'Cannot merge map with incompatible existing map of same name' check is kept in place to catch when ruleset contains a set and map with same name and same key definition. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check that set type is identical before mergingFlorian Westphal6 days1-2/+21
| | | | | | | | | | | | | | | | | | | | | | | Reject maps and sets of the same name: BUG: invalid range expression type catch-all set element nft: src/expression.c:1704: range_expr_value_low: Assertion `0' failed. After: Error: Cannot merge set with existing datamap of same name set z { ^ v2: Pablo points out that we shouldn't merge datamaps (plain value) and objref maps either, catch this too and add another test: nft --check -f invalid_transcation_merge_map_and_objref_map invalid_transcation_merge_map_and_objref_map:9:13-13: Error: Cannot merge map with incompatible existing map of same name We should also make sure that both data (for map case) and set keys are identical, this is added in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: avoid double-free on error handling of bogus objref mapsFlorian Westphal6 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 98c51aaac42b ("evaluate: bail out if anonymous concat set defines a non concat expression") clears set->init to avoid a double-free. Extend this to also handle object maps. The included bogon triggers a double-free of set->init expression: Error: unqualified type invalid specified in map definition. Try "typeof expression" instead of "type datatype". ct helper set ct saddr map { 1c3:: : "p", dead::beef : "myftp" } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This might not crash, depending on libc/malloc, but ASAN reports this: ==17728==ERROR: AddressSanitizer: heap-use-after-free on address 0x50b0000005e8 at .. READ of size 4 at 0x50b0000005e8 thread T0 #0 0x7f1be3cb7526 in expr_free src/expression.c:87 #1 0x7f1be3cbdf29 in map_expr_destroy src/expression.c:1488 #2 0x7f1be3cb74d5 in expr_destroy src/expression.c:80 #3 0x7f1be3cb75c6 in expr_free src/expression.c:96 #4 0x7f1be3d5925e in objref_stmt_destroy src/statement.c:331 #5 0x7f1be3d5831f in stmt_free src/statement.c:56 #6 0x7f1be3d583c2 in stmt_list_free src/statement.c:66 #7 0x7f1be3d42805 in rule_free src/rule.c:495 #8 0x7f1be3d48329 in cmd_free src/rule.c:1417 #9 0x7f1be3cd2c7c in __nft_run_cmd_from_filename src/libnftables.c:759 #10 0x7f1be3cd340c in nft_run_cmd_from_filename src/libnftables.c:847 #11 0x55dcde0440be in main src/main.c:535 Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: make sure chain jump name comes with a null byteFlorian Westphal6 days1-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a stack oob read access in netlink_gen_chain(): mpz_export_data(chain, expr->chain->value, BYTEORDER_HOST_ENDIAN, len); snprintf(data->chain, NFT_CHAIN_MAXNAMELEN, "%s", chain); There is no guarantee that chain[] is null terminated, so snprintf can read past chain[] array. ASAN report is: AddressSanitizer: stack-buffer-overflow on address 0x7ffff5f00520 at .. READ of size 257 at 0x7ffff5f00520 thread T0 #0 0x00000032ffb6 in printf_common(void*, char const*, __va_list_tag*) (src/nft+0x32ffb6) #1 0x00000033055d in vsnprintf (src/nft+0x33055d) #2 0x000000332071 in snprintf (src/nft+0x332071) #3 0x0000004eef03 in netlink_gen_chain src/netlink.c:454:2 #4 0x0000004eef03 in netlink_gen_verdict src/netlink.c:467:4 Reject chain jumps that exceed 255 characters, which matches the netlink policy on the kernel side. The included reproducer fails without asan too because the kernel will reject the too-long chain name. But that happens after the asan detected bogus read. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use EXPR_RANGE_VALUE in interval mapsPablo Neira Ayuso8 days1-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the restriction on maps to use EXPR_RANGE_VALUE to reduce memory consumption. With 100k map with concatenation: table inet x { map y { typeof ip saddr . tcp dport : ip saddr flags interval elements = { 1.0.2.0-1.0.2.240 . 0-2 : 1.0.2.10, ... } } Before: 153.6 Mbytes After: 108.9 Mbytes (-29.11%) With 100k map without concatenation: table inet x { map y { typeof ip saddr : ip saddr flags interval elements = { 1.0.2.0-1.0.2.240 : 1.0.2.10, ... } } Before: 74.36 Mbytes After: 62.39 Mbytes (-16.10%) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use constant range expression for interval+concatenation setsPablo Neira Ayuso8 days1-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expand 347039f64509 ("src: add symbol range expression to further compact intervals") to use constant range expression for elements with concatenation of intervals. Ruleset with 100k elements of this type: table inet x { set y { typeof ip saddr . tcp dport flags interval elements = { 0.1.2.0-0.1.2.240 . 0-1, ... } } } Memory consumption for this set: Before: 123.80 Mbytes After: 80.19 Mbytes (-35.23%) This patch keeps the workaround 2fbade3cd990 ("netlink: bogus concatenated set ranges with netlink message overrun") in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: restrict allowed subtypes of concatenationsFlorian Westphal9 days1-1/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to restrict this, included bogon asserts with: BUG: unknown expression type prefix nft: src/netlink_linearize.c:940: netlink_gen_expr: Assertion `0' failed. Prefix expressions are only allowed if the concatenation is used within a set element, not when specifying the lookup key. For the former, anything that represents a value is allowed. For the latter, only what will generate data (fill a register) is permitted. At this time we do not have an annotation that tells if the expression is on the left hand side (lookup key) or right hand side (set element). Add a new list recursion counter for this. If its 0 then we're building the lookup key, if its the latter the concatenation is the RHS part of a relational expression and prefix, ranges and so on are allowed. IOW, we don't really need a recursion counter, another type of annotation that would tell if the expression is placed on the left or right hand side of another expression would work too. v2: explicitly list all 'illegal' expression types instead of using a default label for them. This will raise a compiler warning to remind us to adjust the case labels in case a new expression type gets added in the future. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: rename recursion counter to recursion.binopFlorian Westphal9 days1-5/+5
| | | | | | | | | | | | | | | | | | | | | | The existing recursion counter is used by the binop expression to detect if we've completely followed all the binops. We can only chain up to NFT_MAX_EXPR_RECURSION binops, but the evaluation step can perform constant-folding, so we must recurse until we found the rightmost (last) binop in the chain. Then we can check the post-eval chain to see if it is something that can be serialized later (i.e., if we are within the NFT_MAX_EXPR_RECURSION after constant folding) or not. Thus we can't reuse the existing ctx->recursion counter for other expressions; entering the initial expr_evaluate_binop with ctx->recursion > 0 would break things. Therefore rename this to an embedded structure. This allows us to add a new recursion counter in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: don't BUG on unexpected base datatypeFlorian Westphal13 days1-1/+2
| | | | | | | Included bogon will cause a crash but this is the evaluation stage where we can just emit an error instead. Signed-off-by: Florian Westphal <fw@strlen.de>
* evalute: make vlan pcp updates workFlorian Westphal2025-04-221-4/+38
| | | | | | | | | | | | | | | | On kernel side, nft_payload_set_vlan() requires a 2 or 4 byte write to the vlan header. As-is, nft emits a 1 byte write: [ payload load 1b @ link header + 14 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000001f ) ^ 0x00000020 ] ... which the kernel doesn't support. Expand all vlan header updates to a 2 or 4 byte write and update the existing vlan id test case. Reported-by: Kevin Vigouroux <ke.vigouroux@laposte.net> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bail out if ct saddr/daddr dependency cannot be insertedFlorian Westphal2025-04-071-1/+2
| | | | | | | | | | | | | | | | | | | | | If we have an incomplete rule like "ct original saddr" in inet family, this function generates an error because it can't determine the required protocol dependency, hinting at missing ip/ip6 keyword. We should not go on in this case to avoid a redundant followup error: nft add rule inet f c ct original saddr 1.2.3.4 Error: cannot determine ip protocol version, use "ip saddr" or "ip6 saddr" instead add rule inet f c ct original saddr 1.2.3.4 ^^^^^^^^^^^^^^^^^ Error: Could not parse symbolic invalid expression add rule inet f c ct original saddr 1.2.3.4 After this change only the first error is shown. Fixes: 2b29ea5f3c3e ("src: ct: add eval part to inject dependencies for ct saddr/daddr") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix crash when generating reject statement errorFlorian Westphal2025-04-021-2/+14
| | | | | | | | | | | After patch, this gets rejected with: internal:0:0-0: Error: conflicting protocols specified: ip vs ip6 Without patch, we crash with a NULL dereference: we cannot use reject.expr->location unconditionally. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: reject: remove unused expr function argumentFlorian Westphal2025-04-021-16/+10
| | | | | | | stmt_evaluate_reject passes cmd->expr argument but its never used. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: only allow stateful statements in set and map definitionsFlorian Westphal2025-03-311-1/+4
| | | | | | | | | | | | | | | | The bison parser doesn't allow this to happen due to grammar restrictions, but the json input has no such issues. The bogon input assigns 'notrack' which triggers: BUG: unknown stateful statement type 19 nft: src/netlink_linearize.c:1061: netlink_gen_stmt_stateful: Assertion `0' failed. After patch, we get: Error: map statement must be stateful Fixes: 07958ec53830 ("json: add set statement list support") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: compact STMT_F_STATEFUL checksFlorian Westphal2025-03-311-12/+14
| | | | | | | | We'll gain another F_STATEFUL check in a followup patch, so lets condense the pattern into a helper to reduce copypaste. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: don't BUG when asked to list synproxiesFlorian Westphal2025-03-271-2/+4
| | | | | | | | | | | | | | | | "-j list synproxys" triggers a BUG(). Rewrite this so that all enum values are handled so the compiler can alert us to a missing value in case there are more commands in the future. While at it, implement a few low-hanging fruites as well. Not-yet-supported cases are simply ignored. v2: return EOPNOTSUPP for unsupported commands (Pablo Neira Ayuso) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove flagcmp expressionPablo Neira Ayuso2025-03-271-21/+0
| | | | | | | | | | | | | | This expression is not used anymore, since: ("src: transform flag match expression to binop expression from parser") remove it. This completes the revert of c3d57114f119 ("parser_bison: add shortcut syntax for matching flags without binary operations"), except the parser chunk for backwards compatibility. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't update cache for anonymous chainsFlorian Westphal2025-03-221-0/+4
| | | | | | | | | | | | Chain lookup needs a name, not a numerical id. After patch, loading bogon gives following errors: Error: No symbol type information a b index 1 10.1.26.a v2: Don't return an error, just make it a no-op (Pablo Neira Ayuso) Fixes: c330152b7f77 ("src: support for implicit chain bindings") Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: fix assertion failure with malformed map definitionsFlorian Westphal2025-03-201-1/+4
| | | | | | | | | | | | | | | | | Included bogon triggers: nft: src/evaluate.c:2267: expr_evaluate_mapping: Assertion `set->data != NULL' failed. After this fix, following errors will be shown: Error: unqualified type invalid specified in map definition. Try "typeof expression" instead of "type datatype". map m { ^ map m { ^ Error: map has no mapping data Fixes: 343a51702656 ("src: store expr, not dtype to track data in sets") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't allow nat map with specified protocolFlorian Westphal2025-03-201-0/+4
| | | | | | | | | | | | | | | Included bogon asserts: src/netlink_linearize.c:1305: netlink_gen_nat_stmt: Assertion `stmt->nat.proto == NULL' failed. The comment right above the assertion says: nat_stmt evaluation step doesn't allow STMT_NAT_F_CONCAT && stmt->nat.proto. ... except it does allow it. Disable this. Fixes: c68314dd4263 ("src: infer NAT mapping with concatenation from set") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace struct stmt_ops by type field in struct stmtPablo Neira Ayuso2025-03-181-12/+12
| | | | | | | | | | | | | | | | | | | | | Shrink struct stmt in 8 bytes. __stmt_ops_by_type() provides an operation for STMT_INVALID since this is required by -o/--optimize. There are many checks for stmt->ops->type, which is the most accessed field, that can be trivially replaced. BUG() uses statement type enum instead of name. Similar to: 68e76238749f ("src: expr: add and use expr_name helper"). 72931553828a ("src: expr: add expression etype") 2cc91e6198e7 ("src: expr: add and use internal expr_ops helper") Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: move interval flag compat check after set key evaluationFlorian Westphal2025-03-181-3/+3
| | | | | | | | | | | | | | | Without this, included bogon asserts with: BUG: unhandled key type 13 nft: src/intervals.c:73: setelem_expr_to_range: Assertion `0' failed. ... because we no longer evaluate set->key/data. Move the check to the tail of the function, right before assiging set->existing_set, so that set->key has been evaluated. Fixes: ceab53cee499 ("evaluate: don't allow merging interval set/map with non-interval one") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't allow merging interval set/map with non-interval oneFlorian Westphal2025-03-131-7/+11
| | | | | | | | | | | Included bogon asserts with: BUG: invalid data expression type range_value Pablo says: "Reject because flags interval is lacking". Make it so. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix expression data corruptionFlorian Westphal2025-03-121-11/+18
| | | | | | | | | | | | | | | | | | | | | | | Sometimes nftables will segfault when doing error-unwind of the included afl-generated bogon. The problem is the unconditional write access to expr->set_flags in expr_evaluate_map(): mappings->set_flags |= NFT_SET_MAP; ... but mappings can point to EXPR_VARIABLE (legal), where this will flip a bit in unused, but allocated memory (i.e., has no effect). In case of the bogon, mapping is EXPR_RANGE_SYMBOL, and the store can flip a bit in identifier_range[1], this causes crash when the pointer is freed. We can't use expr->set_flags unconditionally, so rework this to pass set_flags as argument and place all read and write accesses in places where we've made sure we are dealing with EXPR_SET. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't crash if range has same start and end intervalFlorian Westphal2025-03-101-0/+5
| | | | | | | | | | | | | | In this case, evaluation step replaces the range expression with a single value and we'd crash as range->left/right contain garbage values. Simply replace the input expression with the evaluation result. Also add a test case modeled on the afl reproducer. Fixes: fe6cc0ad29cd ("evaluate: consolidate evaluation of symbol range expression") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix reset element support for interval set typeFlorian Westphal2025-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Running reset command yields on an interval (rbtree) set yields: nft reset element inet filter rbtreeset {1.2.3.4} BUG: unhandled op 8 This is easy to fix, CMD_RESET doesn't add or remove so it should be treated like CMD_GET. Unfortunately, this still doesn't work properly: nft get element inet filter rbset {1.2.3.4} returns: ... elements = { 1.2.3.4 } but its expected that "get" and "reset" also return stateful objects associated with the element. This works for other set types, but for rbtree, the list of statements gets lost during segtree processing. After fix, get/reset returns: elements = { 1.2.3.4 counter packets 10 ... A follow up patch will add a test case. Fixes: 83e0f4402fb7 ("Implement 'reset {set,map,element}' commands") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: support for bitfield payload statement with binary operationPablo Neira Ayuso2025-03-071-2/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update bitfield payload statement support to allow for bitwise and/or/xor updates. Adjust payload expression to fetch 16-bits for mangling while leaving unmodified bits intact. # nft --debug=netlink add rule x y ip dscp set ip dscp or 0x1 ip x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000fbff ) ^ 0x00000400 ] [ payload write reg 1 => 2b @ network header + 0 csum_type 1 csum_off 10 csum_flags 0x0 ] Skip expr_evaluate_bits() transformation since these are only useful for payload matching and set lookups. Listing still shows a raw expression: # nft list ruleset ... @nh,8,5 set 0x0 The follow up patch completes it: ("netlink_delinearize: support for bitfield payload statement with binary operation") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1698 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: reject unsupported expressions in payload statement for bitfieldsPablo Neira Ayuso2025-03-071-1/+2
| | | | | | | | The payload statement evaluation pretends that it can handle any expression for bitfields, but the existing evaluation code only knows how to handle value expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: simplify payload statement evaluation for bitfieldsPablo Neira Ayuso2025-03-071-14/+7
| | | | | | | | Instead of allocating a lshift expression and relying on the binary operation transfer propagate this to the mask value, lshift the mask value immediately. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: release existing datatype when evaluating unary expressionPablo Neira Ayuso2025-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | Use __datatype_set() to release the existing datatype before assigning the new one, otherwise ASAN reports the following memleak: Direct leak of 104 byte(s) in 1 object(s) allocated from: #0 0x7fbc8a2b89cf in __interceptor_malloc ../../../../src/libsa #1 0x7fbc898c96c2 in xmalloc src/utils.c:31 #2 0x7fbc8971a182 in datatype_clone src/datatype.c:1406 #3 0x7fbc89737c35 in expr_evaluate_unary src/evaluate.c:1366 #4 0x7fbc89758ae9 in expr_evaluate src/evaluate.c:3057 #5 0x7fbc89726bd9 in byteorder_conversion src/evaluate.c:243 #6 0x7fbc89739ff0 in expr_evaluate_bitwise src/evaluate.c:1491 #7 0x7fbc8973b4f8 in expr_evaluate_binop src/evaluate.c:1600 #8 0x7fbc89758b01 in expr_evaluate src/evaluate.c:3059 #9 0x7fbc8975ae0e in stmt_evaluate_arg src/evaluate.c:3198 #10 0x7fbc8975c51d in stmt_evaluate_payload src/evaluate.c:330 Fixes: faa6908fad60 ("evaluate: clone unary expression datatype to deal with dynamic datatype") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: consolidate evaluation of symbol range expressionPablo Neira Ayuso2025-02-261-22/+21
| | | | | | | | | | | | | | | | | | | Expand symbol_range to range expression to consolidate evaluation. I found a bug when testing for negative ranges: test.nft:5:16-30: Error: Could not process rule: File exists elements = { 1.1.1.1-1.1.1.0 } ^^^^^^^^^^^^^^^ after this patch, error reporting has been restored: test.nft:5:16-30: Error: Range negative size elements = { 1.1.1.1-1.1.1.0 } ^^^^^^^^^^^^^^^ Fixes: 347039f64509 ("src: add symbol range expression to further compact intervals") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: optimize zero length rangePablo Neira Ayuso2025-02-261-3/+9
| | | | | | | | | | | | | | | | A rule like the following: ... tcp dport 22-22 ... results in a range expression to match from 22 to 22. Simplify to singleton value so a cmp is used instead. This optimization already exists in set elements which might explain this overlook. Fixes: 7a6e16040d65 ("evaluate: allow for zero length ranges") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: auto-merge is only available for singleton interval setsPablo Neira Ayuso2025-02-211-0/+3
| | | | | | | | | | auto-merge is only available to interval sets with one value only, untoggle this flag for concatenation with intervals. Later, this can be hardened to reject it. Fixes: 30f667920601 ("src: add 'auto-merge' option to sets") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add symbol range expression to further compact intervalsPablo Neira Ayuso2025-02-211-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | Update parser to use a new symbol range expression with smaller memory footprint than range expression + two symbol expressions. The evaluation step translates this into EXPR_RANGE_VALUE for interval sets. Note that maps or concatenations still use the less compact range expressions representation, those require more work to use this new symbol range expression. The parser also uses the classic range expression if variables are used. Testing with a 100k intervals, worst case scenario: no prefix or singleton elements. This shows a reduction from 49.58 Mbytes to 35.47 Mbytes (-29.56% memory footprint for this case). This follow up work to previous commits: 91dc281a82ea ("src: rework singleton interval transformation to reduce memory consumption") c9ee9032b0ee ("src: add EXPR_RANGE_VALUE expression and use it") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow to re-use existing metered setFlorian Westphal2025-01-291-9/+34
| | | | | | | | | | | | | | | | | | | Blamed commit translates old meter syntax (which used to allocate an anonymous set) to dynamic sets. A side effect of this is that re-adding a meter rule after chain was flushed results in an error, unlike anonymous sets named sets are not impacted by the flush. Refine this: if a set of the same name exists and is compatible, then re-use it instead of returning an error. Also pick up the reproducer kindly provided by the reporter and place it in the shell test directory. Fixes: b8f8ddfff733 ("evaluate: translate meter into dynamic set") Reported-by: Yi Chen <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: remove variable shadowingPablo Neira Ayuso2025-01-191-2/+0
| | | | | | | | unsigned int i is already declared in resolve_ll_protocol_conflict(), remove it. Fixes: 3a734d608131 ("evaluate: don't assert on net/transport header conflict") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow binop expressions with variable right-hand operandsJeremy Sowden2024-12-041-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hitherto, the kernel has required constant values for the `xor` and `mask` attributes of boolean bitwise expressions. This has meant that the right-hand operand of a boolean binop must be constant. Now the kernel has support for AND, OR and XOR operations with right-hand operands passed via registers, we can relax this restriction. Allow non-constant right-hand operands if the left-hand operand is not constant, e.g.: ct mark & 0xffff0000 | meta mark & 0xffff The kernel now supports performing AND, OR and XOR operations directly, on one register and an immediate value or on two registers, so we need to be able to generate and parse bitwise boolean expressions of this form. If a boolean operation has a constant RHS, we continue to send a mask-and-xor expression to the kernel. Add tests for {ct,meta} mark with variable RHS operands. JSON support is also included. This requires Linux kernel >= 6.13-rc. [ Originally posted as patch 1/8 and 6/8 which has been collapsed and simplified to focus on initial {ct,meta} mark support. Tests have been extracted from 8/8 including a tests/py fix to payload output due to incorrect output in original patchset. JSON support has been extracted from patch 7/8 --pablo] Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: consolidate reset commandPablo Neira Ayuso2024-08-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reset command does not utilize the cache infrastructure. This implicitly fixes a crash with anonymous sets because elements are not fetched. I initially tried to fix it by toggling the missing cache flags, but then ASAN reports memleaks. To address these issues relies on Phil's list filtering infrastructure which updates is expanded to accomodate filtering requirements of the reset commands, such as 'reset table ip' where only the family is sent to the kernel. After this update, tests/shell reports a few inconsistencies between reset and list commands: - reset rules chain t c2 display sets, but it should only list the given chain. - reset rules table t reset rules ip do not list elements in the set. In both cases, these are fully listing a given table and family, elements should be included. The consolidation also ensures list and reset will not differ. A few more notes: - CMD_OBJ_TABLE is used for: rules family table from the parser, due to the lack of a better enum, same applies to CMD_OBJ_CHAIN. - CMD_OBJ_ELEMENTS still does not use the cache, but same occurs in the CMD_GET command case which needs to be consolidated. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1763 Fixes: 83e0f4402fb7 ("Implement 'reset {set,map,element}' commands") Fixes: 1694df2de79f ("Implement 'reset rule' and 'reset rules' commands") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: set on expr->len for catchall set elementsPablo Neira Ayuso2024-07-051-1/+11
| | | | | | | | | | | | Catchall elements coming from the parser provide expr->len == 0. However, the existing mergesort implementation requires expr->len to be set up to the length of the set key to properly sort elements. In particular, set element deletion leverages such list sorting to find if elements exists in the set. Fixes: 419d19688688 ("src: add set element catch-all support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add string preprocessor and use it for log prefix stringPablo Neira Ayuso2024-06-251-41/+4
| | | | | | | | Add a string preprocessor to identify and replace variables in a string. Rework existing support to variables in log prefix strings to use it. Fixes: e76bb3794018 ("src: allow for variables in the log prefix string") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: Fix incorrect checking the `base` variable in case of IPV6Maks Mishin2024-06-031-1/+1
| | | | | | | | Found by RASU JSC. Fixes: 2b29ea5f3c3e ("src: ct: add eval part to inject dependencies for ct saddr/daddr") Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bogus protocol conflicts in vlan with implicit dependenciesPablo Neira Ayuso2024-06-031-12/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following command: # nft add rule netdev x y ip saddr 10.1.1.1 icmp type echo-request vlan id set 321 fails with: Error: conflicting link layer protocols specified: ether vs. vlan netdev x y ip saddr 10.1.1.1 icmp type echo-request vlan id set 321 ^^^^^^^ Users can work around this issue by prepending an explicit match for vlan ethertype field, that is: ether type vlan ip saddr 10.1.1.1 ... ^-------------^ but nft should really handle this itself. The error above is triggered by the following check in resolve_ll_protocol_conflict(): /* This payload and the existing context don't match, conflict. */ if (pctx->protocol[base + 1].desc != NULL) return 1; This check was added by 39f15c243912 ("nft: support listing expressions that use non-byte header fields") and f7d5590688a6 ("tests: vlan tests") to deal with conflicting link layer protocols, for instance: ether type ip vlan id 1 this is matching ethertype ip at offset 12, but then it matches for vlan id at offset 14 which is not present given the previous check. One possibility is to remove such check, but nft does not bail out for the example above and it results in bytecode that never matches: # nft --debug=netlink netdev x y ether type ip vlan id 10 netdev x y [ meta load iiftype => reg 1 ] [ cmp eq reg 1 0x00000001 ] [ payload load 2b @ link header + 12 => reg 1 ] <---- ether type [ cmp eq reg 1 0x00000008 ] <---- ip [ payload load 2b @ link header + 12 => reg 1 ] <---- ether type [ cmp eq reg 1 0x00000081 ] <---- vlan [ payload load 2b @ link header + 14 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000ff0f ) ^ 0x00000000 ] [ cmp eq reg 1 0x00000a00 ] This is due to resolve_ll_protocol_conflict() which deals with the conflict by updating protocol context and emitting an implicit dependency, but there is already an explicit match coming from the user. This patch adds a new helper function to check if an implicit dependency clashes with an existing statement, which results in: # nft add rule netdev x y ether type ip vlan id 1 Error: conflicting statements add rule netdev x y ether type ip vlan id 1 ^^^^^^^^^^^^^ ~~~~~~~ Theoretically, no duplicated implicit dependency should ever be emitted if protocol context is correctly handled. Only implicit payload expressions are considered at this stage for this conflict check, this patch can be extended to deal with other dependency types. Fixes: 39f15c243912 ("nft: support listing expressions that use non-byte header fields") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add support for variables in map expressionsJeremy Sowden2024-05-201-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to use a variable to initialize a map, which is then used in a map statement: define dst_map = { ::1234 : 5678 } table ip6 nat { map dst_map { typeof ip6 daddr : tcp dport; elements = $dst_map } chain prerouting { ip6 nexthdr tcp redirect to ip6 daddr map @dst_map } } However, if one tries to use the variable directly in the statement: define dst_map = { ::1234 : 5678 } table ip6 nat { chain prerouting { ip6 nexthdr tcp redirect to ip6 daddr map $dst_map } } nft rejects it: /space/azazel/tmp/ruleset.1067161.nft:5:47-54: Error: invalid mapping expression variable ip6 nexthdr tcp redirect to ip6 daddr map $dst_map ~~~~~~~~~ ^^^^^^^^ It also rejects variables in stateful object statements: define quota_map = { 192.168.10.123 : "user123", 192.168.10.124 : "user124" } table ip nat { quota user123 { over 20 mbytes } quota user124 { over 20 mbytes } chain prerouting { quota name ip saddr map $quota_map } } thus: /space/azazel/tmp/ruleset.1067161.nft:15:29-38: Error: invalid mapping expression variable quota name ip saddr map $quota_map ~~~~~~~~ ^^^^^^^^^^ Add support for these uses together with some test-cases. Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1067161 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: handle invalid mapping expressions in stateful object statements ↵Jeremy Sowden2024-05-201-2/+3
| | | | | | | | | | | | | | | | | | | gracefully. Currently, they are reported as assertion failures: BUG: invalid mapping expression variable nft: src/evaluate.c:4618: stmt_evaluate_objref_map: Assertion `0' failed. Aborted Instead, report them more informatively as errors: /space/azazel/tmp/ruleset.1067161.nft:15:29-38: Error: invalid mapping expression variable quota name ip saddr map $quota_map ~~~~~~~~ ^^^^^^^^^^ Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: display "Range negative size" errorPablo Neira Ayuso2024-03-201-2/+2
| | | | | | | | zero length ranges now allowed, therefore, update error message to refer to negative ranges which are not possible. Fixes: 7a6e16040d65 ("evaluate: allow for zero length ranges") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: reverse cross-day meta hour rangePablo Neira Ayuso2024-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | f8f32deda31d ("meta: Introduce new conditions 'time', 'day' and 'hour'") reverses the hour range in case that a cross-day range is used, eg. meta hour "03:00"-"14:00" counter accept which results in (Sidney, Australia AEDT time): meta hour != "14:00"-"03:00" counter accept kernel handles time in UTC, therefore, cross-day range may not be obvious according to local time. The ruleset listing above is not very intuitive to the reader depending on their timezone, therefore, complete netlink delinearize path to reverse the cross-day meta range. Update manpage to recommend to use a range expression when matching meta hour range. Recommend range expression for meta time and meta day too. Extend testcases/listing/meta_time to cover for this scenario. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1737 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: do not merge a set with a erroneous oneFlorian Westphal2024-03-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The included sample causes a crash because we attempt to range-merge a prefix expression with a symbolic expression. The first set is evaluated, the symbol expression evaluation fails and nft queues an error message ("Could not resolve hostname"). However, nft continues evaluation. nft then encounters the same set definition again and merges the new content with the preceeding one. But the first set structure is dodgy, it still contains the unresolved symbolic expression. That then makes nft crash (assert) in the set internals. There are various different incarnations of this issue, but the low level set processing code does not allow for any partially transformed expressions to still remain. Before: nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop BUG: invalid range expression type binop nft: src/expression.c:1479: range_expr_value_low: Assertion `0' failed. After: nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop invalid_range_expr_type_binop:4:18-25: Error: Could not resolve hostname: Name or service not known elements = { 1&.141.0.1 - 192.168.0.2} ^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove utf-8 character in printf linesFlorian Westphal2024-03-131-6/+6
| | | | | | replace "‘" (UTF-8, 0xe280 0x98) with "'" (ASCII 0x27). Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: translate meter into dynamic setPablo Neira Ayuso2024-03-121-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 129f9d153279 ("nft: migrate man page examples with `meter` directive to sets") already replaced meters by dynamic sets. This patch removes NFT_SET_ANONYMOUS flag from the implicit set that is instantiated via meter, so the listing shows a dynamic set instead which is the recommended approach these days. Therefore, a batch like this: add table t add chain t c add rule t c tcp dport 80 meter m size 128 { ip saddr timeout 1s limit rate 10/second } gets translated to a dynamic set: table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } chain c { tcp dport 80 update @m { ip saddr timeout 1s limit rate 10/second burst 5 packets } } } Check for NFT_SET_ANONYMOUS flag is also relaxed for list and flush meter commands: # nft list meter ip t m table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } } # nft flush meter ip t m As a side effect the legacy 'list meter' and 'flush meter' commands allow to flush a dynamic set to retain backward compatibility. This patch updates testcases/sets/0022type_selective_flush_0 and testcases/sets/0038meter_list_0 as well as the json output which now uses the dynamic set representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: permit use of host-endian constant values in set lookup keysPablo Neira Ayuso2024-02-131-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AFL found following crash: table ip filter { map ipsec_in { typeof ipsec in reqid . iif : verdict flags interval } chain INPUT { type filter hook input priority filter; policy drop; ipsec in reqid . 100 @ipsec_in } } Which yields: nft: evaluate.c:1213: expr_evaluate_unary: Assertion `!expr_is_constant(arg)' failed. All existing test cases with constant values use big endian values, but "iif" expects host endian values. As raw values were not supported before, concat byteorder conversion doesn't handle constants. Fix this: 1. Add constant handling so that the number is converted in-place, without unary expression. 2. Add the inverse handling on delinearization for non-interval set types. When dissecting the concat data soup, watch for integer constants where the datatype indicates host endian integer. Last, extend an existing test case with the afl input to cover in/output. A new test case is added to test linearization, delinearization and matching. Based on original patch from Florian Westphal, patch subject and description wrote by him. Fixes: b422b07ab2f9 ("src: permit use of constant values in set lookup keys") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>