summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* evaluate: don't update cache for anonymous chainsFlorian Westphal2025-03-221-0/+4
| | | | | | | | | | | | Chain lookup needs a name, not a numerical id. After patch, loading bogon gives following errors: Error: No symbol type information a b index 1 10.1.26.a v2: Don't return an error, just make it a no-op (Pablo Neira Ayuso) Fixes: c330152b7f77 ("src: support for implicit chain bindings") Signed-off-by: Florian Westphal <fw@strlen.de>
* json: make sure timeout list is initialisedFlorian Westphal2025-03-211-1/+1
| | | | | | | | | | On parser error, obj_free will iterate this list. Included json bogon crashes due to null deref because list head initialisation did not yet happen. Fixes: c82a26ebf7e9 ("json: Add ct timeout support") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: consolidate connlimit grammar rule for set elementsPablo Neira Ayuso2025-03-211-20/+21
| | | | | | | Define ct_limit_stmt_alloc and ct_limit_args to follow similar idiom that is used for counters. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: consolidate last grammar rule for set elementsPablo Neira Ayuso2025-03-211-21/+18
| | | | | | | Define last_stmt_alloc and last_args to follow similar idiom that is used for counters. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: consolidate quota grammar rule for set elementsPablo Neira Ayuso2025-03-211-26/+23
| | | | | | | Define quota_stmt_alloc and quota_args to follow similar idiom that is used for counters. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: consolidate limit grammar rule for set elementsPablo Neira Ayuso2025-03-211-40/+37
| | | | | | | Define limit_stmt_alloc and limit_args to follow similar idiom that is used for counters. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: consolidate counter grammar rule for set elementsPablo Neira Ayuso2025-03-211-10/+1
| | | | | | Use existing grammar rules to parse counters to simplify parser. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix assertion failure with malformed map definitionsFlorian Westphal2025-03-201-1/+4
| | | | | | | | | | | | | | | | | Included bogon triggers: nft: src/evaluate.c:2267: expr_evaluate_mapping: Assertion `set->data != NULL' failed. After this fix, following errors will be shown: Error: unqualified type invalid specified in map definition. Try "typeof expression" instead of "type datatype". map m { ^ map m { ^ Error: map has no mapping data Fixes: 343a51702656 ("src: store expr, not dtype to track data in sets") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: return error if table does not existFlorian Westphal2025-03-201-1/+7
| | | | | | | | | | | | | The bogon triggers segfault due to NULL dereference. Error out and set errno to ENOENT; caller uses strerror() in the errmsg. After fix, loading reproducer results in: /tmp/A:2:1-18: Error: Could not process rule: No such file or directory list table inet p ^^^^^^^^^^^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't allow nat map with specified protocolFlorian Westphal2025-03-201-0/+4
| | | | | | | | | | | | | | | Included bogon asserts: src/netlink_linearize.c:1305: netlink_gen_nat_stmt: Assertion `stmt->nat.proto == NULL' failed. The comment right above the assertion says: nat_stmt evaluation step doesn't allow STMT_NAT_F_CONCAT && stmt->nat.proto. ... except it does allow it. Disable this. Fixes: c68314dd4263 ("src: infer NAT mapping with concatenation from set") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expression: tolerate named set protocol dependencyFlorian Westphal2025-03-201-0/+11
| | | | | | | | | | | | | | | | | | Included test will fail with: /dev/stdin:8:38-52: Error: Transparent proxy support requires transport protocol match meta l4proto @protos tproxy to :1088 ^^^^^^^^^^^^^^^ Tolerate a set reference too. Because the set can be empty (or there can be removals later), add a fake 0-rhs value. This will make pctx_update assign proto_unknown as the transport protocol in use, Thats enough to avoid 'requires transport protocol' error. v2: restrict it to meta lhs for now (Pablo Neira Ayuso) Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1686 Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinerize: add more restrictions on meta nfproto removalFlorian Westphal2025-03-202-18/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't remove 'meta nfproto' dependencies for all cases. Its removed for ip/ip6 families, this works fine. But for others, e.g. inet, removal is not as simple. For example meta nfproto ipv4 ct protocol tcp is listed as 'ct protocol tcp', even when this is uses in the inet table. Meta L4PROTO removal checks were correct, but refactor this into a helper function to split meta/ct checks from the common calling function. Ct check was lacking, we need to examine ct keys more closely to figure out if they need to retain the network protocol depenency or not. Elide for NFT_CT_SRC/DST and its variants, as those imply the network protocol to use, all others must keep it as-is. Also extend test coverage for this. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1783 Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: reject non-serializeable typeof expressionsFlorian Westphal2025-03-201-4/+10
| | | | | | | | | | | | | | | Included bogon asserts with: BUG: unhandled key type 13 nft: src/intervals.c:73: setelem_expr_to_range: Assertion `0' failed. This should be rejected at parser stage, but the check for udata support was only done on the first item in a concatenation. After fix, parser rejects this with: Error: primary expression type 'symbol' lacks typeof serialization Fixes: 6e48df5329ea ("src: add "typeof" build/parse/print support") Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink: fix stack buffer overrun when emitting ranged expressionsFlorian Westphal2025-03-181-15/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Included bogon input generates following Sanitizer splat: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7... WRITE of size 2 at 0x7fffffffcbe4 thread T0 #0 0x0000003a68b8 in __asan_memset (src/nft+0x3a68b8) (BuildId: 3678ff51a5405c77e3e0492b9a985910efee73b8) #1 0x0000004eb603 in __mpz_export_data src/gmputil.c:108:2 #2 0x0000004eb603 in netlink_export_pad src/netlink.c:256:2 #3 0x0000004eb603 in netlink_gen_range src/netlink.c:471:2 #4 0x0000004ea250 in __netlink_gen_data src/netlink.c:523:10 #5 0x0000004e8ee3 in alloc_nftnl_setelem src/netlink.c:205:3 #6 0x0000004d4541 in mnl_nft_setelem_batch src/mnl.c:1816:11 Problem is that the range end is emitted to the buffer at the *padded* location (rounded up to next register size), but buffer sizing is based of the expression length, not the padded length. Also extend the test script: Capture stderr and if we see AddressSanitizer warning, make it fail. Same bug as the one fixed in 600b84631410 ("netlink: fix stack buffer overflow with sub-reg sized prefixes"), just in a different function. Apply same fix: no dynamic array + add a range check. Joint work with Pablo Neira Ayuso. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: replace struct stmt_ops by type field in struct stmtPablo Neira Ayuso2025-03-1812-69/+130
| | | | | | | | | | | | | | | | | | | | | Shrink struct stmt in 8 bytes. __stmt_ops_by_type() provides an operation for STMT_INVALID since this is required by -o/--optimize. There are many checks for stmt->ops->type, which is the most accessed field, that can be trivially replaced. BUG() uses statement type enum instead of name. Similar to: 68e76238749f ("src: expr: add and use expr_name helper"). 72931553828a ("src: expr: add expression etype") 2cc91e6198e7 ("src: expr: add and use internal expr_ops helper") Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: print set element with multi-word description in single one linePablo Neira Ayuso2025-03-183-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the set element: - represents a mapping - has a timeout - has a comment - has counter/quota/limit - concatenation (already printed in a single line before this patch) ie. if the set element requires several words, then print it in one single line. Before this patch: table ip x { set y { typeof ip saddr counter elements = { 192.168.10.35 counter packets 0 bytes 0, 192.168.10.101 counter packets 0 bytes 0, 192.168.10.135 counter packets 0 bytes 0 } } } After this patch: table ip x { set y { typeof ip saddr counter elements = { 192.168.10.35 counter packets 0 bytes 0, 192.168.10.101 counter packets 0 bytes 0, 192.168.10.135 counter packets 0 bytes 0 } } } Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: move interval flag compat check after set key evaluationFlorian Westphal2025-03-181-3/+3
| | | | | | | | | | | | | | | Without this, included bogon asserts with: BUG: unhandled key type 13 nft: src/intervals.c:73: setelem_expr_to_range: Assertion `0' failed. ... because we no longer evaluate set->key/data. Move the check to the tail of the function, right before assiging set->existing_set, so that set->key has been evaluated. Fixes: ceab53cee499 ("evaluate: don't allow merging interval set/map with non-interval one") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't allow merging interval set/map with non-interval oneFlorian Westphal2025-03-131-7/+11
| | | | | | | | | | | Included bogon asserts with: BUG: invalid data expression type range_value Pablo says: "Reject because flags interval is lacking". Make it so. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix expression data corruptionFlorian Westphal2025-03-121-11/+18
| | | | | | | | | | | | | | | | | | | | | | | Sometimes nftables will segfault when doing error-unwind of the included afl-generated bogon. The problem is the unconditional write access to expr->set_flags in expr_evaluate_map(): mappings->set_flags |= NFT_SET_MAP; ... but mappings can point to EXPR_VARIABLE (legal), where this will flip a bit in unused, but allocated memory (i.e., has no effect). In case of the bogon, mapping is EXPR_RANGE_SYMBOL, and the store can flip a bit in identifier_range[1], this causes crash when the pointer is freed. We can't use expr->set_flags unconditionally, so rework this to pass set_flags as argument and place all read and write accesses in places where we've made sure we are dealing with EXPR_SET. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: reduce register waste with non-constant binop expressionsPablo Neira Ayuso2025-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register use is not good with bitwise operations that involve three or more selectors, eg. mark set ip dscp and 0x3 or ct mark or meta mark [ payload load 1b @ network header + 1 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0x00000003 ) ^ 0x00000000 ] [ ct load mark => reg 2 ] [ bitwise reg 1 = ( reg 1 | reg 2 ) ] [ meta load mark => reg 3 ] <--- this could use register 2 instead! [ bitwise reg 1 = ( reg 1 | reg 3 ) ] [ meta set mark with reg 1 ] register 3 is used to store meta mark, however, register 2 can be already use since register 1 already stores the partial result of the bitwise operation for this expression. After this fix: [ payload load 1b @ network header + 1 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0x00000003 ) ^ 0x00000000 ] [ ct load mark => reg 2 ] [ bitwise reg 1 = ( reg 1 | reg 2 ) ] [ meta load mark => reg 2 ] <--- recycle register 2 [ bitwise reg 1 = ( reg 1 | reg 2 ) ] [ meta set mark with reg 1 ] Release source register in bitwise operation given destination register already stores the partial result of the expression. Extend tests/py to cover this. Fixes: 54bfc38c522b ("src: allow binop expressions with variable right-hand operands") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't crash if range has same start and end intervalFlorian Westphal2025-03-101-0/+5
| | | | | | | | | | | | | | In this case, evaluation step replaces the range expression with a single value and we'd crash as range->left/right contain garbage values. Simply replace the input expression with the evaluation result. Also add a test case modeled on the afl reproducer. Fixes: fe6cc0ad29cd ("evaluate: consolidate evaluation of symbol range expression") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: incomplete output in get element command with mapsPablo Neira Ayuso2025-03-071-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get element command displays an incomplete range. Using this simple test ruleset: table ip x { map y { typeof ip saddr : meta mark counter flags interval,timeout elements = { 1.1.1.1-1.1.1.10 timeout 10m : 20, 2.2.2.2-2.2.2.5 timeout 10m : 30} } then, invoking the get element command: # nft get element x y { 1.1.1.2 } results in, before (incomplete output): table ip x { map y { type ipv4_addr : mark flags interval,timeout elements = { 1.1.1.1 counter packets 0 bytes 0 timeout 10m expires 1m24s160ms : 0x00000014 } } } Note that it displays 1.1.1.1, instead of 1.1.1.1-1.1.1.10. After this fix: table ip x { map y { type ipv4_addr : mark flags interval,timeout elements = { 1.1.1.1-1.1.1.10 counter packets 0 bytes 0 timeout 10m expires 1m24s160ms : 0x00000014 } } } Fixes: a43cc8d53096 ("src: support for get element command") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix reset element support for interval set typeFlorian Westphal2025-03-072-6/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Running reset command yields on an interval (rbtree) set yields: nft reset element inet filter rbtreeset {1.2.3.4} BUG: unhandled op 8 This is easy to fix, CMD_RESET doesn't add or remove so it should be treated like CMD_GET. Unfortunately, this still doesn't work properly: nft get element inet filter rbset {1.2.3.4} returns: ... elements = { 1.2.3.4 } but its expected that "get" and "reset" also return stateful objects associated with the element. This works for other set types, but for rbtree, the list of statements gets lost during segtree processing. After fix, get/reset returns: elements = { 1.2.3.4 counter packets 10 ... A follow up patch will add a test case. Fixes: 83e0f4402fb7 ("Implement 'reset {set,map,element}' commands") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: support for bitfield payload statement with binary ↵Pablo Neira Ayuso2025-03-072-2/+184
| | | | | | | | | | | | | | | | | | | | | | | | | operation Add a new function to deal with payload statement delinearization with binop expression. Infer the payload offset from the mask, then walk the template list to determine if estimated offset falls within a matching header field. If so, then validate that this is not a raw expression but an actual bitfield matching. Finally, trim the payload expression length accordingly and adjust the payload offset. instead of: @nh,8,5 set 0x0 it displays: ip dscp and 0x1 Update tests/py to cover for this enhancement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: support for bitfield payload statement with binary operationPablo Neira Ayuso2025-03-071-2/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update bitfield payload statement support to allow for bitwise and/or/xor updates. Adjust payload expression to fetch 16-bits for mangling while leaving unmodified bits intact. # nft --debug=netlink add rule x y ip dscp set ip dscp or 0x1 ip x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000fbff ) ^ 0x00000400 ] [ payload write reg 1 => 2b @ network header + 0 csum_type 1 csum_off 10 csum_flags 0x0 ] Skip expr_evaluate_bits() transformation since these are only useful for payload matching and set lookups. Listing still shows a raw expression: # nft list ruleset ... @nh,8,5 set 0x0 The follow up patch completes it: ("netlink_delinearize: support for bitfield payload statement with binary operation") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1698 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: reject unsupported expressions in payload statement for bitfieldsPablo Neira Ayuso2025-03-071-1/+2
| | | | | | | | The payload statement evaluation pretends that it can handle any expression for bitfields, but the existing evaluation code only knows how to handle value expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: simplify payload statement evaluation for bitfieldsPablo Neira Ayuso2025-03-071-14/+7
| | | | | | | | Instead of allocating a lshift expression and relying on the binary operation transfer propagate this to the mask value, lshift the mask value immediately. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: release existing datatype when evaluating unary expressionPablo Neira Ayuso2025-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | Use __datatype_set() to release the existing datatype before assigning the new one, otherwise ASAN reports the following memleak: Direct leak of 104 byte(s) in 1 object(s) allocated from: #0 0x7fbc8a2b89cf in __interceptor_malloc ../../../../src/libsa #1 0x7fbc898c96c2 in xmalloc src/utils.c:31 #2 0x7fbc8971a182 in datatype_clone src/datatype.c:1406 #3 0x7fbc89737c35 in expr_evaluate_unary src/evaluate.c:1366 #4 0x7fbc89758ae9 in expr_evaluate src/evaluate.c:3057 #5 0x7fbc89726bd9 in byteorder_conversion src/evaluate.c:243 #6 0x7fbc89739ff0 in expr_evaluate_bitwise src/evaluate.c:1491 #7 0x7fbc8973b4f8 in expr_evaluate_binop src/evaluate.c:1600 #8 0x7fbc89758b01 in expr_evaluate src/evaluate.c:3059 #9 0x7fbc8975ae0e in stmt_evaluate_arg src/evaluate.c:3198 #10 0x7fbc8975c51d in stmt_evaluate_payload src/evaluate.c:330 Fixes: faa6908fad60 ("evaluate: clone unary expression datatype to deal with dynamic datatype") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: fix string data initialisationFlorian Westphal2025-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This uses the wrong length. This must re-use the length of the datatype, not the string length. The added test cases will fail without the fix due to erroneous overlap detection, which in itself is due to incorrect sorting of the elements. Example error: netlink: Error: interval overlaps with an existing one add element inet testifsets simple_wild { "2-1" } failed. table inet testifsets { ... elements = { "1-1", "abcdef*", "othername", "ppp0" } ... but clearly "2-1" doesn't overlap with any existing members. The false detection is because of the "acvdef*" wildcard getting sorted at the beginning of the list which is because its erronously initialised as a 64bit number instead of 128 bits (16 bytes / IFNAMSIZ). Fixes: 5e393ea1fc0a ("segtree: add string "range" reversal support") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expression: expr_build_udata_recurse should recurseFlorian Westphal2025-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | If we see EXPR_BINOP, recurse: ->left can be another EXPR_BINOP. This is irrelevant for 'typeof' named sets, but for anonymous sets, the key is derived from the concat expression that builds the lookup key for the anonymous set. tcp option mptcp subtype . ip daddr { mp-join. 10.0.0.1, .. needs two binops back-to-back: [ exthdr load tcpopt 1b @ 30 + 2 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000004 ) ] This bug prevents concat_expr_build_udata() from creating the userdata key at load time. When listing the rules, we get an assertion: nft: src/mergesort.c:23: concat_expr_msort_value: Assertion `ilen > 0' failed. because the set has a key with 0-length integers. Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delinearize: also consider exthdr type when trimming binopsFlorian Westphal2025-03-061-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | This allows trimming the binop for exthdrs, this will make nft render (tcp option mptcp unknown & 240) >> 4 . ip saddr @s1 as tcp option mptcp subtype . ip saddr @s1 Also extend the typeof set tests with a set concatenating a sub-byte-sized exthdr expression with a payload one. The additional call to expr_postprocess() is needed, without this, typeof_sets_0.nft fails because frag frag-off @s4 accept is shown as meta nfproto ipv6 frag frag-off @s4 accept Previouly, EXPR_EXTHDR would cause payload_binop_postprocess() to return false which will then make the caller invoke expr_postprocess(), but after handling EXPR_EXTHDR this doesn't happen anymore. Signed-off-by: Florian Westphal <fw@strlen.de>
* expression: propagate key datatype for anonymous setsFlorian Westphal2025-03-061-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | set s { typeof tcp option mptcp subtype elements = { mp-join, dss } } is listed correctly. The set key provides the 'mptcpopt_subtype' information and listing can print all elements with symbolic names. In anon set case this doesn't work: tcp option mptcp subtype { mp-join, dss } is printed as "... subtype { 1, 2}" because the anon set only provides plain integer type. This change propagates the datatype to the individual members of the anon set. After this change, multiple existing data types such as TYPE_ICMP_TYPE could theoretically be replaced by integer-type aliases. However, those datatypes are already exposed to userspace via the 'set type' keyword. Thus removing them will break set definitions that use them. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: add symbol table for mptcp suboptionsFlorian Westphal2025-03-061-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft can be used t match on specific multipath tcp subtypes: tcp option mptcp subtype 0 However, depending on which subtype to match, users need to look up the type/value to use in rfc8684. Add support for mnemonics and "nft describe tcp option mptcp subtype" to get the subtype list. Because the number of unique 'enum datatypes' is limited by ABI contraints this adds a new mptcp suboption type as integer alias. After this patch, nft supports all of the following: add element t s { mp-capable } add rule t c tcp option mptcp subtype mp-capable add rule t c tcp option mptcp subtype { mp-capable, mp-fail } For the 3rd case, listing will break because unlike for named sets, nft lacks the type information needed to pretty-print the integer values, i.e. nft will print the 3rd rule as 'subtype { 0, 6 }'. This is resolved in a followup patch. Other problematic constructs are: set s1 { typeof tcp option mptcp subtype . ip saddr elements = { mp-fail . 1.2.3.4 } } Followed by: tcp option mptcp subtype . ip saddr @s1 nft will print this as: tcp option mptcp unknown & 240) >> 4 . ip saddr @s1 All of these issues are not related to this patch, however, they also occur with other bit-sized extheader fields. Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: don't kill dependency for proto_thFlorian Westphal2025-03-062-2/+15
| | | | | | | | | | | | | | proto_th carries no information about the proto number, we need to preserve the L4 protocol expression unless we can be sure that For example, if "meta l4proto 91 @th,0,16 0" is simplified to "th sport 0", the information of protocol number is lost. Based on initial patch from Xiao Liang. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: honor inner payload description in payload_expr_cmp()Pablo Neira Ayuso2025-02-261-1/+2
| | | | | | | | | | | payload comparison must consider inner_desc. No test update because I could not find any specific bug related to this. I found it through source code inspection. Fixes: 772892a018b4 ("src: add vxlan matching support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: return early if dependency is not a payload expressionFlorian Westphal2025-02-261-1/+2
| | | | | | | | | | | | | | | if (dep->left->payload.base != PROTO_BASE_TRANSPORT_HDR) is legal only after checking that ->left points to an EXPR_PAYLOAD expression. The dependency store can also contain EXPR_META, in this case we access a bogus part of the union. The payload_may_dependency_kill_icmp helper can't handle a META dep either, so return early. Fixes: 533565244d88 ("payload: check icmp dependency before removing previous icmp expression") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: remove double-storeFlorian Westphal2025-02-261-1/+0
| | | | | | | This assignment was duplicated. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: consolidate evaluation of symbol range expressionPablo Neira Ayuso2025-02-261-22/+21
| | | | | | | | | | | | | | | | | | | Expand symbol_range to range expression to consolidate evaluation. I found a bug when testing for negative ranges: test.nft:5:16-30: Error: Could not process rule: File exists elements = { 1.1.1.1-1.1.1.0 } ^^^^^^^^^^^^^^^ after this patch, error reporting has been restored: test.nft:5:16-30: Error: Range negative size elements = { 1.1.1.1-1.1.1.0 } ^^^^^^^^^^^^^^^ Fixes: 347039f64509 ("src: add symbol range expression to further compact intervals") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: optimize zero length rangePablo Neira Ayuso2025-02-261-3/+9
| | | | | | | | | | | | | | | | A rule like the following: ... tcp dport 22-22 ... results in a range expression to match from 22 to 22. Simplify to singleton value so a cmp is used instead. This optimization already exists in set elements which might explain this overlook. Fixes: 7a6e16040d65 ("evaluate: allow for zero length ranges") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* fib: Change data type of fib oifname to "ifname"Xiao Liang2025-02-251-1/+1
| | | | | | | | | | | | | | | Change data type of fib oifname from "string" to "ifname", so that it can be matched against a set of ifnames: set x { type ifname } chain y { fib saddr oifname @x drop } Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* parser_bison: get rid of unneeded statementFlorian Westphal2025-02-251-6/+3
| | | | | | | | Was used for the legacy flow statement, but that was removed in 2ee93ca27ddc ("parser_bison: remove deprecated flow statement") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: auto-merge is only available for singleton interval setsPablo Neira Ayuso2025-02-211-0/+3
| | | | | | | | | | auto-merge is only available to interval sets with one value only, untoggle this flag for concatenation with intervals. Later, this can be hardened to reject it. Fixes: 30f667920601 ("src: add 'auto-merge' option to sets") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: use range expression for OP_EQ and OP_IMPLICITPablo Neira Ayuso2025-02-211-19/+3
| | | | | | | | | | | | | | | | | | range expression is available since v4.9-rc1~127^2~67^2~3, replace the two cmp expression when generating netlink bytecode. Code to delinearize the two cmp expressions to represent the range remains in place for backwards compatibility. The delinearize path to parse range expressions with NFT_OP_EQ is already present since: 3ed932917cc7 ("src: use new range expression for != [a,b] intervals") Update tests/py payload accordingly, json tests need no update since they already use the range to represent them. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add symbol range expression to further compact intervalsPablo Neira Ayuso2025-02-214-1/+105
| | | | | | | | | | | | | | | | | | | | | | | | Update parser to use a new symbol range expression with smaller memory footprint than range expression + two symbol expressions. The evaluation step translates this into EXPR_RANGE_VALUE for interval sets. Note that maps or concatenations still use the less compact range expressions representation, those require more work to use this new symbol range expression. The parser also uses the classic range expression if variables are used. Testing with a 100k intervals, worst case scenario: no prefix or singleton elements. This shows a reduction from 49.58 Mbytes to 35.47 Mbytes (-29.56% memory footprint for this case). This follow up work to previous commits: 91dc281a82ea ("src: rework singleton interval transformation to reduce memory consumption") c9ee9032b0ee ("src: add EXPR_RANGE_VALUE expression and use it") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: compact and simplify list and reset syntaxFlorian Westphal2025-02-211-58/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Works: list sets list sets inet list sets table inet foo Doesn't work: list sets inet foo Same for "list counters", "list quotas", etc. "reset" keyword however supports this: reset counters inet foo and aliased this to reset counters table inet foo This is inconsistent and not inuitive. Moreover, unlike "list sets", "list maps" only supported "list maps" and "list maps inet", without the ability to only list maps of a given table. Compact this to unify the syntax so it becomes possible to omit the "table" keyword for either reset or list mode. flowtables, secmarks and synproxys keywords are updated too. "flow table" and "meters" are NOT changed since both of these are deprecated in favor of standard nft sets. Reported-by: Slavko <linux@slavino.sk> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: turn redundant ip option type field match into booleanPablo Neira Ayuso2025-02-071-0/+3
| | | | | | | | | | | | | | | | | | The ip option expression allows for non-sense matching like: ip option lsrr type 1 because 'lsrr' already provides the type field, this never results in a matching. Turn this expression into: ip option lsrr exists And update documentation to hide this redundant type field. Fixes: 226a0e072d5c ("exthdr: add support for matching IPv4 options") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipopt: use ipv4 address datatype for address field in ip optionsPablo Neira Ayuso2025-02-071-3/+3
| | | | | | | | | | | | | | | | | | So user does not have to play integer arithmetics to match on IPv4 address. Before: # nft describe ip option lsrr addr exthdr expression, datatype integer (integer), 32 bits After: # nft describe ip option lsrr addr exthdr expression, datatype ipv4_addr (IPv4 address) (basetype integer), 32 bits Fixes: 226a0e072d5c ("exthdr: add support for matching IPv4 options") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: clamp boolean value to 0 and 1Pablo Neira Ayuso2025-02-071-0/+24
| | | | | | | | | | | | | | | | | | | | | | | If user provides a numeric value larger than 0 or 1, match never happens: # nft --debug=netlink add rule x y tcp option sack-perm 4 ip x y [ exthdr load tcpopt 1b @ 4 + 0 present => reg 1 ] [ cmp eq reg 1 0x00000004 ] After this update: # nft --debug=netlink add rule x y tcp option sack-perm 4 ip x y [ exthdr load tcpopt 1b @ 4 + 0 present => reg 1 ] [ cmp eq reg 1 0x00000001 ] This is to address a rare corner case, in case user specifies the boolean value through the integer base type. Fixes: 9fd9baba43c8 ("Introduce boolean datatype and boolean expression") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: incomplete type 2 routing header definitionPablo Neira Ayuso2025-02-071-2/+12
| | | | | | | | | | Add missing type 2 routing header definition. Listing is not correct because these IPv6 extension header are still lacking context to properly delinearize the listing, but at least this does not crash anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add and use payload_expr_trim_forceFlorian Westphal2025-02-072-6/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous commit fixed erroneous handling of raw expressions when RHS sets a zero value. Input: @ih,58,6 set 0 @ih,86,6 set 0 @ih,170,22 set 0 Output:@ih,48,16 set @ih,48,16 & 0xffc0 @ih,80,16 set \ @ih,80,16 & 0xfc0f @ih,160,32 set @ih,160,32 & 0xffc00000 After this patch, this will instead display: @ih,58,6 set 0x0 @ih,86,6 set 0x0 @ih,170,22 set 0x0 payload_expr_trim_force() only works when the payload has no known protocol (template) attached, i.e. will be printed as raw payload syntax. It performs sanity checks on @mask and then adjusts the payload expression length and offset according to the mask. Also add this check in __binop_postprocess() so we can also discard masks when matching, e.g. '@ih,7,5 2' becomes '@ih,7,5 0x2', not '@ih,0,16 & 0xffc0 == 0x20'. binop_postprocess now returns if it performed an action or not; if this returns true then arguments might have been freed so callers must no longer refer to any of the expressions attached to the binop. Next patch adds test cases for this. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>