summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* src: permit use of constant values in set lookup keysFlorian Westphal2023-05-242-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Something like: Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something like add rule ip saddr . 1.2.3.4 . 80 @s goto c1 fails with: "Error: Can't parse symbolic invalid expressions". This fails because the relational expression first evaluates the left hand side, so when concat evaluation sees '1.2.3.4' no key context is available. Check if the RHS is a set reference, and, if so, evaluate the right hand side. This sets a pointer to the set key in the evaluation context structure which then makes the concat evaluation step parse 1.2.3.4 and 80 as ipv4 address and 16bit port number. On delinearization, extend relop postprocessing to copy the datatype from the rhs (set reference, has proper datatype according to set->key) to the lhs (concat expression). Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-221-0/+40
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: set NFT_SET_EVAL flag if dynamic set already existsPablo Neira Ayuso2023-05-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft reports EEXIST when reading an existing set whose NFT_SET_EVAL has been previously inferred from the ruleset. # cat test.nft table ip test { set dlist { type ipv4_addr size 65535 } chain output { type filter hook output priority filter; policy accept; udp dport 1234 update @dlist { ip daddr } counter packets 0 bytes 0 } } # nft -f test.nft # nft -f test.nft test.nft:2:6-10: Error: Could not process rule: File exists set dlist { ^^^^^ Phil Sutter says: In the first call, the set lacking 'dynamic' flag does not exist and is therefore added to the cache. Consequently, both the 'add set' command and the set statement point at the same set object. In the second call, a set with same name exists already, so the object created for 'add set' command is not added to cache and consequently not updated with the missing flag. The kernel thus rejects the NEWSET request as the existing set differs from the new one. Set on the NFT_SET_EVAL flag if the existing set sets it on. Fixes: 8d443adfcc8c1 ("evaluate: attempt to set_eval flag if dynamic updates requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: add hint error handlerPablo Neira Ayuso2023-05-111-2/+39
| | | | | | | | | | | | | | | | | | | | | | | If user provides a symbol that cannot be parsed and the datatype provides an error handler, provide a hint through the misspell infrastructure. For instance: # cat test.nft table ip x { map y { typeof ip saddr : verdict elements = { 1.2.3.4 : filter_server1 } } } # nft -f test.nft test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'? elements = { 1.2.3.4 : filter_server1 } ^^^^^^^^^^^^^^ While at it, normalize error to "Could not parse symbolic %s expression". Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: misspell support with symbol table parser for error reportingPablo Neira Ayuso2023-05-111-2/+48
| | | | | | | | | | | | | | | | | | Some datatypes provide a symbol table that is parsed as an integer. Improve error reporting by using the misspell infrastructure, to provide a hint to the user, whenever possible. If base datatype, usually the integer datatype, fails to parse the symbol, then try a fuzzy match on the symbol table to provide a hint in case the user has mistype it. For instance: test.nft:3:11-14: Error: Could not parse Differentiated Services Code Point expression; did you you mean `cs0`? ip dscp ccs0 ^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: do not remove counter in verdict mapsPablo Neira Ayuso2023-05-101-7/+43
| | | | | | | | | | | | | Add counter to set element instead of dropping it: # nft -c -o -f test.nft Merging: test.nft:6:3-50: ip saddr 1.1.1.1 ip daddr 2.2.2.2 counter accept test.nft:7:3-48: ip saddr 1.1.1.2 ip daddr 3.3.3.3 counter drop into: ip daddr . ip saddr vmap { 2.2.2.2 . 1.1.1.1 counter : accept, 3.3.3.3 . 1.1.1.2 counter : drop } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: skip optimization if anonymous set uses stateful statementPablo Neira Ayuso2023-05-101-1/+1
| | | | | | | | fee6bda06403 ("evaluate: remove anon sets with exactly one element") introduces an optimization to remove use of sets with single element. Skip this optimization if set element contains stateful statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow stateful statements with anonymous verdict mapsPablo Neira Ayuso2023-05-101-1/+2
| | | | | | | | | | | | | | Evaluation fails to accept stateful statements in verdict maps, relax the following check for anonymous sets: test.nft:4:29-35: Error: missing statement in map declaration ip saddr vmap { 127.0.0.1 counter : drop, * counter : accept } ^^^^^^^ The existing code generates correctly the counter in the anonymous verdict map. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: restore typeof interval map data typeFlorian Westphal2023-05-021-1/+6
| | | | | | | | | | | | | When "typeof ... : interval ..." gets used, existing logic failed to validate the expressions. "interval" means that kernel reserves twice the size, so consider this when validating and restoring. Also fix up the dump file of the existing test case to be symmetrical. Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: introduce meta broute supportSriram Yagnaraman2023-04-291-0/+2
| | | | | | | | | | | Can be used in bridge prerouting hook to divert a packet to the ip stack for routing. This is a replacement for "ebtables -t broute" functionality. Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20230224095251.11249-1-sriram.yagnaraman@est.tech/ Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Florian Westphal <fw@strlen.de>
* json: formatting fixesJeremy Sowden2023-04-291-21/+20
| | | | | | | A few indentation tweaks for the JSON parser. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: fix enum/integer mismatchesFlorian Westphal2023-04-292-3/+3
| | | | | | | | | | | | | | | | | | | gcc 13 complains about type confusion: cache.c:1178:5: warning: conflicting types for 'nft_cache_update' due to enum/integer mismatch; have 'int(struct nft_ctx *, unsigned int, struct list_head *, const struct nft_cache_filter *)' [-Wenum-int-mismatch] cache.h:74:5: note: previous declaration of 'nft_cache_update' with type 'int(struct nft_ctx *, enum cmd_ops, struct list_head *, const struct nft_cache_filter *)' Same for: rule.c:1915:13: warning: conflicting types for 'obj_type_name' due to enum/integer mismatch; have 'const char *(enum stmt_types)' [-Wenum-int-mismatch] 1915 | const char *obj_type_name(enum stmt_types type) | ^~~~~~~~~~~~~ expression.c:1543:24: warning: conflicting types for 'expr_ops_by_type' due to enum/integer mismatch; have 'const struct expr_ops *(uint32_t)' {aka 'const struct expr_ops *(unsigned int)'} [-Wenum-int-mismatch] 1543 | const struct expr_ops *expr_ops_by_type(uint32_t value) | ^~~~~~~~~~~~~~~~ Convert to the stricter type (enum) where possible. Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: incomplete extended error reporting for singleton device in chainPablo Neira Ayuso2023-04-251-0/+1
| | | | | | | | | | | Fix error reporting when single device is specifies in chain: # nft add chain netdev filter ingress '{ devices = { x }; }' add chain netdev filter ingress { devices = { x }; } ^ Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: handle singleton element in netdevice setPablo Neira Ayuso2023-04-251-14/+32
| | | | | | | | expr_evaluate_set() turns sets with singleton element into value, nft_dev_add() expects a list of expression, so it crashes. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1676 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: allow to specify comment on chainPablo Neira Ayuso2023-04-252-7/+20
| | | | | | Allow users to add a comment when declaring a chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: allow to specify comment on tablePablo Neira Ayuso2023-04-242-5/+18
| | | | | | | | | | | | | | | | | | | | Allow users to add a comment when declaring a table: # sudo nft add table inet test3 '{comment "this is a comment";}' # nft list ruleset table inet test3 { comment "this is a comment" } # nft -j list ruleset {"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 3, "comment": "this is a comment"}}]} # nft -j list ruleset > test.json # nft flush ruleset # nft -j -f test.json # nft -j list ruleset {"nftables": [{"metainfo": {"version": "1.0.7", "release_name": "Old Doc Yak", "json_schema_version": 1}}, {"table": {"family": "inet", "name": "test3", "handle": 4, "comment": "this is a comment"}}]} Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1670 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: skip protocol context update for nfproto with same table familyPablo Neira Ayuso2023-04-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inefficient bytecode crashes ruleset listing: [ meta load nfproto => reg 1 ] [ cmp eq reg 1 0x00000002 ] <-- this specifies NFPROTO_IPV4 but table family is IPv4! [ payload load 4b @ network header + 12 => reg 1 ] [ cmp gte reg 1 0x1000000a ] [ cmp lte reg 1 0x1f00000a ] [ masq ] This IPv4 table obviously only see IPv4 traffic, but bytecode specifies a redundant match on NFPROTO_IPV4. After this patch, listing works: # nft list ruleset table ip crash { chain crash { type nat hook postrouting priority srcnat; policy accept; ip saddr 10.0.0.16-10.0.0.31 masquerade } } Skip protocol context update in case that this information is redundant. Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1562 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bail out if new flowtable does not specify hook and priorityPablo Neira Ayuso2023-04-241-1/+5
| | | | | | | | | | | | | | | | | | | | If user forgets to specify the hook and priority and the flowtable does not exist, then bail out: # cat flowtable-incomplete.nft table t { flowtable f { devices = { lo } } } # nft -f /tmp/k flowtable-incomplete.nft:2:12-12: Error: missing hook and priority in flowtable declaration flowtable f { ^ Update one existing tests/shell to specify a hook and priority. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow for updating devices on existing netdev chainPablo Neira Ayuso2023-04-245-63/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to add/remove devices to an existing chain: # cat ruleset.nft table netdev x { chain y { type filter hook ingress devices = { eth0 } priority 0; policy accept; } } # nft -f ruleset.nft # nft add chain netdev x y '{ devices = { eth1 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth0, eth1 } priority 0; policy accept; } } # nft delete chain netdev x y '{ devices = { eth0 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth1 } priority 0; policy accept; } } This feature allows for creating an empty netdev chain, with no devices. In such case, no packets are seen until a device is registered. This patch includes extended netlink error reporting: # nft add chain netdev x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add chain netdev x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: flowtable support for extended netlink error reportingPablo Neira Ayuso2023-04-241-60/+82
| | | | | | | | | | | | | | | This patch extends existing flowtable support to improve error reporting: # nft add flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add flowtable inet x y { devices = { x } ; } ^ # nft delete flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory delete flowtable inet x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: set SO_SNDBUF before SO_SNDBUFFORCEPablo Neira Ayuso2023-04-243-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set SO_SNDBUF before SO_SNDBUFFORCE: Unpriviledged user namespace does not have CAP_NET_ADMIN on the host (user_init_ns) namespace. SO_SNDBUF always succeeds in Linux, always try SO_SNDBUFFORCE after it. Moreover, suggest the user to bump socket limits if EMSGSIZE after having see EPERM previously, when calling SO_SNDBUFFORCE. Provide a hint to the user too: # nft -f test.nft netlink: Error: Could not process rule: Message too long Please, rise /proc/sys/net/core/wmem_max on the host namespace. Hint: 4194304 bytes Dave Pfike says: Prior to this patch, nft inside a systemd-nspawn container was failing to install my ruleset (which includes a large-ish map), with the error netlink: Error: Could not process rule: Message too long strace reveals: setsockopt(3, SOL_SOCKET, SO_SNDBUFFORCE, [524288], 4) = -1 EPERM (Operation not permitted) This is despite the nspawn process supposedly having CAP_NET_ADMIN. A web search reveals at least one other user having the same issue: https://old.reddit.com/r/Proxmox/comments/scnoav/lxc_container_debian_11_nftables_geoblocking/ Reported-by: Dave Pifke <dave@pifke.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* main: Error out when combining -i/--interactive and -f/--filePablo Neira Ayuso2023-04-181-0/+10
| | | | | | | | | These two options are mutually exclusive, display error in that case: # nft -i -f test.nft Error: -i/--interactive and -f/--file options cannot be combined Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: support for redirect and masqueradePablo Neira Ayuso2023-04-051-32/+119
| | | | | | | | | | | | | | The redirect and masquerade statements can be handled as verdicts: - if redirect statement specifies no ports. - masquerade statement, in any case. Exceptions to the rule: If redirect statement specifies ports, then nat map transformation can be used iif both statements specify ports. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1668 Fixes: 0a6dbfce6dc3 ("optimize: merge nat rules with same selectors into map") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: do not reset protocol context for nat protocol expressionPablo Neira Ayuso2023-04-051-3/+1
| | | | | | | | This patch reverts 403b46ada490 ("netlink_delinearize: kill dependency before eval of 'redirect' stmt"). Since ("evaluate: bogus missing transport protocol"), this workaround is not required anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: bogus missing transport protocolPablo Neira Ayuso2023-04-051-3/+8
| | | | | | | | | | | | | | | | | | | Users have to specify a transport protocol match such as meta l4proto tcp before the redirect statement, even if the redirect statement already implicitly refers to the transport protocol, for instance: test.nft:3:16-53: Error: transport protocol mapping is only valid after transport protocol match redirect to :tcp dport map { 83 : 8083, 84 : 8084 } ~~~~~~~~ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Evaluate the redirect expression before the mandatory check for the transport protocol match, so protocol context already provides a transport protocol. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* optimize: assert nat type on nat statement helperPablo Neira Ayuso2023-04-051-0/+4
| | | | | | Add assert() to helper function to expression from NAT statement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xt: Fix translation error pathPhil Sutter2023-03-291-4/+6
| | | | | | | | | | | | | | | | | | | | If xtables support was compiled in but the required libxtables DSO is not found, nft prints an error message and leaks memory: | counter packets 0 bytes 0 XT target MASQUERADE not found This is not as bad as it seems, the output combines stdout and stderr. Dropping stderr produces an incomplete ruleset listing, though. While this seemingly inline output can't easily be avoided, fix a few things: * Respect octx->error_fp, libnftables might have been configured to redirect stderr somewhere else. * Align error message formatting with others. * Don't return immediately, but free allocated memory and fall back to printing the expression in "untranslated" form. Fixes: 5c30feeee5cfe ("xt: Delay libxtables access until translation") Signed-off-by: Phil Sutter <phil@nwl.cc>
* netlink_delinearize: correct type and byte-order of shiftsJeremy Sowden2023-03-281-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Downgrade to base type integer instead of the specific type from the expression that is used in the shift operation. Without this, listing a rule like: ct mark set ip dscp lshift 2 or 0x10 will return: ct mark set ip dscp << 2 | cs2 because the type of the OR's right operand will be transitively derived from `ip dscp`. However, this is not valid syntax: # nft add rule t c ct mark set ip dscp '<<' 2 '|' cs2 Error: Could not parse integer add rule t c ct mark set ip dscp << 2 | cs2 ^^^ Use xinteger_type to print the output in hexadecimal. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* intervals: use expression location when translating to intervalsPablo Neira Ayuso2023-03-281-2/+2
| | | | | | | | | | | | | | | | | Otherwise, internal location reports: # nft -f ruleset.nft internal:0:0-0: Error: Could not process rule: File exists after this patch: # nft -f ruleset.nft ruleset.nft:402:1-16: Error: Could not process rule: File exists 1.2.3.0/30, ^^^^^^^^^^^ Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: set byteorder when completing expressionPablo Neira Ayuso2023-03-281-0/+1
| | | | | | | | | Otherwise payload expression remains in invalid byteorder which is handled as network byteorder for historical reason. No functional change is intended. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinerize: incorrect byteorder in mark statement listingPablo Neira Ayuso2023-03-281-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using ip dscp in combination with bitwise operation: # nft --debug=netlink add rule ip x y 'ct mark set ip dscp | 0x4' ip x y [ payload load 1b @ network header + 1 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0xfffffffb ) ^ 0x00000004 ] [ ct set mark with reg 1 ] the listing is showing in the incorrect byteorder: # nft list ruleset table ip x { chain y { ct mark set ip dscp | 0x4000000 } } handle and and or operations in host byteorder. The following command: # nft --debug=netlink add rule ip6 x y 'ct mark set ip6 dscp | 0x4' ip6 x y [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ byteorder reg 1 = ntoh(reg 1, 2, 1) ] [ bitwise reg 1 = ( reg 1 & 0xfffffffb ) ^ 0x00000004 ] [ ct set mark with reg 1 ] works fine (without requiring this patch) because there is an explicit byteorder expression. However, ip dscp takes only 1-byte, so it does not require the byteorder expression. Use host byteorder if the rhs of bitwise AND OR is larger than lhs payload expression and such expression is equal or less than 1-byte. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: honor statement length in bitwise evaluationPablo Neira Ayuso2023-03-281-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get length from statement, instead infering it from the expression that is used to set the value. In the particular case of {ct|meta} mark, this is 32 bits. Otherwise, bytecode generation is not correct: # nft -c --debug=netlink 'add rule ip6 x y ct mark set ip6 dscp << 2 | 0x10' [ payload load 2b @ network header + 0 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ] [ byteorder reg 1 = ntoh(reg 1, 2, 1) ] [ bitwise reg 1 = ( reg 1 << 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0x00000fef ) ^ 0x00000010 ] <--- incorrect! [ ct set mark with reg 1 ] the previous bitwise shift already upgraded to 32-bits (not visible from the netlink debug output above). After this patch, the last | 0x10 uses 32-bits: [ bitwise reg 1 = ( reg 1 & 0xffffffef ) ^ 0x00000010 ] note that mask 0xffffffef is used instead of 0x00000fef. Patch ("evaluate: support shifts larger than the width of the left operand") provides the statement length through eval context. Use it to evaluate the bitwise expression accordingly, otherwise bytecode is incorrect: # nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000' ip x y [ payload load 1b @ network header + 1 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0x1e000000 ) ^ 0x000000ff ] <-- incorrect byteorder for OR [ byteorder reg 1 = ntoh(reg 1, 4, 4) ] <-- no needed for single ip dscp byte [ ct set mark with reg 1 ] Correct bytecode: # nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000 ip x y [ payload load 1b @ network header + 1 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ] [ bitwise reg 1 = ( reg 1 & 0x0000001e ) ^ 0xff000000 ] [ ct set mark with reg 1 ] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: honor statement length in integer evaluationPablo Neira Ayuso2023-03-281-2/+8
| | | | | | | | | | | | | | Otherwise, bogus error is reported: # nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000' Error: Value 4278190080 exceeds valid range 0-63 add rule ip x y ct mark set ip dscp & 0x0f << 1 | 0xff000000 ^^^^^^^^^^ Use the statement length as the maximum value in the mark statement expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: set up integer type to shift expressionPablo Neira Ayuso2023-03-281-0/+1
| | | | | | | | | | Otherwise expr_evaluate_value() fails with invalid datatype: # nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1' BUG: invalid basetype invalid nft: evaluate.c:440: expr_evaluate_value: Assertion `0' failed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: relax type-checking for integer arguments in mark statementsPablo Neira Ayuso2023-03-281-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to be able to set ct and meta marks to values derived from payload expressions, we need to relax the requirement that the type of the statement argument must match that of the statement key. Instead, we require that the base-type of the argument is integer and that the argument is small enough to fit. Moreover, swap expression byteorder before to make it compatible with the statement byteorder, to ensure rulesets are portable. # nft --debug=netlink add rule ip t c 'meta mark set ip saddr' ip t c [ payload load 4b @ network header + 12 => reg 1 ] [ byteorder reg 1 = ntoh(reg 1, 4, 4) ] <----------- byteorder swap [ meta set mark with reg 1 ] Based on original work from Jeremy Sowden. The following patches are required for this to work: evaluate: get length from statement instead of lhs expression evaluate: don't eval unary arguments evaluate: support shifts larger than the width of the left operand netlink_delinearize: correct type and byte-order of shifts evaluate: insert byte-order conversions for expressions between 9 and 15 bits Add one testcase for tests/py. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't eval unary argumentsJeremy Sowden2023-03-281-4/+2
| | | | | | | | | | | | When a unary expression is inserted to implement a byte-order conversion, the expression being converted has already been evaluated and so `expr_evaluate_unary` doesn't need to do so. This is required by {ct|meta} statements with bitwise operations, which might result in byteorder conversion of the expression. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: support shifts larger than the width of the left operandPablo Neira Ayuso2023-03-282-20/+46
| | | | | | | | | | | | | | | | | | | | | | | | | If we want to left-shift a value of narrower type and assign the result to a variable of a wider type, we are constrained to only shifting up to the width of the narrower type. Thus: add rule t c meta mark set ip dscp << 2 works, but: add rule t c meta mark set ip dscp << 8 does not, even though the lvalue is large enough to accommodate the result. Upgrade the maximum length based on the statement datatype length, which is provided via context, if it is larger than expression lvalue. Update netlink_delinearize.c to handle the case where the length of a shift expression does not match that of its left-hand operand. Based on patch from Jeremy Sowden. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: don't crash if meta key isn't knownFlorian Westphal2023-03-272-11/+24
| | | | | | | If older nft version is used for dumping, 'key' might be outside of the range of known templates. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: insert byte-order conversions for expressions between 9 and 15 bitsJeremy Sowden2023-03-221-1/+1
| | | | | | | | | | | Round up expression lengths when determining whether to insert a byte-order conversion. For example, if one is masking a network header which spans a byte boundary, the mask will span two bytes and so it will need to be in NBO. Fixes: bb03cbcd18a1 ("evaluate: no need to swap byte-order for values of fewer than 16 bits.") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Avoid a memleak with 'reset rules' commandPhil Sutter2023-03-201-5/+0
| | | | | | | | | | | | | | | | | | | Like other 'reset' commands, 'reset rules' also lists the (part of the) ruleset which was affected to give users a chance to store the zeroed values. Therefore do_command_reset() calls do_command_list(). This in turn calls do_list_ruleset() for CMD_OBJ_RULES which wasn't prepared for values stored in cmd->handle other than a possible family value and thus freely reused the pointers as scratch area for the do_list_table() call whiich in the past fetched each table's data directly from kernel. Meanwhile ruleset listing code has been integrated into the common caching logic, the 'cmd' pointer became unused by do_list_table(). The temporary cmd->handle manipulation is not needed anymore, dropping it prevents a memleak caused by overwriting of allocated table name pointer. Fixes: 1694df2de79f3 ("Implement 'reset rule' and 'reset rules' commands") Signed-off-by: Phil Sutter <phil@nwl.cc>
* Reduce signature of do_list_table()Phil Sutter2023-03-201-4/+3
| | | | | | | | | Since commit 16fac7d11bdf5 ("src: use cache infrastructure for rule objects"), the function does not use the passed 'cmd' object anymore. Remove it to affirm correctness of a follow-up fix and simplification in do_list_ruleset(). Signed-off-by: Phil Sutter <phil@nwl.cc>
* parser_bison: simplify reset syntaxPablo Neira Ayuso2023-03-151-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | Simplify: *reset rules* *chain* ['family'] 'table' ['chain]' to *reset rules* ['family'] 'table' 'chain' *reset rules* *table* ['family'] 'table' to *reset rules* ['family'] 'table' *reset counters* ['family'] *table* 'table' to *reset counters* ['family'] 'table' *reset quotas* ['family'] *table* 'table' to *reset quotas* ['family'] 'table' Previous syntax remains in place for backward compatibility. Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Revert "evaluate: relax type-checking for integer arguments in mark statements"Pablo Neira Ayuso2023-03-141-6/+2
| | | | | | | | | | | This patch reverts eab3eb7f146c ("evaluate: relax type-checking for integer arguments in mark statements") since it might cause ruleset portability issues when moving a ruleset from little to big endian host (and vice-versa). Let's revert this until we agree on what to do in this case. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix a couple of typo's in commentsJeremy Sowden2023-03-121-1/+1
| | | | | Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* cmd: move command functions to src/cmd.cPablo Neira Ayuso2023-03-113-206/+207
| | | | | | Move several command functions to src/cmd.c to debloat src/rule.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: improve error reporting for unsupported chain typePablo Neira Ayuso2023-03-112-9/+36
| | | | | | | | | | | | | | | | 8c75d3a16960 ("Reject invalid chain priority values in user space") provides error reporting from the evaluation phase. Instead, this patch infers the error after the kernel reports EOPNOTSUPP. test.nft:3:28-40: Error: Chains of type "nat" must have a priority value above -200 type nat hook prerouting priority -300; ^^^^^^^^^^^^^ This patch also adds another common issue for users compiling their own kernels if they forget to enable CONFIG_NFT_NAT in their .config file. Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Reject invalid chain priority values in user spacePhil Sutter2023-03-101-0/+9
| | | | | | | | The kernel doesn't accept nat type chains with a priority of -200 or below. Catch this and provide a better error message than the kernel's EOPNOTSUPP. Signed-off-by: Phil Sutter <phil@nwl.cc>
* xt: Fix fallback printing for extensions matching keywordsPhil Sutter2023-03-102-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Yet another Bison workaround: Instead of the fancy error message, an incomprehensible syntax error is emitted: | # iptables-nft -A FORWARD -p tcp -m osf --genre linux | # nft list ruleset | nft -f - | # Warning: table ip filter is managed by iptables-nft, do not touch! | /dev/stdin:4:29-31: Error: syntax error, unexpected osf, expecting string | meta l4proto tcp xt match osf counter packets 0 bytes 0 | ^^^ Avoid this by quoting the extension name when printing: | # nft list ruleset | sudo ./src/nft -f - | # Warning: table ip filter is managed by iptables-nft, do not touch! | /dev/stdin:4:20-33: Error: unsupported xtables compat expression, use iptables-nft with this ruleset | meta l4proto tcp xt match "osf" counter packets 0 bytes 0 | ^^^^^^^^^^^^^^ Fixes: 79195a8cc9e9d ("xt: Rewrite unsupported compat expression dumping") Fixes: e41c53ca5b043 ("xt: Fall back to generic printing from translation") Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: fetch more objects when resetting rulePablo Neira Ayuso2023-03-011-0/+1
| | | | | | | | | | | | | | | | If the ruleset contains a reference to object, listing fails. The existing test for the new reset command displays the following error: # ./run-tests.sh testcases/rule_management/0011reset_0 I: using nft command: ./../../src/nft W: [FAILED] testcases/rule_management/0011reset_0: got 2 loading ruleset resetting specific rule netlink: Error: Unknown set 's' in dynset statement Fixes: 1694df2de79f ("Implement 'reset rule' and 'reset rules' commands") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: allow to use quota in setsPablo Neira Ayuso2023-03-011-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | src: support for restoring element quota This patch allows you to restore quota in dynamic sets. table ip x { set y { type ipv4_addr size 65535 flags dynamic,timeout counter quota 500 bytes timeout 1h elements = { 8.8.8.8 counter packets 9 bytes 756 quota 500 bytes used 500 bytes timeout 1h expires 56m57s47ms } } chain z { type filter hook output priority filter; policy accept; update @y { ip daddr } counter packets 6 bytes 507 } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>