summaryrefslogtreecommitdiffstats
path: root/include/rule.h
Commit message (Collapse)AuthorAgeFilesLines
* Add support for table's persist flagHEADmasterPhil Sutter6 days1-1/+3
| | | | | | | | | Bison parser lacked support for passing multiple flags, JSON parser did not support table flags at all. Document also 'owner' flag (and describe their relationship in nft.8. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: do not merge a set with a erroneous oneFlorian Westphal2024-03-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The included sample causes a crash because we attempt to range-merge a prefix expression with a symbolic expression. The first set is evaluated, the symbol expression evaluation fails and nft queues an error message ("Could not resolve hostname"). However, nft continues evaluation. nft then encounters the same set definition again and merges the new content with the preceeding one. But the first set structure is dodgy, it still contains the unresolved symbolic expression. That then makes nft crash (assert) in the set internals. There are various different incarnations of this issue, but the low level set processing code does not allow for any partially transformed expressions to still remain. Before: nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop BUG: invalid range expression type binop nft: src/expression.c:1479: range_expr_value_low: Assertion `0' failed. After: nft --check -f tests/shell/testcases/bogons/nft-f/invalid_range_expr_type_binop invalid_range_expr_type_binop:4:18-25: Error: Could not resolve hostname: Name or service not known elements = { 1&.141.0.1 - 192.168.0.2} ^^^^^^^^ Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: translate meter into dynamic setPablo Neira Ayuso2024-03-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 129f9d153279 ("nft: migrate man page examples with `meter` directive to sets") already replaced meters by dynamic sets. This patch removes NFT_SET_ANONYMOUS flag from the implicit set that is instantiated via meter, so the listing shows a dynamic set instead which is the recommended approach these days. Therefore, a batch like this: add table t add chain t c add rule t c tcp dport 80 meter m size 128 { ip saddr timeout 1s limit rate 10/second } gets translated to a dynamic set: table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } chain c { tcp dport 80 update @m { ip saddr timeout 1s limit rate 10/second burst 5 packets } } } Check for NFT_SET_ANONYMOUS flag is also relaxed for list and flush meter commands: # nft list meter ip t m table ip t { set m { type ipv4_addr size 128 flags dynamic,timeout } } # nft flush meter ip t m As a side effect the legacy 'list meter' and 'flush meter' commands allow to flush a dynamic set to retain backward compatibility. This patch updates testcases/sets/0022type_selective_flush_0 and testcases/sets/0038meter_list_0 as well as the json output which now uses the dynamic set representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: do not allow to chain more than 16 binopsFlorian Westphal2023-12-221-1/+5
| | | | | | | | | | | | | | | | | | | | | netlink_linearize.c has never supported more than 16 chained binops. Adding more is possible but overwrites the stack in netlink_gen_bitwise(). Add a recursion counter to catch this at eval stage. Its not enough to just abort once the counter hits NFT_MAX_EXPR_RECURSION. This is because there are valid test cases that exceed this. For example, evaluation of 1 | 2 will merge the constans, so even if there are a dozen recursive eval calls this will not end up with large binop chain post-evaluation. v2: allow more than 16 binops iff the evaluation function did constant-merging. Signed-off-by: Florian Westphal <fw@strlen.de>
* include: include <string.h> in <nft.h>Thomas Haller2023-09-281-1/+0
| | | | | | | | <string.h> provides strcmp(), as such it's very basic and used everywhere. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: simplify chain_alloc()Pablo Neira Ayuso2023-08-311-1/+1
| | | | | | | Remove parameter to set the chain name which is only used from netlink path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <std{bool,int}.h> via <nft.h>Thomas Haller2023-08-251-1/+0
| | | | | | | | | | | | | | | | | | | | There is a minimum base that all our sources will end up needing. This is what <nft.h> provides. Add <stdbool.h> and <stdint.h> there. It's unlikely that we want to implement anything, without having "bool" and "uint32_t" types available. Yes, this means the internal headers are not self-contained, with respect to what <nft.h> provides. This is the exception to the rule, and our internal headers should rely to have <nft.h> included for them. They should not include <nft.h> themselves, because <nft.h> needs always be included as first. So when an internal header would include <nft.h> it would be unnecessary, because the header is *always* included already. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ct expectation: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-07-311-0/+1
| | | | | | | | | | Just like "ct timeout", "ct expectation" is in need of the same fix, we get segfault on "nft list ct expectation table t", if table t exists. This is the exact same pattern as resolved for "ct timeout" in commit 1d2e22fc0521 ("ct timeout: fix 'list object x' vs. 'list objects in table' confusion"). Signed-off-by: Florian Westphal <fw@strlen.de>
* src: avoid IPPROTO_MAX for array definitionsFlorian Westphal2023-06-211-1/+1
| | | | | | | | | | | | | | | ip header can only accomodate 8but value, but IPPROTO_MAX has been bumped due to uapi reasons to support MPTCP (262, which is used to toggle on multipath support in tcp). This results in: exthdr.c:349:11: warning: result of comparison of constant 263 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare] if (type < array_size(exthdr_protocols)) ~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ redude array sizes back to what can be used on-wire. Signed-off-by: Florian Westphal <fw@strlen.de>
* ct timeout: fix 'list object x' vs. 'list objects in table' confusionFlorian Westphal2023-06-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <empty ruleset> $ nft list ct timeout table t Error: No such file or directory list ct timeout table t ^ This is expected to list all 'ct timeout' objects. The failure is correct, the table 't' does not exist. But now lets add one: $ nft add table t $ nft list ct timeout table t Segmentation fault (core dumped) ... and thats not expected, nothing should be shown and nft should exit normally. Because of missing TIMEOUTS command enum, the backend thinks it should do an object lookup, but as frontend asked for 'list of objects' rather than 'show this object', handle.obj.name is NULL, which then results in this crash. Update the command enums so that backend knows what the frontend asked for. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: fix enum/integer mismatchesFlorian Westphal2023-04-291-1/+1
| | | | | | | | | | | | | | | | | | | gcc 13 complains about type confusion: cache.c:1178:5: warning: conflicting types for 'nft_cache_update' due to enum/integer mismatch; have 'int(struct nft_ctx *, unsigned int, struct list_head *, const struct nft_cache_filter *)' [-Wenum-int-mismatch] cache.h:74:5: note: previous declaration of 'nft_cache_update' with type 'int(struct nft_ctx *, enum cmd_ops, struct list_head *, const struct nft_cache_filter *)' Same for: rule.c:1915:13: warning: conflicting types for 'obj_type_name' due to enum/integer mismatch; have 'const char *(enum stmt_types)' [-Wenum-int-mismatch] 1915 | const char *obj_type_name(enum stmt_types type) | ^~~~~~~~~~~~~ expression.c:1543:24: warning: conflicting types for 'expr_ops_by_type' due to enum/integer mismatch; have 'const struct expr_ops *(uint32_t)' {aka 'const struct expr_ops *(unsigned int)'} [-Wenum-int-mismatch] 1543 | const struct expr_ops *expr_ops_by_type(uint32_t value) | ^~~~~~~~~~~~~~~~ Convert to the stricter type (enum) where possible. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: support shifts larger than the width of the left operandPablo Neira Ayuso2023-03-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | If we want to left-shift a value of narrower type and assign the result to a variable of a wider type, we are constrained to only shifting up to the width of the narrower type. Thus: add rule t c meta mark set ip dscp << 2 works, but: add rule t c meta mark set ip dscp << 8 does not, even though the lvalue is large enough to accommodate the result. Upgrade the maximum length based on the statement datatype length, which is provided via context, if it is larger than expression lvalue. Update netlink_delinearize.c to handle the case where the length of a shift expression does not match that of its left-hand operand. Based on patch from Jeremy Sowden. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cmd: move command functions to src/cmd.cPablo Neira Ayuso2023-03-111-6/+0
| | | | | | Move several command functions to src/cmd.c to debloat src/rule.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: expand table command before evaluationPablo Neira Ayuso2023-02-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested syntax notation results in one single table command which includes all other objects. This differs from the flat notation where there is usually one command per object. This patch adds a previous step to the evaluation phase to expand the objects that are contained in the table into independent commands, so both notations have similar representations. Remove the code to evaluate the nested representation in the evaluation phase since commands are independently evaluated after the expansion. The commands are expanded after the set element collapse step, in case that there is a long list of singleton element commands to be added to the set, to shorten the command list iteration. This approach also avoids interference with the object cache that is populated in the evaluation, which might refer to objects coming in the existing command list that is being processed. There is still a post_expand phase to detach the elements from the set which could be consolidated by updating the evaluation step to handle the CMD_OBJ_SETELEMS command type. This patch fixes 27c753e4a8d4 ("rule: expand standalone chain that contains rules") which broke rule addition/insertion by index because the expansion code after the evaluation messes up the cache. Fixes: 27c753e4a8d4 ("rule: expand standalone chain that contains rules") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to command "destroy"Fernando F. Mancera2023-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | "destroy" command performs a deletion as "delete" command but does not fail if the object does not exist. As there is no NLM_F_* flag for ignoring such error, it needs to be ignored directly on error handling. Example of use: # nft list ruleset table ip filter { chain output { } } # nft destroy table ip missingtable # echo $? 0 # nft list ruleset table ip filter { chain output { } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset rule' and 'reset rules' commandsPhil Sutter2023-01-181-0/+1
| | | | | | | | Reset rule counters and quotas in kernel, i.e. without having to reload them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET message type. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: add vxlan matching supportPablo Neira Ayuso2023-01-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the initial infrastructure to support for inner header tunnel matching and its first user: vxlan. A new struct proto_desc field for payload and meta expression to specify that the expression refers to inner header matching is used. The existing codebase to generate bytecode is fully reused, allowing for reusing existing supported layer 2, 3 and 4 protocols. Syntax requires to specify vxlan before the inner protocol field: ... vxlan ip protocol udp ... vxlan ip saddr 1.2.3.0/24 This also works with concatenations and anonymous sets, eg. ... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 } You have to restrict vxlan matching to udp traffic, otherwise it complains on missing transport protocol dependency, e.g. ... udp dport 4789 vxlan ip daddr 1.2.3.4 The bytecode that is generated uses the new inner expression: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] JSON support is not included in this patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add eval_proto_ctx()Pablo Neira Ayuso2023-01-021-1/+1
| | | | | | | | | | | Add eval_proto_ctx() to access protocol context (struct proto_ctx). Rename struct proto_ctx field to _pctx to highlight that this field is internal and the helper function should be used. This patch comes in preparation for supporting outer and inner protocol context. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Warn for tables with compat expressions in rulesPhil Sutter2022-11-181-0/+1
| | | | | | | | | | | | | | | | | | While being able to "look inside" compat expressions using nft is a nice feature, it is also (yet another) pitfall for unaware users, deceiving them into assuming interchangeability (or at least compatibility) between iptables-nft and nft. In reality, which involves 'nft list ruleset | nft -f -', any correctly translated compat expressions will turn into native nftables ones not understood by (the version of) iptables-nft which created them in the first place. Other compat expressions will vanish, potentially compromising the firewall ruleset. Emit a warning (as comment) to give users a chance to stop and reconsider before shooting their own foot. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: remove NFT_NLATTR_LOC_MAX limit for netlink location error reportingPablo Neira Ayuso2022-06-271-5/+8
| | | | | | | Set might have more than 16 elements, use a runtime array to store netlink error location. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: collapse set element commandsPablo Neira Ayuso2022-06-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Robots might generate a long list of singleton element commands such as: add element t s { 1.0.1.0/24 } ... add element t s { 1.0.2.0/23 } collapse them into one single command before the evaluation step, ie. add element t s { 1.0.1.0/24, ..., 1.0.2.0/23 } this speeds up overlap detection and set element automerge operations in this worst case scenario. Since 3da9643fb9ff9 ("intervals: add support to automerge with kernel elements"), the new interval tracking relies on mergesort. The pattern above triggers the set sorting for each element. This patch adds a list to cmd objects that store collapsed commands. Moreover, expressions also contain a reference to the original command, to uncollapse the commands after the evaluation step. These commands are uncollapsed after the evaluation step to ensure error reporting works as expected (command and netlink message are mapped 1:1). For the record: - nftables versions <= 1.0.2 did not perform any kind of overlap check for the described scenario above (because set cache only contained elements in the kernel in this case). This is a problem for kernels < 5.7 which rely on userspace to detect overlaps. - the overlap detection could be skipped for kernels >= 5.7. - The extended netlink error reporting available for set elements since 5.19-rc might allow to remove the uncollapse step, in this case, error reporting does not rely on the netlink sequence to refer to the command triggering the problem. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace interval segment tree overlap and automergePablo Neira Ayuso2022-04-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rewrite of the segtree interval codebase. This patch now splits the original set_to_interval() function in three routines: - add set_automerge() to merge overlapping and contiguous ranges. The elements, expressed either as single value, prefix and ranges are all first normalized to ranges. This elements expressed as ranges are mergesorted. Then, there is a linear list inspection to check for merge candidates. This code only merges elements in the same batch, ie. it does not merge elements in the kernela and the userspace batch. - add set_overlap() to check for overlapping set elements. Linux kernel >= 5.7 already checks for overlaps, older kernels still needs this code. This code checks for two conflict types: 1) between elements in this batch. 2) between elements in this batch and kernelspace. The elements in the kernel are temporarily merged into the list of elements in the batch to check for this overlaps. The EXPR_F_KERNEL flag allows us to restore the set cache after the overlap check has been performed. - set_to_interval() now only transforms set elements, expressed as range e.g. [a,b], to individual set elements using the EXPR_F_INTERVAL_END flag notation to represent e.g. [a,b+1), where b+1 has the EXPR_F_INTERVAL_END flag set on. More relevant updates: - The overlap and automerge routines are now performed in the evaluation phase. - The userspace set object representation now stores a reference to the existing kernel set object (in case there is already a set with this same name in the kernel). This is required by the new overlap and automerge approach. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* erec: expose print_location() and line_location()Pablo Neira Ayuso2022-01-151-1/+0
| | | | | | | Add a few helper functions to reuse code in the new rule optimization infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: revisit hook listingPablo Neira Ayuso2021-08-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update this command to display the hook datapath for a packet depending on its family. This patch also includes: - Group of existing hooks based on the hook location. - Order hooks by priority, from INT_MIN to INT_MAX. - Do not add sign to priority zero. - Refresh include/linux/netfilter/nfnetlink_hook.h cache copy. - Use NFNLA_CHAIN_* attributes to print the chain family, table and name. If NFNLA_CHAIN_* attributes are not available, display the hookfn name. - Update syntax: remove optional hook parameter, promote the 'device' argument. The following example shows the hook datapath for IPv4 packets coming in from netdevice 'eth0': # nft list hooks ip device eth0 family ip { hook ingress { +0000000010 chain netdev x y [nf_tables] +0000000300 chain inet m w [nf_tables] } hook input { -0000000100 chain ip a b [nf_tables] +0000000300 chain inet m z [nf_tables] } hook forward { -0000000225 selinux_ipv4_forward 0000000000 chain ip a c [nf_tables] } hook output { -0000000225 selinux_ipv4_output } hook postrouting { +0000000225 selinux_ipv4_postroute } } Note that the listing above includes the existing netdev and inet hooks/chains which *might* interfer in the travel of an incoming IPv4 packet. This allows users to debug the pipeline, basically, to understand in what order the hooks/chains are evaluated for the IPv4 packets. If the netdevice is not specified, then the ingress hooks are not shown. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support for base hook dumpingFlorian Westphal2021-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Example output: $ nft list hook ip input family ip hook input { +0000000000 nft_do_chain_inet [nf_tables] # nft table ip filter chain input +0000000010 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain filter_INPUT +0000000100 nf_nat_ipv4_local_in [nf_nat] +2147483647 ipv4_confirm [nf_conntrack] } $ nft list hooks netdev type ingress device lo family netdev hook ingress device lo { +0000000000 nft_do_chain_netdev [nf_tables] } $ nft list hooks inet family ip hook prerouting { -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] -0000000300 iptable_raw_hook [iptable_raw] -0000000290 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain raw_PREROUTING -0000000200 ipv4_conntrack_in [nf_conntrack] -0000000140 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain mangle_PREROUTING -0000000100 nf_nat_ipv4_pre_routing [nf_nat] } ... 'nft list hooks' will display everyting except the netdev family via successive dump request for all family:hook combinations. Signed-off-by: Florian Westphal <fw@strlen.de>
* libnftables: location-based error reporting for chain typePablo Neira Ayuso2021-05-201-1/+6
| | | | | | | | | | | | | | | | | Store the location of the chain type for better error reporting. Several users that compile custom kernels reported that error reporting is misleading when accidentally selecting CONFIG_NFT_NAT=n. After this patch, a better hint is provided: # nft 'add chain x y { type nat hook prerouting priority dstnat; }' Error: Could not process rule: No such file or directory add chain x y { type nat hook prerouting priority dstnat; } ^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for tablePablo Neira Ayuso2021-05-021-3/+1
| | | | | | | | | | | | Add a hashtable for fast table lookups. Tables that reside in the cache use the table->cache_hlist and table->cache_list heads. Table that are created from command line / ruleset are also added to the cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for flowtablePablo Neira Ayuso2021-05-021-2/+2
| | | | | | | | | | Add flowtable hashtable cache. Actually I am not expecting that many flowtables to benefit from the hashtable to be created by streamline this code with tables, chains, sets and policy objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for objectPablo Neira Ayuso2021-05-021-3/+2
| | | | | | | | | | | | | | | | | | | | This patch adds a hashtable for object lookups. This patch also splits table->objs in two: - Sets that reside in the cache are stored in the new tables->cache_obj and tables->cache_obj_ht. - Set that defined via command line / ruleset file reside in tables->obj. Sets in the cache (already in the kernel) are not placed in the table->objs list. By keeping separated lists, objs defined via command line / ruleset file can be added to cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: consolidate object cache infrastructurePablo Neira Ayuso2021-05-021-8/+4
| | | | | | | This patch consolidates the object cache infrastructure. Update set and chains to use it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: consolidate nft_cache infrastructurePablo Neira Ayuso2021-05-021-6/+0
| | | | | | | | - prepend nft_ prefix to nft_cache API and internal functions - move declarations to cache.h (and remove redundant declarations) - move struct nft_cache definition to cache.h Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for setsPablo Neira Ayuso2021-04-031-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a hashtable for set lookups. This patch also splits table->sets in two: - Sets that reside in the cache are stored in the new tables->cache_set and tables->cache_set_ht. - Set that defined via command line / ruleset file reside in tables->set. Sets in the cache (already in the kernel) are not placed in the table->sets list. By keeping separated lists, sets defined via command line / ruleset file can be added to cache. Adding 10000 sets, before: # time nft -f x real 0m6,415s user 0m3,126s sys 0m3,284s After: # time nft -f x real 0m3,949s user 0m0,743s sys 0m3,205s Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: use chain hashtable for lookupsPablo Neira Ayuso2021-04-031-2/+0
| | | | | | | | | | | | | | | | | | | | | Instead of the linear list lookup. Before this patch: real 0m21,735s user 0m20,329s sys 0m1,384s After: real 0m10,910s user 0m9,448s sys 0m1,434s chain_lookup() is removed since linear list lookups are only used by the fuzzy chain name matching for error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: split chain list in tablePablo Neira Ayuso2021-04-031-0/+2
| | | | | | | | | | | | | | | | | | | This patch splits table->lists in two: - Chains that reside in the cache are stored in the new tables->cache_chain and tables->cache_chain_ht. The hashtable chain cache allows for fast chain lookups. - Chains that defined via command line / ruleset file reside in tables->chains. Note that chains in the cache (already in the kernel) are not placed in the table->chains. By keeping separated lists, chains defined via command line / ruleset file can be added to cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: rename chain_htable to cache_chain_htPablo Neira Ayuso2021-04-031-2/+2
| | | | | | Rename the hashtable chain that is used for fast cache lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftables: add flags offload to flowtableFrank Wunderlich2021-03-251-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allow flags (currently only offload) in flowtables like it is stated here: https://lwn.net/Articles/804384/ tested on mt7622/Bananapi-R64 table ip filter { flowtable f { hook ingress priority filter + 1 devices = { lan3, lan0, wan } flags offload; } chain forward { type filter hook forward priority filter; policy accept; ip protocol { tcp, udp } flow add @f } } table ip nat { chain post { type nat hook postrouting priority filter; policy accept; oifname "wan" masquerade } } Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: support for the table owner flagPablo Neira Ayuso2021-03-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Add new flag to allow userspace process to own tables: Tables that have an owner can only be updated/destroyed by the owner. The table is destroyed either if the owner process calls nft_ctx_free() or owner process is terminated (implicit table release). The ruleset listing includes the program name that owns the table: nft> list ruleset table ip x { # progname nft flags owner chain y { type filter hook input priority filter; policy accept; counter packets 1 bytes 309 } } Original code to pretty print the netlink portID to program name has been extracted from the conntrack userspace utility. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: rework flags printingPablo Neira Ayuso2021-03-021-1/+1
| | | | | | | Simplify routine to print the table flags. Add table_flag_name() and use it from json too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set element multi-statement supportPablo Neira Ayuso2020-12-181-1/+1
| | | | | | | | Extend the set element infrastructure to support for several statements. This patch places the statements right after the key when printing it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: constify location parameter in cmd_add_loc()Pablo Neira Ayuso2020-10-191-3/+3
| | | | | | | Constify pointer to location object to compile check for unintentional updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: larger number of error locationsPablo Neira Ayuso2020-10-191-1/+1
| | | | | | | | | | Statically store up to 32 locations per command, if the number of locations is larger than 32, then skip rather than hit assertion. Revisit this later to dynamically store location per command using a hashtable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for chainsJose M. Guisado Gomez2020-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | This patch enables the user to specify a comment when adding a chain. Relies on kernel space supporting userdata for chains. > nft add table ip filter > nft add chain ip filter input { comment "test"\; type filter hook input priority 0\; policy accept\; } > list ruleset table ip filter { chain input { comment "test" type filter hook input priority filter; policy accept; } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for objectsJose M. Guisado Gomez2020-09-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enables specifying an optional comment when declaring named objects. The comment is to be specified inside the object's block ({} block) Relies on libnftnl exporting nftnl_obj_get_data and kernel space support to store the comments. For consistency, this patch makes the comment be printed first when listing objects. Adds a testcase importing all commented named objects except for secmark, although it's supported. Example: Adding a quota with a comment > add table inet filter > nft add quota inet filter q { over 1200 bytes \; comment "test_comment"\; } > list ruleset table inet filter { quota q { comment "test_comment" over 1200 bytes } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support when adding tablesJose M. Guisado Gomez2020-08-281-0/+1
| | | | | | | | | | | | | | | | | | | Adds userdata building logic if a comment is specified when creating a new table. Adds netlink userdata parsing callback function. Relies on kernel supporting userdata for nft_table. Example: > nft add table ip x { comment "test"\; } > nft list ruleset table ip x { comment "test" } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add chain hashtable cachePablo Neira Ayuso2020-08-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | This significantly improves ruleset listing time with large rulesets (~50k rules) with _lots_ of non-base chains. # time nft list ruleset &> /dev/null Before this patch: real 0m11,172s user 0m6,810s sys 0m4,220s After this patch: real 0m4,747s user 0m0,802s sys 0m3,912s This patch also removes list_bindings from netlink_ctx since there is no need to keep a temporary list of chains anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add expression handler hashtablePablo Neira Ayuso2020-08-261-0/+1
| | | | | | | netlink_parsers is actually small, but update this code to use a hashtable instead since more expressions may come in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for set declarationsJose M. Guisado Gomez2020-08-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow users to add a comment when declaring a named set. Adds set output handling the comment in both nftables and json format. $ nft add table ip x $ nft add set ip x s {type ipv4_addr\; comment "some_addrs"\; elements = {1.1.1.1, 1.2.3.4}} $ nft list ruleset table ip x { set s { type ipv4_addr; comment "some_addrs" elements = { 1.1.1.1, 1.2.3.4 } } } $ nft --json list ruleset { "nftables": [ { "metainfo": { "json_schema_version": 1, "release_name": "Capital Idea #2", "version": "0.9.6" } }, { "table": { "family": "ip", "handle": 4857, "name": "x" } }, { "set": { "comment": "some_addrs", "elem": [ "1.1.1.1", "1.2.3.4" ], "family": "ip", "handle": 1, "name": "s", "table": "x", "type": "ipv4_addr" } } ] } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove cache lookups after the evaluation phasePablo Neira Ayuso2020-07-291-0/+4
| | | | | | | | | | | | This patch adds a new field to the cmd structure for elements to store a reference to the set. This saves an extra lookup in the netlink bytecode generation step. This patch also allows to incrementally update during the evaluation phase according to the command actions, which is required by the follow up ("evaluate: remove table from cache on delete table") bugfix patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for implicit chain bindingsPablo Neira Ayuso2020-07-151-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to group rules in a subchain, e.g. table inet x { chain y { type filter hook input priority 0; tcp dport 22 jump { ip saddr { 127.0.0.0/8, 172.23.0.0/16, 192.168.13.0/24 } accept ip6 saddr ::1/128 accept; } } } This also supports for the `goto' chain verdict. This patch adds a new chain binding list to avoid a chain list lookup from the delinearize path for the usual chains. This can be simplified later on with a single hashtable per table for all chains. From the shell, you have to use the explicit separator ';', in bash you have to escape this: # nft add rule inet x y tcp dport 80 jump { ip saddr 127.0.0.1 accept\; ip6 saddr ::1 accept \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add CMD_OBJ_SETELEMSPablo Neira Ayuso2020-05-141-0/+2
| | | | | | | | | | | This new command type results from expanding the set definition in two commands: One to add the set and another to add the elements. This results in 1:1 mapping between the command object to the netlink API. The command is then translated into a netlink message which gets a unique sequence number. This sequence number allows to correlate the netlink extended error reporting with the corresponding command. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>