summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* json: fix base chain outputFlorian Westphal2021-06-021-1/+1
| | | | | | | | | | | nft-test.py -j fails with python: json.c:243: chain_print_json: Assertion `__out' failed. The member was changed from char * to a struct, pass the name again. Fixes: 5008798157e2114f ("libnftables: location-based error reporting for chain type") Signed-off-by: Florian Westphal <fw@strlen.de> (cherry picked from commit cabe8992b3ee4eb0001a07075b317d966df6bcbd)
* expression: display an error on unknown datatypePablo Neira Ayuso2021-05-241-1/+4
| | | | | | | | # nft describe foo datatype foo is invalid Fixes: 21cbab5b6ffe ("expression: extend 'nft describe' to allow listing data types") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow == and != in the new shortcut syntax to match for flagsPablo Neira Ayuso2021-05-241-0/+4
| | | | | | | | | | | | The flags / mask syntax only allows for ==, != and the implicit operation (which is == in this case). # nft add rule x y tcp flags ! syn / syn,ack Error: either == or != is allowed add rule x y tcp flags ! syn / syn,ack ^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expr_postprocess: Avoid an unintended fall throughPhil Sutter2021-05-201-0/+1
| | | | | | | | | Parsing a range expression, the switch case fell through to prefix expression case, thereby recursing once more for expr->left. This seems not to have caused harm, but is certainly not intended. Fixes: ee4391d0ac1e7 ("nat: transform range to prefix expression when possible") Signed-off-by: Phil Sutter <phil@nwl.cc>
* rule: skip exact matches on fuzzy lookupPablo Neira Ayuso2021-05-201-19/+0
| | | | | | | | The fuzzy lookup is exercised from the error path, when no object is found. Remove branch that checks for exact matching since that should not ever happen. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cmd: typo in chain fuzzy lookupPablo Neira Ayuso2021-05-201-1/+1
| | | | | | | | | | | | | | | | Refer to chain, not table. Error: No such file or directory; did you mean table ‘z’ in family ip? add chain x y { type filter nat prerouting priority dstnat; } ^ It should say instead: Error: No such file or directory; did you mean chain ‘z’ in table ip ‘x’? [ Florian added args check for fmt to the netlink_io_error() prototype. ] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: location-based error reporting for chain typePablo Neira Ayuso2021-05-205-7/+14
| | | | | | | | | | | | | | | | | Store the location of the chain type for better error reporting. Several users that compile custom kernels reported that error reporting is misleading when accidentally selecting CONFIG_NFT_NAT=n. After this patch, a better hint is provided: # nft 'add chain x y { type nat hook prerouting priority dstnat; }' Error: Could not process rule: No such file or directory add chain x y { type nat hook prerouting priority dstnat; } ^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: Implement SCTP Chunk matchingPhil Sutter2021-05-198-2/+506
| | | | | | | | Extend exthdr expression to support scanning through SCTP packet chunks and matching on fixed fields' values. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>
* json: Simplify non-tcpopt exthdr printing a bitPhil Sutter2021-05-191-11/+7
| | | | | | This was just duplicate code apart from the object's name. Signed-off-by: Phil Sutter <phil@nwl.cc>
* scanner: sctp: Move to own scopePhil Sutter2021-05-192-4/+9
| | | | | | | This isolates only "vtag" token for now. Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Florian Westphal <fw@strlen.de>
* datatype: skip cgroupv2 rootfs in listingPablo Neira Ayuso2021-05-181-1/+2
| | | | | | | | | | | | | | | | cgroupv2 path is expressed from the /sys/fs/cgroup folder, update listing to skip it. # nft add rule x y socket cgroupv2 level 1 "user.slice" counter # nft list ruleset table ip x { chain y { type filter hook input priority filter; policy accept; socket cgroupv2 level 1 "user.slice" counter } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use PRIu64 formatPablo Neira Ayuso2021-05-182-2/+2
| | | | | | | | | | | | | | | | | | Fix the following compilation warnings on x86_32. datatype.c: In function ‘cgroupv2_type_print’: datatype.c:1387:22: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=] nft_print(octx, "%lu", id); ~~^ ~~ %llu meta.c: In function ‘date_type_print’: meta.c:411:21: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=] nft_print(octx, "%lu", tstamp); ~~^ ~~~~~~ %llu Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: add shortcut syntax for matching flags without binary operationsPablo Neira Ayuso2021-05-165-22/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following shortcut syntax: expression flags / flags instead of: expression and flags == flags For example: tcp flags syn,ack / syn,ack,fin,rst ^^^^^^^ ^^^^^^^^^^^^^^^ value mask instead of: tcp flags and (syn|ack|fin|rst) == syn|ack The second list of comma-separated flags represents the mask which are examined and the first list of comma-separated flags must be set. You can also use the != operator with this syntax: tcp flags != fin,rst / syn,ack,fin,rst This shortcut is based on the prefix notation, but it is also similar to the iptables tcp matching syntax. This patch introduces the flagcmp expression to print the tcp flags in this new notation. The delinearize path transforms the binary expression to this new flagcmp expression whenever possible. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: check errno before invoking cache_release()Marco Oliverio2021-05-141-2/+4
| | | | | | | | | | | | if genid changes during cache_init(), check_genid() sets errno to EINTR to force a re-init of the cache. cache_release() may inadvertly change errno by calling free(). Indeed free() may invoke madvise() that changes errno to ENOSYS on system where kernel is configured without support for this syscall. Signed-off-by: Marco Oliverio <marco.oliverio@tanaza.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: fix binary operation postprocessing with setsPablo Neira Ayuso2021-05-131-0/+1
| | | | | | | | | | | | | | If the right-hand side expression of the binary expression is a set, then, skip the postprocessing step otherwise the tests/py report the following warning: # ./nft-test.py inet/tcp.t inet/tcp.t: WARNING: line 80: 'add rule ip test-ip4 input tcp flags & (syn|fin) == (syn|fin)': 'tcp flags & (fin | syn) == fin | syn' mismatches 'tcp flags ! fin,syn' inet/tcp.t: WARNING: line 83: 'add rule ip test-ip4 input tcp flags & (fin | syn | rst | psh | ack | urg) == { fin, ack, psh | ack, fin | psh | ack }': 'tcp flags & (fin | syn | rst | psh | ack | urg) == { fin, ack, psh | ack, fin | psh | ack }' mismatches 'tcp flags ! fin,syn,rst,psh,ack,urg' This listing is not correct. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: don't crash on set definition with incorrect datatypePablo Neira Ayuso2021-05-111-1/+1
| | | | | | | | | | Cache updates have resurrected the bug described in 5afa5a164ff1 ("evaluate: check for NULL datatype in rhs in lookup expr"). This is triggered by testcases/cache/0008_delete_by_handle_0. Fixes: df48e56e987f ("cache: add hashtable cache for sets") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set element catch-all supportPablo Neira Ayuso2021-05-116-43/+118
| | | | | | | | | | | | | | | | | | | | | | | | | Add a catchall expression (EXPR_SET_ELEM_CATCHALL). Use the asterisk (*) to represent the catch-all set element, e.g. table x { set y { type ipv4_addr counter elements = { 1.2.3.4 counter packets 0 bytes 0, * counter packets 0 bytes 0 } } } Special handling for segtree: zap the catch-all element from the set element list and re-add it after processing. Remove wildcard_expr deadcode in src/parser_bison.y This patch also adds several tests for the tests/py and tests/shell infrastructures. Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: add set_elem_key_expr rulePablo Neira Ayuso2021-05-111-2/+8
| | | | | | | Add a rule to specify the set key expression in preparation for the catch-all element support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Fix range_mask_len() for subnet ranges exceeding unsigned intStefano Brivio2021-05-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | As concatenated ranges are fetched from kernel sets and displayed to the user, range_mask_len() evaluates whether the range is suitable for display as netmask, and in that case it calculates the mask length by right-shifting the endpoints until no set bits are left, but in the existing version the temporary copies of the endpoints are derived by copying their unsigned int representation, which doesn't suffice for IPv6 netmask lengths, in general. PetrB reports that, after inserting a /56 subnet in a concatenated set element, it's listed as a /64 range. In fact, this happens for any IPv6 mask shorter than 64 bits. Fix this issue by simply sourcing the range endpoints provided by the caller and setting the temporary copies with mpz_init_set(), instead of fetching the unsigned int representation. The issue only affects displaying of the masks, setting elements already works as expected. Reported-by: PetrB <petr.boltik@gmail.com> Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1520 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add cgroupsv2 supportPablo Neira Ayuso2021-05-037-7/+119
| | | | | | Add support for matching on the cgroups version 2. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: remove object from cache on delete object commandPablo Neira Ayuso2021-05-021-0/+37
| | | | | | Update the cache to remove this object from the evaluation phase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: remove flowtable from cache on delete flowtable commandPablo Neira Ayuso2021-05-022-0/+29
| | | | | | | Update the cache to remove this flowtable from the evaluation phase. Add flowtable_cache_del() function for this purpose. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: remove set from cache on delete set commandPablo Neira Ayuso2021-05-021-0/+24
| | | | | | Update the cache to remove this set from the evaluation phase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: remove chain from cache on delete chain commandPablo Neira Ayuso2021-05-022-0/+29
| | | | | | | Update the cache to remove this chain from the evaluation phase. Add chain_cache_del() function for this purpose. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for tablePablo Neira Ayuso2021-05-028-111/+168
| | | | | | | | | | | | Add a hashtable for fast table lookups. Tables that reside in the cache use the table->cache_hlist and table->cache_list heads. Table that are created from command line / ruleset are also added to the cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add object to the cachePablo Neira Ayuso2021-05-021-0/+10
| | | | | | | | | | | If the cache does not contain this object that is defined in this batch, add it to the cache. This allows for references to this new object in the same batch. This patch also adds missing handle_merge() to set the object name, otherwise object name is NULL and obj_cache_find() crashes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: missing table cache for several policy objectsPablo Neira Ayuso2021-05-021-0/+4
| | | | | | Populate the cache with tables for several policy objects types. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add flowtable to the cachePablo Neira Ayuso2021-05-021-0/+3
| | | | | | | | If the cache does not contain this flowtable that is defined in this batch, then add it to the cache. This allows for references to this new flowtable in the same batch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add set to the cachePablo Neira Ayuso2021-05-021-0/+4
| | | | | | | | If the cache does not contain the set that is defined in this batch, add it to the cache. This allows for references to this new set in the same batch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add set_cache_del() and use itPablo Neira Ayuso2021-05-022-1/+6
| | | | | | | | Update set_cache_del() from the monitor path to remove sets in the cache. Fixes: df48e56e987f ("cache: add hashtable cache for sets") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for flowtablePablo Neira Ayuso2021-05-025-31/+104
| | | | | | | | | | Add flowtable hashtable cache. Actually I am not expecting that many flowtables to benefit from the hashtable to be created by streamline this code with tables, chains, sets and policy objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for objectPablo Neira Ayuso2021-05-026-52/+117
| | | | | | | | | | | | | | | | | | | | This patch adds a hashtable for object lookups. This patch also splits table->objs in two: - Sets that reside in the cache are stored in the new tables->cache_obj and tables->cache_obj_ht. - Set that defined via command line / ruleset file reside in tables->obj. Sets in the cache (already in the kernel) are not placed in the table->objs list. By keeping separated lists, objs defined via command line / ruleset file can be added to cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: consolidate object cache infrastructurePablo Neira Ayuso2021-05-023-42/+56
| | | | | | | This patch consolidates the object cache infrastructure. Update set and chains to use it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: consolidate nft_cache infrastructurePablo Neira Ayuso2021-05-024-24/+25
| | | | | | | | - prepend nft_ prefix to nft_cache API and internal functions - move declarations to cache.h (and remove redundant declarations) - move struct nft_cache definition to cache.h Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: pass chain name to chain_cache_find()Pablo Neira Ayuso2021-05-024-12/+11
| | | | | | | | You can identify chains through the unique handle in deletions, update this interface to take a string instead of the handle to prepare for the introduction of 64-bit handle chain lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: skip fuzzy lookup for unexisting 64-bit handlePablo Neira Ayuso2021-05-021-0/+15
| | | | | | | | Deletion by handle, if incorrect, should not exercise the misspell lookup functions. Fixes: 3a0e07106f66 ("src: combine extended netlink error reporting with mispelling support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: unbreak deletion by table handlePablo Neira Ayuso2021-05-022-1/+4
| | | | | | | | | Use NFTA_TABLE_HANDLE instead of NFTA_TABLE_NAME to refer to the table 64-bit unique handle. Fixes: 7840b9224d5b ("evaluate: remove table from cache on delete table") Fixes: f8aec603aa7e ("src: initial extended netlink error reporting") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: missing relational operation on flag listPablo Neira Ayuso2021-05-021-0/+4
| | | | | | | | | | | | | | | Complete e6c32b2fa0b8 ("src: add negation match on singleton bitmask value") which was missing comma-separated list of flags. This patch provides a shortcut for: tcp flags and fin,rst == 0 which allows to check for the packet whose fin and rst bits are unset: # nft add rule x y tcp flags not fin,rst counter Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: allow to load stateful ct connlimit elements in setsLaura Garcia Liebana2021-05-021-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a syntax error after loading a nft dump with a set including stateful ct connlimit elements. Having a nft dump as per below: table ip nftlb { set connlimit-set { type ipv4_addr size 65535 flags dynamic elements = { 84.245.120.167 ct count over 20 , 86.111.207.45 ct count over 20 , 173.212.220.26 ct count over 20 , 200.153.13.235 ct count over 20 } } } The syntax error is shown when loading the ruleset. root# nft -f connlimit.nft connlimit.nft:15997:31-32: Error: syntax error, unexpected ct, expecting comma or '}' elements = { 84.245.120.167 ct count over 20 , 86.111.207.45 ct count over 20 , ^^ connlimit.nft:16000:9-22: Error: syntax error, unexpected string 173.212.220.26 ct count over 20 , 200.153.13.235 ct count over 20 } ^^^^^^^^^^^^^^ After applying this patch a kernel panic is raised running nft_rhash_gc() although no packet reaches the set. The following patch [0] should be used as well: 4d8f9065830e5 ("netfilter: nftables: clone set element expression template") Note that the kernel patch will produce the emptying of the connection tracking, so the restore of the conntrack states should be considered. [0]: https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git/commit/?id=4d8f9065830e526c83199186c5f56a6514f457d2 Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check if nat statement map specifies a transport header exprFlorian Westphal2021-04-291-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | Importing the systemd nat table fails: table ip io.systemd.nat { map map_port_ipport { type inet_proto . inet_service : ipv4_addr . inet_service elements = { tcp . 8088 : 192.168.162.117 . 80 } } chain prerouting { type nat hook prerouting priority dstnat + 1; policy accept; fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport } } ruleset:9:48-59: Error: transport protocol mapping is only valid after transport protocol match To resolve this (no transport header base specified), check if the map itself contains a network base protocol expression. This allows nft to import the ruleset. Import still fails with same error if 'inet_service' is removed from the map, as it should. Reported-by: Henning Reich <henning.reich@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: Increase BATCH_PAGE_SIZE to support huge rulesetsPhil Sutter2021-04-211-4/+4
| | | | | | | Apply the same change from iptables-nft to nftables to keep them in sync with regards to max supported transaction sizes. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: bail out if chain list cannot be fetched from kernelPablo Neira Ayuso2021-04-031-1/+1
| | | | | | Do not report success if chain cache list cannot be built. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add hashtable cache for setsPablo Neira Ayuso2021-04-037-72/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a hashtable for set lookups. This patch also splits table->sets in two: - Sets that reside in the cache are stored in the new tables->cache_set and tables->cache_set_ht. - Set that defined via command line / ruleset file reside in tables->set. Sets in the cache (already in the kernel) are not placed in the table->sets list. By keeping separated lists, sets defined via command line / ruleset file can be added to cache. Adding 10000 sets, before: # time nft -f x real 0m6,415s user 0m3,126s sys 0m3,284s After: # time nft -f x real 0m3,949s user 0m0,743s sys 0m3,205s Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: check for NULL chain in cache_init()Pablo Neira Ayuso2021-04-031-0/+5
| | | | | | | | | | | | | Another process might race to add chains after chain_cache_init(). The generation check does not help since it comes after cache_init(). NLM_F_DUMP_INTR only guarantees consistency within one single netlink dump operation, so it does not help either (cache population requires several netlink dump commands). Let's be safe and do not assume the chain exists in the cache when populating the rule cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: statify chain_cache_dump()Pablo Neira Ayuso2021-04-031-1/+2
| | | | | | Only used internally in cache.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: use chain hashtable for lookupsPablo Neira Ayuso2021-04-033-17/+6
| | | | | | | | | | | | | | | | | | | | | Instead of the linear list lookup. Before this patch: real 0m21,735s user 0m20,329s sys 0m1,384s After: real 0m10,910s user 0m9,448s sys 0m1,434s chain_lookup() is removed since linear list lookups are only used by the fuzzy chain name matching for error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: split chain list in tablePablo Neira Ayuso2021-04-033-13/+17
| | | | | | | | | | | | | | | | | | | This patch splits table->lists in two: - Chains that reside in the cache are stored in the new tables->cache_chain and tables->cache_chain_ht. The hashtable chain cache allows for fast chain lookups. - Chains that defined via command line / ruleset file reside in tables->chains. Note that chains in the cache (already in the kernel) are not placed in the table->chains. By keeping separated lists, chains defined via command line / ruleset file can be added to cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: rename chain_htable to cache_chain_htPablo Neira Ayuso2021-04-032-6/+6
| | | | | | Rename the hashtable chain that is used for fast cache lookups. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: replace vlan ether type with 8021qFlorian Westphal2021-04-032-1/+5
| | | | | | | | | | | | | Previous patches added "8021ad" mnemonic for IEEE 802.1AD frame type. This adds the 8021q shorthand for the existing 'vlan' frame type. nft will continue to recognize 'ether type vlan', but listing will now print 8021q. Adjust all test cases accordingly. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: be careful on vlan dependency removalFlorian Westphal2021-04-031-3/+26
| | | | | | | 'vlan ...' implies 8021Q frame. In case the expression tests something else (802.1AD for example) its not an implictly added one, so keep it. Signed-off-by: Florian Westphal <fw@strlen.de>