summaryrefslogtreecommitdiffstats
path: root/src/segtree.c
Commit message (Collapse)AuthorAgeFilesLines
* segtree: fix decomposition of unclosed intervals containing address prefixesJeremy Sowden2022-09-211-16/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code which decomposes unclosed intervals doesn't check for prefixes. This leads to incorrect output for sets which contain these. For example, # nft -f - <<END table ip t { chain c { ip saddr 192.0.0.0/2 drop ip saddr 10.0.0.0/8 drop ip saddr { 192.0.0.0/2, 10.0.0.0/8 } drop } } table ip6 t { chain c { ip6 saddr ff00::/8 drop ip6 saddr fe80::/10 drop ip6 saddr { ff00::/8, fe80::/10 } drop } } END # nft list table ip6 t table ip6 t { chain c { ip6 saddr ff00::/8 drop ip6 saddr fe80::/10 drop ip6 saddr { fe80::/10, ff00::-ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff } drop } } # nft list table ip t table ip t { chain c { ip saddr 192.0.0.0/2 drop ip saddr 10.0.0.0/8 drop ip saddr { 10.0.0.0/8, 192.0.0.0-255.255.255.255 } drop } } Instead of treating the final unclosed interval as a special case, reuse the code which correctly handles closed intervals. Add a shell test-case. Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1018156 Fixes: 86b965bdab8d ("segtree: fix decomposition of unclosed intervals") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* segtree: refactor decomposition of closed intervalsJeremy Sowden2022-09-211-33/+38
| | | | | | | | | | | Move the code in `interval_map_decompose` which adds a new closed interval to the set into a separate function. In addition to the moving of the code, there is one other change: `compound_expr_add` is called once, after the main conditional, instead of being called in each branch. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* segtree: fix map listing with interface wildcardPablo Neira Ayuso2022-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | # nft -f - <<'EOF' table inet filter { chain INPUT { iifname vmap { "eth0" : jump input_lan, "wg*" : jump input_vpn } } chain input_lan {} chain input_vpn {} } EOF # nft list ruleset nft: segtree.c:578: interval_map_decompose: Assertion `low->len / 8 > 0' failed. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1617 Fixes: 5e393ea1fc0a ("segtree: add string "range" reversal support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: add pretty-print support for wildcard strings in concatenated setsFlorian Westphal2022-05-091-2/+36
| | | | | | | | | | For concat ranges, something like 'ppp*' is translated as a range from 'ppp\0\0\0...' to 'ppp\ff\ff\ff...'. In order to display this properly, check for presence of string base type and convert to symbolic expression, with appended '*' character. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: replace interval segment tree overlap and automergePablo Neira Ayuso2022-04-131-651/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rewrite of the segtree interval codebase. This patch now splits the original set_to_interval() function in three routines: - add set_automerge() to merge overlapping and contiguous ranges. The elements, expressed either as single value, prefix and ranges are all first normalized to ranges. This elements expressed as ranges are mergesorted. Then, there is a linear list inspection to check for merge candidates. This code only merges elements in the same batch, ie. it does not merge elements in the kernela and the userspace batch. - add set_overlap() to check for overlapping set elements. Linux kernel >= 5.7 already checks for overlaps, older kernels still needs this code. This code checks for two conflict types: 1) between elements in this batch. 2) between elements in this batch and kernelspace. The elements in the kernel are temporarily merged into the list of elements in the batch to check for this overlaps. The EXPR_F_KERNEL flag allows us to restore the set cache after the overlap check has been performed. - set_to_interval() now only transforms set elements, expressed as range e.g. [a,b], to individual set elements using the EXPR_F_INTERVAL_END flag notation to represent e.g. [a,b+1), where b+1 has the EXPR_F_INTERVAL_END flag set on. More relevant updates: - The overlap and automerge routines are now performed in the evaluation phase. - The userspace set object representation now stores a reference to the existing kernel set object (in case there is already a set with this same name in the kernel). This is required by the new overlap and automerge approach. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add EXPR_F_KERNEL to identify expression in the kernelPablo Neira Ayuso2022-04-131-1/+4
| | | | | | This allows to identify the set elements that reside in the kernel. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: add support for get element with sets that contain ifnamesFlorian Westphal2022-04-131-14/+45
| | | | | | | | | | | | | | | | nft get element inet filter s { bla, prefixfoo } table inet filter { set s { type ifname flags interval elements = { "prefixfoo*", "bla" } } Also add test cases for this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: use correct byte order for 'element get'Florian Westphal2022-04-131-1/+2
| | | | | | | | Fails when the argument / set contains strings: we need to use host byte order if element has string base type. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: add string "range" reversal supportFlorian Westphal2022-04-131-6/+41
| | | | | | | | | | | | | | | | | | | | Previous commits allows to use set key as a range, i.e. key ifname flags interval elements = { eth* } and then have it match on any interface starting with 'eth'. Listing is broken however, we need to reverse-translate the (128bit) number back to a string. 'eth*' is stored as interval 00687465 0000000 .. 00697465 0000000, i.e. "eth-eti", this adds the needed endianess fixups. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: make interval sets work with string datatypesFlorian Westphal2022-04-131-4/+26
| | | | | | | | | | | | | | | | | | | Allows to interface names in interval sets: table inet filter { set s { type ifname flags interval elements = { eth*, foo } } Concatenations are not yet supported, also, listing is broken, those strings will not be printed back because the values will remain in big-endian order. Followup patch will extend segtree to translate this back to host byte order. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: split prefix and range creation to a helper functionFlorian Westphal2022-04-131-43/+52
| | | | | | | No functional change intended. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: memleak get element commandPablo Neira Ayuso2022-02-171-0/+2
| | | | | | | | | Release removed interval expressions before get_set_interval_find() fails. The memleak can be triggered through: testcases/sets/0034get_element_0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: memleak in error path of the set to segtree conversionPablo Neira Ayuso2021-06-181-2/+14
| | | | | | | | | | | Release the array of intervals and the segtree in case of error, otherwise these structures and objects are never released: SUMMARY: AddressSanitizer: 2864 byte(s) leaked in 37 allocation(s). Moreover, improve existing a test coverage of this error path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace opencoded NFT_SET_ANONYMOUS set flag check by set_is_anonymous()Pablo Neira Ayuso2021-06-141-1/+1
| | | | | | | | Use set_is_anonymous() to check for the NFT_SET_ANONYMOUS set flag instead. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Fix segfault when restoring a huge interval setPhil Sutter2021-06-091-4/+6
| | | | | | | | | | | | Restoring a set of IPv4 prefixes with about 1.1M elements crashes nft as set_to_segtree() exhausts the stack. Prevent this by allocating the pointer array on heap and make sure it is freed before returning to caller. With this patch in place, restoring said set succeeds with allocation of about 3GB of memory, according to valgrind. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: add set element catch-all supportPablo Neira Ayuso2021-05-111-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | Add a catchall expression (EXPR_SET_ELEM_CATCHALL). Use the asterisk (*) to represent the catch-all set element, e.g. table x { set y { type ipv4_addr counter elements = { 1.2.3.4 counter packets 0 bytes 0, * counter packets 0 bytes 0 } } } Special handling for segtree: zap the catch-all element from the set element list and re-add it after processing. Remove wildcard_expr deadcode in src/parser_bison.y This patch also adds several tests for the tests/py and tests/shell infrastructures. Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Fix range_mask_len() for subnet ranges exceeding unsigned intStefano Brivio2021-05-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | As concatenated ranges are fetched from kernel sets and displayed to the user, range_mask_len() evaluates whether the range is suitable for display as netmask, and in that case it calculates the mask length by right-shifting the endpoints until no set bits are left, but in the existing version the temporary copies of the endpoints are derived by copying their unsigned int representation, which doesn't suffice for IPv6 netmask lengths, in general. PetrB reports that, after inserting a /56 subnet in a concatenated set element, it's listed as a /64 range. In fact, this happens for any IPv6 mask shorter than 64 bits. Fix this issue by simply sourcing the range endpoints provided by the caller and setting the temporary copies with mpz_init_set(), instead of fetching the unsigned int representation. The issue only affects displaying of the masks, setting elements already works as expected. Reported-by: PetrB <petr.boltik@gmail.com> Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1520 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* segtree: release single element already contained in an intervalPablo Neira Ayuso2021-03-241-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch: table ip x { chain y { ip saddr { 1.1.1.1-1.1.1.2, 1.1.1.1 } } } results in: table ip x { chain y { ip saddr { 1.1.1.1 } } } due to incorrect interval merge logic. If the element 1.1.1.1 is already contained in an existing interval 1.1.1.1-1.1.1.2, release it. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1512 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: honor set element expirationPablo Neira Ayuso2021-01-061-20/+14
| | | | | | | | Extend c1f0476fd590 ("segtree: copy expr data to closing element") to use interval_expr_copy() from the linearization path. Reported-by: Mike Dillinger <miked@softtalker.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set element multi-statement supportPablo Neira Ayuso2020-12-181-4/+2
| | | | | | | | Extend the set element infrastructure to support for several statements. This patch places the statements right after the key when printing it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: UAF in interval_map_decompose()Pablo Neira Ayuso2020-10-201-3/+5
| | | | | | | | | | | reported by tests/monitor# bash run-tests.sh ... SUMMARY: AddressSanitizer: heap-use-after-free /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:1385 in expr_ops Due to incorrect structure layout when calling interval_expr_copy(). Fixes: c1f0476fd590 ("segtree: copy expr data to closing element") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: copy expr data to closing elementFlorian Westphal2020-10-151-40/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When last expr has no closing element we did not propagate expr properties such as comment or expire date to the newly allocated set elem. Before: nft create table t nft 'add set t s { type ipv4_addr; flags interval; timeout 60s; }' nft add element t s { 224.0.0.0/3 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0-255.255.255.255 } nft flush set t s nft add element t s { 224.0.0.0/4, 240.0.0.0/4 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0/4 expires 55s152ms, 240.0.0.0-255.255.255.255 } nft delete set t s nft 'add set t s { type ipv4_addr; flags interval; auto-merge; timeout 60s; }' nft add element t s { 224.0.0.0/4, 240.0.0.0/4 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0-255.255.255.255 } After: elements = { 224.0.0.0-255.255.255.255 expires 58s515ms } elements = { 224.0.0.0/4 expires 54s622ms, 240.0.0.0-255.255.255.255 expires 54s622ms } elements = { 224.0.0.0-255.255.255.255 expires 57s92ms } Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1454 Signed-off-by: Florian Westphal <fw@strlen.de>
* segtree: memleaks in interval_map_decompose()Pablo Neira Ayuso2020-08-051-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mpz_init_bitmask() overrides the existing memory area: ==19179== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==19179== at 0x483577F: malloc (vg_replace_malloc.c:299) ==19179== by 0x489C718: xmalloc (utils.c:36) ==19179== by 0x4B825C5: __gmpz_init2 (in /usr/lib/x86_64-linux-g nu/libgmp.so.10.3.2) f ==19179== by 0x4880239: constant_expr_alloc (expression.c:400) ==19179== by 0x489B8A1: interval_map_decompose (segtree.c:1098) ==19179== by 0x489017D: netlink_list_setelems (netlink.c:1220) ==19179== by 0x48779AC: cache_init_objects (rule.c:170) 5 ==19179== by 0x48779AC: cache_init (rule.c:228) ==19179== by 0x48779AC: cache_update (rule.c:279) ==19179== by 0x48A21AE: nft_evaluate (libnftables.c:406) left-hand side of the interval is leaked when building the range: ==25835== 368 (128 direct, 240 indirect) bytes in 1 blocks are definitely lost in loss record 5 of 5 ==25835== at 0x483577F: malloc (vg_replace_malloc.c:299) ==25835== by 0x489B628: xmalloc (utils.c:36) ==25835== by 0x489B6F8: xzalloc (utils.c:65) ==25835== by 0x487E176: expr_alloc (expression.c:45) ==25835== by 0x487F960: mapping_expr_alloc (expression.c:1149) ==25835== by 0x488EC84: netlink_delinearize_setelem (netlink.c:1166) ==25835== by 0x4DC6928: nftnl_set_elem_foreach (set_elem.c:725) ==25835== by 0x488F0D5: netlink_list_setelems (netlink.c:1215) ==25835== by 0x487695C: cache_init_objects (rule.c:170) ==25835== by 0x487695C: cache_init (rule.c:228) ==25835== by 0x487695C: cache_update (rule.c:279) ==25835== by 0x48A10BE: nft_evaluate (libnftables.c:406) ==25835== by 0x48A19B6: nft_run_cmd_from_buffer (libnftables.c:451) ==25835== by 0x10A8E1: main (main.c:487) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove cache lookups after the evaluation phasePablo Neira Ayuso2020-07-291-11/+6
| | | | | | | | | | | | This patch adds a new field to the cmd structure for elements to store a reference to the set. This saves an extra lookup in the netlink bytecode generation step. This patch also allows to incrementally update during the evaluation phase according to the command actions, which is required by the follow up ("evaluate: remove table from cache on delete table") bugfix patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: zap element statement when decomposing intervalPablo Neira Ayuso2020-07-061-0/+16
| | | | | | | | Otherwise, interval sets do not display element statement such as counters. Fixes: 6d80e0f15492 ("src: support for counter in set definition") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: fix asan runtime errorPablo Neira Ayuso2020-06-081-2/+2
| | | | | | | | | | ASAN reports: segtree.c:387:30: runtime error: variable length array bound evaluates to non-positive value 0 Update array definition to be the set size plus 1. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix netlink_get_setelem() memleaksPablo Neira Ayuso2020-05-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ==26693==ERROR: LeakSanitizer: detected memory leaks Direct leak of 256 byte(s) in 2 object(s) allocated from: #0 0x7f6ce2189330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f6ce1b1767a in xmalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:36 #2 0x7f6ce1b177d3 in xzalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:65 #3 0x7f6ce1a41760 in expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:45 #4 0x7f6ce1a4dea7 in set_elem_expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:1278 #5 0x7f6ce1ac2215 in netlink_delinearize_setelem /home/pablo/devel/scm/git-netfilter/nftables/src/netlink.c:1094 #6 0x7f6ce1ac3c16 in list_setelem_cb /home/pablo/devel/scm/git-netfilter/nftables/src/netlink.c:1172 #7 0x7f6ce0198808 in nftnl_set_elem_foreach /home/pablo/devel/scm/git-netfilter/libnftnl/src/set_elem.c:725 Indirect leak of 256 byte(s) in 2 object(s) allocated from: #0 0x7f6ce2189330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f6ce1b1767a in xmalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:36 #2 0x7f6ce1b177d3 in xzalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:65 #3 0x7f6ce1a41760 in expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:45 #4 0x7f6ce1a4515d in constant_expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:388 #5 0x7f6ce1abaf12 in netlink_alloc_value /home/pablo/devel/scm/git-netfilter/nftables/src/netlink.c:354 #6 0x7f6ce1ac17f5 in netlink_delinearize_setelem /home/pablo/devel/scm/git-netfilter/nftables/src/netlink.c:1080 #7 0x7f6ce1ac3c16 in list_setelem_cb /home/pablo/devel/scm/git-netfilter/nftables/src/netlink.c:1172 #8 0x7f6ce0198808 in nftnl_set_elem_foreach /home/pablo/devel/scm/git-netfilter/libnftnl/src/set_elem.c:725 Indirect leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f6ce2189720 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9720) #1 0x7f6ce1b1778d in xrealloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:55 #2 0x7f6ce1b1756d in gmp_xrealloc /home/pablo/devel/scm/git-netfilter/nftables/src/gmputil.c:202 #3 0x7f6ce1417059 in __gmpz_realloc (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0x23059) Indirect leak of 8 byte(s) in 1 object(s) allocated from: #0 0x7f6ce2189330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f6ce1b1767a in xmalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:36 #2 0x7f6ce14105c5 in __gmpz_init2 (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0x1c5c5) SUMMARY: AddressSanitizer: 536 byte(s) leaked in 6 allocation(s). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Fix get element command with prefixesPhil Sutter2020-05-041-0/+1
| | | | | | | | | | | | Code wasn't aware of prefix elements in interval sets. With previous changes in place, they merely need to be accepted in get_set_interval_find() - value comparison and expression duplication is identical to ranges. Extend sets/0034get_element_0 test to cover prefixes as well. While being at it, also cover concatenated ranges. Signed-off-by: Phil Sutter <phil@nwl.cc>
* segtree: Merge get_set_interval_find() and get_set_interval_end()Phil Sutter2020-05-041-47/+16
| | | | | | | | | | Both functions were very similar already. Under the assumption that they will always either see a range (or start of) that matches exactly or not at all, reduce complexity and make get_set_interval_find() accept NULL (left or) right values. This way it becomes a full replacement for get_set_interval_end(). Signed-off-by: Phil Sutter <phil@nwl.cc>
* segtree: Use expr_clone in get_set_interval_*()Phil Sutter2020-05-041-6/+2
| | | | | | | | | | | Both functions perform interval set lookups with either start and end or only start values as input. Interestingly, in practice they either see values which are not contained or which match an existing range exactly. Make use of the above and just return a clone of the matching entry instead of creating a new one based on input data. Signed-off-by: Phil Sutter <phil@nwl.cc>
* segtree: Fix missing expires value in prefixesPhil Sutter2020-05-041-1/+1
| | | | | | | | | | | This probable copy'n'paste bug prevented 'expiration' field from being populated when turning a range into a prefix in interval_map_decompose(). Consequently, interval sets with timeout did print expiry value for ranges (such as 10.0.0.1-10.0.0.5) but not prefixes (10.0.0.0/8, for instance). Fixes: bb0e6d8a2851b ("segtree: incorrect handling of comments and timeouts with mapping") Signed-off-by: Phil Sutter <phil@nwl.cc>
* segtree: broken error reporting with mappingsPablo Neira Ayuso2020-04-111-5/+20
| | | | | | | | | | | | | Segfault on error reporting when intervals overlap. ip saddr vmap { 10.0.1.0-10.0.1.255 : accept, 10.0.1.1-10.0.2.255 : drop } Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1415 Fixes: 4d6ad0f310d6 ("segtree: check for overlapping elements at insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Add support for concatenated set rangesStefano Brivio2020-02-071-0/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After exporting field lengths via NFTNL_SET_DESC_CONCAT attributes, we now need to adjust parsing of user input and generation of netlink key data to complete support for concatenation of set ranges. Instead of using separate elements for start and end of a range, denoting the end element by the NFT_SET_ELEM_INTERVAL_END flag, as it's currently done for ranges without concatenation, we'll use the new attribute NFTNL_SET_ELEM_KEY_END as suggested by Pablo. It behaves in the same way as NFTNL_SET_ELEM_KEY, but it indicates that the included key represents the upper bound of a range. For example, "packets with an IPv4 address between 192.0.2.0 and 192.0.2.42, with destination port between 22 and 25", needs to be expressed as a single element with two keys: NFTA_SET_ELEM_KEY: 192.0.2.0 . 22 NFTA_SET_ELEM_KEY_END: 192.0.2.42 . 25 To achieve this, we need to: - adjust the lexer rules to allow multiton expressions as elements of a concatenation. As wildcards are not allowed (semantics would be ambiguous), exclude wildcards expressions from the set of possible multiton expressions, and allow them directly where needed. Concatenations now admit prefixes and ranges - generate, for each element in a range concatenation, a second key attribute, that includes the upper bound for the range - also expand prefixes and non-ranged values in the concatenation to ranges: given a set with interval and concatenation support, the kernel has no way to tell which elements are ranged, so they all need to be. For example, 192.0.2.0 . 192.0.2.9 : 1024 is sent as: NFTA_SET_ELEM_KEY: 192.0.2.0 . 1024 NFTA_SET_ELEM_KEY_END: 192.0.2.9 . 1024 - aggregate ranges when elements received by the kernel represent concatenated ranges, see concat_range_aggregate() - perform a few minor adjustments where interval expressions are already handled: we have intervals in these sets, but the set specification isn't just an interval, so we can't just aggregate and deaggregate interval ranges linearly v4: No changes v3: - rework to use a separate key for closing element of range instead of a separate element with EXPR_F_INTERVAL_END set (Pablo Neira Ayuso) v2: - reworked netlink_gen_concat_data(), moved loop body to a new function, netlink_gen_concat_data_expr() (Phil Sutter) - dropped repeated pattern in bison file, replaced by a new helper, compound_expr_alloc_or_add() (Phil Sutter) - added set_is_nonconcat_range() helper (Phil Sutter) - in expr_evaluate_set(), we need to set NFT_SET_SUBKEY also on empty sets where the set in the context already has the flag - dropped additional 'end' parameter from netlink_gen_data(), temporarily set EXPR_F_INTERVAL_END on expressions and use that from netlink_gen_concat_data() to figure out we need to add the 'end' element (Phil Sutter) - replace range_mask_len() by a simplified version, as we don't need to actually store the composing masks of a range (Phil Sutter) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: store expr, not dtype to track data in setsFlorian Westphal2019-12-161-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will be needed once we add support for the 'typeof' keyword to handle maps that could e.g. store 'ct helper' "type" values. Instead of: set foo { type ipv4_addr . mark; this would allow set foo { typeof(ip saddr) . typeof(ct mark); (exact syntax TBD). This would be needed to allow sets that store variable-sized data types (string, integer and the like) that can't be used at at the moment. Adding special data types for everything is problematic due to the large amount of different types needed. For anonymous sets, e.g. "string" can be used because the needed size can be inferred from the statement, e.g. 'osf name { "Windows", "Linux }', but in case of named sets that won't work because 'type string' lacks the context needed to derive the size information. With 'typeof(osf name)' the context is there, but at the moment it won't help because the expression is discarded instantly and only the data type is retained. Signed-off-by: Florian Westphal <fw@strlen.de>
* segtree: don't remove nul-root element from interval setPablo Neira Ayuso2019-12-091-7/+1
| | | | | | | | | | | | | Check from the delinearize set element path if the nul-root element already exists in the interval set. Hence, the element insertion path skips the implicit nul-root interval insertion. Under some circunstances, nft bogusly fails to delete the last element of the interval set and to create an element in an existing empty internal set. This patch includes a test that reproduces the issue. Fixes: 4935a0d561b5 ("segtree: special handling for the first non-matching segment") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: restore automergePablo Neira Ayuso2019-12-021-1/+1
| | | | | | | | Always close interval in non-anonymous sets unless the auto-merge feature is set on. Fixes: a4ec05381261 ("segtree: always close interval in non-anonymous sets") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Fix get element for little endian rangesPhil Sutter2019-11-151-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes get element command for interval sets with host byte order data type, like e.g. mark. During serializing of the range (or element) to query, data was exported in wrong byteorder and consequently not found in kernel. The mystery part is that code seemed correct: When calling constant_expr_alloc() from set_elem_add(), the set key's byteorder was passed with correct value of BYTEORDER_HOST_ENDIAN. Comparison with delete/add element code paths though turned out that in those use-cases, constant_expr_alloc() is called with BYTEORDER_INVALID: - seg_tree_init() takes byteorder field value of first element in init->expressions (i.e., the elements requested on command line) and assigns that to tree->byteorder - tree->byteorder is passed to constant_expr_alloc() in set_insert_interval() - the elements' byteorder happens to be the default value This patch may not fix the right side, but at least it aligns get with add/delete element codes. Fixes: a43cc8d53096d ("src: support for get element command") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: Check ranges when deleting elementsPhil Sutter2019-11-141-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure any intervals to delete actually exist, otherwise reject the command. Without this, it is possible to mess up rbtree contents: | # nft list ruleset | table ip t { | set s { | type ipv4_addr | flags interval | auto-merge | elements = { 192.168.1.0-192.168.1.254, 192.168.1.255 } | } | } | # nft delete element t s '{ 192.168.1.0/24 }' | # nft list ruleset | table ip t { | set s { | type ipv4_addr | flags interval | auto-merge | elements = { 192.168.1.255-255.255.255.255 } | } | } Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: always close interval in non-anonymous setsPablo Neira Ayuso2019-10-091-1/+2
| | | | | | | Skip this optimization for non-anonymous sets, otherwise, element deletion breaks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use set_is_anonymous()Pablo Neira Ayuso2019-07-161-1/+1
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use UDATA defines from libnftnlPhil Sutter2019-05-031-1/+3
| | | | | | | | | | | | | Userdata attribute names have been added to libnftnl, use them instead of the local copy. While being at it, rename udata_get_comment() in netlink_delinearize.c and the callback it uses since the function is specific to rules. Also integrate the existence check for NFTNL_RULE_USERDATA into it along with the call to nftnl_rule_get_data(). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: fix memleak in interval_map_decompose()Pablo Neira Ayuso2019-04-101-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not inconditionally hold reference to start interval. The handling depends on what kind of range expression we need to build, either no range at all, a prefix or a plain range. Depending on the case, we need to partially clone what we need from the expression to avoid use-after-free. This fixes valgrind reports that look like this, when listing rulesets: ==30018== 2,057,984 (1,028,992 direct, 1,028,992 indirect) bytes in 8,039 blocks are definitely lost in loss record 76 of 83 ==30018== at 0x4C2BBAF: malloc (vg_replace_malloc.c:299) ==30018== by 0x4E75978: xmalloc (utils.c:36) ==30018== by 0x4E75A5D: xzalloc (utils.c:65) ==30018== by 0x4E5CEC0: expr_alloc (expression.c:45) ==30018== by 0x4E5D610: mapping_expr_alloc (expression.c:985) ==30018== by 0x4E6A068: netlink_delinearize_setelem (netlink.c:810) ==30018== by 0x5B51320: nftnl_set_elem_foreach (set_elem.c:673) ==30018== by 0x4E6A2D5: netlink_list_setelems (netlink.c:864) ==30018== by 0x4E56C76: cache_init_objects (rule.c:166) ==30018== by 0x4E56C76: cache_init (rule.c:216) ==30018== by 0x4E56C76: cache_update (rule.c:243) ==30018== by 0x4E64530: cmd_evaluate_list (evaluate.c:3503) ==30018== by 0x4E64530: cmd_evaluate (evaluate.c:3880) ==30018== by 0x4E7D12F: nft_parse (parser_bison.y:798) ==30018== by 0x4E7AB56: nft_parse_bison_buffer (libnftables.c:349) ==30018== by 0x4E7AB56: nft_run_cmd_from_buffer (libnftables.c:394) Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: add missing non-matching segment to set in flat representationPablo Neira Ayuso2019-03-061-3/+6
| | | | | | | | | | | | | | | | | # cat test.nft add set x y { type ipv4_addr; } add element x y { 10.0.24.0/24 } # nft -f test.nft # nft delete element x y { 10.0.24.0/24 } bogusly returns -ENOENT. The non-matching segment (0.0.0.0 with end-flag set on) is not added to the set in the example above. This patch also adds a test to cover this case. Fixes: 4935a0d561b5 ("segtree: special handling for the first non-matching segment") Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: remove dummy debug_octxPablo Neira Ayuso2019-03-061-6/+2
| | | | | | | | Breaks custom-defined configuration in library mode, ie. user may want to store output in a file, instead of stderr. Fixes: 35f6cd327c2e ("src: Pass stateless, numeric, ip2name and handle variables as structure members.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: fix crash when debug mode is activeFlorian Westphal2019-03-041-2/+5
| | | | | | | We must set output_fp to sensible filep, else crash. Reported-by: Václav Zindulka <vaclav.zindulka@tlapnet.cz> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: expr: add expression etypeFlorian Westphal2019-02-081-9/+9
| | | | | | | | Temporary kludge to remove all the expr->ops->type == ... patterns. Followup patch will remove expr->ops, and make expr_ops() lookup the correct expr_ops struct instead to reduce struct expr size. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: expr: add and use expr_name helperFlorian Westphal2019-02-081-1/+1
| | | | | | | | Currently callers use expr->ops->name, but follouwp patch will remove the ops pointer from struct expr. So add this helper and use it everywhere. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: set proper error cause on existing elementsPablo Neira Ayuso2018-10-101-0/+2
| | | | | | | | | | | | | | | | | | Adding new elements result in a confusing "Success" error message. # nft add element x y { 0-3 } [...] Error: Could not process rule: Success add element x y { 0-3 } ^^^^^^^^^^^^^^^^^^^^^^^^ after this patch, this reports: Error: Could not process rule: File exists add element x y { 0-3 } ^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: incorrect handling of last element in get_set_decompose()Pablo Neira Ayuso2018-10-101-1/+1
| | | | | | | Add range to the list of matching elements. Fixes: 95629758a5ec ("segtree: bogus range via get set element on existing elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: stop iteration on existing elements in case range is foundPablo Neira Ayuso2018-10-031-4/+6
| | | | | | No need to keep iterating once the range object has been allocated. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>