summaryrefslogtreecommitdiffstats
path: root/src/evaluate.c
Commit message (Collapse)AuthorAgeFilesLines
* src: Don't parse string as verdict in mapXiao Liang2022-08-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In verdict map, string values are accidentally treated as verdicts. For example: table t { map foo { type ipv4_addr : verdict elements = { 192.168.0.1 : bar } } chain output { type filter hook output priority mangle; ip daddr vmap @foo } } Though "bar" is not a valid verdict (should be "jump bar" or something), the string is taken as the element value. Then NFTA_DATA_VALUE is sent to the kernel instead of NFTA_DATA_VERDICT. This would be rejected by recent kernels. On older ones (e.g. v5.4.x) that don't validate the type, a warning can be seen when the rule is hit, because of the corrupted verdict value: [5120263.467627] WARNING: CPU: 12 PID: 303303 at net/netfilter/nf_tables_core.c:229 nft_do_chain+0x394/0x500 [nf_tables] Indeed, we don't parse verdicts during evaluation, but only chain names, which is of type string rather than verdict. For example, "jump $var" is a verdict while "$var" is a string. Fixes: c64457cff967 ("src: Allow goto and jump to a variable") Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: search stacked header list for matching payload depFlorian Westphal2022-08-051-6/+15
| | | | | | | | | | | | | | "ether saddr 0:1:2:3:4:6 vlan id 2" works, but reverse fails: "vlan id 2 ether saddr 0:1:2:3:4:6" will give Error: conflicting protocols specified: vlan vs. ether After "proto: track full stack of seen l2 protocols, not just cumulative offset", we have a list of all l2 headers, so search those to see if we had this proto base in the past before rejecting this. Reported-by: Eric Garver <eric@garver.life> Signed-off-by: Florian Westphal <fw@strlen.de>
* proto: track full stack of seen l2 protocols, not just cumulative offsetFlorian Westphal2022-08-051-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | For input, a cumulative size counter of all pushed l2 headers is enough, because we have the full expression tree available to us. For delinearization we need to track all seen l2 headers, else we lose information that we might need at a later time. Consider: rule netdev nt nc set update ether saddr . vlan id during delinearization, the vlan proto_desc replaces the ethernet one, and by the time we try to split the concatenation apart we will search the ether saddr offset vs. the templates for proto_vlan. This replaces the offset with an array that stores the protocol descriptions seen. Then, if the payload offset is larger than our description, search the l2 stack and adjust the offset until we're within the expected offset boundary. Reported-by: Eric Garver <eric@garver.life> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: report missing interval flag when using prefix/range in concatenationPablo Neira Ayuso2022-07-071-5/+20
| | | | | | | | If set declaration is missing the interval flag, and user specifies an element with either prefix or range, then bail out. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1592 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix segfault when adding elements to invalid setPeter Tirsek2022-06-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding elements to a set or map with an invalid definition causes nft to segfault. The following nftables.conf triggers the crash: flush ruleset create table inet filter set inet filter foo {} add element inet filter foo { foobar } Simply parsing and checking the config will trigger it: $ nft -c -f nftables.conf.crash Segmentation fault The error in the set/map definition is correctly caught and queued, but because the set is invalid and does not contain a key type, adding to it causes a NULL pointer dereference of set->key within setelem_evaluate(). I don't think it's necessary to queue another error since the underlying problem is correctly detected and reported when parsing the definition of the set. Simply checking the validity of set->key before using it seems to fix it, causing the error in the definition of the set to be reported properly. The element type error isn't caught, but that seems reasonable since the key type is invalid or unknown anyway: $ ./nft -c -f ~/nftables.conf.crash /home/pti/nftables.conf.crash:3:21-21: Error: set definition does not specify key set inet filter foo {} ^ [ Add tests to cover this case --pablo ] Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1597 Signed-off-by: Peter Tirsek <peter@tirsek.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: reset ctx->set after set interval evaluationPablo Neira Ayuso2022-06-011-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise bogus error reports on set datatype mismatch might occur, such as: Error: datatype mismatch, expected Internet protocol, expression has type IPv4 address meta l4proto { tcp, udp } th dport 443 dnat to 10.0.0.1 ~~~~~~~~~~~~ ^^^^^^^^^^^^ with an unrelated set declaration. table ip test { set set_with_interval { type ipv4_addr flags interval } chain prerouting { type nat hook prerouting priority dstnat; policy accept; meta l4proto { tcp, udp } th dport 443 dnat to 10.0.0.1 } } This bug has been introduced in the evaluation step. Reported-by: Roman Petrov <nwhisper@gmail.com> Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge)" Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix always-true assertionsFlorian Westphal2022-04-261-1/+1
| | | | | | | assert(1) is a no-op, this should be assert(0). Use BUG() instead. Add missing CATCHALL to avoid BUG(). Signed-off-by: Florian Westphal <fw@strlen.de>
* src: allow use of base integer types as set keys in concatenationsFlorian Westphal2022-04-181-7/+17
| | | | | | | | | | | | | | | | "typeof ip saddr . ipsec in reqid" won't work because reqid uses integer type, i.e. dtype->size is 0. With "typeof", the size can be derived from the expression length, via set->key. This computes the concat length based either on dtype->size or expression length. It also updates concat evaluation to permit a zero datatype size if the subkey expression has nonzero length (i.e., typeof was used). Signed-off-by: Florian Westphal <fw@strlen.de>
* intervals: support to partial deletion with automergePablo Neira Ayuso2022-04-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Splice the existing set element cache with the elements to be deleted and merge sort it. The elements to be deleted are identified by the EXPR_F_REMOVE flag. The set elements to be deleted is automerged in first place if the automerge flag is set on. There are four possible deletion scenarios: - Exact match, eg. delete [a-b] and there is a [a-b] range in the kernel set. - Adjust left side of range, eg. delete [a-b] from range [a-x] where x > b. - Adjust right side of range, eg. delete [a-b] from range [x-b] where x < a. - Split range, eg. delete [a-b] from range [x-y] where x < a and b < y. Update nft_evaluate() to use the safe list variant since new commands are dynamically registered to the list to update ranges. This patch also restores the set element existence check for Linux kernels <= 5.7. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: allow for zero length rangesPablo Neira Ayuso2022-04-131-1/+1
| | | | | | | | | Allow for ranges such as, eg. 30-30. This is required by the new intervals.c code, which normalize constant, prefix set elements to all ranges. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* intervals: add support to automerge with kernel elementsPablo Neira Ayuso2022-04-131-3/+5
| | | | | | | | | | | | | | | | | | Extend the interval codebase to support for merging elements in the kernel with userspace element updates. Add a list of elements to be purged to cmd and set objects. These elements representing outdated intervals are deleted before adding the updated ranges. This routine splices the list of userspace and kernel elements, then it mergesorts to identify overlapping and contiguous ranges. This splice operation is undone so the set userspace cache remains consistent. Incrementally update the elements in the cache, this allows to remove dd44081d91ce ("segtree: Fix add and delete of element in same batch"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace interval segment tree overlap and automergePablo Neira Ayuso2022-04-131-3/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rewrite of the segtree interval codebase. This patch now splits the original set_to_interval() function in three routines: - add set_automerge() to merge overlapping and contiguous ranges. The elements, expressed either as single value, prefix and ranges are all first normalized to ranges. This elements expressed as ranges are mergesorted. Then, there is a linear list inspection to check for merge candidates. This code only merges elements in the same batch, ie. it does not merge elements in the kernela and the userspace batch. - add set_overlap() to check for overlapping set elements. Linux kernel >= 5.7 already checks for overlaps, older kernels still needs this code. This code checks for two conflict types: 1) between elements in this batch. 2) between elements in this batch and kernelspace. The elements in the kernel are temporarily merged into the list of elements in the batch to check for this overlaps. The EXPR_F_KERNEL flag allows us to restore the set cache after the overlap check has been performed. - set_to_interval() now only transforms set elements, expressed as range e.g. [a,b], to individual set elements using the EXPR_F_INTERVAL_END flag notation to represent e.g. [a,b+1), where b+1 has the EXPR_F_INTERVAL_END flag set on. More relevant updates: - The overlap and automerge routines are now performed in the evaluation phase. - The userspace set object representation now stores a reference to the existing kernel set object (in case there is already a set with this same name in the kernel). This is required by the new overlap and automerge approach. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: string prefix expression must retain original lengthFlorian Westphal2022-04-131-1/+3
| | | | | | | | | | | | | | | | To make something like "eth*" work for interval sets (match eth0, eth1, and so on...) we must treat the string as a 128 bit integer. Without this, segtree will do the wrong thing when applying the prefix, because we generate the prefix based on 'eth*' as input, with a length of 3. The correct import needs to be done on "eth\0\0\0\0\0\0\0...", i.e., if the input buffer were an ipv6 address, it should look like "eth\0::", not "::eth". Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: keep prefix expression lengthFlorian Westphal2022-04-131-0/+1
| | | | | | | | | | | | | Else, range_expr_value_high() will see a 0 length when doing: mpz_init_bitmask(tmp, expr->len - expr->prefix_len); This wasn't a problem so far because prefix expressions generated from "string*" were never passed down to the prefix->range conversion functions. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: make byteorder conversion on string base type a no-opFlorian Westphal2022-04-131-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prerequisite for support of interface names in interval sets: table inet filter { set s { type ifname flags interval elements = { "foo" } } chain input { type filter hook input priority filter; policy accept; iifname @s counter } } Will yield: "Byteorder mismatch: meta expected big endian, got host endian". This is because of: /* Data for range lookups needs to be in big endian order */ if (right->set->flags & NFT_SET_INTERVAL && byteorder_conversion(ctx, &rel->left, BYTEORDER_BIG_ENDIAN) < 0) It doesn't make sense to me to add checks to all callers of byteorder_conversion(), so treat this similar to EXPR_CONCAT and turn TYPE_STRING byteorder change into a no-op. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow to use integer type header fields via typeof set declarationPablo Neira Ayuso2022-03-291-1/+1
| | | | | | | | | | | | | | | Header fields such as udp length cannot be used in concatenations because it is using the generic integer_type: test.nft:3:10-19: Error: can not use variable sized data types (integer) in concat expressions typeof udp length . @th,32,32 ^^^^^^^^^^~~~~~~~~~~~~ This patch slightly extends ("src: allow to use typeof of raw expressions in set declaration") to set on NFTNL_UDATA_SET_KEY_PAYLOAD_LEN in userdata if TYPE_INTEGER is used. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow to use typeof of raw expressions in set declarationPablo Neira Ayuso2022-03-291-12/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the dynamic datatype to allocate an instance of TYPE_INTEGER and set length and byteorder. Add missing information to the set userdata area for raw payload expressions which allows to rebuild the set typeof from the listing path. A few examples: - With anonymous sets: nft add rule x y ip saddr . @ih,32,32 { 1.1.1.1 . 0x14, 2.2.2.2 . 0x1e } - With named sets: table x { set y { typeof ip saddr . @ih,32,32 elements = { 1.1.1.1 . 0x14 } } } Incremental updates are also supported, eg. nft add element x y { 3.3.3.3 . 0x28 } expr_evaluate_concat() is used to evaluate both set key definitions and set key values, using two different function might help to simplify this code in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: copy field_count for anonymous object maps as wellFlorian Westphal2022-03-211-11/+17
| | | | | | | | | | without this test fails with: W: [FAILED] tests/shell/testcases/maps/anon_objmap_concat: got 134 BUG: invalid range expression type concat nft: expression.c:1452: range_expr_value_low: Assertion `0' failed. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: init cmd pointer for new on-stack contextFlorian Westphal2022-03-041-0/+1
| | | | | | | else, this will segfault when trying to print the "table 'x' doesn't exist" error message. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add tcp option reset supportFlorian Westphal2022-02-281-0/+7
| | | | | | | This allows to replace a tcp option with nops, similar to the TCPOPTSTRIP feature of iptables. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: attempt to set_eval flag if dynamic updates requestedFlorian Westphal2022-01-111-0/+10
| | | | | | | | | | | | | When passing no upper size limit, the dynset expression forces an internal 64k upperlimit. In some cases, this can result in 'nft -f' to restore the ruleset. Avoid this by always setting the EVAL flag on a set definition when we encounter packet-path update attempt in the batch. Reported-by: Yi Chen <yiche@redhat.com> Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: reject: support ethernet as L2 protocol for inet tableJeremy Sowden2021-12-151-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we are evaluating a `reject` statement in the `inet` family, we may have `ether` and `ip` or `ip6` as the L2 and L3 protocols in the evaluation context: table inet filter { chain input { type filter hook input priority filter; ether saddr aa:bb:cc:dd:ee:ff ip daddr 192.168.0.1 reject } } Since no `reject` option is given, nft attempts to infer one and fails: BUG: unsupported familynft: evaluate.c:2766:stmt_evaluate_reject_inet_family: Assertion `0' failed. Aborted The reason it fails is that the ethernet protocol numbers for IPv4 and IPv6 (`ETH_P_IP` and `ETH_P_IPV6`) do not match `NFPROTO_IPV4` and `NFPROTO_IPV6`. Add support for the ethernet protocol numbers. Replace the current `BUG("unsupported family")` error message with something more informative that tells the user to provide an explicit reject option. Add a Python test case. Fixes: 5fdd0b6a0600 ("nft: complete reject support") Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001360 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: correct typo'sJeremy Sowden2021-12-151-2/+2
| | | | | | | There are a couple of mistakes in comments. Fix them. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Fix payload statement mask on Big EndianPhil Sutter2021-11-301-2/+2
| | | | | | | | The mask used to select bits to keep must be exported in the same byteorder as the payload statement itself, also the length of the exported data must match the number of bytes extracted earlier. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: grab reference in set expression evaluationPablo Neira Ayuso2021-11-081-2/+2
| | | | | | | Do not clone expression when evaluation a set expression, grabbing the reference counter to reuse the object is sufficient. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: clone variable expression if there is more than one referencePablo Neira Ayuso2021-11-081-1/+10
| | | | | | | | | Clone the expression that defines the variable value if there are multiple references to it in the ruleset. This saves heap memory consumption in case the variable defines a set with a huge number of elements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: raw payload match and mangle on inner header / payload dataPablo Neira Ayuso2021-11-081-0/+3
| | | | | | | | | | | | | | | This patch adds support to match on inner header / payload data: # nft add rule x y @ih,32,32 0x14000000 counter you can also mangle payload data: # nft add rule x y @ih,32,32 set 0x14000000 counter This update triggers a checksum update at the layer 4 header via csum_flags, mangling odd bytes is also aligned to 16-bits. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: postpone transport protocol match check after nat expression ↵Pablo Neira Ayuso2021-11-031-6/+7
| | | | | | | | | evaluation Fix bogus error report when using transport protocol as map key. Fixes: 50780456a01a ("evaluate: check for missing transport protocol match in nat map with concatenations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Support netdev egress hookLukas Wunner2021-10-281-0/+2
| | | | | | | | | Add userspace support for the netdev egress hook which is queued up for v5.16-rc1, complete with documentation and tests. Usage is identical to the ingress hook. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check for missing transport protocol match in nat map with ↵Pablo Neira Ayuso2021-09-291-0/+12
| | | | | | | | | | | | | | | | concatenations Restore this error with NAT maps: # nft add rule 'ip ipfoo c dnat to ip daddr map @y' Error: transport protocol mapping is only valid after transport protocol match add rule ip ipfoo c dnat to ip daddr map @y ~~~~ ^^^^^^^^^^^^^^^ Allow for transport protocol match in the map too, which is implicitly pulling in a transport protocol dependency. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check for concatenation in set data datatypePablo Neira Ayuso2021-09-291-1/+2
| | | | | | | | | | | | | When adding this rule with an existing map: add rule nat x y meta l4proto { tcp, udp } dnat ip to ip daddr . th dport map @fwdtoip_th reports a bogus: Error: datatype mismatch: expected IPv4 address, expression has type concatenation of (IPv4 address, internet network service) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: don't adjust offsets of autogenerated dependency expressionsFlorian Westphal2021-09-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo says: user reports that this is broken: nft --debug=netlink add rule bridge filter forward vlan id 100 vlan id set 200 [..] [ payload load 2b @ link header + 14 => reg 1 ] [..] [ payload load 2b @ link header + 28 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x0000c800 ] [ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ] offset says 28, it is assuming q-in-q, in this case it is mangling the existing header. The problem here is that 'vlan id set 200' needs a read-modify-write cycle because 'vlan id set' has to preserve bits located in the same byte area as the vlan id. The first 'payload load' at offset 14 is generated via 'vlan id 100', this part is ok. The second 'payload load' at offset 28 is the bogus one. Its added as a dependency, but then adjusted because nft evaluation considers this identical to 'vlan id 1 vlan id '2, where nft assumes q-in-q. To fix this, skip offset adjustments for raw expressions and mark the dependency-generated payload instruction as such. This is fine because raw payload operations assume that user specifies base/offset/length manually. Also add a test case for this. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: expand variable containing set into multiple mappingsPablo Neira Ayuso2021-08-121-0/+17
| | | | | | | | | | | | | | | | | | | | | | # cat x.nft define interfaces = { eth0, eth1 } table ip x { chain y { type filter hook input priority 0; policy accept; iifname vmap { lo : accept, $interfaces : drop } } } # nft -f x.nft # nft list ruleset table ip x { chain y { type filter hook input priority 0; policy accept; iifname vmap { "lo" : accept, "eth0" : drop, "eth1" : drop } } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: disallow negation with binary operationPablo Neira Ayuso2021-07-271-6/+10
| | | | | | | | | | | | | The negation was introduced to provide a simple shortcut. Extend e6c32b2fa0b8 ("src: add negation match on singleton bitmask value") to disallow negation with binary operations too. # nft add rule meh tcp_flags 'tcp flags & (fin | syn | rst | ack) ! syn' Error: cannot combine negation with binary expression add rule meh tcp_flags tcp flags & (fin | syn | rst | ack) ! syn ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ~~~ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: error reporting for missing statements in set/map declarationPablo Neira Ayuso2021-07-261-3/+5
| | | | | | | | | | | | | | | | | | Assuming this map: map y { type ipv4_addr : verdict } This patch slightly improves error reporting to refer to the missing 'counter' statement in the map declaration. # nft 'add element x y { 1.2.3.4 counter packets 1 bytes 1 : accept, * counter : drop }' Error: missing statement in map declaration add element x y { 1.2.3.4 counter packets 10 bytes 640 : accept, * counter : drop } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix inet nat with no layer 3 infoPablo Neira Ayuso2021-07-201-2/+3
| | | | | | | | | | | | | | nft currently reports: Error: Could not process rule: Protocol error add rule inet x y meta l4proto tcp dnat to :80 ^^^^ default to NFPROTO_INET family, otherwise kernel bails out EPROTO when trying to load the conntrack helper. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1428 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for nat with interval concatenationPablo Neira Ayuso2021-07-131-5/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to combine concatenation and interval in NAT mappings, e.g. add rule x y dnat to ip saddr . tcp dport map { 192.168.1.2 . 80 : 10.141.10.2-10.141.10.5 . 8888-8999 } This generates the following NAT expression: [ nat dnat ip addr_min reg 1 addr_max reg 10 proto_min reg 9 proto_max reg 11 ] which expects to obtain the following tuple: IP address (min), source port (min), IP address (max), source port (max) to be obtained from the map. This representation simplifies the delinearize path, since the datatype is specified as: ipv4_addr . inet_service. A few more notes on this update: - alloc_nftnl_setelem() needs a variant netlink_gen_data() to deal with the representation of the range on the rhs of the mapping. In contrast to interval concatenation in the key side, where the range is expressed as two netlink attributes, the data side of the set element mapping stores the interval concatenation in a contiguos memory area, see __netlink_gen_concat_expand() for reference. - add range_expr_postprocess() to postprocess the data mapping range. If either one single IP address or port is used, then the minimum and maximum value in the range is the same value, e.g. to avoid listing 80-80, this round simplify the range. This also invokes the range to prefix conversion routine. - add concat_elem_expr() helper function to consolidate code to build the concatenation expression on the rhs element data side. This patch also adds tests/py and tests/shell. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: infer NAT mapping with concatenation from setPablo Neira Ayuso2021-07-131-1/+40
| | | | | | | | | | | | If the map is anonymous, infer it from the set elements. Otherwise, the set definition already have an explicit concatenation definition in the data side of the mapping. This update simplifies the NAT mapping syntax with concatenations, e.g. snat ip to ip saddr map { 10.141.11.4 : 192.168.2.3 . 80 } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove STMT_NAT_F_INTERVAL flags and interval keywordPablo Neira Ayuso2021-07-131-20/+0
| | | | | | | | | | | | | | | STMT_NAT_F_INTERVAL is not useful, the keyword interval can be removed to simplify the syntax, e.g. snat to ip saddr map { 10.141.11.4 : 192.168.2.2-192.168.2.4 } This patch reworks 9599d9d25a6b ("src: NAT support for intervals in maps"). Do not remove STMT_NAT_F_INTERVAL yet since this flag is needed for interval concatenations coming in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix maps with key and data concatenationsPablo Neira Ayuso2021-06-231-6/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | expr_evaluate_concat() is overloaded, it deals with two cases: #1 set key and data definitions, this case uses the special dynamically created concatenation datatype which is taken from the context. #2 set elements, this case iterates over the set key and data expressions that are components of the concatenation tuple, to fetch the corresponding datatype. Add a new function to deal with case #1 specifically. This patch is implicitly fixing up map that include arbitrary concatenations. This is failing with a spurious error report such as: # cat bug.nft table x { map test { type ipv4_addr . inet_proto . inet_service : ipv4_addr . inet_service } } # nft -f bug.nft bug.nft:3:48-71: Error: datatype mismatch, expected concatenation of (IPv4 address, Internet protocol, internet network service), expression has type concatenation of (IPv4 address, internet network service) type ipv4_addr . inet_proto . inet_service : ipv4_addr . inet_service ^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: queue: allow use of arbitrary queue expressionsFlorian Westphal2021-06-211-6/+7
| | | | | | | | | | | | | | | | | | | | | back in 2016 Liping Zhang added support to kernel and libnftnl to specify a source register containing the queue number to use. This was never added to nft itself, so allow this. On linearization side, check if attached expression is a range. If its not, allocate a new register and set NFTNL_EXPR_QUEUE_SREG_QNUM attribute after generating the lowlevel expressions for the kernel. On delinarization we need to check for presence of NFTNL_EXPR_QUEUE_SREG_QNUM and decode the expression(s) when present. Also need to do postprocessing for STMT_QUEUE so that the protocol context is set correctly, without this only raw payload expressions will be shown (@nh,32,...) instead of 'ip ...'. Next patch adds test cases. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: fix hash expression maxvalFlorian Westphal2021-06-181-2/+6
| | | | | | | | It needs to account for the offset too. Fixes: 9bee0c86f179 ("src: add offset attribute for hash expression") Fixes: d4f9a8fb9e9a ("src: add offset attribute for numgen expression") Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: memleak in binary operation transfer to RHSPablo Neira Ayuso2021-06-181-2/+0
| | | | | | | | | | | | | | | | | | | | | | | Remove useless reference count grabbing on constant expression that results in a memleak. Direct leak of 136 byte(s) in 1 object(s) allocated from: #0 0x7f4cd54af330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f4cd4d9e489 in xmalloc /home/.../devel/nftables/src/utils.c:36 #2 0x7f4cd4d9e648 in xzalloc /home/.../devel/nftables/src/utils.c:75 #3 0x7f4cd4caf8c6 in expr_alloc /home/.../devel/nftables/src/expression.c:45 #4 0x7f4cd4cb36e9 in constant_expr_alloc /home/.../devel/nftables/src/expression.c:419 #5 0x7f4cd4ca714c in integer_type_parse /home/.../devel/nftables/src/datatype.c:397 #6 0x7f4cd4ca4bee in symbolic_constant_parse /home/.../devel/nftables/src/datatype.c:165 #7 0x7f4cd4ca4572 in symbol_parse /home/.../devel/nftables/src/datatype.c:135 #8 0x7f4cd4cc333f in expr_evaluate_symbol /home/.../devel/nftables/src/evaluate.c:251 [...] Indirect leak of 8 byte(s) in 1 object(s) allocated from: #0 0x7f4cd54af330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f4cd4d9e489 in xmalloc /home/.../devel/nftables/src/utils.c:36 #2 0x7f4cd46185c5 in __gmpz_init2 (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0x1c5c5) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: unbreak verdict maps with implicit map with interval concatenationsPablo Neira Ayuso2021-06-181-0/+8
| | | | | | | | | | | Verdict maps in combination with interval concatenations are broken, e.g. # nft add rule x y tcp dport . ip saddr vmap { 1025-65535 . 192.168.10.2 : accept } Retrieve the concatenation field length and count from the map->map expressions that represents the key of the implicit map. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: do not skip mapping elementsPablo Neira Ayuso2021-06-181-7/+19
| | | | | | | | | | | Set element keys are of EXPR_SET_ELEM expression type, however, mappings use the EXPR_MAPPING expression to wrap the EXPR_SET_ELEM key (mapping->left) and the corresponding data (mapping->right). This patch adds a wrapper function to fetch the EXPR_SET_ELEM expression from the key in case of mappings and use it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace opencoded NFT_SET_ANONYMOUS set flag check by set_is_anonymous()Pablo Neira Ayuso2021-06-141-1/+1
| | | | | | | | Use set_is_anonymous() to check for the NFT_SET_ANONYMOUS set flag instead. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add set to cache oncePablo Neira Ayuso2021-06-141-3/+0
| | | | | | | | | | | | | | 67d3969a7244 ("evaluate: add set to the cache") re-adds the set into the cache again. This bug was hidden behind 5ec5c706d993 ("cache: add hashtable cache for table") which broke set_evaluate() for anonymous sets. Phil reported a gcc compilation warning which uncovered this problem. Reported-by: Phil Sutter <phil@nwl.cc> Fixes: 67d3969a7244 ("evaluate: add set to the cache") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: Mark fall through case in str2hooknum()Phil Sutter2021-06-141-0/+1
| | | | | | | It is certainly intentional, so just mark it as such. Fixes: b4775dec9f80b ("src: ingress inet support") Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: restore interval + concatenation in anonymous setPablo Neira Ayuso2021-06-111-8/+9
| | | | | | | | | | | | | | | | | | | Perform the table and set lookup only for non-anonymous sets, where the incremental cache update is required. The problem fixed by 7aa08d45031e ("evaluate: Perform set evaluation on implicitly declared (anonymous) sets") resurrected after the cache rework. # nft add rule x y tcp sport . tcp dport vmap { ssh . 0-65535 : accept, 0-65535 . ssh : accept } BUG: invalid range expression type concat nft: expression.c:1422: range_expr_value_low: Assertion `0' failed. Abort Add a test case to make sure this does not happen again. Fixes: 5ec5c706d993 ("cache: add hashtable cache for table") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support for base hook dumpingFlorian Westphal2021-06-091-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Example output: $ nft list hook ip input family ip hook input { +0000000000 nft_do_chain_inet [nf_tables] # nft table ip filter chain input +0000000010 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain filter_INPUT +0000000100 nf_nat_ipv4_local_in [nf_nat] +2147483647 ipv4_confirm [nf_conntrack] } $ nft list hooks netdev type ingress device lo family netdev hook ingress device lo { +0000000000 nft_do_chain_netdev [nf_tables] } $ nft list hooks inet family ip hook prerouting { -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] -0000000300 iptable_raw_hook [iptable_raw] -0000000290 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain raw_PREROUTING -0000000200 ipv4_conntrack_in [nf_conntrack] -0000000140 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain mangle_PREROUTING -0000000100 nf_nat_ipv4_pre_routing [nf_nat] } ... 'nft list hooks' will display everyting except the netdev family via successive dump request for all family:hook combinations. Signed-off-by: Florian Westphal <fw@strlen.de>