summaryrefslogtreecommitdiffstats
path: root/src/evaluate.c
Commit message (Collapse)AuthorAgeFilesLines
* evaluate: missing datatype definition in implicit_set_declaration()Pablo Neira Ayuso2020-06-071-10/+12
| | | | | | | | | | | | | | | | | | | | | | set->data from implicit_set_declaration(), otherwise, set_evaluation() bails out with: # nft -f /etc/nftables/inet-filter.nft /etc/nftables/inet-filter.nft:8:32-54: Error: map definition does not specify mapping data type tcp dport vmap { 22 : jump ssh_input } ^^^^^^^^^^^^^^^^^^^^^^^ /etc/nftables/inet-filter.nft:13:26-52: Error: map definition does not specify mapping data type iif vmap { "eth0" : jump wan_input } ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Add a test to cover this case. Fixes: 7aa08d45031e ("evaluate: Perform set evaluation on implicitly declared (anonymous) sets") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=208093 Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add devices to an existing flowtablePablo Neira Ayuso2020-06-021-11/+10
| | | | | | | | This patch allows you to add new devices to an existing flowtables. # nft add flowtable x y { devices = { eth0 } \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: Perform set evaluation on implicitly declared (anonymous) setsStefano Brivio2020-05-281-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a set is implicitly declared, set_evaluate() is not called as a result of cmd_evaluate_add(), because we're adding in fact something else (e.g. a rule). Expression-wise, evaluation still happens as the implicit set expression is eventually found in the tree and handled by expr_evaluate_set(), but context-wise evaluation (set_evaluate()) is skipped, and this might be relevant instead. This is visible in the reported case of an anonymous set including concatenated ranges: # nft add rule t c ip saddr . tcp dport { 192.0.2.1 . 20-30 } accept BUG: invalid range expression type concat nft: expression.c:1160: range_expr_value_low: Assertion `0' failed. Aborted because we reach do_add_set() without properly evaluated flags and set description, and eventually end up in expr_to_intervals(), which can't handle that expression. Explicitly call set_evaluate() as we add anonymous sets into the context, and instruct the same function to: - skip expression-wise set evaluation if the set is anonymous, as that happens later anyway as part of the general tree evaluation - skip the insertion in the set cache, as it makes no sense to have sets that shouldn't be referenced there For object maps, the allocation of the expression for set->data is already handled by set_evaluate(), so we can now drop that from stmt_evaluate_objref_map(). v2: - skip insertion of set in cache (Pablo Neira Ayuso) - drop double allocation of expression (and leak of the first one) for object maps (Pablo Neira Ayuso) Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: enable reject with 802.1qMichael Braun2020-05-281-1/+1
| | | | | | | | | | | | This enables the use nft bridge reject with bridge vlan filtering. It depends on a kernel patch to make the kernel preserve the vlan id in nft bridge reject generation. [ pablo: update tests/py ] Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: rename CMD_OBJ_SETELEM to CMD_OBJ_ELEMENTSPablo Neira Ayuso2020-05-141-3/+3
| | | | | | | | The CMD_OBJ_ELEMENTS provides an expression that contains the list of set elements. This leaves room to introduce CMD_OBJ_SETELEMS in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: fix error rule reporting with missing table/chain and anonymous setsPablo Neira Ayuso2020-05-141-0/+1
| | | | | | | | | | | | | | | | | handle_merge() skips handle location initialization because set name != NULL. Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7f64f1e in erec_print (octx=0x55555555d2c0, erec=0x55555555fcf0, debug_mask=0) at erec.c:95 95 switch (indesc->type) { (gdb) bt buf=0x55555555db20 "add rule inet traffic-filter input tcp dport { 22, 80, 443 } accept") at libnftables.c:459 (gdb) p indesc $1 = (const struct input_descriptor *) 0x0 Closes: http://bugzilla.opensuse.org/show_bug.cgi?id=1171321 Fixes: 086ec6f30c96 ("mnl: extended error support for create command") Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix memleak in stmt_evaluate_reject_icmp()Pablo Neira Ayuso2020-05-061-0/+2
| | | | | | | | | | | | | | | | | | ==26297==ERROR: LeakSanitizer: detected memory leaks c Direct leak of 512 byte(s) in 4 object(s) allocated from: #0 0x7f46f8167330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7f46f7b3cf1c in xmalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:36 #2 0x7f46f7b3d075 in xzalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:65 #3 0x7f46f7a85760 in expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:45 #4 0x7f46f7a8915d in constant_expr_alloc /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:388 #5 0x7f46f7a7bad4 in symbolic_constant_parse /home/pablo/devel/scm/git-netfilter/nftables/src/datatype.c:173 #6 0x7f46f7a7af5f in symbol_parse /home/pablo/devel/scm/git-netfilter/nftables/src/datatype.c:132 #7 0x7f46f7abf2bd in stmt_evaluate_reject_icmp /home/pablo/devel/scm/git-netfilter/nftables./src/evaluate.c:2739 [...] SUMMARY: AddressSanitizer: 544 byte(s) leaked in 8 allocation(s). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: ct_timeout: release policy string and state listPablo Neira Ayuso2020-05-051-0/+1
| | | | | | | | | | | | | | | | | | | ================================================================= ==19037==ERROR: LeakSanitizer: detected memory leaks Direct leak of 18 byte(s) in 2 object(s) allocated from: #0 0x7ff6ee6f9810 in strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3a810) #1 0x7ff6ee22666d in xstrdup /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:75 #2 0x7ff6ee28cce9 in nft_parse /home/pablo/devel/scm/git-netfilter/nftables/src/parser_bison.c:5792 #3 0x4b903f302c8010a (<unknown module>) Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7ff6ee7a8330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330) #1 0x7ff6ee226578 in xmalloc /home/pablo/devel/scm/git-netfilter/nftables/src/utils.c:36 SUMMARY: AddressSanitizer: 34 byte(s) leaked in 3 allocation(s). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add rule_stmt_insert_at() and use itPablo Neira Ayuso2020-05-051-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | This helper function adds a statement at a given position and it updates the rule statement counter. This patch fixes this: flush table bridge test-bridge add rule bridge test-bridge input vlan id 1 ip saddr 10.0.0.1 rule.c:2870:5: runtime error: index 2 out of bounds for type 'stmt *[*]' ================================================================= ==1043==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffdd69c1350 at pc 0x7f1036f53330 bp 0x7ffdd69c1300 sp 0x7ffdd69c12f8 WRITE of size 8 at 0x7ffdd69c1350 thread T0 #0 0x7f1036f5332f in payload_try_merge /home/mbr/nftables/src/rule.c:2870 #1 0x7f1036f534b7 in rule_postprocess /home/mbr/nftables/src/rule.c:2885 #2 0x7f1036fb2785 in rule_evaluate /home/mbr/nftables/src/evaluate.c:3744 #3 0x7f1036fb627b in cmd_evaluate_add /home/mbr/nftables/src/evaluate.c:3982 #4 0x7f1036fbb9e9 in cmd_evaluate /home/mbr/nftables/src/evaluate.c:4462 #5 0x7f10370652d2 in nft_evaluate /home/mbr/nftables/src/libnftables.c:414 #6 0x7f1037065ba1 in nft_run_cmd_from_buffer /home/mbr/nftables/src/libnftables.c:447 Reported-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: incorrect byteorder with typeof and integer_datatypePablo Neira Ayuso2020-04-291-1/+2
| | | | | | | | | | | | | | | | | | | | table bridge t { set s3 { typeof meta ibrpvid elements = { 2, 3, 103 } } } # nft --debug=netlink -f test.nft s3 t 0 s3 t 0 element 00000100 : 0 [end] element 00000200 : 0 [end] element 00000300 : 0 [end] ^^^^^^^^ The integer_type uses BYTEORDER_INVALID byteorder (which is implicitly handled as BYTEORDER_BIG_ENDIAN). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: fix crash when handling concatenation without mapPablo Neira Ayuso2020-04-281-0/+3
| | | | | | | | Fix a crash when map is not specified, e.g. nft add rule x y snat ip addr . port to 1.1.1.1 . 22 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add STMT_NAT_F_CONCAT flag and use itPablo Neira Ayuso2020-04-281-1/+1
| | | | | | Replace ipportmap boolean field by flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: NAT support for intervals in mapsPablo Neira Ayuso2020-04-281-2/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to specify an interval of IP address in maps. table ip x { chain y { type nat hook postrouting priority srcnat; policy accept; snat ip interval to ip saddr map { 10.141.11.4 : 192.168.2.2-192.168.2.4 } } } The example above performs SNAT to packets that comes from 10.141.11.4 to an interval of IP addresses from 192.168.2.2 to 192.168.2.4 (both included). You can also combine this with dynamic maps: table ip x { map y { type ipv4_addr : interval ipv4_addr flags interval elements = { 10.141.10.0/24 : 192.168.2.2-192.168.2.4 } } chain y { type nat hook postrouting priority srcnat; policy accept; snat ip interval to ip saddr map @y } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Set NFT_SET_CONCAT flag for sets with concatenated rangesStefano Brivio2020-04-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | Pablo reports that nft, after commit 8ac2f3b2fca3 ("src: Add support for concatenated set ranges"), crashes with older kernels (< 5.6) without support for concatenated set ranges: those sets will be sent to the kernel, which adds them without notion of the fact that different concatenated fields are actually included, and nft crashes while trying to list this kind of malformed concatenation. Use the NFT_SET_CONCAT flag introduced by kernel commit ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") when sets including concatenated ranges are sent to the kernel, so that older kernels (with no knowledge of this flag itself) will refuse set creation. Note that, in expr_evaluate_set(), we have to check for the presence of the flag, also on empty sets that might carry it in context data, and actually set it in the actual set flags. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check for device in non-netdev chainsPablo Neira Ayuso2020-03-311-0/+3
| | | | | | | | | # nft -f /tmp/x /tmp/x:3:26-36: Error: This chain type cannot be bound to device type filter hook input device eth0 priority 0 ^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: improve error reporting in netdev ingress chainPablo Neira Ayuso2020-03-311-2/+9
| | | | | | | | | | | | | | # nft -f /tmp/x.nft /tmp/x.nft:3:20-24: Error: The netdev family does not support this hook type filter hook input device eth0 priority 0 ^^^^^ # nft -f /tmp/x.nft /tmp/x.nft:3:3-49: Error: Missing `device' in this chain definition type filter hook ingress device eth0 priority 0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: add hook_specPablo Neira Ayuso2020-03-311-9/+9
| | | | | | Store location of chain hook definition. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: display error if set statement is missingPablo Neira Ayuso2020-03-271-7/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # cat /tmp/x table x { set y { type ipv4_addr elements = { 1.1.1.1 counter packets 1 bytes 67, } } } # nft -f /tmp/x /tmp/x:5:12-18: Error: missing counter statement in set definition 1.1.1.1 counter packets 1 bytes 67, ^^^^^^^^^^^^^^^^^^^^^^^^^^ Instead, this should be: table x { set y { type ipv4_addr counter <------- elements = { 1.1.1.1 counter packets 1 bytes 67, } } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for counter in set definitionPablo Neira Ayuso2020-03-201-0/+9
| | | | | | | | | | | | | | | | | | | | | This patch allows you to turn on counter for each element in the set. table ip x { set y { typeof ip saddr counter elements = { 192.168.10.35, 192.168.10.101, 192.168.10.135 } } chain z { type filter hook output priority filter; policy accept; ip daddr @y } } This example shows how to turn on counters globally in the set 'y'. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add range specified flag setting (missing ↵Pablo Neira Ayuso2020-03-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NF_NAT_RANGE_PROTO_SPECIFIED) Sergey reports: With nf_tables it is not possible to use port range for masquerading. Masquerade statement has option "to [:port-port]" which give no effect to translation behavior. But it must change source port of packet to one from ":port-port" range. My network: +-----------------------------+ | ROUTER | | | | Masquerade| | 10.0.0.1 1.1.1.1 | | +------+ +------+ | | | eth1 | | eth2 | | +-+--^---+-----------+---^--+-+ | | | | +----v------+ +------v----+ | | | | | 10.0.0.2 | | 1.1.1.2 | | | | | |PC1 | |PC2 | +-----------+ +-----------+ For testing i used rule like this: rule ip nat POSTROUTING oifname eth2 masquerade to :666 Run netcat for 1.1.1.2 667(UDP) and get dump from PC2: 15:22:25.591567 a8:f9:4b:aa:08:44 > a8:f9:4b:ac:e7:8f, ethertype IPv4 (0x0800), length 60: 1.1.1.1.34466 > 1.1.1.2.667: UDP, length 1 Address translation works fine, but source port are not belongs to specified range. I see in similar source code (i.e. nft_redir.c, nft_nat.c) that there is setting NF_NAT_RANGE_PROTO_SPECIFIED flag. After adding this, repeat test for kernel with this patch, and get dump: 16:16:22.324710 a8:f9:4b:aa:08:44 > a8:f9:4b:ac:e7:8f, ethertype IPv4 (0x0800), length 60: 1.1.1.1.666 > 1.1.1.2.667: UDP, length 1 Now it is works fine. Reported-by: Sergey Marinkevich <s@marinkevich.ru> Tested-by: Sergey Marinkevich <s@marinkevich.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix leaks.Jeremy Sowden2020-03-041-0/+2
| | | | | | | Some bitmask variables are not cleared. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: no need to swap byte-order for values of fewer than 16 bits.Jeremy Sowden2020-03-041-1/+1
| | | | | | | | | Endianness is not meaningful for objects smaller than 2 bytes and the byte-order conversions are no-ops in the kernel, so just update the expression as if it were constant. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: convert the byte-order of payload statement arguments.Jeremy Sowden2020-03-041-0/+5
| | | | | | | | | | | Since shift operations require host byte-order, we need to be able to convert the result of the shift back to network byte-order, in a rule like: nft add rule ip t c tcp dport set tcp dport lshift 1 Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: don't evaluate payloads twice.Jeremy Sowden2020-03-041-0/+5
| | | | | | | | | Payload munging means that evaluation of payload expressions may not be idempotent. Add a flag to prevent them from being evaluated more than once. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: simplify calculation of payload size.Jeremy Sowden2020-03-041-2/+2
| | | | | | | Use div_round_up and one statement. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: add separate variables for lshift and xor binops.Jeremy Sowden2020-03-041-17/+17
| | | | | | | | stmt_evaluate_payload has distinct variables for some, but not all, the binop expressions it creates. Add variables for the rest. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: stmt_evaluate_nat_map() only if stmt->nat.ipportmap == truePablo Neira Ayuso2020-02-251-17/+11
| | | | | | | stmt_evaluate_nat_map() is only called when the parser sets on stmt->nat.ipportmap. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: nat concatenation support with anonymous mapsPablo Neira Ayuso2020-02-241-3/+20
| | | | | | | | | This patch extends the parser to define the mapping datatypes, eg. ... dnat ip addr . port to ip saddr map { 1.1.1.1 : 2.2.2.2 . 30 } ... dnat ip addr . port to ip saddr map @y Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow nat maps containing both ip(6) address and portFlorian Westphal2020-02-241-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft will now be able to handle map destinations { type ipv4_addr . inet_service : ipv4_addr . inet_service } chain f { dnat to ip daddr . tcp dport map @destinations } Something like this won't work though: meta l4proto tcp dnat ip6 to numgen inc mod 4 map { 0 : dead::f001 . 8080, .. as we lack the type info to properly dissect "dead::f001" as an ipv6 address. For the named map case, this info is available in the map definition, but for the anon case we'd need to resort to guesswork. Support is added by peeking into the map definition when evaluating a nat statement with a map. Right now, when a map is provided as address, we will only check that the mapped-to data type matches the expected size (of an ipv4 or ipv6 address). After this patch, if the mapped-to type is a concatenation, it will take a peek at the individual concat expressions. If its a combination of address and service, nft will translate this so that the kernel nat expression looks at the returned register that would store the inet_service part of the octet soup returned from the lookup expression. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: add two new helpersFlorian Westphal2020-02-241-29/+32
| | | | | | | | | | | In order to support 'dnat to ip saddr map @foo', where @foo returns both an address and a inet_service, we will need to peek into the map and process the concatenations sub-expressions. Add two helpers for this, will be used in followup patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: process concat expressions when used as mapped-to exprFlorian Westphal2020-02-241-0/+4
| | | | | | | | | Needed to avoid triggering the 'dtype->size == 0' tests. Evaluation will build a new concatenated type that holds the size of the aggregate. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: print correct statement name on family mismatchFlorian Westphal2020-02-221-2/+3
| | | | | | | | | | nft add rule inet filter c ip daddr 1.2.3.4 dnat ip6 to f00::1 Error: conflicting protocols specified: ip vs. unknown. You must specify ip or ip6 family in tproxy statement Should be: ... "in nat statement". Fixes: fbe27464dee4588d90 ("src: add nat support for the inet family") Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: change shift byte-order to host-endian.Jeremy Sowden2020-02-071-1/+1
| | | | | | | | | | | The byte-order of the righthand operands of the right-shifts generated for payload and exthdr expressions is big-endian. However, all right operands should be host-endian. Since evaluation of the shift binop will insert a byte-order conversion to enforce this, change the endianness in order to avoid the extra operation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: correct variable name.Jeremy Sowden2020-02-071-6/+6
| | | | | | | | Rename the `lshift` variable used to store an right-shift expression to `rshift`. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Add support for concatenated set rangesStefano Brivio2020-02-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After exporting field lengths via NFTNL_SET_DESC_CONCAT attributes, we now need to adjust parsing of user input and generation of netlink key data to complete support for concatenation of set ranges. Instead of using separate elements for start and end of a range, denoting the end element by the NFT_SET_ELEM_INTERVAL_END flag, as it's currently done for ranges without concatenation, we'll use the new attribute NFTNL_SET_ELEM_KEY_END as suggested by Pablo. It behaves in the same way as NFTNL_SET_ELEM_KEY, but it indicates that the included key represents the upper bound of a range. For example, "packets with an IPv4 address between 192.0.2.0 and 192.0.2.42, with destination port between 22 and 25", needs to be expressed as a single element with two keys: NFTA_SET_ELEM_KEY: 192.0.2.0 . 22 NFTA_SET_ELEM_KEY_END: 192.0.2.42 . 25 To achieve this, we need to: - adjust the lexer rules to allow multiton expressions as elements of a concatenation. As wildcards are not allowed (semantics would be ambiguous), exclude wildcards expressions from the set of possible multiton expressions, and allow them directly where needed. Concatenations now admit prefixes and ranges - generate, for each element in a range concatenation, a second key attribute, that includes the upper bound for the range - also expand prefixes and non-ranged values in the concatenation to ranges: given a set with interval and concatenation support, the kernel has no way to tell which elements are ranged, so they all need to be. For example, 192.0.2.0 . 192.0.2.9 : 1024 is sent as: NFTA_SET_ELEM_KEY: 192.0.2.0 . 1024 NFTA_SET_ELEM_KEY_END: 192.0.2.9 . 1024 - aggregate ranges when elements received by the kernel represent concatenated ranges, see concat_range_aggregate() - perform a few minor adjustments where interval expressions are already handled: we have intervals in these sets, but the set specification isn't just an interval, so we can't just aggregate and deaggregate interval ranges linearly v4: No changes v3: - rework to use a separate key for closing element of range instead of a separate element with EXPR_F_INTERVAL_END set (Pablo Neira Ayuso) v2: - reworked netlink_gen_concat_data(), moved loop body to a new function, netlink_gen_concat_data_expr() (Phil Sutter) - dropped repeated pattern in bison file, replaced by a new helper, compound_expr_alloc_or_add() (Phil Sutter) - added set_is_nonconcat_range() helper (Phil Sutter) - in expr_evaluate_set(), we need to set NFT_SET_SUBKEY also on empty sets where the set in the context already has the flag - dropped additional 'end' parameter from netlink_gen_data(), temporarily set EXPR_F_INTERVAL_END on expressions and use that from netlink_gen_concat_data() to figure out we need to add the 'end' element (Phil Sutter) - replace range_mask_len() by a simplified version, as we don't need to actually store the composing masks of a range (Phil Sutter) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Add support for NFTNL_SET_DESC_CONCATStefano Brivio2020-02-071-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | To support arbitrary range concatenations, the kernel needs to know how long each field in the concatenation is. The new libnftnl NFTNL_SET_DESC_CONCAT set attribute describes this as an array of lengths, in bytes, of concatenated fields. While evaluating concatenated expressions, export the datatype size into the new field_len array, and hand the data over via libnftnl. Similarly, when data is passed back from libnftnl, parse it into the set description. When set data is cloned, we now need to copy the additional fields in set_clone(), too. This change depends on the libnftnl patch with title: set: Add support for NFTA_SET_DESC_CONCAT attributes v4: No changes v3: Rework to use set description data instead of a stand-alone attribute v2: No changes Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: white-space fixes.Jeremy Sowden2020-01-281-6/+5
| | | | | | | Remove some trailing white-space and fix some indentation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: better error notice when interval flag is not set onPablo Neira Ayuso2020-01-161-5/+2
| | | | | | | | | | | | | Users get confused with the existing error notice, let's try a different one: # nft add element x y { 1.1.1.0/24 } Error: You must add 'flags interval' to your set declaration if you want to add prefix elements add element x y { 1.1.1.0/24 } ^^^^^^^^^^ Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1380 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Phil Sutter <phil@nwl.cc>
* evaluate: fix expr_set_context call for shift binops.Jeremy Sowden2020-01-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | expr_evaluate_binop calls expr_set_context for shift expressions to set the context data-type to `integer`. This clobbers the byte-order of the context, resulting in unexpected conversions to NBO. For example: $ sudo nft flush ruleset $ sudo nft add table t $ sudo nft add chain t c '{ type filter hook output priority mangle; }' $ sudo nft add rule t c oif lo tcp dport ssh ct mark set '0x10 | 0xe' $ sudo nft add rule t c oif lo tcp dport ssh ct mark set '0xf << 1' $ sudo nft list table t table ip t { chain c { type filter hook output priority mangle; policy accept; oif "lo" tcp dport 22 ct mark set 0x0000001e oif "lo" tcp dport 22 ct mark set 0x1e000000 } } Replace it with a call to __expr_set_context and set the byteorder to that of the left operand since this is the value being shifted. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: print a hint about 'typeof' syntax on 0 keylenFlorian Westphal2019-12-171-5/+18
| | | | | | | | | | If user says 'type integer; ...' in a set definition, don't just throw an error -- provide a hint that the typeof keyword can be used to provide the needed size information. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: store expr, not dtype to track data in setsFlorian Westphal2019-12-161-18/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will be needed once we add support for the 'typeof' keyword to handle maps that could e.g. store 'ct helper' "type" values. Instead of: set foo { type ipv4_addr . mark; this would allow set foo { typeof(ip saddr) . typeof(ct mark); (exact syntax TBD). This would be needed to allow sets that store variable-sized data types (string, integer and the like) that can't be used at at the moment. Adding special data types for everything is problematic due to the large amount of different types needed. For anonymous sets, e.g. "string" can be used because the needed size can be inferred from the statement, e.g. 'osf name { "Windows", "Linux }', but in case of named sets that won't work because 'type string' lacks the context needed to derive the size information. With 'typeof(osf name)' the context is there, but at the moment it won't help because the expression is discarded instantly and only the data type is retained. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add ability to set/get secmarks to/from connectionChristian Göttsche2019-11-251-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Labeling established and related packets requires the secmark to be stored in the connection. Add the ability to store and retrieve secmarks like: ... chain input { ... # label new incoming packets ct state new meta secmark set tcp dport map @secmapping_in # add label to connection ct state new ct secmark set meta secmark # set label for est/rel packets from connection ct state established,related meta secmark set ct secmark ... } ... chain output { ... # label new outgoing packets ct state new meta secmark set tcp dport map @secmapping_out # add label to connection ct state new ct secmark set meta secmark # set label for est/rel packets from connection ct state established,related meta secmark set ct secmark ... } ... This patch also disallow constant value on the right hand side. # nft add rule x y meta secmark 12 Error: Cannot be used with right hand side constant value add rule x y meta secmark 12 ~~~~~~~~~~~~ ^^ # nft add rule x y ct secmark 12 Error: Cannot be used with right hand side constant value add rule x y ct secmark 12 ~~~~~~~~~~ ^^ # nft add rule x y ct secmark set 12 Error: ct secmark must not be set to constant value add rule x y ct secmark set 12 ^^^^^^^^^^^^^^^^^ This patch improves 3bc84e5c1fdd ("src: add support for setting secmark"). Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add and use `set_is_meter` helperJeremy Sowden2019-11-061-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sets constructed for meters are flagged as anonymous and dynamic. However, in some places there are only checks that they are dynamic, which can lead to normal sets being classified as meters. For example: # nft add table t # nft add set t s { type ipv4_addr; size 256; flags dynamic,timeout; } # nft add chain t c # nft add rule t c tcp dport 80 meter m size 128 { ip saddr limit rate 10/second } # nft list meters table ip t { set s { type ipv4_addr size 256 flags dynamic,timeout } meter m { type ipv4_addr size 128 flags dynamic } } # nft list meter t m table ip t { meter m { type ipv4_addr size 128 flags dynamic } } # nft list meter t s Error: No such file or directory list meter t s ^ Add a new helper `set_is_meter` and use it wherever there are checks for meters. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: flowtable: add support for named flowtable listingEric Jallot2019-10-311-0/+29
| | | | | | | | | | | | | | | | | | | | This patch allows you to dump a named flowtable. # nft list flowtable inet t f table inet t { flowtable f { hook ingress priority filter + 10 devices = { eth0, eth1 } } } Also: libnftables-json.adoc: fix missing quotes. Fixes: db0697ce7f60 ("src: support for flowtable listing") Fixes: 872f373dc50f ("doc: Add JSON schema documentation") Signed-off-by: Eric Jallot <ejallot@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add synproxy stateful object supportFernando Fernandez Mancera2019-09-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Add support for "synproxy" stateful object. For example (for TCP port 80 and using maps with saddr): table ip foo { synproxy https-synproxy { mss 1460 wscale 7 timestamp sack-perm } synproxy other-synproxy { mss 1460 wscale 5 } chain bar { tcp dport 80 synproxy name "https-synproxy" synproxy name ip saddr map { 192.168.1.0/24 : "https-synproxy", 192.168.2.0/24 : "other-synproxy" } } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: flag fwd and queue statements as terminalFlorian Westphal2019-09-071-0/+2
| | | | | | | | | | | | | | | | | | Both queue and fwd statement end evaluation of a rule: in ... fwd to "eth0" accept ... queue accept "accept" is redundant and never evaluated in the kernel. Add the missing "TERMINAL" flag so the evaluation step will catch any trailing expressions: nft add rule filter input queue counter Error: Statement after terminal statement has no effect Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: evaluate: catch invalid 'meta day' values in eval stepFlorian Westphal2019-09-061-4/+13
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: Introduce new conditions 'time', 'day' and 'hour'Ander Juaristi2019-09-061-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These keywords introduce new checks for a timestamp, an absolute date (which is converted to a timestamp), an hour in the day (which is converted to the number of seconds since midnight) and a day of week. When converting an ISO date (eg. 2019-06-06 17:00) to a timestamp, we need to substract it the GMT difference in seconds, that is, the value of the 'tm_gmtoff' field in the tm structure. This is because the kernel doesn't know about time zones. And hence the kernel manages different timestamps than those that are advertised in userspace when running, for instance, date +%s. The same conversion needs to be done when converting hours (e.g 17:00) to seconds since midnight as well. The result needs to be computed modulo 86400 in case GMT offset (difference in seconds from UTC) is negative. We also introduce a new command line option (-t, --seconds) to show the actual timestamps when printing the values, rather than the ISO dates, or the hour. Some usage examples: time < "2019-06-06 17:00" drop; time < "2019-06-06 17:20:20" drop; time < 12341234 drop; day "Saturday" drop; day 6 drop; hour >= 17:00 drop; hour >= "17:00:01" drop; hour >= 63000 drop; We need to convert an ISO date to a timestamp without taking into account the time zone offset, since comparison will be done in kernel space and there is no time zone information there. Overwriting TZ is portable, but will cause problems when parsing a ruleset that has 'time' and 'hour' rules. Parsing an 'hour' type must not do time zone conversion, but that will be automatically done if TZ has been overwritten to UTC. Hence, we use timegm() to parse the 'time' type, even though it's not portable. Overwriting TZ seems to be a much worse solution. Finally, be aware that timestamps are converted to nanoseconds when transferring to the kernel (as comparison is done with nanosecond precision), and back to seconds when retrieving them for printing. We swap left and right values in a range to properly handle cross-day hour ranges (e.g. 23:15-03:22). Signed-off-by: Ander Juaristi <a@juaristi.eus> Reviewed-by: Florian Westphal <fw@strlen.de>
* evaluate: New internal helper __expr_evaluate_rangeAnder Juaristi2019-09-061-4/+16
| | | | | | | | | | | | | This is used by the followup patch to evaluate a range without emitting an error when the left value is larger than the right one. This is done to handle time-matching such as 23:00-01:00 -- expr_evaluate_range() will reject this, but we want to be able to evaluate and then handle this as a request to match from 23:00 to 1am. Signed-off-by: Ander Juaristi <a@juaristi.eus> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: allow variable in chain policyFernando Fernandez Mancera2019-08-081-0/+24
| | | | | | | | | | | | This patch allows you to use variables in chain policy definition, e.g. define default_policy = "accept" add table ip foo add chain ip foo bar {type filter hook input priority filter; policy $default_policy} Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>