summaryrefslogtreecommitdiffstats
path: root/src/payload.c
Commit message (Collapse)AuthorAgeFilesLines
* payload: only assert if l2 header base has no lengthFlorian Westphal2024-01-121-2/+1
| | | | | | | | | | nftables will assert in some cases because the sanity check is done even for network and transport header bases. However, stacked headers are only supported for the link layer. Move the assertion around and add a test case for this. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: reset statement length context before evaluating statementPablo Neira Ayuso2023-12-081-22/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch consolidates ctx->stmt_len reset in stmt_evaluate() to avoid this problem. Note that stmt_evaluate_meta() and stmt_evaluate_ct() already reset it after the statement evaluation. Moreover, statement dependency can be generated while evaluating a meta and ct statement. Payload statement dependency already manually stashes this before calling stmt_evaluate(). Add a new stmt_dependency_evaluate() function to stash statement length context when evaluating a new statement dependency and use it for all of the existing statement dependencies. Florian also says: 'meta mark set vlan id map { 1 : 0x00000001, 4095 : 0x00004095 }' will crash. Reason is that the l2 dependency generated here is errounously expanded to a 32bit-one, so the evaluation path won't recognize this as a L2 dependency. Therefore, pctx->stacked_ll_count is 0 and __expr_evaluate_payload() crashes with a null deref when dereferencing pctx->stacked_ll[0]. nft-test.py gains a fugly hack to tolerate '!map typeof vlan id : meta mark'. For more generic support we should find something more acceptable, e.g. !map typeof( everything here is a key or data ) timeout ... tests/py update and assert(pctx->stacked_ll_count) by Florian Westphal. Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* icmpv6: Allow matching target address in NS/NA, redirect and MLDNicolas Cavallari2023-10-061-2/+80
| | | | | | | | | | | | | | | It was currently not possible to match the target address of a neighbor solicitation or neighbor advertisement against a dynamic set, unlike in IPv4. Since they are many ICMPv6 messages with an address at the same offset, allow filtering on the target address for all icmp types that have one. While at it, also allow matching the destination address of an ICMPv6 redirect. Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Signed-off-by: Florian Westphal <fw@strlen.de>
* include: include <string.h> in <nft.h>Thomas Haller2023-09-281-1/+0
| | | | | | | | <string.h> provides strcmp(), as such it's very basic and used everywhere. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: use enum icmp_hdr_field_type in payload_may_dependency_kill_icmp()Thomas Haller2023-09-201-6/+4
| | | | | | | Don't mix icmp_dep (enum icmp_hdr_field_type) and the uint8_t icmp_type. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: fix leak and cleanup reference counting for struct datatypeThomas Haller2023-09-141-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test `./tests/shell/run-tests.sh -V tests/shell/testcases/maps/nat_addr_port` fails: ==118== 195 (112 direct, 83 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3 ==118== at 0x484682C: calloc (vg_replace_malloc.c:1554) ==118== by 0x48A39DD: xmalloc (utils.c:37) ==118== by 0x48A39DD: xzalloc (utils.c:76) ==118== by 0x487BDFD: datatype_alloc (datatype.c:1205) ==118== by 0x487BDFD: concat_type_alloc (datatype.c:1288) ==118== by 0x488229D: stmt_evaluate_nat_map (evaluate.c:3786) ==118== by 0x488229D: stmt_evaluate_nat (evaluate.c:3892) ==118== by 0x488229D: stmt_evaluate (evaluate.c:4450) ==118== by 0x488328E: rule_evaluate (evaluate.c:4956) ==118== by 0x48ADC71: nft_evaluate (libnftables.c:552) ==118== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:595) ==118== by 0x402983: main (main.c:534) I think the reference handling for datatype is wrong. It was introduced by commit 01a13882bb59 ('src: add reference counter for dynamic datatypes'). We don't notice it most of the time, because instances are statically allocated, where datatype_get()/datatype_free() is a NOP. Fix and rework. - Commit 01a13882bb59 comments "The reference counter of any newly allocated datatype is set to zero". That seems not workable. Previously, functions like datatype_clone() would have returned the refcnt set to zero. Some callers would then then set the refcnt to one, but some wouldn't (set_datatype_alloc()). Calling datatype_free() with a refcnt of zero will overflow to UINT_MAX and leak: if (--dtype->refcnt > 0) return; While there could be schemes with such asymmetric counting that juggle the appropriate number of datatype_get() and datatype_free() calls, this is confusing and error prone. The common pattern is that every alloc/clone/get/ref is paired with exactly one unref/free. Let datatype_clone() return references with refcnt set 1 and in general be always clear about where we transfer ownership (take a reference) and where we need to release it. - set_datatype_alloc() needs to consistently return ownership to the reference. Previously, some code paths would and others wouldn't. - Replace datatype_set(key, set_datatype_alloc(dtype, key->byteorder)) with a __datatype_set() with takes ownership. Fixes: 01a13882bb59 ('src: add reference counter for dynamic datatypes') Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <stdlib.h> in <nft.h>Thomas Haller2023-09-111-1/+0
| | | | | | | | | | | | | | It provides malloc()/free(), which is so basic that we need it everywhere. Include via <nft.h>. The ultimate purpose is to define more things in <nft.h>. While it has not corresponding C sources, <nft.h> can contain macros and static inline functions, and is a good place for things that we shall have everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that is a very basic dependency, that will be needed for that. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: rename "dtype_clone()" to datatype_clone()Thomas Haller2023-09-081-1/+1
| | | | | | | | | | | | | The struct is called "datatype" and related functions have the fitting "datatype_" prefix. Rename. Also rename the internal "dtype_alloc()" to "datatype_alloc()". This is a follow up to commit 01a13882bb59 ('src: add reference counter for dynamic datatypes'), which started adding "datatype_*()" functions. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* include: include <std{bool,int}.h> via <nft.h>Thomas Haller2023-08-251-1/+0
| | | | | | | | | | | | | | | | | | | | There is a minimum base that all our sources will end up needing. This is what <nft.h> provides. Add <stdbool.h> and <stdint.h> there. It's unlikely that we want to implement anything, without having "bool" and "uint32_t" types available. Yes, this means the internal headers are not self-contained, with respect to what <nft.h> provides. This is the exception to the rule, and our internal headers should rely to have <nft.h> included for them. They should not include <nft.h> themselves, because <nft.h> needs always be included as first. So when an internal header would include <nft.h> it would be unnecessary, because the header is *always* included already. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add <nft.h> header and include it as firstThomas Haller2023-08-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <config.h> is generated by the configure script. As it contains our feature detection, it want to use it everywhere. Likewise, in some of our sources, we define _GNU_SOURCE. This defines the C variant we want to use. Such a define need to come before anything else, and it would be confusing if different source files adhere to a different C variant. It would be good to use autoconf's AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure that <config.h> is always included as first. Instead of going through all source files and include <config.h> as first, add a new header "include/nft.h", which is supposed to be included in all our sources (and as first). This will also allow us later to prepare some common base, like include <stdbool.h> everywhere. We aim that headers are self-contained, so that they can be included in any order. Which, by the way, already didn't work because some headers define _GNU_SOURCE, which would only work if the header gets included as first. <nft.h> is however an exception to the rule: everything we compile shall rely on having <nft.h> header included as first. This applies to source files (which explicitly include <nft.h>) and to internal header files (which are only compiled indirectly, by being included from a source file). Note that <config.h> has no include guards, which is at least ugly to include multiple times. It doesn't cause problems in practice, because it only contains defines and the compiler doesn't warn about redefining a macro with the same value. Still, <nft.h> also ensures to include <config.h> exactly once. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: stash context statement length when generating payload/meta dependencyPablo Neira Ayuso2023-07-191-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | ... meta mark set ip dscp generates an implicit dependency from the inet family to match on meta nfproto ip. The length of this implicit expression is incorrectly adjusted to the statement length, ie. relational to compare meta nfproto takes 4 bytes instead of 1 byte. The evaluation of 'ip dscp' under the meta mark statement triggers this implicit dependency which should not consider the context statement length since it is added before the statement itself. This problem shows when listing the ruleset, since netlink_parse_cmp() where left->len < right->len, hence handling the implicit dependency as a concatenation, but it is actually a bug in the evaluation step that leads to incorrect bytecode. Fixes: 3c64ea7995cb ("evaluate: honor statement length in integer evaluation") Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand") Tested-by: Brian Davidson <davidson.brian@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: set byteorder when completing expressionPablo Neira Ayuso2023-03-281-0/+1
| | | | | | | | | Otherwise payload expression remains in invalid byteorder which is handled as network byteorder for historical reason. No functional change is intended. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add gre supportPablo Neira Ayuso2023-01-021-0/+47
| | | | | | | | | | | | | GRE has a number of fields that are conditional based on flags, which requires custom dependency code similar to icmp and icmpv6. Matching on optional fields is not supported at this stage. Since this is a layer 3 tunnel protocol, an implicit dependency on NFT_META_L4PROTO for IPPROTO_GRE is generated. To achieve this, this patch adds new infrastructure to remove an outer dependency based on the inner protocol from delinearize path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add vxlan matching supportPablo Neira Ayuso2023-01-021-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the initial infrastructure to support for inner header tunnel matching and its first user: vxlan. A new struct proto_desc field for payload and meta expression to specify that the expression refers to inner header matching is used. The existing codebase to generate bytecode is fully reused, allowing for reusing existing supported layer 2, 3 and 4 protocols. Syntax requires to specify vxlan before the inner protocol field: ... vxlan ip protocol udp ... vxlan ip saddr 1.2.3.0/24 This also works with concatenations and anonymous sets, eg. ... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 } You have to restrict vxlan matching to udp traffic, otherwise it complains on missing transport protocol dependency, e.g. ... udp dport 4789 vxlan ip daddr 1.2.3.4 The bytecode that is generated uses the new inner expression: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] JSON support is not included in this patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add eval_proto_ctx()Pablo Neira Ayuso2023-01-021-23/+35
| | | | | | | | | | | Add eval_proto_ctx() to access protocol context (struct proto_ctx). Rename struct proto_ctx field to _pctx to highlight that this field is internal and the helper function should be used. This patch comes in preparation for supporting outer and inner protocol context. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: do not kill dependency for proto_unknownPablo Neira Ayuso2022-10-311-2/+4
| | | | | | | | | | | | | Unsupported meta match on layer 4 protocol sets on protocol context to proto_unknown, handle anything coming after it as a raw expression in payload_expr_expand(). Moreover, payload_dependency_kill() skips dependency removal if protocol is unknown, so raw payload expression leaves meta layer 4 protocol remains in place. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1641 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: track full stack of seen l2 protocols, not just cumulative offsetFlorian Westphal2022-08-051-11/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | For input, a cumulative size counter of all pushed l2 headers is enough, because we have the full expression tree available to us. For delinearization we need to track all seen l2 headers, else we lose information that we might need at a later time. Consider: rule netdev nt nc set update ether saddr . vlan id during delinearization, the vlan proto_desc replaces the ethernet one, and by the time we try to split the concatenation apart we will search the ether saddr offset vs. the templates for proto_vlan. This replaces the offset with an array that stores the protocol descriptions seen. Then, if the payload offset is larger than our description, search the l2 stack and adjust the offset until we're within the expected offset boundary. Reported-by: Eric Garver <eric@garver.life> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: allow to use integer type header fields via typeof set declarationPablo Neira Ayuso2022-03-291-6/+9
| | | | | | | | | | | | | | | Header fields such as udp length cannot be used in concatenations because it is using the generic integer_type: test.nft:3:10-19: Error: can not use variable sized data types (integer) in concat expressions typeof udp length . @th,32,32 ^^^^^^^^^^~~~~~~~~~~~~ This patch slightly extends ("src: allow to use typeof of raw expressions in set declaration") to set on NFTNL_UDATA_SET_KEY_PAYLOAD_LEN in userdata if TYPE_INTEGER is used. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow to use typeof of raw expressions in set declarationPablo Neira Ayuso2022-03-291-5/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the dynamic datatype to allocate an instance of TYPE_INTEGER and set length and byteorder. Add missing information to the set userdata area for raw payload expressions which allows to rebuild the set typeof from the listing path. A few examples: - With anonymous sets: nft add rule x y ip saddr . @ih,32,32 { 1.1.1.1 . 0x14, 2.2.2.2 . 0x1e } - With named sets: table x { set y { typeof ip saddr . @ih,32,32 elements = { 1.1.1.1 . 0x14 } } } Incremental updates are also supported, eg. nft add element x y { 3.3.3.3 . 0x28 } expr_evaluate_concat() is used to evaluate both set key definitions and set key values, using two different function might help to simplify this code in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: store more than one payload dependencyJeremy Sowden2022-01-151-19/+30
| | | | | | | | Change the payload-dependency context to store a dependency for every protocol layer. This allows us to eliminate more redundant protocol expressions. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add a helper that returns a payload dependency for a particular baseJeremy Sowden2022-01-151-4/+27
| | | | | | | | | | | Currently, with only one base and dependency stored this is superfluous, but it will become more useful when the next commit adds support for storing a payload for every base. Remove redundant `ctx->pbase` check. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: reduce indentationJeremy Sowden2022-01-151-7/+11
| | | | | | | | Re-arrange some switch-cases and conditionals to reduce levels of indentation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: skip templates with meta key setFlorian Westphal2021-12-091-0/+3
| | | | | | | | | | | | | | | | | | | | meta templates are only there for ease of use (input/parsing). When listing, they should be ignored: set s4 { typeof ip version elements = { 1, } } chain c4 { ip version @s4 accept } gets listed as 'ip l4proto ...' which is nonsensical. after this patch we get: in: ip version @s4 out: (@nh,0,8 & 0xf0) >> 4 == @s4 .. which is (marginally) better. Next patch adds support for payload decoding. Signed-off-by: Florian Westphal <fw@strlen.de>
* datatype: add xinteger_type alias to print in hexadecimalPablo Neira Ayuso2021-11-031-1/+1
| | | | | | | | | Add an alias of the integer type to print raw payload expressions in hexadecimal. Update tests/py. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: don't adjust offsets of autogenerated dependency expressionsFlorian Westphal2021-09-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo says: user reports that this is broken: nft --debug=netlink add rule bridge filter forward vlan id 100 vlan id set 200 [..] [ payload load 2b @ link header + 14 => reg 1 ] [..] [ payload load 2b @ link header + 28 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x0000c800 ] [ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ] offset says 28, it is assuming q-in-q, in this case it is mangling the existing header. The problem here is that 'vlan id set 200' needs a read-modify-write cycle because 'vlan id set' has to preserve bits located in the same byte area as the vlan id. The first 'payload load' at offset 14 is generated via 'vlan id 100', this part is ok. The second 'payload load' at offset 28 is the bogus one. Its added as a dependency, but then adjusted because nft evaluation considers this identical to 'vlan id 1 vlan id '2, where nft assumes q-in-q. To fix this, skip offset adjustments for raw expressions and mark the dependency-generated payload instruction as such. This is fine because raw payload operations assume that user specifies base/offset/length manually. Also add a test case for this. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: do not remove icmp echo dependencyFlorian Westphal2021-06-171-24/+37
| | | | | | | | | | | "icmp type echo-request icmp id 2" and "icmp id 2" are not the same, the latter gains an implicit dependency on both echo-request and echo-reply. Change payload dependency tracking to not store dependency in case the value type is ICMP(6)_ECHO(REPLY). Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: be careful on vlan dependency removalFlorian Westphal2021-04-031-3/+26
| | | | | | | 'vlan ...' implies 8021Q frame. In case the expression tests something else (802.1AD for example) its not an implictly added one, so keep it. Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: check icmp dependency before removing previous icmp expressionFlorian Westphal2021-02-021-21/+42
| | | | | | | | | | | | | | | nft is too greedy when removing icmp dependencies. 'icmp code 1 type 2' did remove the type when printing. Be more careful and check that the icmp type dependency of the candidate expression (earlier icmp payload expression) has the same type dependency as the new expression. Reported-by: Eric Garver <eric@garver.life> Reported-by: Michael Biebl <biebl@debian.org> Tested-by: Eric Garver <eric@garver.life> Fixes: d0f3b9eaab8d77e ("payload: auto-remove simple icmp/icmpv6 dependency expressions") Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: auto-remove simple icmp/icmpv6 dependency expressionsFlorian Westphal2020-12-091-3/+47
| | | | | | | | | | | | Instead of: icmpv6 type packet-too-big icmpv6 mtu 1280 display just icmpv6 mtu 1280 The dependency added for id/sequence is still kept, its handled by a anon set instead to cover both the echo 'request' and 'reply' cases. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add auto-dependencies for ipv6 icmp6Florian Westphal2020-12-091-0/+33
| | | | | | Extend the earlier commit to also cover icmpv6. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add auto-dependencies for ipv4 icmpFlorian Westphal2020-12-091-1/+128
| | | | | | | | | | | | | | The ICMP header has field values that are only exist for certain types. Mark the icmp proto 'type' field as a nextheader field and add a new th description to store the icmp type dependency. This can later be re-used for other protocol dependend definitions such as mptcp options -- which are all share the same tcp option number and have a special 4 bit marker inside the mptcp option space that tells how the remaining option looks like. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: Support odd-sized payload matchesPhil Sutter2020-11-041-0/+5
| | | | | | | | | | | | When expanding a payload match, don't disregard oversized templates at the right offset. A more flexible user may extract less bytes from the packet if only parts of a field are interesting, e.g. only the prefix of source/destination address. Support that by using the template, but fix the length. Later when creating a relational expression for it, detect the unusually small payload expression length and turn the RHS value into a prefix expression. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: context tracking for multiple transport protocolsPablo Neira Ayuso2020-09-151-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the protocol context infrastructure to track multiple transport protocols when they are specified from sets. This removes errors like: "transport protocol mapping is only valid after transport protocol match" when invoking: # nft add rule x z meta l4proto { tcp, udp } dnat to 1.1.1.1:80 This patch also catches conflicts like: # nft add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ and: # nft add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ Note that: - the singleton protocol context tracker is left in place until the existing users are updated to use this new multiprotocol tracker. Moving forward, it would be good to consolidate things around this new multiprotocol context tracker infrastructure. - link and network layers are not updated to use this infrastructure yet. The code that deals with vlan conflicts relies on forcing protocol context updates to the singleton protocol base. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add "typeof" build/parse/print supportFlorian Westphal2019-12-171-0/+75
| | | | | | | | | | | | | | | | | | | | This patch adds two new expression operations to build and to parse the userdata area that describe the set key and data typeof definitions. For maps, the grammar enforces either "type data_type : data_type" or or "typeof expression : expression". Check both key and data for valid user typeof info first. If they check out, flag set->key_typeof_valid as true and use it for printing the key info. This patch comes with initial support for using payload expressions with the 'typeof' keyword, followup patches will add support for other expressions as well. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: Fix dumping vlan rulesM. Braun2019-07-311-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given the following bridge rules: 1. ip protocol icmp accept 2. ether type vlan vlan type ip ip protocol icmp accept The are currently both dumped by "nft list ruleset" as 1. ip protocol icmp accept 2. ip protocol icmp accept Though, the netlink code actually is different bridge filter FORWARD 4 [ payload load 2b @ link header + 12 => reg 1 ] [ cmp eq reg 1 0x00000008 ] [ payload load 1b @ network header + 9 => reg 1 ] [ cmp eq reg 1 0x00000001 ] [ immediate reg 0 accept ] bridge filter FORWARD 5 4 [ payload load 2b @ link header + 12 => reg 1 ] [ cmp eq reg 1 0x00000081 ] [ payload load 2b @ link header + 16 => reg 1 ] [ cmp eq reg 1 0x00000008 ] [ payload load 1b @ network header + 9 => reg 1 ] [ cmp eq reg 1 0x00000001 ] [ immediate reg 0 accept ] What happens here is that: 1. vlan type ip kills ether type vlan 2. ip protocol icmp kills vlan type ip Fix this by avoiding the removal of all vlan statements in the given example. Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: add pseudo th protocol to match d/sport in generic wayFlorian Westphal2019-07-151-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Its not possible to easily match both udp and tcp in a single rule. ... input ip protocol { tcp,udp } dport 53 will not work, as bison expects "tcp dport" or "sctp dport", or any other transport protocol name. Its possible to match the sport and dport via raw payload expressions, e.g.: ... input ip protocol { tcp,udp } @th,16,16 53 but its not very readable. Furthermore, its not possible to use this for set definitions: table inet filter { set myset { type ipv4_addr . inet_proto . inet_service } chain forward { type filter hook forward priority filter; policy accept; ip daddr . ip protocol . @th,0,16 @myset } } # nft -f test test:7:26-35: Error: can not use variable sized data types (integer) in concat expressions During the netfilter workshop Pablo suggested to add an alias to do raw sport/dport matching more readable, and make it use the inet_service type automatically. So, this change makes @th,0,16 work for the set definition case by setting the data type to inet_service. A new "th s|dport" syntax is provided as readable alternative: ip protocol { tcp, udp } th dport 53 As "th" is an alias for the raw expression, no dependency is generated -- its the users responsibility to add a suitable test to select the l4 header types that should be matched. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: add support for matching IPv4 optionsStephen Suryaputra2019-07-041-0/+4
| | | | | | | | | Add capability to have rules matching IPv4 options. This is developed mainly to support dropping of IP packets with loose and/or strict source route route options. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: prefer meta protocol as bridge l3 dependencyFlorian Westphal2019-06-191-0/+18
| | | | | | | | | | | | | | | | | | | On families other than 'ip', the rule ip protocol icmp needs a dependency on the ip protocol so we do not treat e.g. an ipv6 header as ip. Bridge currently uses eth_hdr.type for this, but that will cause the rule above to not match in case the ip packet is within a VLAN tagged frame -- ether.type will appear as ETH_P_8021Q. Due to vlan tag stripping, skb->protocol will be ETH_P_IP -- so prefer to use this instead. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: missing destroy function in statement definitionsPablo Neira Ayuso2019-04-051-0/+7
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: expr: remove expr_ops from struct exprFlorian Westphal2019-02-081-2/+2
| | | | | | | | size of struct expr changes from 144 to 128 bytes on x86_64. This doesn't look like much, but large rulesets can have tens of thousands of expressions (each set element is represented by an expression). Signed-off-by: Florian Westphal <fw@strlen.de>
* src: expr: add expression etypeFlorian Westphal2019-02-081-5/+5
| | | | | | | | Temporary kludge to remove all the expr->ops->type == ... patterns. Followup patch will remove expr->ops, and make expr_ops() lookup the correct expr_ops struct instead to reduce struct expr size. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: payload: export and use payload_expr_cmpFlorian Westphal2019-02-081-1/+1
| | | | | | | expr->ops is going away, so export payload cmp and use it directly. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: refine payload expr mergingFlorian Westphal2019-01-111-1/+27
| | | | | | | | | | | | | | | | | | | nf_tables can handle payload exprs for sizes <= sizeof(u32) via a direct operation from the eval loop, rather than a a call to the payload expression. Two loads for four byte quantities are thus faster than a single load for an 8 byte load. ip saddr 1.2.3.4 ip daddr 2.3.4.5 is faster with this applied, even though it involves two payload and two two compare expressions, just because all can be handled from the main loop without any calls to expression ops. Keep merging for linklayer and when at least one of the expressions already exceeded the 4 byte "limit" anyway. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add igmp supportPablo Neira Ayuso2019-01-091-2/+4
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: Implement JSON output supportPhil Sutter2018-05-111-0/+3
| | | | | | | | | | | | Although technically there already is support for JSON output via 'nft export json' command, it is hardly useable since it exports all the gory details of nftables VM. Also, libnftables has no control over what is exported since the content comes directly from libnftnl. Instead, implement JSON format support for regular 'nft list' commands. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Revert "payload: don't remove icmp family dependency in special cases"Florian Westphal2018-03-281-9/+0
| | | | | | | | This reverts commit 126706c23c0458b07d54550dc27561b30f8a43f2. As its now ok to use icmp-in-ip6 family we can remove the dependency. Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: don't remove icmp family dependency in special casesFlorian Westphal2018-03-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | When using nftables to filter icmp-in-ipv6 or icmpv6-in-ipv4 we erronously removed the dependency, i.e. "lis ruleset" shows table ip6 filter { chain output { type filter hook output priority 0; policy accept; icmp type destination-unreachable } } but that won't restore because of ip vs ipv6 conflict. After this patch, this lists as meta l4proto icmp icmp type destination-unreachable instead. We still remove the dependency in "ip" family. Same applies to icmpv6-in-ip. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: make raw payloads workFlorian Westphal2018-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | make syntax consistent between print and parse. No dependency handling -- once you use raw expression, you need to make sure the raw expression only sees the packets that you'd want it to see. based on an earlier patch from Laurent Fasnacht <l@libres.ch>. Laurents patch added a different syntax: @<protocol>,<base>,<data type>,<offset>,<length> data_type is useful to make nftables not err when asking for "@payload,32,32 192.168.0.1", this patch still requires manual convsersion to an integer type (hex or decimal notation). data_type should probably be added later by adding an explicit cast expression, independent of the raw payload syntax. Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: don't resolve expressions using the inet pseudoheaderFlorian Westphal2018-02-261-1/+1
| | | | | | | Else, '@ll,0,8' will be mapped to 'inet nfproto', but thats not correct (inet is a pseudo header). Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: use integer_type when initializing a raw expressionFlorian Westphal2018-02-261-0/+1
| | | | | | | The invalid type prints prominent "[invalid]", so prefer integer type in raw expressions. Signed-off-by: Florian Westphal <fw@strlen.de>