| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8c75d3a16960 ("Reject invalid chain priority values in user space")
provides error reporting from the evaluation phase. Instead, this patch
infers the error after the kernel reports EOPNOTSUPP.
test.nft:3:28-40: Error: Chains of type "nat" must have a priority value above -200
type nat hook prerouting priority -300;
^^^^^^^^^^^^^
This patch also adds another common issue for users compiling their own
kernels if they forget to enable CONFIG_NFT_NAT in their .config file.
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The kernel doesn't accept nat type chains with a priority of -200 or
below. Catch this and provide a better error message than the kernel's
EOPNOTSUPP.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new statement allows you to know how long ago there was a matching
packet.
# nft list ruleset
table ip x {
chain y {
[...]
ip protocol icmp last used 49m54s884ms counter packets 1 bytes 64
}
}
if this statement never sees a packet, then the listing says:
ip protocol icmp last used never counter packets 0 bytes 0
Add tests/py in this patch too.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the data in the mapping contains a range, then upgrade value to range.
Otherwise, the following error is displayed:
/dev/stdin:11:57-75: Error: Could not process rule: Invalid argument
dnat ip to iifname . ip saddr map { enp2s0 . 10.1.1.136 : 1.1.2.69, enp2s0 . 10.1.1.1-10.1.1.135 : 1.1.2.66-1.84.236.78 }
^^^^^^^^^^^^^^^^^^^
The kernel rejects this command because userspace sends a single value
while the kernel expects the range that represents the min and the max
IP address to be used for NAT. The upgrade is also done when concatenation
with intervals is used in the rhs of the mapping.
For anonymous sets, expansion cannot be done from expr_evaluate_mapping()
because the EXPR_F_INTERVAL flag is inferred from the elements. For
explicit sets, this can be done from expr_evaluate_mapping() because the
user already specifies the interval flag in the rhs of the map definition.
Update tests/shell and tests/py to improve testing coverage in this case.
Fixes: 9599d9d25a6b ("src: NAT support for intervals in maps")
Fixes: 66746e7dedeb ("src: support for nat with interval concatenation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The nested syntax notation results in one single table command which
includes all other objects. This differs from the flat notation where
there is usually one command per object.
This patch adds a previous step to the evaluation phase to expand the
objects that are contained in the table into independent commands, so
both notations have similar representations.
Remove the code to evaluate the nested representation in the evaluation
phase since commands are independently evaluated after the expansion.
The commands are expanded after the set element collapse step, in case
that there is a long list of singleton element commands to be added to
the set, to shorten the command list iteration.
This approach also avoids interference with the object cache that is
populated in the evaluation, which might refer to objects coming in the
existing command list that is being processed.
There is still a post_expand phase to detach the elements from the set
which could be consolidated by updating the evaluation step to handle
the CMD_OBJ_SETELEMS command type.
This patch fixes 27c753e4a8d4 ("rule: expand standalone chain that
contains rules") which broke rule addition/insertion by index because
the expansion code after the evaluation messes up the cache.
Fixes: 27c753e4a8d4 ("rule: expand standalone chain that contains rules")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
If the key in the nat mapping is either ip or ip6, then set the nat
family accordingly, no need for explicit family in the nat statement.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Print error message in case family cannot be inferred, before this
patch, $? shows 1 after nft execution but no error message was printed.
While at it, update error reporting for consistency in similar use
cases.
Fixes: e5c9c8fe0bcc ("evaluate: stmt_evaluate_nat_map() only if stmt->nat.ipportmap == true")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to be able to set ct and meta marks to values derived from
payload expressions, we need to relax the requirement that the type of
the statement argument must match that of the statement key. Instead,
we require that the base-type of the argument is integer and that the
argument is small enough to fit.
Add one testcase for tests/py.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"destroy" command performs a deletion as "delete" command but does not fail
if the object does not exist. As there is no NLM_F_* flag for ignoring such
error, it needs to be ignored directly on error handling.
Example of use:
# nft list ruleset
table ip filter {
chain output {
}
}
# nft destroy table ip missingtable
# echo $?
0
# nft list ruleset
table ip filter {
chain output {
}
}
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eric reports that nft asserts when using integer basetype constants with
'typeof' sets. Example:
table netdev t {
set s {
typeof ether saddr . vlan id
flags dynamic,timeout
}
chain c { }
}
loads fine. But adding a rule with add/update statement fails:
nft 'add rule netdev t c set update ether saddr . 0 @s'
nft: netlink_linearize.c:867: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed.
When the 'ether saddr . 0' concat expression is processed, there is
no set definition available anymore to deduce the required size of the
integer constant.
nft eval step then derives the required length using the data types.
'0' has integer basetype, so the deduced length is 0.
The assertion triggers because serialization step finds that it
needs one more register.
2 are needed to store the ethernet address, another register is
needed for the vlan id.
Update eval step to make the expression context store the set key
information when processing the preceeding set reference, then
let stmt_evaluate_set() preserve the existing context instead of
zeroing it again via stmt_evaluate_arg().
This makes concat expression evaluation compute the total size
needed based on the sets key definition.
Reported-by: Eric Garver <eric@garver.life>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
| |
Reset rule counters and quotas in kernel, i.e. without having to reload
them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET
message type.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GRE has a number of fields that are conditional based on flags,
which requires custom dependency code similar to icmp and icmpv6.
Matching on optional fields is not supported at this stage.
Since this is a layer 3 tunnel protocol, an implicit dependency on
NFT_META_L4PROTO for IPPROTO_GRE is generated. To achieve this, this
patch adds new infrastructure to remove an outer dependency based on
the inner protocol from delinearize path.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For easier debugging, add decoration on protocol context:
# nft --debug=proto-ctx add rule netdev x y udp dport 4789 vxlan ip protocol icmp counter
update link layer protocol context (inner):
link layer : netdev <-
network layer : none
transport layer : none
payload data : none
update network layer protocol context (inner):
link layer : netdev
network layer : ip <-
transport layer : none
payload data : none
update network layer protocol context (inner):
link layer : netdev
network layer : ip <-
transport layer : none
payload data : none
update transport layer protocol context (inner):
link layer : netdev
network layer : ip
transport layer : icmp <-
payload data : none
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the initial infrastructure to support for inner header
tunnel matching and its first user: vxlan.
A new struct proto_desc field for payload and meta expression to specify
that the expression refers to inner header matching is used.
The existing codebase to generate bytecode is fully reused, allowing for
reusing existing supported layer 2, 3 and 4 protocols.
Syntax requires to specify vxlan before the inner protocol field:
... vxlan ip protocol udp
... vxlan ip saddr 1.2.3.0/24
This also works with concatenations and anonymous sets, eg.
... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 }
You have to restrict vxlan matching to udp traffic, otherwise it
complains on missing transport protocol dependency, e.g.
... udp dport 4789 vxlan ip daddr 1.2.3.4
The bytecode that is generated uses the new inner expression:
# nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4
netdev x y
[ meta load l4proto => reg 1 ]
[ cmp eq reg 1 0x00000011 ]
[ payload load 2b @ transport header + 2 => reg 1 ]
[ cmp eq reg 1 0x0000b512 ]
[ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ]
[ cmp eq reg 1 0x00000008 ]
[ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ]
[ cmp eq reg 1 0x04030201 ]
JSON support is not included in this patch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Add eval_proto_ctx() to access protocol context (struct proto_ctx).
Rename struct proto_ctx field to _pctx to highlight that this field
is internal and the helper function should be used.
This patch comes in preparation for supporting outer and inner
protocol context.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an underflow of the index that iterates over the concatenation:
../include/datatype.h:292:15: runtime error: shift exponent 4294967290 is too large for 32-bit type 'unsigned int'
set the datatype to invalid which is fine to evaluate a concatenation
in a set/map statement.
Update b8e1940aa190 ("tests: add a test case for map update from packet
path with concat") so it does not need a workaround to work.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Map updates can use timeouts, just like with sets, but the
linearization step did not pass this info to the kernel.
meta l4proto tcp update @pinned { ip saddr . ct original proto-src timeout 90s : ip daddr . tcp dport
Listing this won't show the "timeout 90s" because kernel never saw it to
begin with.
Also update evaluation step to reject a timeout that was set on
the data part: Timeouts are only allowed for the key-value pair
as a whole.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set pointer to list of expression to NULL and check that it is set on
before using it.
In function ‘expr_evaluate_concat’,
inlined from ‘expr_evaluate’ at evaluate.c:2488:10:
evaluate.c:1338:20: warning: ‘expressions’ may be used uninitialized [-Wmaybe-uninitialized]
1338 | if (runaway) {
| ^
evaluate.c: In function ‘expr_evaluate’:
evaluate.c:1321:33: note: ‘expressions’ was declared here
1321 | const struct list_head *expressions;
| ^~~~~~~~~~~
Reported-by: Florian Westphal <fw@strlen.de>
Fixes: 508f3a270531 ("netlink: swap byteorder of value component in concatenation of intervals")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Display error message in case user specifies more data components than
those defined by the concatenation of selectors.
# cat example.nft
table ip x {
chain y {
type filter hook prerouting priority 0; policy drop;
ip saddr . meta mark { 1.2.3.4 . 0x00000100 . 1.2.3.6-1.2.3.8 } accept
}
}
# nft -f example.nft
example.nft:4:3-22: Error: too many concatenation components
ip saddr . meta mark { 1.2.3.4 . 0x00000100 . 1.2.3.6-1.2.3.8 } accept
~~~~~~~~~~~~~~~~~~~~ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Without this patch, nft crashes:
==464771==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60d000000418 at pc 0x7fbc17513aa5 bp 0x7ffc73d33c90 sp 0x7ffc73d33c88
READ of size 8 at 0x60d000000418 thread T0
#0 0x7fbc17513aa4 in expr_evaluate_concat /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1348
#1 0x7fbc1752a9da in expr_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:2476
#2 0x7fbc175175e2 in expr_evaluate_set_elem /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1504
#3 0x7fbc1752aa22 in expr_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:2482
#4 0x7fbc17512cb5 in list_member_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1310
#5 0x7fbc17518ca0 in expr_evaluate_set /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1590
[...]
Fixes: 64bb3f43bb96 ("src: allow to use typeof of raw expressions in set declaration")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assuming the following interval set with concatenation:
set test {
typeof ip saddr . meta mark
flags interval
}
then, the following rule:
ip saddr . meta mark @test
requires bytecode that swaps the byteorder for the meta mark selector in
case the set contains intervals and concatenations.
inet x y
[ meta load nfproto => reg 1 ]
[ cmp eq reg 1 0x00000002 ]
[ payload load 4b @ network header + 12 => reg 1 ]
[ meta load mark => reg 9 ]
[ byteorder reg 9 = hton(reg 9, 4, 4) ] <----- this is required !
[ lookup reg 1 set test dreg 0 ]
This patch updates byteorder_conversion() to add the unary expression
that introduces the byteorder expression.
Moreover, store the meta mark range component of the element tuple in
the set in big endian as it is required for the range comparisons. Undo
the byteorder swap in the netlink delinearize path to listing the meta
mark values accordingly.
Update tests/py to validate that byteorder expression is emitted in the
bytecode. Update tests/shell to validate insertion and listing of a
named map declaration.
A similar commit 806ab081dc9a ("netlink: swap byteorder for
host-endian concat data") already exists in the tree to handle this for
strings with prefix (e.g. eth*).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following ruleset:
ip version vmap { 4 : jump t3, 6 : jump t4 }
results in a memleak.
expr_evaluate_shift() overrides the datatype which results in a datatype
memleak after the binop transfer that triggers a left-shift of the
constant (in the map).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
Use datatype_equal(), otherwise dynamically allocated datatype fails
to fulfill the datatype pointer check, triggering the assertion:
nft: evaluate.c:1249: expr_evaluate_binop: Assertion `expr_basetype(left) == expr_basetype(right)' failed.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1636
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
'vlan id 1'
must also add a ethernet header dep, else nft fetches the payload from
header offset 0 instead of 14.
Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
| |
nft add rule inet filter input vlan id 2
Error: conflicting protocols specified: ether vs. vlan
Refresh the current dependency after superseding the dummy
dependency to make this work.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a couple of spelling mistakes:
'expresion' -> 'expression'
and correct some non-native usages:
'allows to' -> 'allows one to'
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
| |
'rule inet dscpclassify dscp_match meta l4proto { udp } th dport { 3478 } th sport { 3478-3497, 16384-16387 } goto ct_set_ef'
works with 'nft add', but not 'nft insert', the latter yields: "BUG: unhandled op 4".
Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In verdict map, string values are accidentally treated as verdicts.
For example:
table t {
map foo {
type ipv4_addr : verdict
elements = {
192.168.0.1 : bar
}
}
chain output {
type filter hook output priority mangle;
ip daddr vmap @foo
}
}
Though "bar" is not a valid verdict (should be "jump bar" or something),
the string is taken as the element value. Then NFTA_DATA_VALUE is sent
to the kernel instead of NFTA_DATA_VERDICT. This would be rejected by
recent kernels. On older ones (e.g. v5.4.x) that don't validate the
type, a warning can be seen when the rule is hit, because of the
corrupted verdict value:
[5120263.467627] WARNING: CPU: 12 PID: 303303 at net/netfilter/nf_tables_core.c:229 nft_do_chain+0x394/0x500 [nf_tables]
Indeed, we don't parse verdicts during evaluation, but only chain names,
which is of type string rather than verdict. For example, "jump $var" is
a verdict while "$var" is a string.
Fixes: c64457cff967 ("src: Allow goto and jump to a variable")
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"ether saddr 0:1:2:3:4:6 vlan id 2" works, but reverse fails:
"vlan id 2 ether saddr 0:1:2:3:4:6" will give
Error: conflicting protocols specified: vlan vs. ether
After "proto: track full stack of seen l2 protocols, not just cumulative offset",
we have a list of all l2 headers, so search those to see if we had this
proto base in the past before rejecting this.
Reported-by: Eric Garver <eric@garver.life>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For input, a cumulative size counter of all pushed l2 headers is enough,
because we have the full expression tree available to us.
For delinearization we need to track all seen l2 headers, else we lose
information that we might need at a later time.
Consider:
rule netdev nt nc set update ether saddr . vlan id
during delinearization, the vlan proto_desc replaces the ethernet one,
and by the time we try to split the concatenation apart we will search
the ether saddr offset vs. the templates for proto_vlan.
This replaces the offset with an array that stores the protocol
descriptions seen.
Then, if the payload offset is larger than our description, search the
l2 stack and adjust the offset until we're within the expected offset
boundary.
Reported-by: Eric Garver <eric@garver.life>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
| |
If set declaration is missing the interval flag, and user specifies an
element with either prefix or range, then bail out.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1592
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding elements to a set or map with an invalid definition causes nft to
segfault. The following nftables.conf triggers the crash:
flush ruleset
create table inet filter
set inet filter foo {}
add element inet filter foo { foobar }
Simply parsing and checking the config will trigger it:
$ nft -c -f nftables.conf.crash
Segmentation fault
The error in the set/map definition is correctly caught and queued, but
because the set is invalid and does not contain a key type, adding to it
causes a NULL pointer dereference of set->key within setelem_evaluate().
I don't think it's necessary to queue another error since the underlying
problem is correctly detected and reported when parsing the definition
of the set. Simply checking the validity of set->key before using it
seems to fix it, causing the error in the definition of the set to be
reported properly. The element type error isn't caught, but that seems
reasonable since the key type is invalid or unknown anyway:
$ ./nft -c -f ~/nftables.conf.crash
/home/pti/nftables.conf.crash:3:21-21: Error: set definition does not specify key
set inet filter foo {}
^
[ Add tests to cover this case --pablo ]
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1597
Signed-off-by: Peter Tirsek <peter@tirsek.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise bogus error reports on set datatype mismatch might occur, such as:
Error: datatype mismatch, expected Internet protocol, expression has type IPv4 address
meta l4proto { tcp, udp } th dport 443 dnat to 10.0.0.1
~~~~~~~~~~~~ ^^^^^^^^^^^^
with an unrelated set declaration.
table ip test {
set set_with_interval {
type ipv4_addr
flags interval
}
chain prerouting {
type nat hook prerouting priority dstnat; policy accept;
meta l4proto { tcp, udp } th dport 443 dnat to 10.0.0.1
}
}
This bug has been introduced in the evaluation step.
Reported-by: Roman Petrov <nwhisper@gmail.com>
Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge)"
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
assert(1) is a no-op, this should be assert(0). Use BUG() instead.
Add missing CATCHALL to avoid BUG().
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"typeof ip saddr . ipsec in reqid" won't work because reqid uses
integer type, i.e. dtype->size is 0.
With "typeof", the size can be derived from the expression length,
via set->key.
This computes the concat length based either on dtype->size or
expression length.
It also updates concat evaluation to permit a zero datatype size
if the subkey expression has nonzero length (i.e., typeof was used).
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Splice the existing set element cache with the elements to be deleted
and merge sort it. The elements to be deleted are identified by the
EXPR_F_REMOVE flag.
The set elements to be deleted is automerged in first place if the
automerge flag is set on.
There are four possible deletion scenarios:
- Exact match, eg. delete [a-b] and there is a [a-b] range in the kernel set.
- Adjust left side of range, eg. delete [a-b] from range [a-x] where x > b.
- Adjust right side of range, eg. delete [a-b] from range [x-b] where x < a.
- Split range, eg. delete [a-b] from range [x-y] where x < a and b < y.
Update nft_evaluate() to use the safe list variant since new commands
are dynamically registered to the list to update ranges.
This patch also restores the set element existence check for Linux
kernels <= 5.7.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Allow for ranges such as, eg. 30-30.
This is required by the new intervals.c code, which normalize constant,
prefix set elements to all ranges.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the interval codebase to support for merging elements in the
kernel with userspace element updates.
Add a list of elements to be purged to cmd and set objects. These
elements representing outdated intervals are deleted before adding the
updated ranges.
This routine splices the list of userspace and kernel elements, then it
mergesorts to identify overlapping and contiguous ranges. This splice
operation is undone so the set userspace cache remains consistent.
Incrementally update the elements in the cache, this allows to remove
dd44081d91ce ("segtree: Fix add and delete of element in same batch").
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a rewrite of the segtree interval codebase.
This patch now splits the original set_to_interval() function in three
routines:
- add set_automerge() to merge overlapping and contiguous ranges.
The elements, expressed either as single value, prefix and ranges are
all first normalized to ranges. This elements expressed as ranges are
mergesorted. Then, there is a linear list inspection to check for
merge candidates. This code only merges elements in the same batch,
ie. it does not merge elements in the kernela and the userspace batch.
- add set_overlap() to check for overlapping set elements. Linux
kernel >= 5.7 already checks for overlaps, older kernels still needs
this code. This code checks for two conflict types:
1) between elements in this batch.
2) between elements in this batch and kernelspace.
The elements in the kernel are temporarily merged into the list of
elements in the batch to check for this overlaps. The EXPR_F_KERNEL
flag allows us to restore the set cache after the overlap check has
been performed.
- set_to_interval() now only transforms set elements, expressed as range
e.g. [a,b], to individual set elements using the EXPR_F_INTERVAL_END
flag notation to represent e.g. [a,b+1), where b+1 has the
EXPR_F_INTERVAL_END flag set on.
More relevant updates:
- The overlap and automerge routines are now performed in the evaluation
phase.
- The userspace set object representation now stores a reference to the
existing kernel set object (in case there is already a set with this
same name in the kernel). This is required by the new overlap and
automerge approach.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make something like "eth*" work for interval sets (match
eth0, eth1, and so on...) we must treat the string as a 128 bit
integer.
Without this, segtree will do the wrong thing when applying the prefix,
because we generate the prefix based on 'eth*' as input, with a length of 3.
The correct import needs to be done on "eth\0\0\0\0\0\0\0...", i.e., if
the input buffer were an ipv6 address, it should look like "eth\0::",
not "::eth".
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Else, range_expr_value_high() will see a 0 length when doing:
mpz_init_bitmask(tmp, expr->len - expr->prefix_len);
This wasn't a problem so far because prefix expressions generated
from "string*" were never passed down to the prefix->range conversion
functions.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prerequisite for support of interface names in interval sets:
table inet filter {
set s {
type ifname
flags interval
elements = { "foo" }
}
chain input {
type filter hook input priority filter; policy accept;
iifname @s counter
}
}
Will yield: "Byteorder mismatch: meta expected big endian, got host endian".
This is because of:
/* Data for range lookups needs to be in big endian order */
if (right->set->flags & NFT_SET_INTERVAL &&
byteorder_conversion(ctx, &rel->left, BYTEORDER_BIG_ENDIAN) < 0)
It doesn't make sense to me to add checks to all callers of
byteorder_conversion(), so treat this similar to EXPR_CONCAT and turn
TYPE_STRING byteorder change into a no-op.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Header fields such as udp length cannot be used in concatenations because
it is using the generic integer_type:
test.nft:3:10-19: Error: can not use variable sized data types (integer) in concat expressions
typeof udp length . @th,32,32
^^^^^^^^^^~~~~~~~~~~~~
This patch slightly extends ("src: allow to use typeof of raw expressions in
set declaration") to set on NFTNL_UDATA_SET_KEY_PAYLOAD_LEN in userdata if
TYPE_INTEGER is used.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the dynamic datatype to allocate an instance of TYPE_INTEGER and set
length and byteorder. Add missing information to the set userdata area
for raw payload expressions which allows to rebuild the set typeof from
the listing path.
A few examples:
- With anonymous sets:
nft add rule x y ip saddr . @ih,32,32 { 1.1.1.1 . 0x14, 2.2.2.2 . 0x1e }
- With named sets:
table x {
set y {
typeof ip saddr . @ih,32,32
elements = { 1.1.1.1 . 0x14 }
}
}
Incremental updates are also supported, eg.
nft add element x y { 3.3.3.3 . 0x28 }
expr_evaluate_concat() is used to evaluate both set key definitions
and set key values, using two different function might help to simplify
this code in the future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
without this test fails with:
W: [FAILED] tests/shell/testcases/maps/anon_objmap_concat: got 134
BUG: invalid range expression type concat
nft: expression.c:1452: range_expr_value_low: Assertion `0' failed.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
| |
else, this will segfault when trying to print the
"table 'x' doesn't exist" error message.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
| |
This allows to replace a tcp option with nops, similar
to the TCPOPTSTRIP feature of iptables.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When passing no upper size limit, the dynset expression forces
an internal 64k upperlimit.
In some cases, this can result in 'nft -f' to restore the ruleset.
Avoid this by always setting the EVAL flag on a set definition when
we encounter packet-path update attempt in the batch.
Reported-by: Yi Chen <yiche@redhat.com>
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we are evaluating a `reject` statement in the `inet` family, we may
have `ether` and `ip` or `ip6` as the L2 and L3 protocols in the
evaluation context:
table inet filter {
chain input {
type filter hook input priority filter;
ether saddr aa:bb:cc:dd:ee:ff ip daddr 192.168.0.1 reject
}
}
Since no `reject` option is given, nft attempts to infer one and fails:
BUG: unsupported familynft: evaluate.c:2766:stmt_evaluate_reject_inet_family: Assertion `0' failed.
Aborted
The reason it fails is that the ethernet protocol numbers for IPv4 and
IPv6 (`ETH_P_IP` and `ETH_P_IPV6`) do not match `NFPROTO_IPV4` and
`NFPROTO_IPV6`. Add support for the ethernet protocol numbers.
Replace the current `BUG("unsupported family")` error message with
something more informative that tells the user to provide an explicit
reject option.
Add a Python test case.
Fixes: 5fdd0b6a0600 ("nft: complete reject support")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001360
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
There are a couple of mistakes in comments. Fix them.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The mask used to select bits to keep must be exported in the same
byteorder as the payload statement itself, also the length of the
exported data must match the number of bytes extracted earlier.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|