| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
stmt_evaluate_log_prefix()
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, the macro asserts against truncation. This is despite the
callers still checked for truncation and tried to handle it. Probably
for good reason. With stmt_evaluate_log_prefix() it's not clear that the
code ensures that truncation cannot happen, so we must not assert
against it, but handle it.
Also,
- wrap the macro in "do { ... } while(0)" to make it more
function-like.
- evaluate macro arguments exactly once, to make it more function-like.
- take pointers to the arguments that are being modified.
- use assert() instead of abort().
- use size_t type for arguments related to the buffer size.
- drop "size". It was mostly redundant to "offset". We can know
everything we want based on "len" and "offset" alone.
- "offset" previously was incremented before checking for truncation.
So it would point somewhere past the buffer. This behavior does not
seem useful. Instead, on truncation "len" will be zero (as before) and
"offset" will point one past the buffer (one past the terminating
NUL).
Thereby, also fix a warning from clang:
evaluate.c:4134:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable]
size_t size = 0;
^
meta.c:1006:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable]
size_t size;
^
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, nft crashes with prefix longer than 127 bytes:
# nft add rule x y log prefix \"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee\"
==159385==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffed5bf4a10 at pc 0x7f3134839269 bp 0x7ffed5bf48b0 sp 0x7ffed5bf4060
WRITE of size 129 at 0x7ffed5bf4a10 thread T0
#0 0x7f3134839268 in __interceptor_memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:778
#1 0x7f3133e3074e in __mpz_export_data /tmp/nftables/src/gmputil.c:110
#2 0x7f3133d21d3c in expr_to_string /tmp/nftables/src/expression.c:192
#3 0x7f3133ded103 in netlink_gen_log_stmt /tmp/nftables/src/netlink_linearize.c:1148
#4 0x7f3133df33a1 in netlink_gen_stmt /tmp/nftables/src/netlink_linearize.c:1682
[...]
Fixes: e76bb3794018 ('src: allow for variables in the log prefix string')
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 343a51702656a ("src: store expr, not dtype to track data in
sets"), set->data is allocated for object maps in set_evaluate(), all
other map types have set->data initialized by the parser already,
set_evaluate() also checks that.
Drop the confusing check, later in the function set->data is
dereferenced unconditionally.
Fixes: 343a51702656a ("src: store expr, not dtype to track data in sets")
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a minimum base that all our sources will end up needing. This
is what <nft.h> provides.
Add <stdbool.h> and <stdint.h> there. It's unlikely that we want to
implement anything, without having "bool" and "uint32_t" types
available.
Yes, this means the internal headers are not self-contained, with
respect to what <nft.h> provides. This is the exception to the rule, and
our internal headers should rely to have <nft.h> included for them.
They should not include <nft.h> themselves, because <nft.h> needs always
be included as first. So when an internal header would include <nft.h>
it would be unnecessary, because the header is *always* included
already.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
<config.h> is generated by the configure script. As it contains our
feature detection, it want to use it everywhere.
Likewise, in some of our sources, we define _GNU_SOURCE. This defines
the C variant we want to use. Such a define need to come before anything
else, and it would be confusing if different source files adhere to a
different C variant. It would be good to use autoconf's
AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure
that <config.h> is always included as first.
Instead of going through all source files and include <config.h> as
first, add a new header "include/nft.h", which is supposed to be
included in all our sources (and as first).
This will also allow us later to prepare some common base, like include
<stdbool.h> everywhere.
We aim that headers are self-contained, so that they can be included in
any order. Which, by the way, already didn't work because some headers
define _GNU_SOURCE, which would only work if the header gets included as
first. <nft.h> is however an exception to the rule: everything we compile
shall rely on having <nft.h> header included as first. This applies to
source files (which explicitly include <nft.h>) and to internal header
files (which are only compiled indirectly, by being included from a source
file).
Note that <config.h> has no include guards, which is at least ugly to
include multiple times. It doesn't cause problems in practice, because
it only contains defines and the compiler doesn't warn about redefining
a macro with the same value. Still, <nft.h> also ensures to include
<config.h> exactly once.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getaddrinfo() blocks while trying to resolve the name. Blocking the
caller of the library is in many cases undesirable. Also, while
reconfiguring the firewall, it's not clear that resolving names via
the network will work or makes sense.
Add a new input flag NFT_CTX_INPUT_NO_DNS to opt-out from getaddrinfo()
and only accept plain IP addresses.
We could also use AI_NUMERICHOST with getaddrinfo() instead of
inet_pton(). By parsing via inet_pton(), we are better aware of
what we expect and can generate a better error message in case of
failure.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of the problems with meters is that they use the set/map
infrastructure behind the scenes which might be confusing to users.
This patch errors out in case user declares a meter whose name overlaps
with an existing set/map:
meter.nft:15:18-91: Error: File exists; meter ‘syn4-meter’ overlaps an existing set ‘syn4-meter’ in family inet
tcp dport 22 meter syn4-meter { ip saddr . tcp dport timeout 5m limit rate 20/minute } counter accept
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An old 5.10 kernel bails out simply with EEXIST, with this patch a
better hint is provided.
Dynamic sets are preferred over meters these days.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
Just like "ct timeout", "ct expectation" is in need of the same fix,
we get segfault on "nft list ct expectation table t", if table t exists.
This is the exact same pattern as resolved for "ct timeout" in commit
1d2e22fc0521 ("ct timeout: fix 'list object x' vs. 'list objects in table' confusion").
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
| |
All these are used to reset state in set/map elements, i.e. reset the
timeout or zero quota and counter values.
While 'reset element' expects a (list of) elements to be specified which
should be reset, 'reset set/map' will reset all elements in the given
set/map.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
| |
Evaluation phase checks the given table and set exist in cache. Relieve
execution phase from having to perform the lookup again by storing the
set reference in cmd->set. Just have to increase the ref counter so
cmd_free() does the right thing (which lacked handling of MAP and METER
objects for some reason).
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
| |
The code for set, map and meter were almost identical apart from the
specific last check. Fold them together and make the distinction in that
spot only.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For bitfield that spans more than one byte, such as ip6 dscp, byteorder
conversion needs to be done before rshift. Add unary expression for this
conversion only in the case of meta and ct statements.
Before this patch:
# nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp'
ip6 x y
[ payload load 2b @ network header + 0 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
[ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
[ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <--------- incorrect
[ meta set mark with reg 1 ]
After this patch:
# nft --debug=netlink add rule ip6 x y 'meta mark set ip6 dscp'
ip6 x y
[ payload load 2b @ network header + 0 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
[ byteorder reg 1 = ntoh(reg 1, 2, 2) ] <-------- correct
[ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
[ meta set mark with reg 1 ]
For the matching case, binary transfer already deals with the rshift to
adjust left and right hand side of the expression, the unary conversion
is not needed in such case.
Fixes: 8221d86e616b ("tests: py: add test-cases for ct and packet mark payload expressions")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
<empty ruleset>
$ nft list ct timeout table t
Error: No such file or directory
list ct timeout table t
^
This is expected to list all 'ct timeout' objects.
The failure is correct, the table 't' does not exist.
But now lets add one:
$ nft add table t
$ nft list ct timeout table t
Segmentation fault (core dumped)
... and thats not expected, nothing should be shown
and nft should exit normally.
Because of missing TIMEOUTS command enum, the backend thinks
it should do an object lookup, but as frontend asked for
'list of objects' rather than 'show this object',
handle.obj.name is NULL, which then results in this crash.
Update the command enums so that backend knows what the
frontend asked for.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
nft: evaluate.c:1849: __mapping_expr_expand: Assertion `i->etype == EXPR_MAPPING' failed.
after:
Error: expected mapping, not set element
snat ip prefix to ip saddr map { 10.141.11.0/24 : 192.168.2.0/24, 10.141.12.1 }
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
| |
Iptables supports the matching of DCCP packets based on the presence
or absence of DCCP options. Extend exthdr expressions to add this
functionality to nftables.
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=930
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Something like:
Given: set s { type ipv4_addr . ipv4_addr . inet_service .. } something
like
add rule ip saddr . 1.2.3.4 . 80 @s goto c1
fails with: "Error: Can't parse symbolic invalid expressions".
This fails because the relational expression first evaluates
the left hand side, so when concat evaluation sees '1.2.3.4'
no key context is available.
Check if the RHS is a set reference, and, if so, evaluate
the right hand side.
This sets a pointer to the set key in the evaluation context
structure which then makes the concat evaluation step parse
1.2.3.4 and 80 as ipv4 address and 16bit port number.
On delinearization, extend relop postprocessing to
copy the datatype from the rhs (set reference, has
proper datatype according to set->key) to the lhs (concat
expression).
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nft reports EEXIST when reading an existing set whose NFT_SET_EVAL has
been previously inferred from the ruleset.
# cat test.nft
table ip test {
set dlist {
type ipv4_addr
size 65535
}
chain output {
type filter hook output priority filter; policy accept;
udp dport 1234 update @dlist { ip daddr } counter packets 0 bytes 0
}
}
# nft -f test.nft
# nft -f test.nft
test.nft:2:6-10: Error: Could not process rule: File exists
set dlist {
^^^^^
Phil Sutter says:
In the first call, the set lacking 'dynamic' flag does not exist
and is therefore added to the cache. Consequently, both the 'add set'
command and the set statement point at the same set object. In the
second call, a set with same name exists already, so the object created
for 'add set' command is not added to cache and consequently not updated
with the missing flag. The kernel thus rejects the NEWSET request as the
existing set differs from the new one.
Set on the NFT_SET_EVAL flag if the existing set sets it on.
Fixes: 8d443adfcc8c1 ("evaluate: attempt to set_eval flag if dynamic updates requested")
Tested-by: Eric Garver <eric@garver.life>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
fee6bda06403 ("evaluate: remove anon sets with exactly one element")
introduces an optimization to remove use of sets with single element.
Skip this optimization if set element contains stateful statements.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Evaluation fails to accept stateful statements in verdict maps, relax
the following check for anonymous sets:
test.nft:4:29-35: Error: missing statement in map declaration
ip saddr vmap { 127.0.0.1 counter : drop, * counter : accept }
^^^^^^^
The existing code generates correctly the counter in the anonymous
verdict map.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If user forgets to specify the hook and priority and the flowtable does
not exist, then bail out:
# cat flowtable-incomplete.nft
table t {
flowtable f {
devices = { lo }
}
}
# nft -f /tmp/k
flowtable-incomplete.nft:2:12-12: Error: missing hook and priority in flowtable declaration
flowtable f {
^
Update one existing tests/shell to specify a hook and priority.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows you to add/remove devices to an existing chain:
# cat ruleset.nft
table netdev x {
chain y {
type filter hook ingress devices = { eth0 } priority 0; policy accept;
}
}
# nft -f ruleset.nft
# nft add chain netdev x y '{ devices = { eth1 }; }'
# nft list ruleset
table netdev x {
chain y {
type filter hook ingress devices = { eth0, eth1 } priority 0; policy accept;
}
}
# nft delete chain netdev x y '{ devices = { eth0 }; }'
# nft list ruleset
table netdev x {
chain y {
type filter hook ingress devices = { eth1 } priority 0; policy accept;
}
}
This feature allows for creating an empty netdev chain, with no devices.
In such case, no packets are seen until a device is registered.
This patch includes extended netlink error reporting:
# nft add chain netdev x y '{ devices = { x } ; }'
Error: Could not process rule: No such file or directory
add chain netdev x y { devices = { x } ; }
^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users have to specify a transport protocol match such as
meta l4proto tcp
before the redirect statement, even if the redirect statement already
implicitly refers to the transport protocol, for instance:
test.nft:3:16-53: Error: transport protocol mapping is only valid after transport protocol match
redirect to :tcp dport map { 83 : 8083, 84 : 8084 }
~~~~~~~~ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Evaluate the redirect expression before the mandatory check for the
transport protocol match, so protocol context already provides a
transport protocol.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Get length from statement, instead infering it from the expression that
is used to set the value. In the particular case of {ct|meta} mark, this
is 32 bits.
Otherwise, bytecode generation is not correct:
# nft -c --debug=netlink 'add rule ip6 x y ct mark set ip6 dscp << 2 | 0x10'
[ payload load 2b @ network header + 0 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x0000c00f ) ^ 0x00000000 ]
[ bitwise reg 1 = ( reg 1 >> 0x00000006 ) ]
[ byteorder reg 1 = ntoh(reg 1, 2, 1) ]
[ bitwise reg 1 = ( reg 1 << 0x00000002 ) ]
[ bitwise reg 1 = ( reg 1 & 0x00000fef ) ^ 0x00000010 ] <--- incorrect!
[ ct set mark with reg 1 ]
the previous bitwise shift already upgraded to 32-bits (not visible from
the netlink debug output above).
After this patch, the last | 0x10 uses 32-bits:
[ bitwise reg 1 = ( reg 1 & 0xffffffef ) ^ 0x00000010 ]
note that mask 0xffffffef is used instead of 0x00000fef.
Patch ("evaluate: support shifts larger than the width of the left operand")
provides the statement length through eval context. Use it to evaluate the
bitwise expression accordingly, otherwise bytecode is incorrect:
# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000'
ip x y
[ payload load 1b @ network header + 1 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ]
[ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ]
[ bitwise reg 1 = ( reg 1 & 0x1e000000 ) ^ 0x000000ff ] <-- incorrect byteorder for OR
[ byteorder reg 1 = ntoh(reg 1, 4, 4) ] <-- no needed for single ip dscp byte
[ ct set mark with reg 1 ]
Correct bytecode:
# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000
ip x y
[ payload load 1b @ network header + 1 => reg 1 ]
[ bitwise reg 1 = ( reg 1 & 0x000000fc ) ^ 0x00000000 ]
[ bitwise reg 1 = ( reg 1 >> 0x00000002 ) ]
[ bitwise reg 1 = ( reg 1 & 0x0000001e ) ^ 0xff000000 ]
[ ct set mark with reg 1 ]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, bogus error is reported:
# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1 | 0xff000000'
Error: Value 4278190080 exceeds valid range 0-63
add rule ip x y ct mark set ip dscp & 0x0f << 1 | 0xff000000
^^^^^^^^^^
Use the statement length as the maximum value in the mark statement
expression.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise expr_evaluate_value() fails with invalid datatype:
# nft --debug=netlink add rule ip x y 'ct mark set ip dscp & 0x0f << 1'
BUG: invalid basetype invalid
nft: evaluate.c:440: expr_evaluate_value: Assertion `0' failed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to be able to set ct and meta marks to values derived from
payload expressions, we need to relax the requirement that the type of
the statement argument must match that of the statement key. Instead,
we require that the base-type of the argument is integer and that the
argument is small enough to fit.
Moreover, swap expression byteorder before to make it compatible with
the statement byteorder, to ensure rulesets are portable.
# nft --debug=netlink add rule ip t c 'meta mark set ip saddr'
ip t c
[ payload load 4b @ network header + 12 => reg 1 ]
[ byteorder reg 1 = ntoh(reg 1, 4, 4) ] <----------- byteorder swap
[ meta set mark with reg 1 ]
Based on original work from Jeremy Sowden.
The following patches are required for this to work:
evaluate: get length from statement instead of lhs expression
evaluate: don't eval unary arguments
evaluate: support shifts larger than the width of the left operand
netlink_delinearize: correct type and byte-order of shifts
evaluate: insert byte-order conversions for expressions between 9 and 15 bits
Add one testcase for tests/py.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a unary expression is inserted to implement a byte-order
conversion, the expression being converted has already been evaluated
and so `expr_evaluate_unary` doesn't need to do so.
This is required by {ct|meta} statements with bitwise operations, which
might result in byteorder conversion of the expression.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we want to left-shift a value of narrower type and assign the result
to a variable of a wider type, we are constrained to only shifting up to
the width of the narrower type. Thus:
add rule t c meta mark set ip dscp << 2
works, but:
add rule t c meta mark set ip dscp << 8
does not, even though the lvalue is large enough to accommodate the
result.
Upgrade the maximum length based on the statement datatype length, which
is provided via context, if it is larger than expression lvalue.
Update netlink_delinearize.c to handle the case where the length of a
shift expression does not match that of its left-hand operand.
Based on patch from Jeremy Sowden.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Round up expression lengths when determining whether to insert a
byte-order conversion. For example, if one is masking a network header
which spans a byte boundary, the mask will span two bytes and so it will
need to be in NBO.
Fixes: bb03cbcd18a1 ("evaluate: no need to swap byte-order for values of fewer than 16 bits.")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reverts eab3eb7f146c ("evaluate: relax type-checking for
integer arguments in mark statements") since it might cause ruleset
portability issues when moving a ruleset from little to big endian
host (and vice-versa).
Let's revert this until we agree on what to do in this case.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8c75d3a16960 ("Reject invalid chain priority values in user space")
provides error reporting from the evaluation phase. Instead, this patch
infers the error after the kernel reports EOPNOTSUPP.
test.nft:3:28-40: Error: Chains of type "nat" must have a priority value above -200
type nat hook prerouting priority -300;
^^^^^^^^^^^^^
This patch also adds another common issue for users compiling their own
kernels if they forget to enable CONFIG_NFT_NAT in their .config file.
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The kernel doesn't accept nat type chains with a priority of -200 or
below. Catch this and provide a better error message than the kernel's
EOPNOTSUPP.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new statement allows you to know how long ago there was a matching
packet.
# nft list ruleset
table ip x {
chain y {
[...]
ip protocol icmp last used 49m54s884ms counter packets 1 bytes 64
}
}
if this statement never sees a packet, then the listing says:
ip protocol icmp last used never counter packets 0 bytes 0
Add tests/py in this patch too.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the data in the mapping contains a range, then upgrade value to range.
Otherwise, the following error is displayed:
/dev/stdin:11:57-75: Error: Could not process rule: Invalid argument
dnat ip to iifname . ip saddr map { enp2s0 . 10.1.1.136 : 1.1.2.69, enp2s0 . 10.1.1.1-10.1.1.135 : 1.1.2.66-1.84.236.78 }
^^^^^^^^^^^^^^^^^^^
The kernel rejects this command because userspace sends a single value
while the kernel expects the range that represents the min and the max
IP address to be used for NAT. The upgrade is also done when concatenation
with intervals is used in the rhs of the mapping.
For anonymous sets, expansion cannot be done from expr_evaluate_mapping()
because the EXPR_F_INTERVAL flag is inferred from the elements. For
explicit sets, this can be done from expr_evaluate_mapping() because the
user already specifies the interval flag in the rhs of the map definition.
Update tests/shell and tests/py to improve testing coverage in this case.
Fixes: 9599d9d25a6b ("src: NAT support for intervals in maps")
Fixes: 66746e7dedeb ("src: support for nat with interval concatenation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The nested syntax notation results in one single table command which
includes all other objects. This differs from the flat notation where
there is usually one command per object.
This patch adds a previous step to the evaluation phase to expand the
objects that are contained in the table into independent commands, so
both notations have similar representations.
Remove the code to evaluate the nested representation in the evaluation
phase since commands are independently evaluated after the expansion.
The commands are expanded after the set element collapse step, in case
that there is a long list of singleton element commands to be added to
the set, to shorten the command list iteration.
This approach also avoids interference with the object cache that is
populated in the evaluation, which might refer to objects coming in the
existing command list that is being processed.
There is still a post_expand phase to detach the elements from the set
which could be consolidated by updating the evaluation step to handle
the CMD_OBJ_SETELEMS command type.
This patch fixes 27c753e4a8d4 ("rule: expand standalone chain that
contains rules") which broke rule addition/insertion by index because
the expansion code after the evaluation messes up the cache.
Fixes: 27c753e4a8d4 ("rule: expand standalone chain that contains rules")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
If the key in the nat mapping is either ip or ip6, then set the nat
family accordingly, no need for explicit family in the nat statement.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Print error message in case family cannot be inferred, before this
patch, $? shows 1 after nft execution but no error message was printed.
While at it, update error reporting for consistency in similar use
cases.
Fixes: e5c9c8fe0bcc ("evaluate: stmt_evaluate_nat_map() only if stmt->nat.ipportmap == true")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to be able to set ct and meta marks to values derived from
payload expressions, we need to relax the requirement that the type of
the statement argument must match that of the statement key. Instead,
we require that the base-type of the argument is integer and that the
argument is small enough to fit.
Add one testcase for tests/py.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"destroy" command performs a deletion as "delete" command but does not fail
if the object does not exist. As there is no NLM_F_* flag for ignoring such
error, it needs to be ignored directly on error handling.
Example of use:
# nft list ruleset
table ip filter {
chain output {
}
}
# nft destroy table ip missingtable
# echo $?
0
# nft list ruleset
table ip filter {
chain output {
}
}
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eric reports that nft asserts when using integer basetype constants with
'typeof' sets. Example:
table netdev t {
set s {
typeof ether saddr . vlan id
flags dynamic,timeout
}
chain c { }
}
loads fine. But adding a rule with add/update statement fails:
nft 'add rule netdev t c set update ether saddr . 0 @s'
nft: netlink_linearize.c:867: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed.
When the 'ether saddr . 0' concat expression is processed, there is
no set definition available anymore to deduce the required size of the
integer constant.
nft eval step then derives the required length using the data types.
'0' has integer basetype, so the deduced length is 0.
The assertion triggers because serialization step finds that it
needs one more register.
2 are needed to store the ethernet address, another register is
needed for the vlan id.
Update eval step to make the expression context store the set key
information when processing the preceeding set reference, then
let stmt_evaluate_set() preserve the existing context instead of
zeroing it again via stmt_evaluate_arg().
This makes concat expression evaluation compute the total size
needed based on the sets key definition.
Reported-by: Eric Garver <eric@garver.life>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
| |
Reset rule counters and quotas in kernel, i.e. without having to reload
them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET
message type.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GRE has a number of fields that are conditional based on flags,
which requires custom dependency code similar to icmp and icmpv6.
Matching on optional fields is not supported at this stage.
Since this is a layer 3 tunnel protocol, an implicit dependency on
NFT_META_L4PROTO for IPPROTO_GRE is generated. To achieve this, this
patch adds new infrastructure to remove an outer dependency based on
the inner protocol from delinearize path.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For easier debugging, add decoration on protocol context:
# nft --debug=proto-ctx add rule netdev x y udp dport 4789 vxlan ip protocol icmp counter
update link layer protocol context (inner):
link layer : netdev <-
network layer : none
transport layer : none
payload data : none
update network layer protocol context (inner):
link layer : netdev
network layer : ip <-
transport layer : none
payload data : none
update network layer protocol context (inner):
link layer : netdev
network layer : ip <-
transport layer : none
payload data : none
update transport layer protocol context (inner):
link layer : netdev
network layer : ip
transport layer : icmp <-
payload data : none
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the initial infrastructure to support for inner header
tunnel matching and its first user: vxlan.
A new struct proto_desc field for payload and meta expression to specify
that the expression refers to inner header matching is used.
The existing codebase to generate bytecode is fully reused, allowing for
reusing existing supported layer 2, 3 and 4 protocols.
Syntax requires to specify vxlan before the inner protocol field:
... vxlan ip protocol udp
... vxlan ip saddr 1.2.3.0/24
This also works with concatenations and anonymous sets, eg.
... vxlan ip saddr . vxlan ip daddr { 1.2.3.4 . 4.3.2.1 }
You have to restrict vxlan matching to udp traffic, otherwise it
complains on missing transport protocol dependency, e.g.
... udp dport 4789 vxlan ip daddr 1.2.3.4
The bytecode that is generated uses the new inner expression:
# nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4
netdev x y
[ meta load l4proto => reg 1 ]
[ cmp eq reg 1 0x00000011 ]
[ payload load 2b @ transport header + 2 => reg 1 ]
[ cmp eq reg 1 0x0000b512 ]
[ inner type 1 hdrsize 8 flags f [ meta load protocol => reg 1 ] ]
[ cmp eq reg 1 0x00000008 ]
[ inner type 1 hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ]
[ cmp eq reg 1 0x04030201 ]
JSON support is not included in this patch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Add eval_proto_ctx() to access protocol context (struct proto_ctx).
Rename struct proto_ctx field to _pctx to highlight that this field
is internal and the helper function should be used.
This patch comes in preparation for supporting outer and inner
protocol context.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an underflow of the index that iterates over the concatenation:
../include/datatype.h:292:15: runtime error: shift exponent 4294967290 is too large for 32-bit type 'unsigned int'
set the datatype to invalid which is fine to evaluate a concatenation
in a set/map statement.
Update b8e1940aa190 ("tests: add a test case for map update from packet
path with concat") so it does not need a workaround to work.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Map updates can use timeouts, just like with sets, but the
linearization step did not pass this info to the kernel.
meta l4proto tcp update @pinned { ip saddr . ct original proto-src timeout 90s : ip daddr . tcp dport
Listing this won't show the "timeout 90s" because kernel never saw it to
begin with.
Also update evaluation step to reject a timeout that was set on
the data part: Timeouts are only allowed for the key-value pair
as a whole.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set pointer to list of expression to NULL and check that it is set on
before using it.
In function ‘expr_evaluate_concat’,
inlined from ‘expr_evaluate’ at evaluate.c:2488:10:
evaluate.c:1338:20: warning: ‘expressions’ may be used uninitialized [-Wmaybe-uninitialized]
1338 | if (runaway) {
| ^
evaluate.c: In function ‘expr_evaluate’:
evaluate.c:1321:33: note: ‘expressions’ was declared here
1321 | const struct list_head *expressions;
| ^~~~~~~~~~~
Reported-by: Florian Westphal <fw@strlen.de>
Fixes: 508f3a270531 ("netlink: swap byteorder of value component in concatenation of intervals")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Display error message in case user specifies more data components than
those defined by the concatenation of selectors.
# cat example.nft
table ip x {
chain y {
type filter hook prerouting priority 0; policy drop;
ip saddr . meta mark { 1.2.3.4 . 0x00000100 . 1.2.3.6-1.2.3.8 } accept
}
}
# nft -f example.nft
example.nft:4:3-22: Error: too many concatenation components
ip saddr . meta mark { 1.2.3.4 . 0x00000100 . 1.2.3.6-1.2.3.8 } accept
~~~~~~~~~~~~~~~~~~~~ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Without this patch, nft crashes:
==464771==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60d000000418 at pc 0x7fbc17513aa5 bp 0x7ffc73d33c90 sp 0x7ffc73d33c88
READ of size 8 at 0x60d000000418 thread T0
#0 0x7fbc17513aa4 in expr_evaluate_concat /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1348
#1 0x7fbc1752a9da in expr_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:2476
#2 0x7fbc175175e2 in expr_evaluate_set_elem /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1504
#3 0x7fbc1752aa22 in expr_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:2482
#4 0x7fbc17512cb5 in list_member_evaluate /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1310
#5 0x7fbc17518ca0 in expr_evaluate_set /home/pablo/devel/scm/git-netfilter/nftables/src/evaluate.c:1590
[...]
Fixes: 64bb3f43bb96 ("src: allow to use typeof of raw expressions in set declaration")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|