| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, ICMP{v4,v6,inet} code datatypes only describe those that are
supported by the reject statement, but they can also be used for icmp
code matching. Moreover, ICMP code types go hand-to-hand with ICMP
types, that is, ICMP code symbols depend on the ICMP type.
Thus, the output of:
nft describe icmp_code
look confusing because that only displays the values that are supported
by the reject statement.
Disentangle this by adding internal datatypes for the reject statement
to handle the ICMP code symbol conversion to value as well as ruleset
listing.
The existing icmp_code, icmpv6_code and icmpx_code remain in place. For
backward compatibility, a parser function is defined in case an existing
ruleset relies on these symbols.
As for the manpage, move existing ICMP code tables from the DATA TYPES
section to the REJECT STATEMENT section, where this really belongs to.
But the icmp_code and icmpv6_code table stubs remain in the DATA TYPES
section because that describe that this is an 8-bit integer field.
After this patch:
# nft describe icmp_code
datatype icmp_code (icmp code) (basetype integer), 8 bits
# nft describe icmpv6_code
datatype icmpv6_code (icmpv6 code) (basetype integer), 8 bits
# nft describe icmpx_code
datatype icmpx_code (icmpx code) (basetype integer), 8 bits
do not display the symbol table of the reject statement anymore.
icmpx_code_type is not used anymore, but keep it in place for backward
compatibility reasons.
And update tests/shell accordingly.
Fixes: 5fdd0b6a0600 ("nft: complete reject support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DTYPE_F_PREFIX flag provides a hint to the netlink delinearize path to
use prefix notation.
It seems use of prefix notation in meta mark causes confusion, users
expect to see prefix in the listing only in IP address datatypes.
Untoggle this flag so (more lengthy) binop output such as:
meta mark & 0xffffff00 == 0xffffff00
is used instead.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1739
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
| |
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inputs:
ip protocol . th dport { tcp / 22, }'
or
th dport . ip protocol { tcp / 22, }'
are not rejected at this time. 'list ruleset' yields:
ip protocol & nft: src/gmputil.c:77: mpz_get_uint8: Assertion `cnt <= 1' failed.
or
th dport & nft: src/gmputil.c:87: mpz_get_be16: Assertion `cnt <= 1' failed.
While this should be caught at input too, the print path should be more
robust, e.g. when there are direct nfnetlink users.
After this patch, the print functions fall back to
'integer_type_print' which can handle large numbers too.
Note that the output printed this way cannot be read back by nft;
it will dump something like:
tcp dport & 18446739675663040512 . ip protocol 0 . 0
but thats better than assert().
v2: same problem exists for service too.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a symbol_table_print() wrapper for the run-time populated
rt_symbol_tables which formats output similar to expr_describe() and
includes the data source.
Since these tables reside in struct output_ctx there is no implicit
connection between data type and therefore providing callbacks for
relevant datat types which feed the data into said wrapper is a simpler
solution than extending expr_describe() itself.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
| |
It is unconditionally accessed in symbol_table_print() so make sure it
is initialized to either BASE_DECIMAL (arbitrary) for empty or
non-existent source files or a proper value depending on entry number
format.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an ongoing effort among various distributions to tidy up in
/etc. The idea is to reduce contents to just what the admin manually
inserted to customize the system, anything else shall move out to /usr
(or so). The various files in /etc/iproute2 fall in that category as
they are seldomly modified.
The crux is though that iproute2 project seems not quite sure yet where
the files should go. While v6.6.0 installs them into /usr/lib/iproute2,
current mast^Wmain branch uses /usr/share/iproute2. Assume this is going
to stay as /(usr/)lib does not seem right for such files.
Note that rt_symbol_table_init() is not just used for
iproute2-maintained configs but also for connlabel.conf - so retain the
old behaviour when passed an absolute path.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xmalloc() (and similar x-functions) are used for allocation. They wrap
malloc()/realloc() but will abort the program on ENOMEM.
The meaning of xmalloc() is that it wraps malloc() but aborts on
failure. I don't think x-functions should have the notion, that this
were potentially a different memory allocator that must be paired
with a particular xfree().
Even if the original intent was that the allocator is abstracted (and
possibly not backed by standard malloc()/free()), then that doesn't seem
a good idea. Nowadays libc allocators are pretty good, and we would need
a very special use cases to switch to something else. In other words,
it will never happen that xmalloc() is not backed by malloc().
Also there were a few places, where a xmalloc() was already "wrongly"
paired with free() (for example, iface_cache_release(), exit_cookie(),
nft_run_cmd_from_buffer()).
Or note how pid2name() returns an allocated string from fscanf(), which
needs to be freed with free() (and not xfree()). This requirement
bubbles up the callers portid2name() and name_by_portid(). This case was
actually handled correctly and the buffer was freed with free(). But it
shows that mixing different allocators is cumbersome to get right. Of
course, we don't actually have different allocators and whether to use
free() or xfree() makes no different. The point is that xfree() serves
no actual purpose except raising irrelevant questions about whether
x-functions are correctly paired with xfree().
Note that xfree() also used to accept const pointers. It is bad to
unconditionally for all deallocations. Instead prefer to use plain
free(). To free a const pointer use free_const() which obviously wraps
free, as indicated by the name.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Almost everywhere xmalloc() and friends is used instead of malloc().
This is almost everywhere paired with xfree().
xfree() has two problems. First, it brings the wrong notion that
xmalloc() should be paired with xfree(), as if xmalloc() would not use
the plain malloc() allocator. In practices, xfree() just wraps free(),
and it wouldn't make sense any other way. xfree() should go away. This
will be addressed in the next commit.
The problem addressed by this commit is that xfree() accepts a const
pointer. Paired with the practice of almost always using xfree() instead
of free(), all our calls to xfree() cast away constness of the pointer,
regardless whether that is necessary. Declaring a pointer as const
should help us to catch wrong uses. If the xfree() function always casts
aways const, the compiler doesn't help.
There are many places that rightly cast away const during free. But not
all of them. Add a free_const() macro, which is like free(), but accepts
const pointers. We should always make an intentional choice whether to
use free() or free_const(). Having a free_const() macro makes this very
common choice clearer, instead of adding a (void*) cast at many places.
Note that we now pair xmalloc() allocations with a free() call (instead
of xfree(). That inconsistency will be resolved in the next commit.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The caller is supposed to free the allocated string. Return a non-const
string to make that clearer.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The returned memory will be initialized. No need to zero it first. Use
xmalloc() instead of xzalloc().
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
<string.h> provides strcmp(), as such it's very basic and used
everywhere.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
"struct datatype" is for the most part immutable, and most callers deal
with const pointers. That's why datatype_get() accepts a const pointer
to increase the reference count (mutating the refcnt field).
It should also return a const pointer. In fact, all callers are fine
with that already.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Use the enum types as we have them.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Matching on ct event makes no sense since this is mostly used as
statement to globally filter out ctnetlink events, but do not crash
if it is used from concatenations.
Add the missing slot in the datatype array so this does not crash.
Fixes: 2595b9ad6840 ("ct: add conntrack event mask support")
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, ct label with concatenations such as:
table ip x {
chain y {
ct label . ct mark { 0x1 . 0x1 }
}
}
crashes:
../include/datatype.h:196:11: runtime error: member access within null pointer of type 'const struct datatype'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==640948==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fc970d3199b bp 0x7fffd1f20560 sp 0x7fffd1f20540 T0)
==640948==The signal is caused by a READ memory access.
==640948==Hint: address points to the zero page.
sudo #0 0x7fc970d3199b in datatype_equal ../include/datatype.h:196
Fixes: 2fcce8b0677b ("ct: connlabel matching support")
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test `./tests/shell/run-tests.sh -V tests/shell/testcases/maps/nat_addr_port`
fails:
==118== 195 (112 direct, 83 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==118== at 0x484682C: calloc (vg_replace_malloc.c:1554)
==118== by 0x48A39DD: xmalloc (utils.c:37)
==118== by 0x48A39DD: xzalloc (utils.c:76)
==118== by 0x487BDFD: datatype_alloc (datatype.c:1205)
==118== by 0x487BDFD: concat_type_alloc (datatype.c:1288)
==118== by 0x488229D: stmt_evaluate_nat_map (evaluate.c:3786)
==118== by 0x488229D: stmt_evaluate_nat (evaluate.c:3892)
==118== by 0x488229D: stmt_evaluate (evaluate.c:4450)
==118== by 0x488328E: rule_evaluate (evaluate.c:4956)
==118== by 0x48ADC71: nft_evaluate (libnftables.c:552)
==118== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:595)
==118== by 0x402983: main (main.c:534)
I think the reference handling for datatype is wrong. It was introduced
by commit 01a13882bb59 ('src: add reference counter for dynamic
datatypes').
We don't notice it most of the time, because instances are statically
allocated, where datatype_get()/datatype_free() is a NOP.
Fix and rework.
- Commit 01a13882bb59 comments "The reference counter of any newly
allocated datatype is set to zero". That seems not workable.
Previously, functions like datatype_clone() would have returned the
refcnt set to zero. Some callers would then then set the refcnt to one, but
some wouldn't (set_datatype_alloc()). Calling datatype_free() with a
refcnt of zero will overflow to UINT_MAX and leak:
if (--dtype->refcnt > 0)
return;
While there could be schemes with such asymmetric counting that juggle the
appropriate number of datatype_get() and datatype_free() calls, this is
confusing and error prone. The common pattern is that every
alloc/clone/get/ref is paired with exactly one unref/free.
Let datatype_clone() return references with refcnt set 1 and in
general be always clear about where we transfer ownership (take a
reference) and where we need to release it.
- set_datatype_alloc() needs to consistently return ownership to the
reference. Previously, some code paths would and others wouldn't.
- Replace
datatype_set(key, set_datatype_alloc(dtype, key->byteorder))
with a __datatype_set() with takes ownership.
Fixes: 01a13882bb59 ('src: add reference counter for dynamic datatypes')
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It provides malloc()/free(), which is so basic that we need it
everywhere. Include via <nft.h>.
The ultimate purpose is to define more things in <nft.h>. While it has
not corresponding C sources, <nft.h> can contain macros and static
inline functions, and is a good place for things that we shall have
everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that
is a very basic dependency, that will be needed for that.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The struct is called "datatype" and related functions have the fitting
"datatype_" prefix. Rename.
Also rename the internal "dtype_alloc()" to "datatype_alloc()".
This is a follow up to commit 01a13882bb59 ('src: add reference counter
for dynamic datatypes'), which started adding "datatype_*()" functions.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getaddrinfo()
With CC=clang we get
datatype.c:625:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
addr = ((struct sockaddr_in *)ai->ai_addr)->sin_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
datatype.c:690:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in6 *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
addr = ((struct sockaddr_in6 *)ai->ai_addr)->sin6_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
datatype.c:826:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
port = ((struct sockaddr_in *)ai->ai_addr)->sin_port;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix that by casting to (void*) first. Also, add an assertion that the
type is as expected.
For inet_service_type_parse(), differentiate between AF_INET and
AF_INET6. It might not have been a problem in practice, because the
struct offsets of sin_port/sin6_port are identical.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
<config.h> is generated by the configure script. As it contains our
feature detection, it want to use it everywhere.
Likewise, in some of our sources, we define _GNU_SOURCE. This defines
the C variant we want to use. Such a define need to come before anything
else, and it would be confusing if different source files adhere to a
different C variant. It would be good to use autoconf's
AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure
that <config.h> is always included as first.
Instead of going through all source files and include <config.h> as
first, add a new header "include/nft.h", which is supposed to be
included in all our sources (and as first).
This will also allow us later to prepare some common base, like include
<stdbool.h> everywhere.
We aim that headers are self-contained, so that they can be included in
any order. Which, by the way, already didn't work because some headers
define _GNU_SOURCE, which would only work if the header gets included as
first. <nft.h> is however an exception to the rule: everything we compile
shall rely on having <nft.h> header included as first. This applies to
source files (which explicitly include <nft.h>) and to internal header
files (which are only compiled indirectly, by being included from a source
file).
Note that <config.h> has no include guards, which is at least ugly to
include multiple times. It doesn't cause problems in practice, because
it only contains defines and the compiler doesn't warn about redefining
a macro with the same value. Still, <nft.h> also ensures to include
<config.h> exactly once.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getaddrinfo() blocks while trying to resolve the name. Blocking the
caller of the library is in many cases undesirable. Also, while
reconfiguring the firewall, it's not clear that resolving names via
the network will work or makes sense.
Add a new input flag NFT_CTX_INPUT_NO_DNS to opt-out from getaddrinfo()
and only accept plain IP addresses.
We could also use AI_NUMERICHOST with getaddrinfo() instead of
inet_pton(). By parsing via inet_pton(), we are better aware of
what we expect and can generate a better error message in case of
failure.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getservbyport_r()
We should aim to use the thread-safe variants of getprotoby{name,number}
and getservbyport(). However, they may not be available with other libc,
so it requires a configure check. As that is cumbersome, add wrappers
that do that at one place.
These wrappers are thread-safe, if libc provides the reentrant versions.
Use them.
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If user provides a symbol that cannot be parsed and the datatype provides
an error handler, provide a hint through the misspell infrastructure.
For instance:
# cat test.nft
table ip x {
map y {
typeof ip saddr : verdict
elements = { 1.2.3.4 : filter_server1 }
}
}
# nft -f test.nft
test.nft:4:26-39: Error: Could not parse netfilter verdict; did you mean `jump filter_server1'?
elements = { 1.2.3.4 : filter_server1 }
^^^^^^^^^^^^^^
While at it, normalize error to "Could not parse symbolic %s expression".
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some datatypes provide a symbol table that is parsed as an integer.
Improve error reporting by using the misspell infrastructure, to provide
a hint to the user, whenever possible.
If base datatype, usually the integer datatype, fails to parse the
symbol, then try a fuzzy match on the symbol table to provide a hint
in case the user has mistype it.
For instance:
test.nft:3:11-14: Error: Could not parse Differentiated Services Code Point expression; did you you mean `cs0`?
ip dscp ccs0
^^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In verdict map, string values are accidentally treated as verdicts.
For example:
table t {
map foo {
type ipv4_addr : verdict
elements = {
192.168.0.1 : bar
}
}
chain output {
type filter hook output priority mangle;
ip daddr vmap @foo
}
}
Though "bar" is not a valid verdict (should be "jump bar" or something),
the string is taken as the element value. Then NFTA_DATA_VALUE is sent
to the kernel instead of NFTA_DATA_VERDICT. This would be rejected by
recent kernels. On older ones (e.g. v5.4.x) that don't validate the
type, a warning can be seen when the rule is hit, because of the
corrupted verdict value:
[5120263.467627] WARNING: CPU: 12 PID: 303303 at net/netfilter/nf_tables_core.c:229 nft_do_chain+0x394/0x500 [nf_tables]
Indeed, we don't parse verdicts during evaluation, but only chain names,
which is of type string rather than verdict. For example, "jump $var" is
a verdict while "$var" is a string.
Fixes: c64457cff967 ("src: Allow goto and jump to a variable")
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the dynamic datatype to allocate an instance of TYPE_INTEGER and set
length and byteorder. Add missing information to the set userdata area
for raw payload expressions which allows to rebuild the set typeof from
the listing path.
A few examples:
- With anonymous sets:
nft add rule x y ip saddr . @ih,32,32 { 1.1.1.1 . 0x14, 2.2.2.2 . 0x1e }
- With named sets:
table x {
set y {
typeof ip saddr . @ih,32,32
elements = { 1.1.1.1 . 0x14 }
}
}
Incremental updates are also supported, eg.
nft add element x y { 3.3.3.3 . 0x28 }
expr_evaluate_concat() is used to evaluate both set key definitions
and set key values, using two different function might help to simplify
this code in the future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Used by 'ct expiration', time_type is supposed to be 32bits. Passing a
64bits variable to constant_expr_alloc() causes the value to be always
zero on Big Endian.
Fixes: 0974fa84f162a ("datatype: seperate time parsing/printing from time_type")
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
| |
Add an alias of the integer type to print raw payload expressions in
hexadecimal.
Update tests/py.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Honor NFT_CTX_OUTPUT_NUMERIC_TIME.
# nft list ruleset
table ip x {
set y {
type ipv4_addr
flags timeout
elements = { 1.1.1.1 timeout 5m expires 1m49s40ms }
}
}
# sudo nft -T list ruleset
table ip x {
set y {
type ipv4_addr
flags timeout
elements = { 1.1.1.1 timeout 300s expires 108s }
}
}
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1561
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cgroupv2 path is expressed from the /sys/fs/cgroup folder, update
listing to skip it.
# nft add rule x y socket cgroupv2 level 1 "user.slice" counter
# nft list ruleset
table ip x {
chain y {
type filter hook input priority filter; policy accept;
socket cgroupv2 level 1 "user.slice" counter
}
}
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the following compilation warnings on x86_32.
datatype.c: In function ‘cgroupv2_type_print’:
datatype.c:1387:22: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", id);
~~^ ~~
%llu
meta.c: In function ‘date_type_print’:
meta.c:411:21: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", tstamp);
~~^ ~~~~~~
%llu
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
Add support for matching on the cgroups version 2.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
As an alternative to print the datatype values when no symbol table is
available. Use it to print protocols available via getprotobynumber()
which actually refers to /etc/protocols.
Not very efficient, getprotobynumber() causes a series of open()/close()
calls on /etc/protocols, but this is called from a non-critical path.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1503
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Add expr_chain_export() helper function to convert the chain name that
is stored in a gmp value variable to string.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables to send icmp frag-needed messages using reject target.
I have a bridge with connects an gretap tunnel with some ethernet lan.
On the gretap device I use ignore-df to avoid packets being lost without
icmp reject to the sender of the bridged packet.
Still I want to avoid packet fragmentation with the gretap packets.
So I though about adding an nftables rule like this:
nft insert rule bridge filter FORWARD \
ip protocol tcp \
ip length > 1400 \
ip frag-off & 0x4000 != 0 \
reject with icmp type frag-needed
This would reject all tcp packets with ip dont-fragment bit set that are
bigger than some threshold (here 1400 bytes). The sender would then receive
ICMP unreachable - fragmentation needed and reduce its packet size (as
defined with PMTU).
[ pablo: update tests/py ]
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nft list table bridge t
table bridge t {
set s4 {
typeof ip saddr . ip daddr
elements = { 1.0.0.1 . 2.0.0.2 }
}
}
=================================================================
==24334==ERROR: AddressSanitizer: heap-use-after-free on address 0x6080000000a8 at pc 0x7fe0e67df0ad bp 0x7ffff83e88c0 sp 0x7ffff83e88b8
READ of size 4 at 0x6080000000a8 thread T0
#0 0x7fe0e67df0ac in datatype_free nftables/src/datatype.c:1110
#1 0x7fe0e67e2092 in expr_free nftables/src/expression.c:89
#2 0x7fe0e67a855e in set_free nftables/src/rule.c:359
#3 0x7fe0e67b2f3e in table_free nftables/src/rule.c:1263
#4 0x7fe0e67a70ce in __cache_flush nftables/src/rule.c:299
#5 0x7fe0e67a71c7 in cache_release nftables/src/rule.c:305
#6 0x7fe0e68dbfa9 in nft_ctx_free nftables/src/libnftables.c:292
#7 0x55f00fbe0051 in main nftables/src/main.c:469
#8 0x7fe0e553309a in __libc_start_main ../csu/libc-start.c:308
#9 0x55f00fbdd429 in _start (nftables/src/.libs/nft+0x9429)
0x6080000000a8 is located 8 bytes inside of 96-byte region [0x6080000000a0,0x608000000100)
freed by thread T0 here:
#0 0x7fe0e6e70fb0 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
#1 0x7fe0e68b8122 in xfree nftables/src/utils.c:29
#2 0x7fe0e67df2e5 in datatype_free nftables/src/datatype.c:1117
#3 0x7fe0e67e2092 in expr_free nftables/src/expression.c:89
#4 0x7fe0e67a83fe in set_free nftables/src/rule.c:356
#5 0x7fe0e67b2f3e in table_free nftables/src/rule.c:1263
#6 0x7fe0e67a70ce in __cache_flush nftables/src/rule.c:299
#7 0x7fe0e67a71c7 in cache_release nftables/src/rule.c:305
#8 0x7fe0e68dbfa9 in nft_ctx_free nftables/src/libnftables.c:292
#9 0x55f00fbe0051 in main nftables/src/main.c:469
#10 0x7fe0e553309a in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7fe0e6e71330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7fe0e68b813d in xmalloc nftables/src/utils.c:36
#2 0x7fe0e68b8296 in xzalloc nftables/src/utils.c:65
#3 0x7fe0e67de7d5 in dtype_alloc nftables/src/datatype.c:1065
#4 0x7fe0e67df862 in concat_type_alloc nftables/src/datatype.c:1146
#5 0x7fe0e67ea852 in concat_expr_parse_udata nftables/src/expression.c:954
#6 0x7fe0e685dc94 in set_make_key nftables/src/netlink.c:718
#7 0x7fe0e685e177 in netlink_delinearize_set nftables/src/netlink.c:770
#8 0x7fe0e685f667 in list_set_cb nftables/src/netlink.c:895
#9 0x7fe0e4f95a03 in nftnl_set_list_foreach src/set.c:904
SUMMARY: AddressSanitizer: heap-use-after-free nftables/src/datatype.c:1110 in datatype_free
Shadow bytes around the buggy address:
0x0c107fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c107fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c107fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c107fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c107fff8000: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c107fff8010: fa fa fa fa fd[fd]fd fd fd fd fd fd fd fd fd fd
0x0c107fff8020: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
0x0c107fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c107fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c107fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c107fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==24334==ABORTING
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
| |
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will be needed once we add support for the 'typeof' keyword to
handle maps that could e.g. store 'ct helper' "type" values.
Instead of:
set foo {
type ipv4_addr . mark;
this would allow
set foo {
typeof(ip saddr) . typeof(ct mark);
(exact syntax TBD).
This would be needed to allow sets that store variable-sized data types
(string, integer and the like) that can't be used at at the moment.
Adding special data types for everything is problematic due to the
large amount of different types needed.
For anonymous sets, e.g. "string" can be used because the needed size can
be inferred from the statement, e.g. 'osf name { "Windows", "Linux }',
but in case of named sets that won't work because 'type string' lacks the
context needed to derive the size information.
With 'typeof(osf name)' the context is there, but at the moment it won't
help because the expression is discarded instantly and only the data
type is retained.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
# nft describe ip dscp
payload expression, datatype dscp (Differentiated Services Code Point) (basetype integer), 6 bits
pre-defined symbolic constants (in hexadecimal):
nft: datatype.c:209: switch_byteorder: Assertion `len > 0' failed.
Aborted
Fixes: c89a0801d077 ("datatype: Display pre-defined inet_service values in host byte order")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These keywords introduce new checks for a timestamp, an absolute date (which is converted to a timestamp),
an hour in the day (which is converted to the number of seconds since midnight) and a day of week.
When converting an ISO date (eg. 2019-06-06 17:00) to a timestamp,
we need to substract it the GMT difference in seconds, that is, the value
of the 'tm_gmtoff' field in the tm structure. This is because the kernel
doesn't know about time zones. And hence the kernel manages different timestamps
than those that are advertised in userspace when running, for instance, date +%s.
The same conversion needs to be done when converting hours (e.g 17:00) to seconds since midnight
as well.
The result needs to be computed modulo 86400 in case GMT offset (difference in seconds from UTC)
is negative.
We also introduce a new command line option (-t, --seconds) to show the actual
timestamps when printing the values, rather than the ISO dates, or the hour.
Some usage examples:
time < "2019-06-06 17:00" drop;
time < "2019-06-06 17:20:20" drop;
time < 12341234 drop;
day "Saturday" drop;
day 6 drop;
hour >= 17:00 drop;
hour >= "17:00:01" drop;
hour >= 63000 drop;
We need to convert an ISO date to a timestamp
without taking into account the time zone offset, since comparison will
be done in kernel space and there is no time zone information there.
Overwriting TZ is portable, but will cause problems when parsing a
ruleset that has 'time' and 'hour' rules. Parsing an 'hour' type must
not do time zone conversion, but that will be automatically done if TZ has
been overwritten to UTC.
Hence, we use timegm() to parse the 'time' type, even though it's not portable.
Overwriting TZ seems to be a much worse solution.
Finally, be aware that timestamps are converted to nanoseconds when
transferring to the kernel (as comparison is done with nanosecond
precision), and back to seconds when retrieving them for printing.
We swap left and right values in a range to properly handle
cross-day hour ranges (e.g. 23:15-03:22).
Signed-off-by: Ander Juaristi <a@juaristi.eus>
Reviewed-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
table bla {
chain foo { }
chain bar { jump foo }
}
}
Fails to restore on big-endian platforms:
jump.nft:5:2-9: Error: Could not process rule: No such file or directory
jump foo
nft passes a 0-length name to the kernel.
This is because when we export the value (the string), we provide
the size of the destination buffer.
In earlier versions, the parser allocated the name with the same
fixed size and all was fine.
After the fix, the export places the name in the wrong location
in the destination buffer.
This makes tests/shell/testcases/chains/0001jumps_0 work on s390x.
v2: convert one error check to a BUG(), it should not happen unless
kernel abi is broken.
Fixes: 142350f154c78 ("src: invalid read when importing chain name")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows you to use variables in chain policy definition, e.g.
define default_policy = "accept"
add table ip foo
add chain ip foo bar {type filter hook input priority filter; policy $default_policy}
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows you to use variables in chain priority definitions,
e.g.
define prio = filter
define prionum = 10
define prioffset = "filter - 150"
add table ip foo
add chain ip foo bar { type filter hook input priority $prio; }
add chain ip foo ber { type filter hook input priority $prionum; }
add chain ip foo bor { type filter hook input priority $prioffset; }
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Store symbol tables in context object instead. Use the nft_ctx object to
store the dynamic symbol table. Pass it on to the parse_ctx object so
this can be accessed from the parse routines. This dynamic symbol table
is also accesible from the output_ctx object for print routines.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
This object stores the dynamic symbol tables that are loaded from files.
Pass this object to datatype parse functions, although this new
parameter is not used yet, this is just a preparation patch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The two rules:
arp operation 1-2 accept
arp operation 256-512 accept
are both shown as 256-512:
chain in_public {
arp operation 256-512 accept
arp operation 256-512 accept
meta mark "1"
tcp flags 2,4
}
This is because range expression enforces numeric output,
yet nft_print doesn't respect byte order.
Behave as if we had no symbol in the first place and call
the base type print function instead.
This means we now respect format specifier as well:
chain in_public {
arp operation 1-2 accept
arp operation 256-512 accept
meta mark 0x00000001
tcp flags 0x2,0x4
}
Without fix, added test case will fail:
'add rule arp test-arp input arp operation 1-2': 'arp operation 1-2' mismatches 'arp operation 256-512'
v2: in case of -n, also elide quotation marks, just as if we would not
have found a symbolic name.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
datatype_set() already deals with this case, remove this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
Clone original flags too.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two datatypes are using runtime datatype allocation:
* Concatenations.
* Integer, that require byteorder adjustment.
From the evaluation / postprocess step, transformations are common,
hence expressions may end up fetching (infering) datatypes from an
existing one.
This patch adds a reference counter to release the dynamic datatype
object when it is shared.
The API includes the following helper functions:
* datatype_set(expr, datatype), to assign a datatype to an expression.
This helper already deals with reference counting for dynamic
datatypes. This also drops the reference counter of any previous
datatype (to deal with the datatype replacement case).
* datatype_get(datatype) bumps the reference counter. This function also
deals with nul-pointers, that occurs when the datatype is unset.
* datatype_free() drops the reference counter, and it also releases the
datatype if there are not more clients of it.
Rule of thumb is: The reference counter of any newly allocated datatype
is set to zero.
This patch also updates every spot to use datatype_set() for non-dynamic
datatypes, for consistency. In this case, the helper just makes an
simple assignment.
Note that expr_alloc() has been updated to call datatype_get() on the
datatype that is assigned to this new expression. Moreover, expr_free()
calls datatype_free().
This fixes valgrind reports like this one:
==28352== 1,350 (440 direct, 910 indirect) bytes in 5 blocks are definitely lost in loss recor 3 of 3
==28352== at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==28352== by 0x4E79558: xmalloc (utils.c:36)
==28352== by 0x4E7963D: xzalloc (utils.c:65)
==28352== by 0x4E6029B: dtype_alloc (datatype.c:1073)
==28352== by 0x4E6029B: concat_type_alloc (datatype.c:1127)
==28352== by 0x4E6D3B3: netlink_delinearize_set (netlink.c:578)
==28352== by 0x4E6D68E: list_set_cb (netlink.c:648)
==28352== by 0x5D74023: nftnl_set_list_foreach (set.c:780)
==28352== by 0x4E6D6F3: netlink_list_sets (netlink.c:669)
==28352== by 0x4E5A7A3: cache_init_objects (rule.c:159)
==28352== by 0x4E5A7A3: cache_init (rule.c:216)
==28352== by 0x4E5A7A3: cache_update (rule.c:266)
==28352== by 0x4E7E0EE: nft_evaluate (libnftables.c:388)
==28352== by 0x4E7EADD: nft_run_cmd_from_filename (libnftables.c:479)
==28352== by 0x109A53: main (main.c:310)
This patch also removes the DTYPE_F_CLONE flag which is broken and not
needed anymore since proper reference counting is in place.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|