| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new -o/--optimize option to enable ruleset
optimization.
You can combine this option with the dry run mode (--check) to review
the proposed ruleset updates without actually loading the ruleset, e.g.
# nft -c -o -f ruleset.test
Merging:
ruleset.nft:16:3-37: ip daddr 192.168.0.1 counter accept
ruleset.nft:17:3-37: ip daddr 192.168.0.2 counter accept
ruleset.nft:18:3-37: ip daddr 192.168.0.3 counter accept
into:
ip daddr { 192.168.0.1, 192.168.0.2, 192.168.0.3 } counter packets 0 bytes 0 accept
This infrastructure collects the common statements that are used in
rules, then it builds a matrix of rules vs. statements. Then, it looks
for common statements in consecutive rules which allows to merge rules.
This ruleset optimization always performs an implicit dry run to
validate that the original ruleset is correct. Then, on a second pass,
it performs the ruleset optimization and add the rules into the kernel
(unless --check has been specified by the user).
From libnftables perspective, there is a new API to enable
this feature:
uint32_t nft_ctx_get_optimize(struct nft_ctx *ctx);
void nft_ctx_set_optimize(struct nft_ctx *ctx, uint32_t flags);
This patch adds support for the first optimization: Collapse a linear
list of rules matching on a single selector into a set as exposed in the
example above.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reading from stdin requires to store the ruleset in a buffer so error
reporting works accordingly, eg.
# cat ruleset.nft | nft -f -
/dev/stdin:3:13-13: Error: unknown identifier 'x'
ip saddr $x
^
The error reporting infrastructure performs a fseek() on the file
descriptor which does not work in this case since the data from the
descriptor has been already consumed.
This patch adds a new stdin input descriptor to perform this special
handling which consists on re-routing this request through the buffer
functions.
Fixes: 935f82e7dd49 ("Support 'nft -f -' to read from stdin")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Add a few helper functions to reuse code in the new rule optimization
infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Its always 0, so remove it.
Looks like this was intended to support variable options that have
array-like members, but so far this isn't implemented, better remove
dead code and implement it properly when such support is needed.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
| |
Extend nft_cache_filter to hold a flowtable name so 'list flowtable'
command causes fetching the requested flowtable only.
Dump flowtables just once instead of for each table, merely assign
fetched data to tables inside the loop.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
| |
Fetch either all tables' sets at once, a specific table's sets or even a
specific set if needed instead of iterating over the list of previously
fetched tables and fetching for each, then ignoring anything returned
that doesn't match the filter.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When operating on a specific chain, add payload to NFT_MSG_GETCHAIN so
kernel returns only relevant data. Since ENOENT is an expected return
code, do not treat this as error.
While being at it, improve code in chain_cache_cb() a bit:
- Check chain's family first, it is a less expensive check than
comparing table names.
- Do not extract chain name of uninteresting chains.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of fetching all existing rules in kernel's ruleset and filtering
in user space, add payload to the dump request specifying the table and
chain to filter for.
Since list_rule_cb() no longer needs the filter, pass only netlink_ctx
to the callback and drop struct rule_cache_dump_ctx.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of requesting a dump of all tables and filtering the data in
user space, construct a non-dump request if filter contains a table so
kernel returns only that single table.
This should improve nft performance in rulesets with many tables
present.
Signed-off-by: Phil Sutter <phil@nwl.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unify binop handling for ipv6 extension header, ip option and tcp option
processing.
Pass the real offset and length expected, not the one used in the kernel.
This was already done for extension headers and ip options, but tcp
option parsing did not do this.
This was fine before because no existing tcp option template
had a non-byte sized member.
With mptcp addition this isn't the case anymore, subtype field is
only 4 bits wide, but tcp option delinearization passed 8bits instead.
Pass the offset and mask delta, just like ip option/ipv6 exthdr.
This makes nft show 'tcp option mptcp subtype 1' instead of
'tcp option mptcp unknown & 240 == 16'.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MPTCP multiplexes the various mptcp signalling data using the
first 4 bits of the mptcp option.
This allows to match on the mptcp subtype via:
tcp option mptcp subtype 1
This misses delinearization support. mptcp subtype is the first tcp
option field that has a length of less than one byte.
Serialization processing will add a binop for this, but netlink
delinearization can't remove them, yet.
Also misses a new datatype/symbol table to allow to use mnemonics like
'mp_join' instead of raw numbers.
For this reason, no tests are added yet.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
| |
Allow to use "fastopen", "md5sig" and "mptcp" mnemonics rather than the
raw option numbers.
These new keywords are only recognized while scanner is in tcp state.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
| |
This moves tcp options not used anywhere else (e.g. in synproxy) to a
distinct scope. This will also allow to avoid exposing new option
keywords in the ruleset context.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
--terse does not apply to anonymous set, add a NFT_CACHE_TERSE bit
to skip named sets only.
Moreover, prioritize specific listing filter over --terse to avoid a
bogus:
netlink: Error: Unknown set '__set0' in lookup expression
when invoking:
# nft -ta list set inet filter example
Extend existing test to improve coverage.
Fixes: 9628d52e46ac ("cache: disable NFT_CACHE_SETELEM_BIT on --terse listing only")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With an autogenerated ruleset with ~20k chains.
# time nft list ruleset &> /dev/null
real 0m1,712s
user 0m1,258s
sys 0m0,454s
Speed up listing of a specific chain:
# time nft list chain nat MWDG-UGR-234PNG3YBUOTS5QD &> /dev/null
real 0m0,542s
user 0m0,251s
sys 0m0,292s
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Check family when filtering out listing of tables and sets.
Fixes: 3f1d3912c3a6 ("cache: filter out tables that are not requested")
Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Skip set element netlink dump if set is flushed, this speeds up
set flush + add element operation in a batch file for an existing set.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Wrap the table and set fields for list filtering to prepare for the
introduction element filters.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not call alloc_setelem_cache() to build the set element list in
nftnl_set. Instead, translate one single set element expression to
nftnl_set_elem object at a time and use this object to build the netlink
header.
Using a huge test set containing 1.1 million element blocklist, this
patch is reducing userspace memory consumption by 40%.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support to match on inner header / payload data:
# nft add rule x y @ih,32,32 0x14000000 counter
you can also mangle payload data:
# nft add rule x y @ih,32,32 set 0x14000000 counter
This update triggers a checksum update at the layer 4 header via
csum_flags, mangling odd bytes is also aligned to 16-bits.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Add an alias of the integer type to print raw payload expressions in
hexadecimal.
Update tests/py.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Factor the `N / time-unit` and `N byte-unit / time-unit` expressions
from limit expressions out into separate `limit_rate_pkts` and
`limit_rate_bytes` rules respectively.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Add userspace support for the netdev egress hook which is queued up for
v5.16-rc1, complete with documentation and tests. Usage is identical to
the ingress hook.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Do not fetch set content for list commands that specify a
set name.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not fetch table content for list commands that specify a
table name, e.g.
# nft list table filter
This speeds up listing of a given table by not populating the
cache with tables that are not needed.
- Full ruleset (huge with ~100k lines).
# sudo nft list ruleset &> /dev/null
real 0m3,049s
user 0m2,080s
sys 0m0,968s
- Listing per table is now faster:
# nft list table nat &> /dev/null
real 0m1,969s
user 0m1,412s
sys 0m0,556s
# nft list table filter &> /dev/null
real 0m0,697s
user 0m0,478s
sys 0m0,220s
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1326
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partially revert 913979f882d1 ("src: add expression handler hashtable")
which is causing a crash with two instances of the nftables handler.
$ sudo python
[sudo] password for echerkashin:
Python 3.9.7 (default, Sep 3 2021, 06:18:44)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nftables import Nftables
>>> n1=Nftables()
>>> n2=Nftables()
>>> <Ctrl-D>
double free or corruption (top)
Aborted
Reported-by: Eugene Crosser <crosser@average.org>
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Therefore, -n honors numeric time in seconds.
Fixes: f8f32deda31d ("meta: Introduce new conditions 'time', 'day' and 'hour'")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Add NFT_CACHE_SETELEM_MAYBE to dump the set elements conditionally,
only in case that the set interval flag is set on.
Reported-by: Cristian Constantin <const.crist@googlemail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update this command to display the hook datapath for a packet depending
on its family.
This patch also includes:
- Group of existing hooks based on the hook location.
- Order hooks by priority, from INT_MIN to INT_MAX.
- Do not add sign to priority zero.
- Refresh include/linux/netfilter/nfnetlink_hook.h cache copy.
- Use NFNLA_CHAIN_* attributes to print the chain family, table and name.
If NFNLA_CHAIN_* attributes are not available, display the hookfn name.
- Update syntax: remove optional hook parameter, promote the 'device'
argument.
The following example shows the hook datapath for IPv4 packets coming in
from netdevice 'eth0':
# nft list hooks ip device eth0
family ip {
hook ingress {
+0000000010 chain netdev x y [nf_tables]
+0000000300 chain inet m w [nf_tables]
}
hook input {
-0000000100 chain ip a b [nf_tables]
+0000000300 chain inet m z [nf_tables]
}
hook forward {
-0000000225 selinux_ipv4_forward
0000000000 chain ip a c [nf_tables]
}
hook output {
-0000000225 selinux_ipv4_output
}
hook postrouting {
+0000000225 selinux_ipv4_postroute
}
}
Note that the listing above includes the existing netdev and inet
hooks/chains which *might* interfer in the travel of an incoming IPv4
packet. This allows users to debug the pipeline, basically, to
understand in what order the hooks/chains are evaluated for the IPv4
packets.
If the netdevice is not specified, then the ingress hooks are not
shown.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
This function might be useful to recycle the existing nft_ctx to use it
with different external variable definitions.
Moreover, reset ctx->num_vars to zero.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new option to define variables from the command line.
# cat test.nft
table netdev x {
chain y {
type filter hook ingress devices = $dev priority 0;
counter accept
}
}
# nft --define dev="{ eth0, eth1 }" -f test.nft
You can only combine it with -f/--filename.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Commit 4694f7230195 introduced nfnetlink_hook.h but didn't update the
automake system to take account of the new file.
Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
Preparation patch to avoid too much $<stmt>$ references in the parser.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, assertion to ensure that no colission occur is hit due to
uninitialized hashtable memory area:
nft: netlink_delinearize.c:1741: expr_handler_init: Assertion `expr_handle_ht[hash] == NULL' failed.
Fixes: c4058f96c6a5 ("netlink_delinearize: Fix suspicious calloc() call")
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
the CFI bit has been repurposed as DEI "Drop Eligible Indicator"
since 802.1Q-2011.
The vlan cfi field is still retained for compatibility.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1516
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Example output:
$ nft list hook ip input
family ip hook input {
+0000000000 nft_do_chain_inet [nf_tables] # nft table ip filter chain input
+0000000010 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain filter_INPUT
+0000000100 nf_nat_ipv4_local_in [nf_nat]
+2147483647 ipv4_confirm [nf_conntrack]
}
$ nft list hooks netdev type ingress device lo
family netdev hook ingress device lo {
+0000000000 nft_do_chain_netdev [nf_tables]
}
$ nft list hooks inet
family ip hook prerouting {
-0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4]
-0000000300 iptable_raw_hook [iptable_raw]
-0000000290 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain raw_PREROUTING
-0000000200 ipv4_conntrack_in [nf_conntrack]
-0000000140 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain mangle_PREROUTING
-0000000100 nf_nat_ipv4_pre_routing [nf_nat]
}
...
'nft list hooks' will display everyting except the netdev family
via successive dump request for all family:hook combinations.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
| |
Followup patch will add new 'hooks' keyword for
nft list hooks
Add a scope for list to avoid exposure of the new keyword in nft
rulesets.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
| |
set_elem_catchall_expr_json undeclared here (not in a function); did you mean 'set_elem_catchall_expr_ops'?
1344 | .json = set_elem_catchall_expr_json,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
| set_elem_catchall_expr_ops
https://bugzilla.netfilter.org/show_bug.cgi?id=1542
Fixes: 5c2c6b092860 json: catchall element support
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Treat '*' as catchall element, not as a symbol.
Also add missing json test cases for wildcard set support.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
| |
Fix make distcheck.
Fixes: 0e3871cfd9a1 ("exthdr: Implement SCTP Chunk matching")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refer to chain, not table.
Error: No such file or directory; did you mean table ‘z’ in family ip?
add chain x y { type filter nat prerouting priority dstnat; }
^
It should say instead:
Error: No such file or directory; did you mean chain ‘z’ in table ip ‘x’?
[ Florian added args check for fmt to the netlink_io_error() prototype. ]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Store the location of the chain type for better error reporting.
Several users that compile custom kernels reported that error
reporting is misleading when accidentally selecting
CONFIG_NFT_NAT=n.
After this patch, a better hint is provided:
# nft 'add chain x y { type nat hook prerouting priority dstnat; }'
Error: Could not process rule: No such file or directory
add chain x y { type nat hook prerouting priority dstnat; }
^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Extend exthdr expression to support scanning through SCTP packet chunks
and matching on fixed fields' values.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
| |
This isolates only "vtag" token for now.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the following shortcut syntax:
expression flags / flags
instead of:
expression and flags == flags
For example:
tcp flags syn,ack / syn,ack,fin,rst
^^^^^^^ ^^^^^^^^^^^^^^^
value mask
instead of:
tcp flags and (syn|ack|fin|rst) == syn|ack
The second list of comma-separated flags represents the mask which are
examined and the first list of comma-separated flags must be set.
You can also use the != operator with this syntax:
tcp flags != fin,rst / syn,ack,fin,rst
This shortcut is based on the prefix notation, but it is also similar to
the iptables tcp matching syntax.
This patch introduces the flagcmp expression to print the tcp flags in
this new notation. The delinearize path transforms the binary expression
to this new flagcmp expression whenever possible.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a catchall expression (EXPR_SET_ELEM_CATCHALL).
Use the asterisk (*) to represent the catch-all set element, e.g.
table x {
set y {
type ipv4_addr
counter
elements = { 1.2.3.4 counter packets 0 bytes 0, * counter packets 0 bytes 0 }
}
}
Special handling for segtree: zap the catch-all element from the set
element list and re-add it after processing.
Remove wildcard_expr deadcode in src/parser_bison.y
This patch also adds several tests for the tests/py and tests/shell
infrastructures.
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
Add support for matching on the cgroups version 2.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Update the cache to remove this flowtable from the evaluation phase.
Add flowtable_cache_del() function for this purpose.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Update the cache to remove this chain from the evaluation phase. Add
chain_cache_del() function for this purpose.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a hashtable for fast table lookups.
Tables that reside in the cache use the table->cache_hlist and
table->cache_list heads.
Table that are created from command line / ruleset are also added
to the cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|