summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* mnl: do not build nftnl_set element listPablo Neira Ayuso2021-11-083-25/+93
| | | | | | | | | | | | Do not call alloc_setelem_cache() to build the set element list in nftnl_set. Instead, translate one single set element expression to nftnl_set_elem object at a time and use this object to build the netlink header. Using a huge test set containing 1.1 million element blocklist, this patch is reducing userspace memory consumption by 40%. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: py: remove verdict from closing end intervalPablo Neira Ayuso2021-11-087-7/+7
| | | | | | | | | | | | | Kernel does not allow for NFT_SET_ELEM_INTERVAL_END flag and NFTA_SET_ELEM_DATA. The closing end interval represents a mismatch, therefore, no verdict can be applied. The existing payload files show the drop verdict when this is unset (because NF_DROP=0). This update is required to fix payload warnings in tests/py after libnftnl's ("set: use NFTNL_SET_ELEM_VERDICT to print verdict"). Fixes: 6671d9d137f6 ("mnl: Set NFTNL_SET_DATA_TYPE before dumping set elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: raw payload match and mangle on inner header / payload dataPablo Neira Ayuso2021-11-086-2/+12
| | | | | | | | | | | | | | | This patch adds support to match on inner header / payload data: # nft add rule x y @ih,32,32 0x14000000 counter you can also mangle payload data: # nft add rule x y @ih,32,32 set 0x14000000 counter This update triggers a checksum update at the layer 4 header via csum_flags, mangling odd bytes is also aligned to 16-bits. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: $NFT needs to be invoked unquotedŠtěpán Němec2021-11-052-1/+4
| | | | | | | | | | | | | The variable has to undergo word splitting, otherwise the shell tries to find the variable value as an executable, which breaks in cases that 7c8a44b25c22 ("tests: shell: Allow wrappers to be passed as nft command") intends to support. Mention this in the shell tests README. Fixes: d8ccad2a2b73 ("tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge interval set")") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: README: clarify test file name conventionŠtěpán Němec2021-11-051-2/+5
| | | | | | | | | | | | Since commit 4d26b6dd3c4c, test file name suffix no longer reflects expected exit code in all cases. Move the sentence "Since they are located with `find', test files can be put in any subdirectory." to a separate paragraph. Fixes: 4d26b6dd3c4c ("tests: shell: change all test scripts to return 0") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: README: $NFT does not have to be a path to a binaryŠtěpán Němec2021-11-051-1/+1
| | | | | | | | | Since commit 7c8a44b25c22, $NFT can contain an arbitrary command, e.g. 'valgrind nft'. Fixes: 7c8a44b25c22 ("tests: shell: Allow wrappers to be passed as nft command") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: README: copy editŠtěpán Němec2021-11-051-11/+12
| | | | | | | Grammar, wording, formatting fixes (no substantial change of meaning). Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* datatype: add xinteger_type alias to print in hexadecimalPablo Neira Ayuso2021-11-035-9/+26
| | | | | | | | | Add an alias of the integer type to print raw payload expressions in hexadecimal. Update tests/py. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: postpone transport protocol match check after nat expression ↵Pablo Neira Ayuso2021-11-034-6/+34
| | | | | | | | | evaluation Fix bogus error report when using transport protocol as map key. Fixes: 50780456a01a ("evaluate: check for missing transport protocol match in nat map with concatenations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: extend limit syntaxJeremy Sowden2021-11-034-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation describes the syntax of limit statements thus: limit rate [over] packet_number / TIME_UNIT [burst packet_number packets] limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT] TIME_UNIT := second | minute | hour | day BYTE_UNIT := bytes | kbytes | mbytes From this one might infer that a limit may be specified by any of the following: limit rate 1048576/second limit rate 1048576 mbytes/second limit rate 1048576 / second limit rate 1048576 mbytes / second However, the last does not currently parse: $ sudo /usr/sbin/nft add filter input limit rate 1048576 mbytes / second Error: wrong rate format add filter input limit rate 1048576 mbytes / second ^^^^^^^^^^^^^^^^^^^^^^^^^ Extend the `limit_rate_bytes` parser rule to support it, and add some new Python test-cases. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: add `limit_rate_pkts` and `limit_rate_bytes` rulesJeremy Sowden2021-11-032-62/+63
| | | | | | | | | Factor the `N / time-unit` and `N byte-unit / time-unit` expressions from limit expressions out into separate `limit_rate_pkts` and `limit_rate_bytes` rules respectively. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: add new `limit_bytes` ruleJeremy Sowden2021-11-031-6/+9
| | | | | | | | Refactor the `N byte-unit` expression out of the `limit_bytes_burst` rule into a separate `limit_bytes` rule. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: run-tests.sh: ensure non-zero exit when $failed != 0Štěpán Němec2021-11-022-2/+2
| | | | | | | | | | | | | | POSIX [1] does not specify the behavior of `exit' with arguments outside the 0-255 range, but what generally (bash, dash, zsh, OpenBSD ksh, busybox) seems to happen is the shell exiting with status & 255 [2], which results in zero exit for certain non-zero arguments. [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#exit [2] https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/common.c#n579 Fixes: 0c6592420586 ("tests: fix return codes") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: Fix bogus testsuite failure with 250HzPhil Sutter2021-11-021-1/+1
| | | | | | | | | | | Previous fix for HZ=100 was not sufficient, a kernel with HZ=250 rounds the 10ms to 8ms it seems. Do as Lukas suggests and accept the occasional input/output asymmetry instead of continuing the hide'n'seek game. Fixes: c9c5b5f621c37 ("tests: shell: Fix bogus testsuite failure with 100Hz") Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Support netdev egress hookLukas Wunner2021-10-2841-39/+2413
| | | | | | | | | Add userspace support for the netdev egress hook which is queued up for v5.16-rc1, complete with documentation and tests. Usage is identical to the ingress hook. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: py: Move netdev-specific tests to appropriate subdirectoryLukas Wunner2021-10-287-0/+0
| | | | | | | | The fwd and dup statements are specific to netdev hooks, so move their tests to the appropriate subdirectory. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: add testcase for --tersePablo Neira Ayuso2021-10-281-0/+69
| | | | | | | | | Compare listing with and without --terse for: nft list ruleset nft list set x y Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: disable NFT_CACHE_SETELEM_BIT on --terse listing onlyPablo Neira Ayuso2021-10-281-2/+2
| | | | | | | Instead of NFT_CACHE_SETELEM which also disables set dump. Fixes: 6bcd0d576a60 ("cache: unset NFT_CACHE_SETELEM with --terse listing") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: ensure evaluate_cache_list flags are set correctlyChris Arges2021-10-271-0/+1
| | | | | | | | | This change ensures that when listing rulesets with the terse flag that the terse flag is maintained. Fixes: 6bcd0d576a60 ("cache: unset NFT_CACHE_SETELEM with --terse listing") Signed-off-by: Chris Arges <carges@cloudflare.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: honor table in set filteringPablo Neira Ayuso2021-10-271-1/+2
| | | | | | | | Check if table mismatch, in case the same set name is used in different tables. Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: honor filter in set listing commandsPablo Neira Ayuso2021-10-271-0/+2
| | | | | | | | Fetch table, set and set elements only for set listing commands, e.g. nft list set inet filter ipv4_bogons. Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: always set on NFT_CACHE_REFRESH for listingPablo Neira Ayuso2021-10-271-6/+7
| | | | | | | | This flag forces a refresh of the cache on list commands, several object types are missing this flag, this fixes nft --interactive mode. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* configure: default to libedit for cliPablo Neira Ayuso2021-10-251-1/+1
| | | | | | | readline support only compiles for libreadline5, set libedit as default library. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge ↵Štěpán Němec2021-10-201-0/+29
| | | | | | | | | | | | | | | | | | interval set") Test inspired by [1] with both the set and stack size reduced by the same power of 2, to preserve the (pre-baecd1cf2685) segfault on one hand, and make the test successfully complete (post-baecd1cf2685) in a few seconds even on weaker hardware on the other. (The reason I stopped at 128kB stack size is that with 64kB I was getting segfaults even with baecd1cf2685 applied.) [1] https://bugzilla.redhat.com/show_bug.cgi?id=1908127 Signed-off-by: Štěpán Němec <snemec@redhat.com> Helped-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Phil Sutter <phil@nwl.cc>
* main: _exit() if setuidFlorian Westphal2021-10-191-0/+4
| | | | | | | | | | | | | Apparently some people think its a good idea to make nft setuid so unrivilged users can change settings. "nft -f /etc/shadow" is just one example of why this is a bad idea. Disable this. Do not print anything, fd cannot be trusted. This change intentionally doesn't affect libnftables, on the off-chance that somebody creates an suid program and knows what they're doing. Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: auto-removal of chain hook on netns removalFlorian Westphal2021-10-191-0/+6
| | | | | | | | This is the nft equivalent of the syzbot report that lead to kernel commit 68a3765c659f8 ("netfilter: nf_tables: skip netdev events generated on netns removal"). Signed-off-by: Florian Westphal <fw@strlen.de>
* rule: replace three conditionals with oneJeremy Sowden2021-10-121-5/+3
| | | | | | | | When outputting set definitions, merge three consecutive `if (!list_empty(&set->stmt_list))` conditionals. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: fix stateless output after listing sets containing countersJeremy Sowden2021-10-121-1/+3
| | | | | | | | | | | | | Before outputting counters in set definitions the `NFT_CTX_OUTPUT_STATELESS` flag was set to suppress output of the counter state and unconditionally cleared afterwards, regardless of whether it had been originally set. Record the original set of flags and restore it. Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994273 Fixes: 6d80e0f15492 ("src: support for counter in set definition") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: remove fake stateless output of named countersJeremy Sowden2021-10-121-7/+6
| | | | | | | | | When `-s` is passed, no state is output for named quotas and counter and quota rules, but fake zero state is output for named counters. Remove the output of named counters to match the remaining stateful objects. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: libnftables-json: make the example valid libnftables JSON inputŠtěpán Němec2021-10-111-2/+3
| | | | | | | | | | | - Add missing comma between array elements. - Fix chain 'name' property. - Match 'op' property is mandatory. Fixes: 2e56f533b36a ("doc: Improve example in libnftables-json(5)") Fixes: 90d4ee087171 ("JSON: Make match op mandatory, introduce 'in' operator") Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: unset NFT_CACHE_SETELEM with --terse listingPablo Neira Ayuso2021-10-021-3/+12
| | | | | | Skip populating the set element cache in this case to speed up listing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: filter out sets and maps that are not requestedPablo Neira Ayuso2021-09-302-2/+20
| | | | | | | Do not fetch set content for list commands that specify a set name. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: filter out tables that are not requestedPablo Neira Ayuso2021-09-304-15/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not fetch table content for list commands that specify a table name, e.g. # nft list table filter This speeds up listing of a given table by not populating the cache with tables that are not needed. - Full ruleset (huge with ~100k lines). # sudo nft list ruleset &> /dev/null real 0m3,049s user 0m2,080s sys 0m0,968s - Listing per table is now faster: # nft list table nat &> /dev/null real 0m1,969s user 0m1,412s sys 0m0,556s # nft list table filter &> /dev/null real 0m0,697s user 0m0,478s sys 0m0,220s Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1326 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: finer grain cache population for list commandsPablo Neira Ayuso2021-09-291-2/+23
| | | | | | Skip full cache population for list commands to speed up listing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: set on cache flags for nested notationPablo Neira Ayuso2021-09-292-0/+35
| | | | | | | | | Set on the cache flags for the nested notation too, this is fixing nft -f with two files, one that contains the set declaration and another that adds a rule that refers to such set. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1474 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check for missing transport protocol match in nat map with ↵Pablo Neira Ayuso2021-09-293-3/+37
| | | | | | | | | | | | | | | | concatenations Restore this error with NAT maps: # nft add rule 'ip ipfoo c dnat to ip daddr map @y' Error: transport protocol mapping is only valid after transport protocol match add rule ip ipfoo c dnat to ip daddr map @y ~~~~ ^^^^^^^^^^^^^^^ Allow for transport protocol match in the map too, which is implicitly pulling in a transport protocol dependency. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: check for concatenation in set data datatypePablo Neira Ayuso2021-09-293-1/+20
| | | | | | | | | | | | | When adding this rule with an existing map: add rule nat x y meta l4proto { tcp, udp } dnat ip to ip daddr . th dport map @fwdtoip_th reports a bogus: Error: datatype mismatch: expected IPv4 address, expression has type concatenation of (IPv4 address, internet network service) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: honor NLM_F_EXCL netlink flagPablo Neira Ayuso2021-09-291-1/+7
| | | | | | This allow to report for the create command. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: monitor: update insert and replace commandsPablo Neira Ayuso2021-09-291-2/+2
| | | | | | | | | Adjust test after these two kernel fixes: ("netfilter: nf_tables: reverse order in rule replacement expansion") ("netfilter: nf_tables: add position handle in event notification") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: honor NLM_F_APPEND flag for rulesPablo Neira Ayuso2021-09-291-14/+26
| | | | | | Print 'add' or 'insert' according to this netlink flag. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: display rule position handlePablo Neira Ayuso2021-09-291-1/+4
| | | | | | This allow to locate the incremental update in the ruleset. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: dynset: set compound expr dtype based on set key definitionFlorian Westphal2021-09-293-1/+73
| | | | | | | | | | | | | | | | | "nft add rule ... add @t { ip saddr . 22 ..." will be listed as 'ip saddr . 0x16 [ invalid type]". This is a display bug, the compound expression created during netlink deserialization lacks correct datatypes for the value expression. Avoid this by setting the individual expressions' datatype. The set key has the needed information, so walk over the types and set them in the dynset statment. Also add a test case. Reported-by: Paulo Ricardo Bruck <paulobruck1@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* payload: don't adjust offsets of autogenerated dependency expressionsFlorian Westphal2021-09-296-1/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo says: user reports that this is broken: nft --debug=netlink add rule bridge filter forward vlan id 100 vlan id set 200 [..] [ payload load 2b @ link header + 14 => reg 1 ] [..] [ payload load 2b @ link header + 28 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x000000f0 ) ^ 0x0000c800 ] [ payload write reg 1 => 2b @ link header + 14 csum_type 0 csum_off 0 csum_flags 0x0 ] offset says 28, it is assuming q-in-q, in this case it is mangling the existing header. The problem here is that 'vlan id set 200' needs a read-modify-write cycle because 'vlan id set' has to preserve bits located in the same byte area as the vlan id. The first 'payload load' at offset 14 is generated via 'vlan id 100', this part is ok. The second 'payload load' at offset 28 is the bogus one. Its added as a dependency, but then adjusted because nft evaluation considers this identical to 'vlan id 1 vlan id '2, where nft assumes q-in-q. To fix this, skip offset adjustments for raw expressions and mark the dependency-generated payload instruction as such. This is fine because raw payload operations assume that user specifies base/offset/length manually. Also add a test case for this. Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink: reset temporary set element stmt list after list splicePablo Neira Ayuso2021-09-163-1/+28
| | | | | | | | Reset temporary stmt list to deal with the key_end case which might result in a jump backward to handle the rhs of the interval. Reported-by: Martin Zatloukal <slezi2@pvfree.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: fix synopsis of named counter, quota and ct {helper,timeout,expect}Pablo Neira Ayuso2021-09-162-9/+61
| | | | | | Synopsis is not complete. Add examples for counters and quotas. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: py: update ct expirationPablo Neira Ayuso2021-09-151-3/+3
| | | | | | | | Since 309785674b25 ("datatype: time_print() ignores -T"), time_type honors -T option. Given tests/py run in numeric format, this patch fixes a warning since the ct expiration is now expressed in seconds. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: revert hashtable for expression handlersPablo Neira Ayuso2021-09-153-35/+10
| | | | | | | | | | | | | | | | | | | | | Partially revert 913979f882d1 ("src: add expression handler hashtable") which is causing a crash with two instances of the nftables handler. $ sudo python [sudo] password for echerkashin: Python 3.9.7 (default, Sep 3 2021, 06:18:44) [GCC 11.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from nftables import Nftables >>> n1=Nftables() >>> n2=Nftables() >>> <Ctrl-D> double free or corruption (top) Aborted Reported-by: Eugene Crosser <crosser@average.org> Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: nfnetlink_log allows one single process through unicastPablo Neira Ayuso2021-09-091-5/+5
| | | | | | | nfnetlink_log uses netlink unicast to send logs to one single process in userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: rework range_expr_to_prefix()Pablo Neira Ayuso2021-09-094-30/+148
| | | | | | | | | Consolidate prefix calculation in range_expr_is_prefix(). Add tests/py for 9208fb30dc49 ("src: Check range bounds before converting to prefix"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: skip -T for hour and date formatPablo Neira Ayuso2021-09-091-24/+9
| | | | | | | | | | | | | | | | If -T is used: - meta hour displays the hours in seconds based on your timezone. - meta time displays the UNIX time since 1970 in nanoseconds. Better, skip -T for these two datatypes and use the formatted output instead, ie. - meta hour "00:00:20" - meta time "1970-01-01 01:00:01" Fixes: f8f32deda31d ("meta: Introduce new conditions 'time', 'day' and 'hour'") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>