summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* tcpopt: add md5sig, fastopen and mptcp optionsFlorian Westphal2021-12-013-2/+41
| | | | | | | | | Allow to use "fastopen", "md5sig" and "mptcp" mnemonics rather than the raw option numbers. These new keywords are only recognized while scanner is in tcp state. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: split tcp option rulesFlorian Westphal2021-12-011-19/+61
| | | | | | | | | | | | | | | At this time the parser will accept nonsensical input like tcp option mss left 2 which will be treated as 'tcp option maxseg size 2'. This is because the enum space overlaps. Split the rules so that 'tcp option mss' will only accept field names specific to the mss/maxseg option kind. Signed-off-by: Florian Westphal <fw@strlen.de> (cherry picked from commit 46168852c03d73c29b557c93029dc512ca6e233a)
* scanner: add tcp flex scopeFlorian Westphal2021-12-012-11/+17
| | | | | | | | This moves tcp options not used anywhere else (e.g. in synproxy) to a distinct scope. This will also allow to avoid exposing new option keywords in the ruleset context. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: remove KIND keywordFlorian Westphal2021-12-012-4/+1
| | | | | | | | | | | | | | | | tcp option <foo> kind ... never makes any sense, as "tcp option <foo>" already tells the kernel to look for the foo <kind>. "tcp option sack kind 5" matches if the sack option is present; its a more complicated form of the simpler "tcp option sack exists". "tcp option sack kind 1" (or any other value than 5) will never match. So remove this. Test cases are converted to "exists". Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delinearize: binop: make accesses to expr->left/right conditionalFlorian Westphal2021-12-011-19/+31
| | | | | | | | | | | This function can be called for different expression types, including some (EXPR_MAP) where expr->left/right alias to different member variables. This makes accesses to those members conditional by checking the expression type ahead of the access. Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delinearize: rename misleading variableFlorian Westphal2021-12-011-12/+12
| | | | | | | | | | | | relational_binop_postprocess() is called for EXPR_RELATIONAL, so "expr->right" is safe to use. But the RHS can be something other than a value. This has been extended to handle other types, so rename to 'right'. No code changes intended. Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delinearize: use correct member typeFlorian Westphal2021-12-011-1/+1
| | | | | | | expr is a map, so this should use expr->map, not expr->left. These fields are aliased, so this would break if that is ever changed. Signed-off-by: Florian Westphal <fw@strlen.de>
* cli: save history on ctrl-d with editlinePablo Neira Ayuso2021-11-301-7/+14
| | | | | | | | | | | Missing call to cli_exit() to save the history when ctrl-d is pressed in nft -i. Moreover, remove call to rl_callback_handler_remove() in cli_exit() for editline cli since it does not call rl_callback_handler_install(). Fixes: bc2d5f79c2ea ("cli: use plain readline() interface with libedit") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: Fix for escaped asterisk strings on Big EndianPhil Sutter2021-11-301-40/+17
| | | | | | | | | | | The original nul-char detection was not functional on Big Endian. Instead, go a simpler route by exporting the string and working on the exported data to check for a nul-char and escape a trailing asterisk if present. With the data export already happening in the caller, fold escaped_string_wildcard_expr_alloc() into it as well. Fixes: b851ba4731d9f ("src: add interface wildcard matching") Signed-off-by: Phil Sutter <phil@nwl.cc>
* ct: Fix ct label value parserPhil Sutter2021-11-301-3/+2
| | | | | | | | Size of array to export the bit value into was eight times too large, so on Big Endian the data written into the data reg was always zero. Fixes: 2fcce8b0677b3 ("ct: connlabel matching support") Signed-off-by: Phil Sutter <phil@nwl.cc>
* datatype: Fix size of time_typePhil Sutter2021-11-301-2/+4
| | | | | | | | | Used by 'ct expiration', time_type is supposed to be 32bits. Passing a 64bits variable to constant_expr_alloc() causes the value to be always zero on Big Endian. Fixes: 0974fa84f162a ("datatype: seperate time parsing/printing from time_type") Signed-off-by: Phil Sutter <phil@nwl.cc>
* meta: Fix hour_type sizePhil Sutter2021-11-301-4/+5
| | | | | | | | | | | | In kernel as well as when parsing, hour_type is assumed to be 32bits. Having the struct datatype field set to 64bits breaks Big Endian and so does passing a 64bit value and 32 as length to constant_expr_alloc() as it makes it import the upper 32bits. Fix this by turning 'result' into a uint32_t and introduce a temporary uint64_t just for the call to time_parse() which expects that. Fixes: f8f32deda31df ("meta: Introduce new conditions 'time', 'day' and 'hour'") Signed-off-by: Phil Sutter <phil@nwl.cc>
* meta: Fix {g,u}id_type on Big EndianPhil Sutter2021-11-301-6/+10
| | | | | | | | | | | | Using a 64bit variable to temporarily hold the parsed value works only on Little Endian. uid_t and gid_t (and therefore also pw->pw_uid and gr->gr_gid) are 32bit. To fix this, use uid_t/gid_t for the temporary variable but keep the 64bit one for numeric parsing so values exceeding 32bits are still detected. Fixes: e0ed4c45d9ad2 ("meta: relax restriction on UID/GID parsing") Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: Fix payload statement mask on Big EndianPhil Sutter2021-11-301-2/+2
| | | | | | | | The mask used to select bits to keep must be exported in the same byteorder as the payload statement itself, also the length of the exported data must match the number of bytes extracted earlier. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: Fix for missing info in rule dumpsPhil Sutter2021-11-301-1/+12
| | | | | | | | | | | Commit 0e52cab1e64ab improved error reporting by adding rule's table and chain names to netlink message directly, prefixed by their location info. This in turn caused netlink dumps of the rule to not contain table and chain name anymore. Fix this by inserting the missing info before dumping and remove it afterwards to not cause duplicated entries in netlink message. Signed-off-by: Phil Sutter <phil@nwl.cc>
* exthdr: Fix for segfault with unknown exthdrPhil Sutter2021-11-301-5/+7
| | | | | | | | | Unknown exthdr type with NFT_EXTHDR_F_PRESENT flag set caused NULL-pointer deref. Fix this by moving the conditional exthdr.desc deref atop the function and use the result in all cases. Fixes: e02bd59c4009b ("exthdr: Implement existence check") Signed-off-by: Phil Sutter <phil@nwl.cc>
* exthdr: fix type number saved in udataFlorian Westphal2021-11-301-3/+1
| | | | | | | | | This should store the index of the protocol template, but &x[i] - &x[0] is always i, so remove the divide. Also add test case. Fixes: 01fbc1574b9e ("exthdr: add parse and build userdata interface") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Phil Sutter <phil@nwl.cc>
* cli: remove #include <editline/history.h>Pablo Neira Ayuso2021-11-221-1/+0
| | | | | | | | This header is not required to compile nftables with editline, remove it, this unbreak compilation in several distros which have no symlink from history.h to editline.h Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: different signedness compilation warningPablo Neira Ayuso2021-11-191-1/+1
| | | | | | | | | mnl.c: In function ‘mnl_batch_talk’: mnl.c:417:17: warning: comparison of integer expressions of different signedness: ‘unsigned in’ and ‘long int’ [-Wsign-compare] if (rcvbufsiz < NFT_MNL_ECHO_RCVBUFF_DEFAULT) ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: do not skip populating anonymous set with -tPablo Neira Ayuso2021-11-181-4/+7
| | | | | | | | | | | | | | | | | | | --terse does not apply to anonymous set, add a NFT_CACHE_TERSE bit to skip named sets only. Moreover, prioritize specific listing filter over --terse to avoid a bogus: netlink: Error: Unknown set '__set0' in lookup expression when invoking: # nft -ta list set inet filter example Extend existing test to improve coverage. Fixes: 9628d52e46ac ("cache: disable NFT_CACHE_SETELEM_BIT on --terse listing only") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: do not call interval_map_decompose() for concat intervalsFlorian Westphal2021-11-171-1/+6
| | | | | | | | | | | | | | Without this, nft monitor will either print garbage or even segfault when encountering a concat set because we pass expr->value to libgmp helpers for concat (non-value) expressions. Also, for concat case, we need to call concat_range_aggregate() helper. Add a test case for this. Without this patch, it gives: tests/monitor/run-tests.sh: line 98: 1163 Segmentation fault (core dumped) $nft -nn -e -f $command_file > $echo_output Signed-off-by: Florian Westphal <fw@strlen.de>
* parser_json: add raw payload inner header match supportPablo Neira Ayuso2021-11-171-0/+2
| | | | | | | Add missing "ih" base raw payload and extend tests/py to cover this new usecase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: allow for string raw payload basePablo Neira Ayuso2021-11-162-3/+11
| | | | | | | Remove new 'ih' token, allow to represent the raw payload base with a string instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: filter out rules by chainPablo Neira Ayuso2021-11-112-45/+81
| | | | | | | | | | | | | | | | | | | | With an autogenerated ruleset with ~20k chains. # time nft list ruleset &> /dev/null real 0m1,712s user 0m1,258s sys 0m0,454s Speed up listing of a specific chain: # time nft list chain nat MWDG-UGR-234PNG3YBUOTS5QD &> /dev/null real 0m0,542s user 0m0,251s sys 0m0,292s Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: missing family in cache filteringPablo Neira Ayuso2021-11-111-4/+8
| | | | | | | | Check family when filtering out listing of tables and sets. Fixes: 3f1d3912c3a6 ("cache: filter out tables that are not requested") Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: do not populate cache if it is going to be flushedPablo Neira Ayuso2021-11-112-5/+77
| | | | | | | Skip set element netlink dump if set is flushed, this speeds up set flush + add element operation in a batch file for an existing set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: move list filter under structPablo Neira Ayuso2021-11-111-11/+11
| | | | | | | Wrap the table and set fields for list filtering to prepare for the introduction element filters. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: update ct timeout section with the state namesFlorian Westphal2021-11-081-1/+1
| | | | | | | | docs are too terse and did not have the list of valid timeout states. While at it, adjust default stream timeout of udp to 120, this is the current kernel default. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: grab reference in set expression evaluationPablo Neira Ayuso2021-11-081-2/+2
| | | | | | | Do not clone expression when evaluation a set expression, grabbing the reference counter to reuse the object is sufficient. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: clone variable expression if there is more than one referencePablo Neira Ayuso2021-11-081-1/+10
| | | | | | | | | Clone the expression that defines the variable value if there are multiple references to it in the ruleset. This saves heap memory consumption in case the variable defines a set with a huge number of elements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: do not build nftnl_set element listPablo Neira Ayuso2021-11-082-25/+91
| | | | | | | | | | | | Do not call alloc_setelem_cache() to build the set element list in nftnl_set. Instead, translate one single set element expression to nftnl_set_elem object at a time and use this object to build the netlink header. Using a huge test set containing 1.1 million element blocklist, this patch is reducing userspace memory consumption by 40%. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: raw payload match and mangle on inner header / payload dataPablo Neira Ayuso2021-11-085-2/+11
| | | | | | | | | | | | | | | This patch adds support to match on inner header / payload data: # nft add rule x y @ih,32,32 0x14000000 counter you can also mangle payload data: # nft add rule x y @ih,32,32 set 0x14000000 counter This update triggers a checksum update at the layer 4 header via csum_flags, mangling odd bytes is also aligned to 16-bits. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: add xinteger_type alias to print in hexadecimalPablo Neira Ayuso2021-11-032-1/+17
| | | | | | | | | Add an alias of the integer type to print raw payload expressions in hexadecimal. Update tests/py. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: postpone transport protocol match check after nat expression ↵Pablo Neira Ayuso2021-11-031-6/+7
| | | | | | | | | evaluation Fix bogus error report when using transport protocol as map key. Fixes: 50780456a01a ("evaluate: check for missing transport protocol match in nat map with concatenations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: extend limit syntaxJeremy Sowden2021-11-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation describes the syntax of limit statements thus: limit rate [over] packet_number / TIME_UNIT [burst packet_number packets] limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT] TIME_UNIT := second | minute | hour | day BYTE_UNIT := bytes | kbytes | mbytes From this one might infer that a limit may be specified by any of the following: limit rate 1048576/second limit rate 1048576 mbytes/second limit rate 1048576 / second limit rate 1048576 mbytes / second However, the last does not currently parse: $ sudo /usr/sbin/nft add filter input limit rate 1048576 mbytes / second Error: wrong rate format add filter input limit rate 1048576 mbytes / second ^^^^^^^^^^^^^^^^^^^^^^^^^ Extend the `limit_rate_bytes` parser rule to support it, and add some new Python test-cases. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: add `limit_rate_pkts` and `limit_rate_bytes` rulesJeremy Sowden2021-11-031-62/+59
| | | | | | | | | Factor the `N / time-unit` and `N byte-unit / time-unit` expressions from limit expressions out into separate `limit_rate_pkts` and `limit_rate_bytes` rules respectively. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: add new `limit_bytes` ruleJeremy Sowden2021-11-031-6/+9
| | | | | | | | Refactor the `N byte-unit` expression out of the `limit_bytes_burst` rule into a separate `limit_bytes` rule. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Support netdev egress hookLukas Wunner2021-10-282-0/+5
| | | | | | | | | Add userspace support for the netdev egress hook which is queued up for v5.16-rc1, complete with documentation and tests. Usage is identical to the ingress hook. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: disable NFT_CACHE_SETELEM_BIT on --terse listing onlyPablo Neira Ayuso2021-10-281-2/+2
| | | | | | | Instead of NFT_CACHE_SETELEM which also disables set dump. Fixes: 6bcd0d576a60 ("cache: unset NFT_CACHE_SETELEM with --terse listing") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: ensure evaluate_cache_list flags are set correctlyChris Arges2021-10-271-0/+1
| | | | | | | | | This change ensures that when listing rulesets with the terse flag that the terse flag is maintained. Fixes: 6bcd0d576a60 ("cache: unset NFT_CACHE_SETELEM with --terse listing") Signed-off-by: Chris Arges <carges@cloudflare.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: honor table in set filteringPablo Neira Ayuso2021-10-271-1/+2
| | | | | | | | Check if table mismatch, in case the same set name is used in different tables. Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: honor filter in set listing commandsPablo Neira Ayuso2021-10-271-0/+2
| | | | | | | | Fetch table, set and set elements only for set listing commands, e.g. nft list set inet filter ipv4_bogons. Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: always set on NFT_CACHE_REFRESH for listingPablo Neira Ayuso2021-10-271-6/+7
| | | | | | | | This flag forces a refresh of the cache on list commands, several object types are missing this flag, this fixes nft --interactive mode. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* main: _exit() if setuidFlorian Westphal2021-10-191-0/+4
| | | | | | | | | | | | | Apparently some people think its a good idea to make nft setuid so unrivilged users can change settings. "nft -f /etc/shadow" is just one example of why this is a bad idea. Disable this. Do not print anything, fd cannot be trusted. This change intentionally doesn't affect libnftables, on the off-chance that somebody creates an suid program and knows what they're doing. Signed-off-by: Florian Westphal <fw@strlen.de>
* rule: replace three conditionals with oneJeremy Sowden2021-10-121-5/+3
| | | | | | | | When outputting set definitions, merge three consecutive `if (!list_empty(&set->stmt_list))` conditionals. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: fix stateless output after listing sets containing countersJeremy Sowden2021-10-121-1/+3
| | | | | | | | | | | | | Before outputting counters in set definitions the `NFT_CTX_OUTPUT_STATELESS` flag was set to suppress output of the counter state and unconditionally cleared afterwards, regardless of whether it had been originally set. Record the original set of flags and restore it. Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994273 Fixes: 6d80e0f15492 ("src: support for counter in set definition") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: remove fake stateless output of named countersJeremy Sowden2021-10-121-7/+6
| | | | | | | | | When `-s` is passed, no state is output for named quotas and counter and quota rules, but fake zero state is output for named counters. Remove the output of named counters to match the remaining stateful objects. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: unset NFT_CACHE_SETELEM with --terse listingPablo Neira Ayuso2021-10-021-3/+12
| | | | | | Skip populating the set element cache in this case to speed up listing. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: filter out sets and maps that are not requestedPablo Neira Ayuso2021-09-301-2/+19
| | | | | | | Do not fetch set content for list commands that specify a set name. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: filter out tables that are not requestedPablo Neira Ayuso2021-09-303-13/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not fetch table content for list commands that specify a table name, e.g. # nft list table filter This speeds up listing of a given table by not populating the cache with tables that are not needed. - Full ruleset (huge with ~100k lines). # sudo nft list ruleset &> /dev/null real 0m3,049s user 0m2,080s sys 0m0,968s - Listing per table is now faster: # nft list table nat &> /dev/null real 0m1,969s user 0m1,412s sys 0m0,556s # nft list table filter &> /dev/null real 0m0,697s user 0m0,478s sys 0m0,220s Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1326 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>