summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* scanner: counter: move to own scopeFlorian Westphal2021-03-242-18/+20
| | | | | | move bytes/packets away from initial state. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add support for scope nestingFlorian Westphal2021-03-241-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a COUNTER scope introduces parsing errors. Example: add rule ... counter ip saddr 1.2.3.4 This is supposed to be COUNTER IP SADDR SYMBOL but it will be parsed as COUNTER IP STRING SYMBOL ... and rule fails with unknown saddr. This is because IP state change gets popped right after it was pushed. bison parser invokes scanner_pop_start_cond() helper via 'close_scope_counter' rule after it has processed the entire 'counter' rule. But that happens *after* flex has executed the 'IP' rule. IOW, the sequence of events is not the exepcted "COUNTER close_scope_counter IP SADDR SYMBOL close_scope_ip", it is "COUNTER IP close_scope_counter". close_scope_counter pops the just-pushed SCANSTATE_IP and returns the scanner to SCANSTATE_COUNTER, so next input token (saddr) gets parsed as a string, which gets then rejected from bison. To resolve this, defer the pop operation until the current state is done. scanner_pop_start_cond() already gets the scope that it has been completed as an argument, so we can compare it to the active state. If those are not the same, just defer the pop operation until the bison reports its done with the active flex scope. This leads to following sequence of events: 1. flex switches to SCANSTATE_COUNTER 2. flex switches to SCANSTATE_IP 3. bison calls scanner_pop_start_cond(SCANSTATE_COUNTER) 4. flex remains in SCANSTATE_IP, bison continues 5. bison calls scanner_pop_start_cond(SCANSTATE_IP) once the entire ip rule has completed: this pops both IP and COUNTER. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: avoid -fasan heap overflow warningsFlorian Westphal2021-03-181-1/+1
| | | | | Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: secmark: move to own scopeFlorian Westphal2021-03-162-10/+12
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: move until,over,used keywords away from init stateFlorian Westphal2021-03-161-3/+5
| | | | | | Only applicable for limit and quota. "ct count" also needs 'over'. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: quota: move to own scopeFlorian Westphal2021-03-162-12/+14
| | | | | | ... and move "used" keyword to it. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: limit: move to own scopeFlorian Westphal2021-03-162-15/+19
| | | | | | Moves rate and burst out of INITIAL. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: vlan: move to own scopeFlorian Westphal2021-03-162-5/+9
| | | | | | ID needs to remain exposed as its used by ct, icmp, icmp6 and so on. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: remove saddr/daddr from initial stateFlorian Westphal2021-03-161-2/+4
| | | | | | This can now be reduced to expressions that can expect saddr/daddr tokens. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: arp: move to own scopeFlorian Westphal2021-03-162-9/+13
| | | | | | allows to move the arp specific tokens out of the INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add ether scopeFlorian Westphal2021-03-162-6/+8
| | | | | | | just like previous change: useless as-is, but prepares for removal of saddr/daddr from INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add fib scopeFlorian Westphal2021-03-162-2/+4
| | | | | | | | | makes no sense as-is because all keywords need to stay in the INITIAL scope. This can be changed after all saddr/daddr users have been scoped. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ip6: move to own scopeFlorian Westphal2021-03-162-13/+17
| | | | | | move flowlabel and hoplimit. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ip: move to own scopeFlorian Westphal2021-03-162-18/+22
| | | | | | Move the ip option names (rr, lsrr, ...) out of INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ct: move to own scopeFlorian Westphal2021-03-162-38/+42
| | | | | | | | | | | | This allows moving multiple ct specific keywords out of INITIAL scope. Next few patches follow same pattern: 1. add a scope_close_XXX rule 2. add a SCANSTATE_XXX & make flex switch to it when encountering XXX keyword 3. make bison leave SCANSTATE_XXXX when it has seen the complete expression. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: move remaining cache functions in rule.c to cache.cPablo Neira Ayuso2021-03-112-203/+205
| | | | | | Move all the cache logic to src/cache.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* scanner: socket: move to own scopeFlorian Westphal2021-03-112-5/+8
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: rt: move to own scopeFlorian Westphal2021-03-112-6/+10
| | | | | | | | classid and nexthop can be moved out of INIT scope. Rest are still needed because tehy are used by other expressions as well. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ipsec: move to own scopeFlorian Westphal2021-03-112-9/+13
| | | | | | ... and hide the ipsec specific tokens from the INITITAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: queue: move to own scopeFlorian Westphal2021-03-112-7/+10
| | | | | | allows to remove 3 queue specific keywords from INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: introduce start condition stackFlorian Westphal2021-03-112-11/+36
| | | | | | | | | | | | | | | | | | | | Add a small initial chunk of flex start conditionals. This starts with two low-hanging fruits, numgen and j/symhash. NUMGEN and HASH start conditions are entered from flex when the corresponding expression token is encountered. Flex returns to the INIT condition when the bison parser has seen a complete numgen/hash statement. This intentionally uses a stack rather than BEGIN() to eventually support nested states. The scanner_pop_start_cond() function argument is not used yet, but will need to be used later to deal with nesting. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: remove unused tokensFlorian Westphal2021-03-092-12/+0
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* nftables: xt: fix misprint in nft_xt_compatible_revisionPavel Tikhomirov2021-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rev variable is used here instead of opt obviously by mistake. Please see iptables:nft_compatible_revision() for an example how it should be. This breaks revision compatibility checks completely when reading compat-target rules from nft utility. That's why nftables can't work on "old" kernels which don't support new revisons. That's a problem for containers. E.g.: 0 and 1 is supported but not 2: https://git.sw.ru/projects/VZS/repos/vzkernel/browse/net/netfilter/xt_nat.c#111 Reproduce of the problem on Virtuozzo 7 kernel 3.10.0-1160.11.1.vz7.172.18 in centos 8 container: iptables-nft -t nat -N TEST iptables-nft -t nat -A TEST -j DNAT --to-destination 172.19.0.2 nft list ruleset > nft.ruleset nft -f - < nft.ruleset #/dev/stdin:19:67-81: Error: Range has zero or negative size # meta l4proto tcp tcp dport 81 counter packets 0 bytes 0 dnat to 3.0.0.0-0.0.0.0 # ^^^^^^^^^^^^^^^ nft -v #nftables v0.9.3 (Topsy) iptables-nft -v #iptables v1.8.7 (nf_tables) Kernel returns ip range in rev 0 format: crash> p *((struct nf_nat_ipv4_multi_range_compat *) 0xffff8ca2fabb3068) $5 = { rangesize = 1, range = {{ flags = 3, min_ip = 33559468, max_ip = 33559468, But nft reads this as rev 2 format (nf_nat_range2) which does not have rangesize, and thus flugs 3 is treated as ip 3.0.0.0, which is wrong and can't be restored later. (Should probably be the same on Centos 7 kernel 3.10.0-1160.11.1) Fixes: fbc0768cb696 ("nftables: xt: don't use hard-coded AF_INET") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Set NFTNL_SET_DATA_TYPE before dumping set elementsPhil Sutter2021-03-091-0/+3
| | | | | | | | In combination with libnftnl's commit "set_elem: Fix printing of verdict map elements", This adds the vmap target to netlink dumps. Adjust dumps in tests/py accordingly. Signed-off-by: Phil Sutter <phil@nwl.cc>
* parser: compact ct obj list typesFlorian Westphal2021-03-061-11/+8
| | | | | | Add new ct_cmd_type and avoid copypaste of the ct cmd_list rules. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: compact map RHS typeFlorian Westphal2021-03-061-29/+9
| | | | | | Similar to previous patch, we can avoid duplication. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: squash duplicated spec/specid rulesFlorian Westphal2021-03-061-44/+38
| | | | | | | No need to have duplicate CMD rules for spec and specid: add and use a common rule for those cases. Signed-off-by: Florian Westphal <fw@strlen.de>
* expression: memleak in verdict_expr_parse_udata()Pablo Neira Ayuso2021-03-051-1/+1
| | | | | | | Remove unnecessary verdict_expr_alloc() invocation. Fixes: 4ab1e5e60779 ("src: allow use of 'verdict' in typeof definitions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: memleak list of chainPablo Neira Ayuso2021-03-051-13/+26
| | | | | | Release chain list from the error path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove nft_mnl_socket_reopen()Pablo Neira Ayuso2021-03-052-16/+19
| | | | | | | | | | | | | | nft_mnl_socket_reopen() was introduced to deal with the EINTR case. By reopening the netlink socket, pending netlink messages that are part of a stale netlink dump are implicitly drop. This patch replaces the nft_mnl_socket_reopen() strategy by pulling out all of the remaining netlink message to restart in a clean state. This is implicitly fixing up a bug in the table ownership support, which assumes that the netlink socket remains open until nft_ctx_free() is invoked. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: support for the table owner flagPablo Neira Ayuso2021-03-025-2/+187
| | | | | | | | | | | | | | | | | | | | | | | | Add new flag to allow userspace process to own tables: Tables that have an owner can only be updated/destroyed by the owner. The table is destroyed either if the owner process calls nft_ctx_free() or owner process is terminated (implicit table release). The ruleset listing includes the program name that owns the table: nft> list ruleset table ip x { # progname nft flags owner chain y { type filter hook input priority filter; policy accept; counter packets 1 bytes 309 } } Original code to pretty print the netlink portID to program name has been extracted from the conntrack userspace utility. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: rework flags printingPablo Neira Ayuso2021-03-022-14/+25
| | | | | | | Simplify routine to print the table flags. Add table_flag_name() and use it from json too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser: re-enable support for concatentation on map RHSFlorian Westphal2021-02-231-0/+9
| | | | | | | | | "typeof .... : ip saddr . tcp dport" is legal. This makes 'testcases/maps/nat_addr_port' pass again. Fixes: 4ab1e5e6077918 ("src: allow use of 'verdict' in typeof definitions") Signed-off-by: Florian Westphal <fw@strlen.de>
* src: allow use of 'verdict' in typeof definitionsFlorian Westphal2021-02-222-3/+43
| | | | | | | | | | | | | | | 'verdict' cannot be used as part of a map typeof-based key definition, its a datatype and not an expression, e.g.: typeof iifname . ip protocol . th dport : verdic ... will fail. Make the parser convert a 'verdict' symbol to a verdict expression and allow to store its presence as part of the typeof key definition. Reported-by: Frank Myhr <fmyhr@fhmtech.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* main: fix nft --help output fallout from 719e4427Štěpán Němec2021-02-221-3/+4
| | | | | | | | | Long options were missing the double dash. Fixes: 719e44277f8e ("main: use one data-structure to initialize getopt_long(3) arguments and help.") Cc: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Štěpán Němec <snemec@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: init parser state for every new buffer/fileEric Garver2021-02-191-0/+2
| | | | | | | | Otherwise invalid error states cause subsequent json parsing to fail when it should not. Signed-off-by: Eric Garver <eric@garver.life> Signed-off-by: Phil Sutter <phil@nwl.cc>
* monitor: Don't print newgen message with JSON outputPhil Sutter2021-02-171-0/+3
| | | | | | | | | Iff this should be printed, it must adhere to output format settings. In its current form it breaks JSON syntax, so skip it for non-default output formats. Fixes: cb7e02f44d6a6 ("src: enable json echo output when reading native syntax") Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: set evaluation context for set elementsFlorian Westphal2021-02-161-2/+9
| | | | | | | | | | | | | This resolves same issue as previous patch when such expression is used as a set key: set z { typeof ct zone - elements = { 1, 512, 768, 1024, 1280, 1536 } + elements = { 1, 2, 3, 4, 5, 6 } } Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: pick data element byte order, not dtype oneFlorian Westphal2021-02-161-1/+1
| | | | | | | | | | | | | Some expressions have integer base type, not a specific one, e.g. 'ct zone'. In that case nft used the wrong byte order. Without this, nft adds elements = { "eth0" : 256, "eth1" : 512, "veth4" : 256 } instead of 1, 2, 3. This is not a 'display bug', the added elements have wrong byte order. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: incorrect usage of stmt_binary_error() in rejectPablo Neira Ayuso2021-02-091-3/+2
| | | | | | | Don't pass ctx->pctx.protocol[PROTO_BASE_LL_HDR] to stmt_binary_error(), it's not useful for the error reporting as location is not available. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* erec: Sanitize erec location indescPhil Sutter2021-02-091-1/+2
| | | | | | | erec_print() unconditionally dereferences erec->locations->indesc, so make sure it is valid when either creating an erec or adding a location. Signed-off-by: Phil Sutter <phil@nwl.cc>
* trace: do not remove icmp type from packet dumpFlorian Westphal2021-02-081-1/+3
| | | | | | | | | | | | | | | | | As of 0.9.8 the icmp type is marked as a protocol field, so its elided in 'nft monitor trace' output: icmp code 0 icmp id 44380 .. Restore it. Unlike tcp, where 'tcp sport' et. al in the dump will make the 'ip protocol tcp' redundant this case isn't obvious in the icmp case: icmp type 8 code 0 id ... Reported-by: Martin Gignac <martin.gignac@gmail.com> Fixes: 98b871512c4677 ("src: add auto-dependencies for ipv4 icmp") Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add negation match on singleton bitmask valuePablo Neira Ayuso2021-02-055-5/+28
| | | | | | | | | | | | | | | | | This patch provides a shortcut for: ct status and dnat == 0 which allows to check for the packet whose dnat bit is unset: # nft add rule x y ct status ! dnat counter This operation is only available for expression with a bitmask basetype, eg. # nft describe ct status ct expression, datatype ct_status (conntrack status) (basetype bitmask, integer), 32 bits Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: do not crash if dynamic set has no statementsFlorian Westphal2021-02-051-4/+6
| | | | | | | | list_first_entry() returns garbage when the list is empty. There is no need to run the following loop if we have no statements, so just return 0. Signed-off-by: Florian Westphal <fw@strlen.de>
* json: Do not abbreviate reject statement objectPhil Sutter2021-02-031-8/+0
| | | | | | | | | No need to reduce output size, also this way output is more predictable. While being at it, drop some pointless chunks from tests/py/bridge/reject.t.json.output. Signed-off-by: Phil Sutter <phil@nwl.cc>
* payload: check icmp dependency before removing previous icmp expressionFlorian Westphal2021-02-021-21/+42
| | | | | | | | | | | | | | | nft is too greedy when removing icmp dependencies. 'icmp code 1 type 2' did remove the type when printing. Be more careful and check that the icmp type dependency of the candidate expression (earlier icmp payload expression) has the same type dependency as the new expression. Reported-by: Eric Garver <eric@garver.life> Reported-by: Michael Biebl <biebl@debian.org> Tested-by: Eric Garver <eric@garver.life> Fixes: d0f3b9eaab8d77e ("payload: auto-remove simple icmp/icmpv6 dependency expressions") Signed-off-by: Florian Westphal <fw@strlen.de>
* json: limit: Always include burst valuePhil Sutter2021-01-271-7/+5
| | | | | | The default burst value is non-zero, so JSON output should include it. Signed-off-by: Phil Sutter <phil@nwl.cc>
* reject: Unify inet, netdev and bridge delinearizationPhil Sutter2021-01-271-20/+4
| | | | | | | | | | | | | | Postprocessing for inet family did not attempt to kill any existing payload dependency, although it is perfectly fine to do so. The mere culprit is to not abbreviate default code rejects as that would drop needed protocol info as a side-effect. Since postprocessing is then almost identical to that of bridge and netdev families, merge them. While being at it, extend tests/py/netdev/reject.t by a few more tests taken from inet/reject.t so this covers icmpx rejects as well. Cc: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Phil Sutter <phil@nwl.cc>
* reject: Fix for missing dependencies in netdev familyPhil Sutter2021-01-272-1/+3
| | | | | | | | | | | | | | | Like with bridge family, rejecting with either icmp or icmpv6 must create a dependency match on meta protocol. Upon delinearization, treat netdev reject identical to bridge as well so no family info is lost. This makes reject statement in netdev family fully symmetric so fix the tests in tests/py/netdev/reject.t, adjust the related payload dumps and add JSON equivalents which were missing altogether. Fixes: 0c42a1f2a0cc5 ("evaluate: add netdev support for reject default") Fixes: a51a0bec1f698 ("tests: py: add netdev folder and reject.t icmp cases") Cc: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: evaluate: reset context maxlen value before prio evaluationFlorian Westphal2021-01-261-2/+2
| | | | | | | | | | unshare -n tests/shell/run-tests.sh tests/shell/testcases/nft-f/0024priority_0 W: [FAILED] tests/shell/testcases/nft-f/0024priority_0: got 1 /dev/stdin:8:47-49: Error: Value 100 exceeds valid range 0-15 type filter hook postrouting priority 100 Reported-by: Andreas Schultz <andreas.schultz@travelping.com Signed-off-by: Florian Westphal <fw@strlen.de>