summaryrefslogtreecommitdiffstats
path: root/src/evaluate.c
Commit message (Collapse)AuthorAgeFilesLines
* src: tproxy: relax family restrictionsFlorian Westphal2018-08-291-17/+13
| | | | | | | | | | | | | | | | | | evaluation step currently prohibits tproxy ip to 1.2.3.4 in ip family, and tproxy ip6 to dead::1 in ip6. This seems an arbitrary limitation, just accept this. The current restriction would make json output support harder than needed, as the tproxy expression generated from json path would have to special-case the table its currently in, rather than just using the family attribute in the json output. We obviously still reject the family in case it mismatches the table family (e.g., can't use ip address in ip6 table). Signed-off-by: Florian Westphal <fw@strlen.de>
* src: Make invalid chain priority error more specificMáté Eckl2018-08-241-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | So far if invalid priority name was specified the error message referred to the whole chain/flowtable specification: nft> add chain ip x h { type filter hook prerouting priority first; } Error: 'first' is invalid priority in this context. add chain ip x h { type filter hook prerouting priority first; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ With this patch this reference is made specific to the priority specification: nft> add chain ip x h { type filter hook prerouting priority first; } Error: 'first' is invalid priority in this context. add chain ip x h { type filter hook prerouting priority first; } ^^^^^^^^^^^^^^ `prio_spec` is also reused to keep naming intuitive. The parser section formerly named `prio_spec` is renamed to `int_num` as it basically provides the mathematical set of integer numbers. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: integrate stateful expressions into sets and mapsPablo Neira Ayuso2018-08-241-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following example shows how to populate a set from the packet path using the destination IP address, for each entry there is a counter. The entry expires after the 1 hour timeout if no packets matching this entry are seen. table ip x { set xyz { type ipv4_addr size 65535 flags dynamic,timeout timeout 1h } chain y { type filter hook output priority filter; policy accept; update @xyz { ip daddr counter } counter } } Similar example, that creates a mapping better IP address and mark, where the mark is assigned using an incremental sequence generator from 0 to 1 inclusive. table ip x { map xyz { type ipv4_addr : mark size 65535 flags dynamic,timeout timeout 1h } chain y { type filter hook input priority filter; policy accept; update @xyz { ip saddr counter : numgen inc mod 2 } } } Supported stateful statements are: limit, quota, counter and connlimit. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: simplify map statementPablo Neira Ayuso2018-08-241-1/+24
| | | | | | | Instead of using the map expression, store dynamic key and data separately since they need special handling than constant maps. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: reject: Allow icmpx in inet/bridge familiesPhil Sutter2018-08-141-6/+1
| | | | | | | | | | | | | | | | | | | Commit 3e6ab2b335142 added restraints on reject types for bridge and inet families but aparently those were too strict: If a rule in e.g. inet family contained a match which introduced a protocol dependency, icmpx type rejects were disallowed for no obvious reason. Allow icmpx type rejects in inet family regardless of protocol dependency since we either have IPv4 or IPv6 traffic in there and for both icmpx is fine. Merge restraints in bridge family with those for TCP reset since it already does what is needed, namely checking that ether proto is either IPv4 or IPv6. Fixes: 3e6ab2b335142 ("evaluate: reject: check in bridge and inet the network context in reject") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Set/print standard chain prios with textual namesMáté Eckl2018-08-141-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the possibility to use textual names to set the chain priority to standard values so that numeric values do not need to be learnt any more for basic usage. Basic arithmetic can also be done with them to ease the addition of relatively higher/lower priority chains. Addition and substraction is possible. Values are also printed with their friendly name within the range of <basicprio> +- 10. Also numeric printing is supported in case of -nnn option (numeric == NFT_NUMERIC_ALL) The supported name-value pairs and where they are valid is based on how x_tables use these values when registering their base chains. (See iptables/nft.c in the iptables repository). Also see the compatibility matrices extracted from the man page: Standard priority names, family and hook compatibility matrix ┌─────────┬───────┬────────────────┬─────────────┐ │Name │ Value │ Families │ Hooks │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │raw │ -300 │ ip, ip6, inet │ all │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │mangle │ -150 │ ip, ip6, inet │ all │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │dstnat │ -100 │ ip, ip6, inet │ prerouting │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │filter │ 0 │ ip, ip6, inet, │ all │ │ │ │ arp, netdev │ │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │security │ 50 │ ip, ip6, inet │ all │ ├─────────┼───────┼────────────────┼─────────────┤ │ │ │ │ │ │srcnat │ 100 │ ip, ip6, inet │ postrouting │ └─────────┴───────┴────────────────┴─────────────┘ Standard priority names and hook compatibility for the bridge family ┌───────┬───────┬─────────────┐ │ │ │ │ │Name │ Value │ Hooks │ ├───────┼───────┼─────────────┤ │ │ │ │ │dstnat │ -300 │ prerouting │ ├───────┼───────┼─────────────┤ │ │ │ │ │filter │ -200 │ all │ ├───────┼───────┼─────────────┤ │ │ │ │ │out │ 100 │ output │ ├───────┼───────┼─────────────┤ │ │ │ │ │srcnat │ 300 │ postrouting │ └───────┴───────┴─────────────┘ This can be also applied for flowtables wher it works as a netdev family chain. Example: nft> add table ip x nft> add chain ip x y { type filter hook prerouting priority raw; } nft> add chain ip x z { type filter hook prerouting priority mangle + 1; } nft> add chain ip x w { type filter hook prerouting priority dstnat - 5; } nft> add chain ip x r { type filter hook prerouting priority filter + 10; } nft> add chain ip x t { type filter hook prerouting priority security; } nft> add chain ip x q { type filter hook postrouting priority srcnat + 11; } nft> add chain ip x h { type filter hook prerouting priority 15; } nft> nft> add flowtable ip x y { hook ingress priority filter + 5 ; devices = {enp0s31f6}; } nft> nft> add table arp x nft> add chain arp x y { type filter hook input priority filter + 5; } nft> nft> add table bridge x nft> add chain bridge x y { type filter hook input priority filter + 9; } nft> add chain bridge x z { type filter hook prerouting priority dstnat; } nft> add chain bridge x q { type filter hook postrouting priority srcnat; } nft> add chain bridge x k { type filter hook output priority out; } nft> nft> list ruleset table ip x { flowtable y { hook ingress priority filter + 5 devices = { enp0s31f6 } } chain y { type filter hook prerouting priority raw; policy accept; } chain z { type filter hook prerouting priority mangle + 1; policy accept; } chain w { type filter hook prerouting priority dstnat - 5; policy accept; } chain r { type filter hook prerouting priority filter + 10; policy accept; } chain t { type filter hook prerouting priority security; policy accept; } chain q { type filter hook postrouting priority 111; policy accept; } chain h { type filter hook prerouting priority 15; policy accept; } } table arp x { chain y { type filter hook input priority filter + 5; policy accept; } } table bridge x { chain y { type filter hook input priority filter + 9; policy accept; } chain z { type filter hook prerouting priority dstnat; policy accept; } chain q { type filter hook postrouting priority srcnat; policy accept; } chain k { type filter hook output priority out; policy accept; } } nft> # Everything should fail after this nft> add chain ip x h { type filter hook prerouting priority first; } Error: 'first' is invalid priority in this context. add chain ip x h { type filter hook prerouting priority first; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ nft> add chain ip x q { type filter hook prerouting priority srcnat + 11; } Error: 'srcnat' is invalid priority in this context. add chain ip x q { type filter hook prerouting priority srcnat + 11; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ nft> add chain arp x y { type filter hook input priority raw; } Error: 'raw' is invalid priority in this context. add chain arp x y { type filter hook input priority raw; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ nft> add flowtable ip x y { hook ingress priority magle; devices = {enp0s31f6}; } Error: 'magle' is invalid priority. add flowtable ip x y { hook ingress priority magle; devices = {enp0s31f6}; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ nft> add chain bridge x r { type filter hook postrouting priority dstnat; } Error: 'dstnat' is invalid priority in this context. add chain bridge x r { type filter hook postrouting priority dstnat; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ nft> add chain bridge x t { type filter hook prerouting priority srcnat; } Error: 'srcnat' is invalid priority in this context. add chain bridge x t { type filter hook prerouting priority srcnat; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: introduce passive OS fingerprint matchingFernando Fernandez Mancera2018-08-041-0/+7
| | | | | | | | | | | | | | Add support for "osf" expression. Example: table ip foo { chain bar { type filter hook input priority 0; policy accept; osf name "Linux" counter packets 3 bytes 132 } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Expose socket mark via socket expressionMáté Eckl2018-08-031-1/+5
| | | | | | | | This can be used like ct mark or meta mark except it cannot be set. doc and tests are included. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Add tproxy supportMáté Eckl2018-08-031-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for transparent proxy functionality which is supported in ip, ip6 and inet tables. The syntax is the following: tproxy [{|ip|ip6}] to {<ip address>|:<port>|<ip address>:<port>} It looks for a socket listening on the specified address or port and assigns it to the matching packet. In an inet table, a packet matches for both families until address is specified. Network protocol family has to be specified **only** in inet tables if address is specified. As transparent proxy support is implemented for sockets with layer 4 information, a transport protocol header criterion has to be set in the same rule. eg. 'meta l4proto tcp' or 'udp dport 4444' Example ruleset: table ip x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to 1.1.1.1 udp dport ssh tproxy to :2222 } } table ip6 x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to [dead::beef] udp dport ssh tproxy to :2222 } } table inet x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport 321 tproxy to :ssh tcp dport 99 tproxy ip to 1.1.1.1:999 udp dport 155 tproxy ip6 to [dead::beef]:smux } } Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: skip evaluation of datatype concatenationsPablo Neira Ayuso2018-07-071-4/+5
| | | | | | | | | These are not really expressions, so there is not value in place. The expr_evaluate_concat() is called from set_evaluate() to calculate the total length of the tuple. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1265 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: do not reset generation ID on ruleset flushPablo Neira Ayuso2018-06-071-1/+2
| | | | | | | | If 'flush ruleset' command is done, release the cache but still keep the generation ID around. Hence, follow up calls to cache_update() will assume that cache is updated and will not perform a netlink dump. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Introduce socket matchingMáté Eckl2018-06-061-0/+9
| | | | | | | | | | | | | | | | For now it can only match sockets with IP(V6)_TRANSPARENT socket option set. Example: table inet sockin { chain sockchain { type filter hook prerouting priority -150; policy accept; socket transparent 1 mark set 0x00000001 nftrace set 1 counter packets 9 bytes 504 accept } } Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expr: extend fwd statement to support address and familyPablo Neira Ayuso2018-06-061-3/+24
| | | | | | | | Allow to forward packets through to explicit destination and interface. nft add rule netdev x y fwd ip to 192.168.2.200 device eth0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: connlimit supportPablo Neira Ayuso2018-06-061-0/+1
| | | | | | | | | | | | | | This patch adds support for the new connlimit stateful expression, that provides a mapping with the connlimit iptables extension through meters. eg. nft add rule filter input tcp dport 22 \ meter test { ip saddr ct count over 2 } counter reject This limits the maximum amount incoming of SSH connections per source address up to 2 simultaneous connections. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nat: Eliminate misuse of AF_*Máté Eckl2018-06-061-3/+3
| | | | | | | | | Although the value of AF_INET and NFPROTO_IPV4 is the same, the use of AF_INET was misleading when checking the proto family. Same with AF_INET6. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: explicitly deny concatenated types in interval setsPhil Sutter2018-06-061-0/+4
| | | | | | | | | | | | | | | | | | | | | Previously, this triggered a program abort: | # nft add table ip t | # nft add set ip t my_set '{ type ipv4_addr . inet_service ; flags interval ; }' | # nft add element ip t my_set '{10.0.0.1 . tcp }' | BUG: invalid range expression type concat | nft: expression.c:1085: range_expr_value_low: Assertion `0' failed. With this patch in place, the 'add set' command above gives an error message: | # nft add set ip t my_set3 '{ type ipv4_addr . inet_service ; flags interval ; }' | Error: concatenated types not supported in interval sets | add set ip t my_set3 { type ipv4_addr . inet_service ; flags interval ; } | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* log: Add support for audit loggingPhil Sutter2018-06-031-0/+4
| | | | | | | | | | | This is implemented via a pseudo log level. The kernel ignores any other parameter, so reject those at evaluation stage. Audit logging is therefore simply a matter of: | log level audit Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: Return ENOENT if rule index is too largePhil Sutter2018-05-111-1/+1
| | | | | | | | | Since EINVAL usually indicates errors from kernel, avoid using it here too. Instead return ENOENT to indicate there's no entry to append or prepend the rule to. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Support 'add/insert rule index <IDX>'Phil Sutter2018-05-091-0/+45
| | | | | | | | | | | Allow to specify an absolute rule position in add/insert commands like with iptables. The translation to rule handle takes place in userspace, so no kernel support for this is needed. Possible undesired effects are pointed out in man page to make users aware that this way of specifying a rule location might not be ideal. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use location to display error messagesPablo Neira Ayuso2018-05-061-62/+94
| | | | | | | | | # nft add chain foo bar Error: Could not process rule: No such file or directory add chain foo bar ^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add obj_specPablo Neira Ayuso2018-05-061-2/+2
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set_specPablo Neira Ayuso2018-05-061-18/+18
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add chain_specPablo Neira Ayuso2018-05-061-2/+2
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add table_specPablo Neira Ayuso2018-05-061-21/+21
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meter: enforce presence of a max sizeFlorian Westphal2018-05-021-0/+1
| | | | | | | | | | meters are updated dynamically, so we don't know in advance how large this structure can be. Add a 'size' keyword to specifiy an upper limit and update the old syntax to assume a default max value of 65535. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: missing flowtable evaluation from nested notationPablo Neira Ayuso2018-04-261-0/+7
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix --debug mnl not producing outputDuncan Roe2018-04-261-15/+15
| | | | | | | | | cache_update() needs to accept the full debug mask instead of a boolean of NFT_DEBUG_NETLINK, because called functions may wish to check other bits (NFT_DEBUG_MNL in particular). Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: clear expression context before cmd evaluationFlorian Westphal2018-04-191-0/+2
| | | | | | | | | | | | We also need to clear expr ctx before we eval a command. This is a followup fix to 'evaluate: reset eval context when evaluating set definitions'. The first patch only fixed set evaluation when dealing with a complete table representation rather than individual commands. Reported-by: David Fabian <david.fabian@bosson.cz> Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: reset eval context when evaluating set definitionsFlorian Westphal2018-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | David reported nft chokes on this: nft -f /tmp/A /tmp/A:9:22-45: Error: datatype mismatch, expected concatenation of (IPv4 address, internet network service, IPv4 address), expression has type concatenation of (IPv4 address, internet network service) cat /tmp/A flush ruleset; table ip filter { set setA { type ipv4_addr . inet_service . ipv4_addr flags timeout } set setB { type ipv4_addr . inet_service flags timeout } } Problem is we leak set definition details of setA to setB via eval context, so reset this. Also add test case for this. Reported-by: David Fabian <david.fabian@bosson.cz> Signed-off-by: Florian Westphal <fw@strlen.de>
* Review raw payload allocation pointsPhil Sutter2018-04-141-1/+0
| | | | | | | | | | | | In parser_bison.y, call payload_init_raw() instead of assigning all fields manually. Also drop manual initialization of flags field: it is not touched in allocation path, so no need for that. In stmt_evaluate_payload(), setting dtype field is redundant since payload_init_raw() does that already. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: use recursive call for SET_REF handlingFlorian Westphal2018-04-031-29/+1
| | | | | | We can now call the helper again, with set->init as new RHS expression. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: handle EXPR_MAPPINGFlorian Westphal2018-04-031-0/+4
| | | | | | | | Needed by followup patch. EXPR_SET_REF handling is bonkers, it "works" when using { key : value } because ->key and ->left are aliased in struct expr to the same location. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: split binop xfer to separate functionFlorian Westphal2018-04-031-16/+31
| | | | | | to reuse this in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: move lhs fixup to a helperFlorian Westphal2018-04-031-19/+28
| | | | | | ... to reuse this in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: propagate binop_transfer() adjustment to set key sizePablo Neira Ayuso2018-04-031-1/+2
| | | | | | | | | The right shift transfer may be result in adjusting the set key size, eg. ip6 dscp results in fetching 6 bits that are splitted between two bytes, hence the set element ends up being 16 bytes long. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: do not inconditionally update cache from flush commandPablo Neira Ayuso2018-04-011-5/+15
| | | | | | This is only required by sets, maps and meters, skip cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* flowtable: Make parsing a little more robustPhil Sutter2018-03-201-0/+6
| | | | | | | | | | It was surprisingly easy to crash nft with invalid syntax in 'add flowtable' command. Catch at least three possible ways (illustrated in provided test case) by making evaluation phase survive so that bison gets a chance to complain. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Combine redir and masq statements into natPhil Sutter2018-03-171-40/+0
| | | | | | | | | | | | | | | | | | | All these statements are very similar, handling them with the same code is obvious. The only thing required here is a custom extension of enum nft_nat_types which is used in nat_stmt to distinguish between snat and dnat already. Though since enum nft_nat_types is part of kernel uAPI, create a local extended version containing the additional fields. Note that nat statement printing got a bit more complicated to get the number of spaces right for every possible combination of attributes. Note also that there wasn't a case for STMT_MASQ in rule_parse_postprocess(), which seems like a bug. Since STMT_MASQ became just a variant of STMT_NAT, postprocessing will take place for it now anyway. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: evaluate: add preliminary binop transfer support for vmapsFlorian Westphal2018-03-171-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | nftables doesn't support vmap with bit-sized headers, such as flow label or dscp: nft add rule ip filter input ip dscp vmap \{ 4 : accept, 63 : continue \} BUG: invalid binary operation 5 Unlike plain "ip dscp { 4, 63 }", we don't have a relational operation in case of vmap. Binop fixups need to be done when evaluating map statements. This patch is incomplete. 'ip dscp' works, but this won't: nft add rule --debug=netlink ip6 test-ip6 input ip6 dscp vmap { 0x04 : accept, 0x3f : continue } The generated expressions look sane, however there is disagreement on the sets key size vs. the sizes of the individual elements in the set. This is because ip6 dscp spans a byte boundary. Key set size is still set to one byte (dscp type is 6bits). However, binop expansion requirements result in 2 byte loads, i.e. set members will be 2 bytes in size as well. This can hopefully get addressed in an incremental patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* evaluate: handle binop adjustment recursivelyFlorian Westphal2018-03-171-21/+32
| | | | | | | | | | | | | | | currently this is fine, but a followup commit will add EXPR_SET_ELEM handling. And unlike RANGE we cannot assume the key is a value. Therefore make binop_can_transfer and binop_transfer_one handle right hand recursively if needed. For RANGE, call it again with from/to. For future SET_ELEM, we can then just call the function recursively again with right->key as new RHS. Signed-off-by: Florian Westphal <fw@strlen.de>
* relational: Eliminate meta OPsPhil Sutter2018-03-161-97/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With a bit of code reorganization, relational meta OPs OP_RANGE, OP_FLAGCMP and OP_LOOKUP become unused and can be removed. The only meta OP left is OP_IMPLICIT which is usually treated as alias to OP_EQ. Though it needs to stay in place for one reason: When matching against a bitmask (e.g. TCP flags or conntrack states), it has a different meaning: | nft --debug=netlink add rule ip t c tcp flags syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ bitwise reg 1 = (reg=1 & 0x00000002 ) ^ 0x00000000 ] | [ cmp neq reg 1 0x00000000 ] | nft --debug=netlink add rule ip t c tcp flags == syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ cmp eq reg 1 0x00000002 ] OP_IMPLICIT creates a match which just checks the given flag is present, while OP_EQ creates a match which ensures the given flag and no other is present. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support of dynamic map addition and update of elementsLaura Garcia Liebana2018-03-151-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The support of dynamic adds and updates are only available for sets and meters. This patch gives such abilities to maps as well. This patch is useful in cases where dynamic population of maps are required, for example, to maintain a persistence during some period of time. Example: table ip nftlb { map persistencia { type ipv4_addr : mark timeout 1h elements = { 192.168.1.132 expires 59m55s : 0x00000064, 192.168.56.101 expires 59m24s : 0x00000065 } } chain pre { type nat hook prerouting priority 0; policy accept; map update \ { @nh,96,32 : numgen inc mod 2 offset 100 } @persistencia } } An example of the netlink generated sequence: nft --debug=netlink add rule ip nftlb pre map add \ { ip saddr : numgen inc mod 2 offset 100 } @persistencia ip nftlb pre [ payload load 4b @ network header + 12 => reg 1 ] [ numgen reg 2 = inc mod 2 offset 100 ] [ dynset add reg_key 1 set persistencia sreg_data 2 ] Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for get element commandPablo Neira Ayuso2018-03-071-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | You need a Linux kernel >= 4.15 to use this feature. This patch allows us to dump the content of an existing set. # nft list ruleset table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 3.3.3.3, 5.5.5.5-6.6.6.6 } } } You check if a single element exists in the set: # nft get element x x { 1.1.1.5 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } Output means '1.1.1.5' belongs to the '1.1.1.1-2.2.2.2' interval. You can also check for intervals: # nft get element x x { 1.1.1.1-2.2.2.2 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } If you try to check for an element that doesn't exist, an error is displayed. # nft get element x x { 1.1.1.0 } Error: Could not receive set elements: No such file or directory get element x x { 1.1.1.0 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can also check for multiple elements in one go: # nft get element x x { 1.1.1.5, 5.5.5.10 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 5.5.5.5-6.6.6.6 } } } You can also use this to fetch the existing timeout for specific elements, in case you have a set with timeouts in place: # nft get element w z { 2.2.2.2 } table ip w { set z { type ipv4_addr timeout 30s elements = { 2.2.2.2 expires 17s } } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: flow offload supportPablo Neira Ayuso2018-03-051-0/+1
| | | | | | | | | | | | This patch allows us to refer to existing flowtables: # nft add rule x x flow offload @m Packets matching this rule create an entry in the flow table 'm', hence, follow up packets that get to the flowtable at ingress bypass the classic forwarding path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: delete flowtablePablo Neira Ayuso2018-03-051-0/+1
| | | | | | | | This patch allows you to delete an existing flowtable: # nft delete flowtable x m Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to add flowtablesPablo Neira Ayuso2018-03-051-0/+26
| | | | | | | | | | | | | | | | | This patch allows you to create flowtable: # nft add table x # nft add flowtable x m { hook ingress priority 10\; devices = { eth0, wlan0 }\; } You have to specify hook and priority. So far, only the ingress hook is supported. The priority represents where this flowtable is placed in the ingress hook, which is registered to the devices that the user specifies. You can also use the 'create' command instead to bail out in case that there is an existing flowtable with this name. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for flowtable listingPablo Neira Ayuso2018-03-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to dump existing flowtable. # nft list ruleset table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You can also list existing flowtables via: # nft list flowtables table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You need a Linux kernel >= 4.16-rc to test this new feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add variable expression and use it to allow redefinitionsPablo Neira Ayuso2018-03-041-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new variable expression that we can use to attach symbols in runtime, this allows us to redefine variables via new keyword, eg. table ip x { chain y { define address = { 1.1.1.1, 2.2.2.2 } ip saddr $address redefine address = { 3.3.3.3 } ip saddr $address } } # nft list ruleset table ip x { chain y { ip saddr { 1.1.1.1, 2.2.2.2 } ip saddr { 3.3.3.3 } } } Note that redefinition just places a new symbol version before the existing one, so symbol lookups always find the latest version. The undefine keyword decrements the reference counter and removes the symbol from the list, so it cannot be used anymore. Still, previous references to this symbol via variable expression are still valid. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: Fix memleak in stmt_reject_gen_dependency()Phil Sutter2018-03-021-3/+7
| | | | | | | | | | | | | | The allocated payload expression is not used after returning from that function, so it needs to be freed again. Simple test case: | nft add rule inet t c reject with tcp reset Valgrind reports definitely lost 144 bytes. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Review switch statements for unmarked fall through casesPhil Sutter2018-02-281-0/+1
| | | | | | | | | | | | | | While revisiting all of them, clear a few oddities as well: - There's no point in marking empty fall through cases: They are easy to spot and a common concept when using switch(). - Fix indenting of break statement in one occasion. - Drop needless braces around one case which doesn't declare variables. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>