summaryrefslogtreecommitdiffstats
path: root/src/netlink_linearize.c
Commit message (Collapse)AuthorAgeFilesLines
* src: Optimize prefix match only if is big-endianXiao Liang2021-08-231-1/+2
| | | | | | | | | | | A prefix of integer type is big-endian in nature. Prefix match can be optimized to truncated 'cmp' only if it is big-endian. [ Add one tests/py for this use-case --pablo ] Fixes: 25338cdb6c77 ("src: Optimize prefix matches on byte-boundaries") Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: incorrect netlink bytecode with binary operation and flagsPablo Neira Ayuso2021-07-271-15/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft generates incorrect bytecode when combining flag datatype and binary operations: # nft --debug=netlink add rule meh tcp_flags 'tcp flags & (fin | syn | rst | ack) syn' ip [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000006 ] [ payload load 1b @ transport header + 13 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x00000017 ) ^ 0x00000000 ] [ bitwise reg 1 = ( reg 1 & 0x00000002 ) ^ 0x00000000 ] [ cmp neq reg 1 0x00000000 ] Note the double bitwise expression. The last two expressions are not correct either since it should match on the syn flag, ie. 0x2. After this patch, netlink bytecode generation looks correct: # nft --debug=netlink add rule meh tcp_flags 'tcp flags & (fin | syn | rst | ack) syn' ip [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000006 ] [ payload load 1b @ transport header + 13 => reg 1 ] [ bitwise reg 1 = ( reg 1 & 0x00000017 ) ^ 0x00000000 ] [ cmp eq reg 1 0x00000002 ] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for nat with interval concatenationPablo Neira Ayuso2021-07-131-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to combine concatenation and interval in NAT mappings, e.g. add rule x y dnat to ip saddr . tcp dport map { 192.168.1.2 . 80 : 10.141.10.2-10.141.10.5 . 8888-8999 } This generates the following NAT expression: [ nat dnat ip addr_min reg 1 addr_max reg 10 proto_min reg 9 proto_max reg 11 ] which expects to obtain the following tuple: IP address (min), source port (min), IP address (max), source port (max) to be obtained from the map. This representation simplifies the delinearize path, since the datatype is specified as: ipv4_addr . inet_service. A few more notes on this update: - alloc_nftnl_setelem() needs a variant netlink_gen_data() to deal with the representation of the range on the rhs of the mapping. In contrast to interval concatenation in the key side, where the range is expressed as two netlink attributes, the data side of the set element mapping stores the interval concatenation in a contiguos memory area, see __netlink_gen_concat_expand() for reference. - add range_expr_postprocess() to postprocess the data mapping range. If either one single IP address or port is used, then the minimum and maximum value in the range is the same value, e.g. to avoid listing 80-80, this round simplify the range. This also invokes the range to prefix conversion routine. - add concat_elem_expr() helper function to consolidate code to build the concatenation expression on the rhs element data side. This patch also adds tests/py and tests/shell. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: queue: allow use of arbitrary queue expressionsFlorian Westphal2021-06-211-5/+23
| | | | | | | | | | | | | | | | | | | | | back in 2016 Liping Zhang added support to kernel and libnftnl to specify a source register containing the queue number to use. This was never added to nft itself, so allow this. On linearization side, check if attached expression is a range. If its not, allocate a new register and set NFTNL_EXPR_QUEUE_SREG_QNUM attribute after generating the lowlevel expressions for the kernel. On delinarization we need to check for presence of NFTNL_EXPR_QUEUE_SREG_QNUM and decode the expression(s) when present. Also need to do postprocessing for STMT_QUEUE so that the protocol context is set correctly, without this only raw payload expressions will be shown (@nh,32,...) instead of 'ip ...'. Next patch adds test cases. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add cgroupsv2 supportPablo Neira Ayuso2021-05-031-0/+1
| | | | | | Add support for matching on the cgroups version 2. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add negation match on singleton bitmask valuePablo Neira Ayuso2021-02-051-2/+7
| | | | | | | | | | | | | | | | | This patch provides a shortcut for: ct status and dnat == 0 which allows to check for the packet whose dnat bit is unset: # nft add rule x y ct status ! dnat counter This operation is only available for expression with a bitmask basetype, eg. # nft describe ct status ct expression, datatype ct_status (conntrack status) (basetype bitmask, integer), 32 bits Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: set on flags to request multi-statement supportPablo Neira Ayuso2021-01-041-0/+2
| | | | | | | Old kernel reject requests for element with multiple statements because userspace sets on the flags for multi-statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support for multi-statement in dynamic sets and mapsPablo Neira Ayuso2020-12-171-8/+33
| | | | | | | | This patch allows for two statements for dynamic set updates, e.g. nft rule x y add @y { ip daddr limit rate 1/second counter } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tcpopt: allow to check for presence of any tcp optionFlorian Westphal2020-11-091-1/+1
| | | | | | | | | | | | | nft currently doesn't allow to check for presence of arbitrary tcp options. Only known options where nft provides a template can be tested for. This allows to test for presence of raw protocol values as well. Example: tcp option 42 exists Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: split tcpopt_hdr_fields into per-option enumFlorian Westphal2020-11-091-2/+2
| | | | | | | | | | | | | | | | | | Currently we're limited to ten template fields in exthdr_desc struct. Using a single enum for all tpc option fields thus won't work indefinitely (TCPOPTHDR_FIELD_TSECR is 9) when new option templates get added. Fortunately we can just use one enum per tcp option to avoid this. As a side effect this also allows to simplify the sack offset calculations. Rather than computing that on-the-fly, just add extra fields to the SACK template. expr->exthdr.offset now holds the 'raw' value, filled in from the option template. This would ease implementation of 'raw option matching' using offset and length to load from the option. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: Optimize prefix matches on byte-boundariesPhil Sutter2020-11-041-1/+3
| | | | | | | | | | | | | | | | If a prefix expression's length is on a byte-boundary, it is sufficient to just reduce the length passed to "cmp" expression. No need for explicit bitwise modification of data on LHS. The relevant code is already there, used for string prefix matches. There is one exception though, namely zero-length prefixes: Kernel doesn't accept zero-length "cmp" expressions, so keep them in the old code-path for now. This patch depends upon the previous one to correctly parse odd-sized payload matches but has to extend support for non-payload LHS as well. In practice, this is needed for "ct" expressions as they allow matching against IP address prefixes, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: improve rule error reportingPablo Neira Ayuso2020-10-201-58/+113
| | | | | | | | | | | | | | | | | | | | | Kernel provides information regarding expression since 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). A common mistake is to refer a chain which does not exist, e.g. # nft add rule x y jump test Error: Could not process rule: No such file or directory add rule x y jump test ^^^^ Use the existing netlink extended error reporting infrastructure to provide better error reporting as in the example above. Requires Linux kernel patch 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: add sctp crc32 checksum fixupFlorian Westphal2020-10-151-1/+1
| | | | | | | | | | Stateless SCTP header mangling doesn't work reliably. This tells the kernel to update the checksum field using the sctp crc32 algorithm. Note that this needs additional kernel support to work. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: support for implicit chain bindingsPablo Neira Ayuso2020-07-151-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to group rules in a subchain, e.g. table inet x { chain y { type filter hook input priority 0; tcp dport 22 jump { ip saddr { 127.0.0.0/8, 172.23.0.0/16, 192.168.13.0/24 } accept ip6 saddr ::1/128 accept; } } } This also supports for the `goto' chain verdict. This patch adds a new chain binding list to avoid a chain list lookup from the delinearize path for the usual chains. This can be simplified later on with a single hashtable per table for all chains. From the shell, you have to use the explicit separator ';', in bash you have to escape this: # nft add rule inet x y tcp dport 80 jump { ip saddr 127.0.0.1 accept\; ip6 saddr ::1 accept \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use expression to store the log prefixPablo Neira Ayuso2020-07-081-2/+5
| | | | | | Intsead of using an array of char. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add STMT_NAT_F_CONCAT flag and use itPablo Neira Ayuso2020-04-281-3/+3
| | | | | | Replace ipportmap boolean field by flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: NAT support for intervals in mapsPablo Neira Ayuso2020-04-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to specify an interval of IP address in maps. table ip x { chain y { type nat hook postrouting priority srcnat; policy accept; snat ip interval to ip saddr map { 10.141.11.4 : 192.168.2.2-192.168.2.4 } } } The example above performs SNAT to packets that comes from 10.141.11.4 to an interval of IP addresses from 192.168.2.2 to 192.168.2.4 (both included). You can also combine this with dynamic maps: table ip x { map y { type ipv4_addr : interval ipv4_addr flags interval elements = { 10.141.10.0/24 : 192.168.2.2-192.168.2.4 } } chain y { type nat hook postrouting priority srcnat; policy accept; snat ip interval to ip saddr map @y } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for restoring element countersPablo Neira Ayuso2020-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to restore counters in dynamic sets: table ip test { set test { type ipv4_addr size 65535 flags dynamic,timeout timeout 30d gc-interval 1d elements = { 192.168.10.13 expires 19d23h52m27s576ms counter packets 51 bytes 17265 } } chain output { type filter hook output priority 0; update @test { ip saddr } } } You can also add counters to elements from the control place, ie. table ip test { set test { type ipv4_addr size 65535 elements = { 192.168.2.1 counter packets 75 bytes 19043 } } chain output { type filter hook output priority filter; policy accept; ip daddr @test } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: remove unused parameter from netlink_gen_stmt_stateful()Pablo Neira Ayuso2020-03-181-23/+13
| | | | | | Remove context from netlink_gen_stmt_stateful(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow nat maps containing both ip(6) address and portFlorian Westphal2020-02-241-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft will now be able to handle map destinations { type ipv4_addr . inet_service : ipv4_addr . inet_service } chain f { dnat to ip daddr . tcp dport map @destinations } Something like this won't work though: meta l4proto tcp dnat ip6 to numgen inc mod 4 map { 0 : dead::f001 . 8080, .. as we lack the type info to properly dissect "dead::f001" as an ipv6 address. For the named map case, this info is available in the map definition, but for the anon case we'd need to resort to guesswork. Support is added by peeking into the map definition when evaluating a nat statement with a map. Right now, when a map is provided as address, we will only check that the mapped-to data type matches the expected size (of an ipv4 or ipv6 address). After this patch, if the mapped-to type is a concatenation, it will take a peek at the individual concat expressions. If its a combination of address and service, nft will translate this so that the kernel nat expression looks at the returned register that would store the inet_service part of the octet soup returned from the lookup expression. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: add support for handling shift expressions.Jeremy Sowden2020-01-281-3/+47
| | | | | | | | | The kernel supports bitwise shift operations, so add support to the netlink linearization and delinearization code. The number of bits (the righthand operand) is expected to be a 32-bit value in host endianness. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: white-space fixes.Jeremy Sowden2020-01-281-1/+1
| | | | | | | Remove some trailing white-space and fix some indentation. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: Avoid potential NULL-pointer deref in netlink_gen_payload_stmt()Phil Sutter2020-01-221-1/+1
| | | | | | | | | | With payload_needs_l4csum_update_pseudohdr() unconditionally dereferencing passed 'desc' parameter and a previous check for it to be non-NULL, make sure to call the function only if input is sane. Fixes: 68de70f2b3fc6 ("netlink_linearize: fix IPv6 layer 4 checksum mangling") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: introduce SYNPROXY matchingFernando Fernandez Mancera2019-07-171-0/+17
| | | | | | | | | | | | | | | | | | | | Add support for "synproxy" statement. For example (for TCP port 8888): table ip x { chain y { type filter hook prerouting priority raw; policy accept; tcp dport 8888 tcp flags syn notrack } chain z { type filter hook input priority filter; policy accept; tcp dport 8888 ct state invalid,untracked synproxy mss 1460 wscale 7 timestamp sack-perm ct state invalid drop } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use UDATA defines from libnftnlPhil Sutter2019-05-031-1/+1
| | | | | | | | | | | | | Userdata attribute names have been added to libnftnl, use them instead of the local copy. While being at it, rename udata_get_comment() in netlink_delinearize.c and the callback it uses since the function is specific to rules. Also integrate the existence check for NFTNL_RULE_USERDATA into it along with the call to nftnl_rule_get_data(). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add nat support for the inet familyFlorian Westphal2019-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | consider a simple ip6 nat table: table ip6 nat { chain output { type nat hook output priority 0; policy accept; dnat to dead:2::99 } Now consider same ruleset, but using 'table inet nat': nft now lacks context to determine address family to parse 'to $address'. This adds code to make the following work: table inet nat { [ .. ] # detect af from network protocol context: ip6 daddr dead::2::1 dnat to dead:2::99 # use new dnat ip6 keyword: dnat ip6 to dead:2::99 } On list side, the keyword is only shown in the inet family, else the short version (dnat to ...) is used as the family is redundant when the table already mandates the ip protocol version supported. Address mismatches such as table ip6 { .. dnat ip to 1.2.3.4 are detected/handled during the evaluation phase. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* osf: add version fingerprint supportFernando Fernandez Mancera2019-04-081-0/+1
| | | | | | | | | | | | | | | Add support for version fingerprint in "osf" expression. Example: table ip foo { chain bar { type filter hook input priority filter; policy accept; osf ttl skip name "Linux" osf ttl skip version "Linux:4.20" } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: expr: add expression etypeFlorian Westphal2019-02-081-13/+13
| | | | | | | | Temporary kludge to remove all the expr->ops->type == ... patterns. Followup patch will remove expr->ops, and make expr_ops() lookup the correct expr_ops struct instead to reduce struct expr size. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: expr: add and use expr_name helperFlorian Westphal2019-02-081-1/+1
| | | | | | | | Currently callers use expr->ops->name, but follouwp patch will remove the ops pointer from struct expr. So add this helper and use it everywhere. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* osf: add ttl option supportFernando Fernandez Mancera2018-10-231-0/+1
| | | | | | | | | | | | | | Add support for ttl option in "osf" expression. Example: table ip foo { chain bar { type filter hook input priority filter; policy accept; osf ttl skip name "Linux" } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add ipsec (xfrm) expressionMáté Eckl2018-09-211-0/+16
| | | | | | | | | | | | | | | | This allows matching on ipsec tunnel/beet addresses in xfrm state associated with a packet, ipsec request id and the SPI. Examples: ipsec in ip saddr 192.168.1.0/24 ipsec out ip6 daddr @endpoints ipsec in spi 1-65536 Joint work with Florian Westphal. Cc: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: integrate stateful expressions into sets and mapsPablo Neira Ayuso2018-08-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following example shows how to populate a set from the packet path using the destination IP address, for each entry there is a counter. The entry expires after the 1 hour timeout if no packets matching this entry are seen. table ip x { set xyz { type ipv4_addr size 65535 flags dynamic,timeout timeout 1h } chain y { type filter hook output priority filter; policy accept; update @xyz { ip daddr counter } counter } } Similar example, that creates a mapping better IP address and mark, where the mark is assigned using an incremental sequence generator from 0 to 1 inclusive. table ip x { map xyz { type ipv4_addr : mark size 65535 flags dynamic,timeout timeout 1h } chain y { type filter hook input priority filter; policy accept; update @xyz { ip saddr counter : numgen inc mod 2 } } } Supported stateful statements are: limit, quota, counter and connlimit. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: simplify map statementPablo Neira Ayuso2018-08-241-10/+11
| | | | | | | Instead of using the map expression, store dynamic key and data separately since they need special handling than constant maps. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: introduce passive OS fingerprint matchingFernando Fernandez Mancera2018-08-041-0/+13
| | | | | | | | | | | | | | Add support for "osf" expression. Example: table ip foo { chain bar { type filter hook input priority 0; policy accept; osf name "Linux" counter packets 3 bytes 132 } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Add tproxy supportMáté Eckl2018-08-031-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for transparent proxy functionality which is supported in ip, ip6 and inet tables. The syntax is the following: tproxy [{|ip|ip6}] to {<ip address>|:<port>|<ip address>:<port>} It looks for a socket listening on the specified address or port and assigns it to the matching packet. In an inet table, a packet matches for both families until address is specified. Network protocol family has to be specified **only** in inet tables if address is specified. As transparent proxy support is implemented for sockets with layer 4 information, a transport protocol header criterion has to be set in the same rule. eg. 'meta l4proto tcp' or 'udp dport 4444' Example ruleset: table ip x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to 1.1.1.1 udp dport ssh tproxy to :2222 } } table ip6 x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport ntp tproxy to [dead::beef] udp dport ssh tproxy to :2222 } } table inet x { chain y { type filter hook prerouting priority -150; policy accept; tcp dport 321 tproxy to :ssh tcp dport 99 tproxy ip to 1.1.1.1:999 udp dport 155 tproxy ip6 to [dead::beef]:smux } } Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Introduce socket matchingMáté Eckl2018-06-061-0/+14
| | | | | | | | | | | | | | | | For now it can only match sockets with IP(V6)_TRANSPARENT socket option set. Example: table inet sockin { chain sockchain { type filter hook prerouting priority -150; policy accept; socket transparent 1 mark set 0x00000001 nftrace set 1 counter packets 9 bytes 504 accept } } Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expr: extend fwd statement to support address and familyPablo Neira Ayuso2018-06-061-4/+15
| | | | | | | | Allow to forward packets through to explicit destination and interface. nft add rule netdev x y fwd ip to 192.168.2.200 device eth0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: connlimit supportPablo Neira Ayuso2018-06-061-0/+18
| | | | | | | | | | | | | | This patch adds support for the new connlimit stateful expression, that provides a mapping with the connlimit iptables extension through meters. eg. nft add rule filter input tcp dport 22 \ meter test { ip saddr ct count over 2 } counter reject This limits the maximum amount incoming of SSH connections per source address up to 2 simultaneous connections. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set_specPablo Neira Ayuso2018-05-061-5/+5
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: avoid errouneous assert with map+concatFlorian Westphal2018-03-271-0/+8
| | | | | | | | | | | | | | | | | | Phil reported following assert: add rule ip6 f o mark set ip6 saddr . ip6 daddr . tcp dport \ map { dead::beef . f00::. 22 : 1 } nft: netlink_linearize.c:655: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. This happens because "mark set" will allocate one register (the dreg), but netlink_gen_concat_expr will populate a lot more register space if the concat expression strings a lot of expressions together. As the assert is useful pseudo-reserve the register space as per concat->len and undo after generating the expressions. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* Combine redir and masq statements into natPhil Sutter2018-03-171-99/+36
| | | | | | | | | | | | | | | | | | | All these statements are very similar, handling them with the same code is obvious. The only thing required here is a custom extension of enum nft_nat_types which is used in nat_stmt to distinguish between snat and dnat already. Though since enum nft_nat_types is part of kernel uAPI, create a local extended version containing the additional fields. Note that nat statement printing got a bit more complicated to get the number of spaces right for every possible combination of attributes. Note also that there wasn't a case for STMT_MASQ in rule_parse_postprocess(), which seems like a bug. Since STMT_MASQ became just a variant of STMT_NAT, postprocessing will take place for it now anyway. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: Fold netlink_gen_cmp() into netlink_gen_relational()Phil Sutter2018-03-161-65/+53
| | | | | | | | | Since netlink_gen_relational() didn't do much anymore after meta OP treating had been removed, it makes sense to merge it with the only function it dispached to. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* relational: Eliminate meta OPsPhil Sutter2018-03-161-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With a bit of code reorganization, relational meta OPs OP_RANGE, OP_FLAGCMP and OP_LOOKUP become unused and can be removed. The only meta OP left is OP_IMPLICIT which is usually treated as alias to OP_EQ. Though it needs to stay in place for one reason: When matching against a bitmask (e.g. TCP flags or conntrack states), it has a different meaning: | nft --debug=netlink add rule ip t c tcp flags syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ bitwise reg 1 = (reg=1 & 0x00000002 ) ^ 0x00000000 ] | [ cmp neq reg 1 0x00000000 ] | nft --debug=netlink add rule ip t c tcp flags == syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ cmp eq reg 1 0x00000002 ] OP_IMPLICIT creates a match which just checks the given flag is present, while OP_EQ creates a match which ensures the given flag and no other is present. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support of dynamic map addition and update of elementsLaura Garcia Liebana2018-03-151-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The support of dynamic adds and updates are only available for sets and meters. This patch gives such abilities to maps as well. This patch is useful in cases where dynamic population of maps are required, for example, to maintain a persistence during some period of time. Example: table ip nftlb { map persistencia { type ipv4_addr : mark timeout 1h elements = { 192.168.1.132 expires 59m55s : 0x00000064, 192.168.56.101 expires 59m24s : 0x00000065 } } chain pre { type nat hook prerouting priority 0; policy accept; map update \ { @nh,96,32 : numgen inc mod 2 offset 100 } @persistencia } } An example of the netlink generated sequence: nft --debug=netlink add rule ip nftlb pre map add \ { ip saddr : numgen inc mod 2 offset 100 } @persistencia ip nftlb pre [ payload load 4b @ network header + 12 => reg 1 ] [ numgen reg 2 = inc mod 2 offset 100 ] [ dynset add reg_key 1 set persistencia sreg_data 2 ] Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: flow offload supportPablo Neira Ayuso2018-03-051-0/+13
| | | | | | | | | | | | This patch allows us to refer to existing flowtables: # nft add rule x x flow offload @m Packets matching this rule create an entry in the flow table 'm', hence, follow up packets that get to the flowtable at ingress bypass the classic forwarding path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: introduce datatype ifname_typeArturo Borrero Gonzalez2018-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This new datatype is a string subtype. It will allow us to build named maps/sets using meta keys like 'iifname', 'oifname', 'ibriport' or 'obriport'. Example: table inet t { set s { type ifname elements = { "eth0", "eth1" } } chain c { iifname @s accept oifname @s accept } } Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: exthdr op must be u32Florian Westphal2017-12-111-2/+2
| | | | | | | | libnftnl casts this to u32. Broke exthdr expressions on bigendian. Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: deprecate "flow table" syntax, replace it by "meter"Pablo Neira Ayuso2017-11-241-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | According to bugzilla 1137: "flow tables" should not be syntactically unique. "Flow tables are always named, but they don't conform to the way sets, maps, and dictionaries work in terms of "add" and "delete" and all that. They are also "flow tables" instead of one word like "flows" or "throttle" or something. It seems weird to just have these break the syntactic expectations." Personally, I never liked the reference to "table" since we have very specific semantics in terms of what a "table" is netfilter for long time. This patch promotes "meter" as the new keyword. The former syntax is still accepted for a while, just to reduce chances of breaking things. At some point the former syntax will just be removed. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1137 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org>
* netlink_linearize: skip set element expression in set statement keyAnders K. Pedersen2017-10-061-3/+3
| | | | | | | | | | | | | | | | | | | | Before this patch the following fails: # nft add rule ip6 filter x \ set add ip6 saddr . ip6 daddr @test nft: netlink_linearize.c:648: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. Aborted This is was previously fixed for flow statements in fbea4a6f4449 ("netlink_linearize: skip set element expression in flow table key"), and this patch implements the same change for set statements by using the set element key in netlink_gen_set_stmt(). nft-test.py is updated to support set types with concatenated data types in order to support testing of this. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: get rid of printfPhil Sutter2017-09-291-1/+1
| | | | | | | | | | | | | | | | | This patch introduces nft_print()/nft_gmp_print() functions which have to be used instead of printf to output information that were previously send to stdout. These functions print to a FILE pointer defined in struct output_ctx. It is set by calling: | old_fp = nft_ctx_set_output(ctx, new_fp); Having an application-defined FILE pointer is actually quite flexible: Using fmemopen() or even fopencookie(), an application gains full control over what is printed and where it should go to. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>