summaryrefslogtreecommitdiffstats
path: root/src/netlink_linearize.c
Commit message (Collapse)AuthorAgeFilesLines
* src: Introduce socket matchingMáté Eckl2018-06-061-0/+14
| | | | | | | | | | | | | | | | For now it can only match sockets with IP(V6)_TRANSPARENT socket option set. Example: table inet sockin { chain sockchain { type filter hook prerouting priority -150; policy accept; socket transparent 1 mark set 0x00000001 nftrace set 1 counter packets 9 bytes 504 accept } } Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expr: extend fwd statement to support address and familyPablo Neira Ayuso2018-06-061-4/+15
| | | | | | | | Allow to forward packets through to explicit destination and interface. nft add rule netdev x y fwd ip to 192.168.2.200 device eth0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: connlimit supportPablo Neira Ayuso2018-06-061-0/+18
| | | | | | | | | | | | | | This patch adds support for the new connlimit stateful expression, that provides a mapping with the connlimit iptables extension through meters. eg. nft add rule filter input tcp dport 22 \ meter test { ip saddr ct count over 2 } counter reject This limits the maximum amount incoming of SSH connections per source address up to 2 simultaneous connections. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set_specPablo Neira Ayuso2018-05-061-5/+5
| | | | | | Store location object in handle to improve error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: avoid errouneous assert with map+concatFlorian Westphal2018-03-271-0/+8
| | | | | | | | | | | | | | | | | | Phil reported following assert: add rule ip6 f o mark set ip6 saddr . ip6 daddr . tcp dport \ map { dead::beef . f00::. 22 : 1 } nft: netlink_linearize.c:655: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. This happens because "mark set" will allocate one register (the dreg), but netlink_gen_concat_expr will populate a lot more register space if the concat expression strings a lot of expressions together. As the assert is useful pseudo-reserve the register space as per concat->len and undo after generating the expressions. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* Combine redir and masq statements into natPhil Sutter2018-03-171-99/+36
| | | | | | | | | | | | | | | | | | | All these statements are very similar, handling them with the same code is obvious. The only thing required here is a custom extension of enum nft_nat_types which is used in nat_stmt to distinguish between snat and dnat already. Though since enum nft_nat_types is part of kernel uAPI, create a local extended version containing the additional fields. Note that nat statement printing got a bit more complicated to get the number of spaces right for every possible combination of attributes. Note also that there wasn't a case for STMT_MASQ in rule_parse_postprocess(), which seems like a bug. Since STMT_MASQ became just a variant of STMT_NAT, postprocessing will take place for it now anyway. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: Fold netlink_gen_cmp() into netlink_gen_relational()Phil Sutter2018-03-161-65/+53
| | | | | | | | | Since netlink_gen_relational() didn't do much anymore after meta OP treating had been removed, it makes sense to merge it with the only function it dispached to. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* relational: Eliminate meta OPsPhil Sutter2018-03-161-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With a bit of code reorganization, relational meta OPs OP_RANGE, OP_FLAGCMP and OP_LOOKUP become unused and can be removed. The only meta OP left is OP_IMPLICIT which is usually treated as alias to OP_EQ. Though it needs to stay in place for one reason: When matching against a bitmask (e.g. TCP flags or conntrack states), it has a different meaning: | nft --debug=netlink add rule ip t c tcp flags syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ bitwise reg 1 = (reg=1 & 0x00000002 ) ^ 0x00000000 ] | [ cmp neq reg 1 0x00000000 ] | nft --debug=netlink add rule ip t c tcp flags == syn | ip t c | [ meta load l4proto => reg 1 ] | [ cmp eq reg 1 0x00000006 ] | [ payload load 1b @ transport header + 13 => reg 1 ] | [ cmp eq reg 1 0x00000002 ] OP_IMPLICIT creates a match which just checks the given flag is present, while OP_EQ creates a match which ensures the given flag and no other is present. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support of dynamic map addition and update of elementsLaura Garcia Liebana2018-03-151-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The support of dynamic adds and updates are only available for sets and meters. This patch gives such abilities to maps as well. This patch is useful in cases where dynamic population of maps are required, for example, to maintain a persistence during some period of time. Example: table ip nftlb { map persistencia { type ipv4_addr : mark timeout 1h elements = { 192.168.1.132 expires 59m55s : 0x00000064, 192.168.56.101 expires 59m24s : 0x00000065 } } chain pre { type nat hook prerouting priority 0; policy accept; map update \ { @nh,96,32 : numgen inc mod 2 offset 100 } @persistencia } } An example of the netlink generated sequence: nft --debug=netlink add rule ip nftlb pre map add \ { ip saddr : numgen inc mod 2 offset 100 } @persistencia ip nftlb pre [ payload load 4b @ network header + 12 => reg 1 ] [ numgen reg 2 = inc mod 2 offset 100 ] [ dynset add reg_key 1 set persistencia sreg_data 2 ] Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: flow offload supportPablo Neira Ayuso2018-03-051-0/+13
| | | | | | | | | | | | This patch allows us to refer to existing flowtables: # nft add rule x x flow offload @m Packets matching this rule create an entry in the flow table 'm', hence, follow up packets that get to the flowtable at ingress bypass the classic forwarding path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* meta: introduce datatype ifname_typeArturo Borrero Gonzalez2018-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This new datatype is a string subtype. It will allow us to build named maps/sets using meta keys like 'iifname', 'oifname', 'ibriport' or 'obriport'. Example: table inet t { set s { type ifname elements = { "eth0", "eth1" } } chain c { iifname @s accept oifname @s accept } } Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: exthdr op must be u32Florian Westphal2017-12-111-2/+2
| | | | | | | | libnftnl casts this to u32. Broke exthdr expressions on bigendian. Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: deprecate "flow table" syntax, replace it by "meter"Pablo Neira Ayuso2017-11-241-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | According to bugzilla 1137: "flow tables" should not be syntactically unique. "Flow tables are always named, but they don't conform to the way sets, maps, and dictionaries work in terms of "add" and "delete" and all that. They are also "flow tables" instead of one word like "flows" or "throttle" or something. It seems weird to just have these break the syntactic expectations." Personally, I never liked the reference to "table" since we have very specific semantics in terms of what a "table" is netfilter for long time. This patch promotes "meter" as the new keyword. The former syntax is still accepted for a while, just to reduce chances of breaking things. At some point the former syntax will just be removed. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1137 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org>
* netlink_linearize: skip set element expression in set statement keyAnders K. Pedersen2017-10-061-3/+3
| | | | | | | | | | | | | | | | | | | | Before this patch the following fails: # nft add rule ip6 filter x \ set add ip6 saddr . ip6 daddr @test nft: netlink_linearize.c:648: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. Aborted This is was previously fixed for flow statements in fbea4a6f4449 ("netlink_linearize: skip set element expression in flow table key"), and this patch implements the same change for set statements by using the set element key in netlink_gen_set_stmt(). nft-test.py is updated to support set types with concatenated data types in order to support testing of this. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: get rid of printfPhil Sutter2017-09-291-1/+1
| | | | | | | | | | | | | | | | | This patch introduces nft_print()/nft_gmp_print() functions which have to be used instead of printf to output information that were previously send to stdout. These functions print to a FILE pointer defined in struct output_ctx. It is set by calling: | old_fp = nft_ctx_set_output(ctx, new_fp); Having an application-defined FILE pointer is actually quite flexible: Using fmemopen() or even fopencookie(), an application gains full control over what is printed and where it should go to. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add debugging mask to context structurePablo Neira Ayuso2017-08-231-1/+1
| | | | | | | So this toggle is not global anymore. Update name that fits better with the semantics of this variable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add tcp options set supportFlorian Westphal2017-08-221-0/+29
| | | | | | | | | | | | This adds support for tcp mss mangling: nft add rule filter input tcp option maxseg size 1200 Its also possible to change other tcp option fields, but maxseg is one of the more useful ones to change. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* hash: generate a random seed if seed option is emptyLiping Zhang2017-04-151-1/+2
| | | | | | | | | | | Typing the "nft add rule x y ct mark set jhash ip saddr mod 2" will not generate a random seed, instead, the seed will always be zero. So if seed option is empty, we shoulde not set the NFTA_HASH_SEED attribute, then a random seed will be generated in the kernel. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: Add support for exthdr specific flagsPhil Sutter2017-03-101-0/+1
| | | | | | | | | This allows to have custom flags in exthdr expression, which is necessary for upcoming existence checks (of both IPv6 extension headers as well as TCP options). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: hash: support of symmetric hashLaura Garcia Liebana2017-03-061-7/+12
| | | | | | | | | | | | | | | | | | | | | This patch provides symmetric hash support according to source ip address and port, and destination ip address and port. The new attribute NFTA_HASH_TYPE has been included to support different types of hashing functions. Currently supported NFT_HASH_JENKINS through jhash and NFT_HASH_SYM through symhash. The main difference between both types are: - jhash requires an expression with sreg, symhash doesn't. - symhash supports modulus and offset, but not seed. Examples: nft add rule ip nat prerouting ct mark set jhash ip saddr mod 2 nft add rule ip nat prerouting ct mark set symhash mod 2 Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support zone set statement with optional directionFlorian Westphal2017-02-281-0/+4
| | | | | | | | | | | | nft automatically understands 'ct zone set 1' but when a direction is specified too we get a parser error since they are currently only allowed for plain ct expressions. This permits the existing syntax ('ct original zone') for all tokens with an optional direction also for set statements. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add TCP option matchingManuel Messner2017-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables nft to match against TCP options. Currently these TCP options are supported: * End of Option List (eol) * No-Operation (noop) * Maximum Segment Size (maxseg) * Window Scale (window) * SACK Permitted (sack_permitted) * SACK (sack) * Timestamps (timestamp) Syntax: tcp options $option_name [$offset] $field_name Example: # count all incoming packets with a specific maximum segment size `x` # nft add rule filter input tcp option maxseg size x counter # count all incoming packets with a SACK TCP option where the third # (counted from zero) left field is greater `x`. # nft add rule filter input tcp option sack 2 left \> x counter If the offset (the `2` in the example above) is zero, it can optionally be omitted. For all non-SACK TCP options it is always zero, thus can be left out. Option names and field names are parsed from templates, similar to meta and ct options rather than via keywords to prevent adding more keywords than necessary. Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
* exthdr: prepare for tcp supportManuel Messner2017-02-121-2/+2
| | | | | | | | | | | right now exthdr only deals with ipv6 extension headers, followup patch will enable tcp option matching. This adds the 'op' arg to exthdr_init. Signed-off-by: Manuel Messner <mm@skelett.io> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add support for stateful object mapsPablo Neira Ayuso2017-01-031-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | You can create these maps using explicit map declarations: # nft add table filter # nft add chain filter input { type filter hook input priority 0\; } # nft add map filter badguys { type ipv4_addr : counter \; } # nft add rule filter input counter name ip saddr map @badguys # nft add counter filter badguy1 # nft add counter filter badguy2 # nft add element filter badguys { 192.168.2.3 : "badguy1" } # nft add element filter badguys { 192.168.2.4 : "badguy2" } Or through implicit map definitions: table ip filter { counter http-traffic { packets 8 bytes 672 } chain input { type filter hook input priority 0; policy accept; counter name tcp dport map { 80 : "http-traffic", 443 : "http-traffic"} } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add stateful object reference expressionPablo Neira Ayuso2017-01-031-0/+16
| | | | | | | | | This patch adds a new objref statement to refer to existing stateful objects from rules, eg. # nft add rule filter input counter name test counter Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add used quota supportPablo Neira Ayuso2017-01-031-0/+1
| | | | | | | | | | | | | table ip x { chain y { type filter hook forward priority 0; policy accept; quota over 200 mbytes used 1143 kbytes drop } } This patch allows us to list and to restore used quota. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: fix IPv6 layer 4 checksum manglingPablo Neira Ayuso2016-12-161-5/+4
| | | | | | | In IPv6 there is no checksum field. We always have to trigger layer 4 checksum mangling if any of the layer 3 pseudoheader fields are updated. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: trigger layer 4 checksum when pseudoheader fields are modifiedPablo Neira2016-12-041-0/+17
| | | | | | | | This patch sets the NFT_PAYLOAD_L4CSUM_PSEUDOHDR when any of the pseudoheader fields are modified. This implicitly enables stateless NAT, that can be useful under some circuntances. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Interpret OP_NEQ against a set as OP_LOOKUPAnatole Denis2016-11-291-5/+9
| | | | | | | | | | Now that the support for inverted matching is in the kernel and in libnftnl, add it to nftables too. This fixes bug #888 Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add log flags syntax supportLiping Zhang2016-11-241-0/+3
| | | | | | | | | | | | | | | | | | | | | Now NF_LOG_XXX is exposed to the userspace, we can set it explicitly. Like iptables LOG target, we can log TCP sequence numbers, TCP options, IP options, UID owning local socket and decode MAC header. Note the log flags are mutually exclusive with group. Some examples are listed below: # nft add rule t c log flags tcp sequence,options # nft add rule t c log flags ip options # nft add rule t c log flags skuid # nft add rule t c log flags ether # nft add rule t c log flags all # nft add rule t c log flags all group 1 <cmdline>:1:14-16: Error: flags and group are mutually exclusive add rule t c log flags all group 1 ^^^ Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add notrack supportPablo Neira Ayuso2016-11-141-0/+11
| | | | | | | This patch adds the notrack statement, to skip connection tracking for certain packets. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add offset attribute for hash expressionLaura Garcia Liebana2016-11-091-0/+1
| | | | | | | | | | | Add support to add an offset to the hash generator, eg. ct mark set hash ip saddr mod 10 offset 100 This will generate marks with series between 100-109. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: skip set element expression in flow table keyPablo Neira Ayuso2016-10-311-3/+3
| | | | | | | | | | | | | | | | | | | | | Anders reports that: # nft add rule ip6 filter postrouting \ flow table acct_out \{ meta iif . ip6 saddr timeout 600s counter \} while the opposite doesn't work: # nft add rule ip6 filter postrouting \ flow table acct_out \{ ip6 saddr . meta iif timeout 600s counter \} netlink_gen_flow_stmt() relies on the flow table key, that is expressed as a set element. Use the set element key instead to skip the set element wrap, otherwise get_register() abort execution: nft: netlink_linearize.c:650: netlink_gen_expr: Assertion `dreg < ctx->reg_low' failed. Reported-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add fib expressionFlorian Westphal2016-10-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the 'fib' expression which can be used to obtain the output interface from the route table based on either source or destination address of a packet. This can be used to e.g. add reverse path filtering: # drop if not coming from the same interface packet # arrived on # nft add rule x prerouting fib saddr . iif oif eq 0 drop # accept only if from eth0 # nft add rule x prerouting fib saddr . iif oif eq "eth0" accept # accept if from any valid interface # nft add rule x prerouting fib saddr oif accept Querying of address type is also supported. This can be used to e.g. only accept packets to addresses configured in the same interface: # fib daddr . iif type local Its also possible to use mark and verdict map, e.g.: # nft add rule x prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : drop, unicast : accept } Signed-off-by: Florian Westphal <fw@strlen.de>
* rt: introduce routing expressionAnders K. Pedersen2016-10-281-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce rt expression for routing related data with support for nexthop (i.e. the directly connected IP address that an outgoing packet is sent to), which can be used either for matching or accounting, eg. # nft add rule filter postrouting \ ip daddr 192.168.1.0/24 rt nexthop != 192.168.0.1 drop This will drop any traffic to 192.168.1.0/24 that is not routed via 192.168.0.1. # nft add rule filter postrouting \ flow table acct { rt nexthop timeout 600s counter } # nft add rule ip6 filter postrouting \ flow table acct { rt nexthop timeout 600s counter } These rules count outgoing traffic per nexthop. Note that the timeout releases an entry if no traffic is seen for this nexthop within 10 minutes. # nft add rule inet filter postrouting \ ether type ip \ flow table acct { rt nexthop timeout 600s counter } # nft add rule inet filter postrouting \ ether type ip6 \ flow table acct { rt nexthop timeout 600s counter } Same as above, but via the inet family, where the ether type must be specified explicitly. "rt classid" is also implemented identical to "meta rtclassid", since it is more logical to have this match in the routing expression going forward. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: fix linearize numgen typeLaura Garcia Liebana2016-10-271-1/+1
| | | | | | | | Avoid to treat numgen type attribute as a register. Fixes: 345236211715 ("src: add hash expression") Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add offset attribute for numgen expressionLaura Garcia Liebana2016-10-271-0/+1
| | | | | | | | | | | | | Add support to add an offset to the numgen generated value. Example: ct mark set numgen inc mod 2 offset 100 This will generate marks with serie like 100, 101, 100, ... Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use new range expression for != [a,b] intervalsPablo Neira Ayuso2016-10-171-25/+21
| | | | | | | Use new range expression in the kernel to fix wrong bytecode generation. This patch also adjust tests so we don't hit problems there. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix compile error due to _UNTIL renamed to _MODULUS in libnftnlLiping Zhang2016-09-121-1/+1
| | | | | | | | | | | | | In the latest libnftnl, NFTNL_EXPR_NG_UNTIL was renamed to NFTNL_EXPR_NG_MODULUS, so compile error happened: netlink_linearize.c: In function ‘netlink_gen_numgen’: netlink_linearize.c:184:26: error: ‘NFTNL_EXPR_NG_UNTIL’ undeclared (first use in this function) Also update NFTA_NG_UNTIL to NFTA_NG_MODULUS. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add hash expressionPablo Neira Ayuso2016-08-291-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is special expression that transforms an input expression into a 32-bit unsigned integer. This expression takes a modulus parameter to scale the result and the random seed so the hash result becomes harder to predict. You can use it to set the packet mark, eg. # nft add rule x y meta mark set jhash ip saddr . ip daddr mod 2 seed 0xdeadbeef You can combine this with maps too, eg. # nft add rule x y dnat to jhash ip saddr mod 2 seed 0xdeadbeef map { \ 0 : 192.168.20.100, \ 1 : 192.168.30.100 \ } Currently, this expression implements the jenkins hash implementation available in the Linux kernel: http://lxr.free-electrons.com/source/include/linux/jhash.h But it should be possible to extend it to support any other hash function type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add numgen expressionPablo Neira Ayuso2016-08-291-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new expression allows us to generate incremental and random numbers bound to a specified modulus value. The following rule sets the conntrack mark of 0 to the first packet seen, then 1 to second packet, then 0 again to the third packet and so on: # nft add rule x y ct mark set numgen inc mod 2 A more useful example is a simple load balancing scenario, where you can also use maps to set the destination NAT address based on this new numgen expression: # nft add rule nat prerouting \ dnat to numgen inc mod 2 map { 0 : 192.168.10.100, 1 : 192.168.20.200 } So this is distributing new connections in a round-robin fashion between 192.168.10.100 and 192.168.20.200. Don't forget the special NAT chain semantics: Only the first packet evaluates the rule, follow up packets rely on conntrack to apply the NAT information. You can also emulate flow distribution with different backend weights using intervals: # nft add rule nat prerouting \ dnat to numgen inc mod 10 map { 0-5 : 192.168.10.100, 6-9 : 192.168.20.200 } So 192.168.10.100 gets 60% of the workload, while 192.168.20.200 gets 40%. We can also be mixed with dynamic sets, thus weight can be updated in runtime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add quota statementPablo Neira Ayuso2016-08-291-0/+16
| | | | | | | | | | | | | This new statement is stateful, so it can be used from flow tables, eg. # nft add rule filter input \ flow table http { ip saddr timeout 60s quota over 50 mbytes } drop This basically sets a quota per source IP address of 50 mbytes after which packets are dropped. Note that the timeout releases the entry if no traffic is seen from this IP after 60 seconds. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: skip NFTNL_EXPR_DYNSET_TIMEOUT attribute if timeout is unsetPablo Neira Ayuso2016-07-121-2/+3
| | | | | | | Otherwise kernel bails out with EINVAL in case that the sets got no timeout flag. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use new definitions from libnftnlPablo Neira Ayuso2016-06-151-7/+7
| | | | | | | Use new definitions in libnftnl, so we can consider getting rid of them at some point. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_linearize: do not duplicate user data when linearizing user dataCarlos Falgueras García2016-05-251-8/+3
| | | | | | | | | | | | | Otherwise, we memory leak this area since nftnl_rule_set_data() now makes a copy of the user data which receives. This is happening since libnftnl's ("rule: Fix segfault due to invalid free of rule user data"), it is not necessary make a copy before call it. Note: Carlos originally posted this patch under the name of ("nftables: Fix memory leak linearizing user data"). Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add flow statementPatrick McHardy2016-05-131-4/+36
| | | | | | | | | | | | | | | The flow statement allows to instantiate per flow statements for user defined flows. This can so far be used for per flow accounting or limiting, similar to what the iptables hashlimit provides. Flows can be aged using the timeout option. Examples: # nft filter input flow ip saddr . tcp dport limit rate 10/second # nft filter input flow table acct iif . ip saddr timeout 60s counter Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* stmt: support generating stateful statements outside of rule contextPatrick McHardy2016-05-131-30/+50
| | | | | | | | | | The flow statement contains a stateful per flow statement, which is not directly part of the rule. Allow generating these statements without adding them to the rule and mark the supported statements using a new flag STMT_F_STATEFUL. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: move payload sub-byte matching to the evaluation stepPablo Neira Ayuso2016-05-111-99/+0
| | | | | | | | | | | | | | | | | | | | Generating the bitwise logic to match sub-byte payload fields from the linearize step has several problems: 1) When the bits are split between two bytes and the payload field is smaller than one byte, we need to extend the expression length on both sides (payload and constant) of the relational expression. 2) Explicit bitmask operations on sub-byte payload fields need to be merge to the implicit bitmask operation, otherwise we generate two bitwise instructions. This is not resolved by this patch, but we should have a look at some point to this. With this approach, we can benefit from the binary operation transfer for shifts to provide a generic way to adjust the constant side of the expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: Use libnftnl user data TLV infrastructureCarlos Falgueras García2016-04-141-3/+22
| | | | | | | | | Now it is possible to store multiple variable length user data into rule. Modify the parser in order to fill the nftnl_udata with the comment, and the print function for extract these commentary and print it to user. Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinarize: shift constant for ranges tooFlorian Westphal2016-03-101-0/+2
| | | | | | | | | ... else rule like vlan pcp 1-3 won't work and will be displayed as 0-0 (reverse direction already works since range is represented as two lte/gte compare expressions). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>