summaryrefslogtreecommitdiffstats
path: root/src/netlink_delinearize.c
Commit message (Collapse)AuthorAgeFilesLines
* netlink_delinearize: Fix resource leaksPhil Sutter2018-03-021-52/+92
| | | | | | | | | | | | | | | | | | Most of the cases are basically the same: Error path fails to free the previously allocated statement or expression. A few cases received special treatment though: - In netlink_parse_payload_stmt(), the leak is easily avoided by code reordering. - In netlink_parse_exthdr(), there's no point in introducing a goto label since there is but a single affected error check. - In netlink_parse_hash() non-error path leaked as well if sreg contained a concatenated expression. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Review switch statements for unmarked fall through casesPhil Sutter2018-02-281-0/+1
| | | | | | | | | | | | | | While revisiting all of them, clear a few oddities as well: - There's no point in marking empty fall through cases: They are easy to spot and a common concept when using switch(). - Fix indenting of break statement in one occasion. - Drop needless braces around one case which doesn't declare variables. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* meta: introduce datatype ifname_typeArturo Borrero Gonzalez2018-02-251-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | This new datatype is a string subtype. It will allow us to build named maps/sets using meta keys like 'iifname', 'oifname', 'ibriport' or 'obriport'. Example: table inet t { set s { type ifname elements = { "eth0", "eth1" } } chain c { iifname @s accept oifname @s accept } } Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Spelling fixesVille Skyttä2018-02-151-1/+1
| | | | | Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: add meta_may_dependency_kill()Pablo Neira Ayuso2018-02-151-1/+71
| | | | | | | | | | | | | Do not exercise dependency removal for protocol key network payload expressions in bridge, netdev and inet families from meta expressions, more specifically: * inet: nfproto and ether type. * netdev and bridge: meta protocol and ether type. need to be left in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: get rid of __payload_dependency_kill()Pablo Neira Ayuso2018-02-151-6/+3
| | | | | | Use payload_dependency_release() instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add payload_dependency_exists()Pablo Neira Ayuso2018-02-151-6/+9
| | | | | | | | | | | | | | | This helper function tells us if there is already a protocol key payload expression, ie. those with EXPR_F_PROTOCOL flag set on, that we might want to remove since we can infer from another expression in the upper protocol base, eg. ip protocol tcp tcp dport 22 'ip protocol tcp' can be removed in the ip family since it is redundant, but not in the netdev, bridge and inet families, where we cannot make assumptions on the layer 3 protocol. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: pass family to payload_dependency_kill()Pablo Neira Ayuso2018-02-151-7/+11
| | | | | | | | | This context information is very relevant when deciding if a redundant dependency needs to be removed or not, specifically for the inet, bridge and netdev families. This new parameter is used by follow up patch entitled ("payload: add payload_may_dependency_kill()"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: add assertion to prevent infinite loopPablo Neira Ayuso2018-02-021-1/+4
| | | | | | | | | | | | | | | | | | | | | | | The following configuration: table inet filter { chain input { ct original ip daddr {1.2.3.4} accept } } is triggering an infinite loop. This problem also exists with concatenations and ct ip {s,d}addr. Until we have a solution for this, let's just prevent infinite loops. Now we hit this: # nft list ruleset nft: netlink_delinearize.c:124: netlink_parse_concat_expr: Assertion `consumed > 0' failed. Abort Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix protocol context update on big-endian systemsPhil Sutter2017-12-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an obscure bug on big-endian systems when trying to list a rule containing the expression 'ct helper tftp' which triggers the assert() call in mpz_get_type(). Florian identified the cause: ct_expr_pctx_update() is called for the relational expression which calls mpz_get_uint32() to get RHS value (assuming it is a protocol number). On big-endian systems, the misinterpreted value exceeds UINT_MAX. Expressions' pctx_update() callback should only be called for protocol matches, so ct_meta_common_postprocess() lacked a check for 'left->flags & EXPR_F_PROTOCOL' like the one already present in payload_expr_pctx_update(). In order to fix this in a clean way, this patch introduces a wrapper relational_expr_pctx_update() to be used instead of directly calling LHS's pctx_update() callback which unifies the necessary checks (and adds one more assert): - assert(expr->ops->type == EXPR_RELATIONAL) -> This is new, just to ensure the wrapper is called properly. - assert(expr->op == OP_EQ) -> This was moved from {ct,meta,payload}_expr_pctx_update(). - left->ops->pctx_update != NULL -> This was taken from expr_evaluate_relational(), a necessary requirement for the introduced wrapper to function at all. - (left->flags & EXPR_F_PROTOCOL) != 0 -> The crucial missing check which led to the problem. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: deprecate "flow table" syntax, replace it by "meter"Pablo Neira Ayuso2017-11-241-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | According to bugzilla 1137: "flow tables" should not be syntactically unique. "Flow tables are always named, but they don't conform to the way sets, maps, and dictionaries work in terms of "add" and "delete" and all that. They are also "flow tables" instead of one word like "flows" or "throttle" or something. It seems weird to just have these break the syntactic expectations." Personally, I never liked the reference to "table" since we have very specific semantics in terms of what a "table" is netfilter for long time. This patch promotes "meter" as the new keyword. The former syntax is still accepted for a while, just to reduce chances of breaking things. At some point the former syntax will just be removed. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1137 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org>
* src: unifiy meta and ct postprocessingFlorian Westphal2017-09-291-28/+22
| | | | | | | | | From postprocess point of view meta and ct are logically the same, except that their storage area overlaps (union type), so if we extract the relevant fields we can move all of it into a single helper and support dependency store/kill for both expressions. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add alternate syntax for ct saddrFlorian Westphal2017-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | current syntax is: ct original saddr $address problem is that in inet, bridge etc. we lack context to figure out if this should fetch ipv6 or ipv4 from the conntrack structure. $address might not exist, rhs could e.g. be a set reference. One way to do this is to have users manually specifiy the dependeny: ct l3proto ipv4 ct original saddr $address Thats ugly, and, moreover, only needed for table families other than ip or ipv6. Pablo suggested to instead specify ip saddr, ip6 saddr: ct original ip saddr $address and let nft handle the dependency injection. This adds the required parts to the scanner and the grammar, next commit adds code to eval step to make use of this. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: store expression as set key instead of data typeFlorian Westphal2017-09-271-6/+6
| | | | | | | | | | | | Doing so retains legth information in case of unqualified data types, e.g. we now have 'meta iifname' expression instead of an (unqualified) string type. This allows to eventually use iifnames as set keys without adding yet another special data type for them. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add debugging mask to context structurePablo Neira Ayuso2017-08-231-1/+2
| | | | | | | So this toggle is not global anymore. Update name that fits better with the semantics of this variable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add tcp options set supportFlorian Westphal2017-08-221-2/+19
| | | | | | | | | | | | This adds support for tcp mss mangling: nft add rule filter input tcp option maxseg size 1200 Its also possible to change other tcp option fields, but maxseg is one of the more useful ones to change. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: introduce struct nft_cacheVarsha Rao2017-08-141-2/+3
| | | | | | | | | | Pass variable cache_initialized and structure list_head as members of structure nft_cache. Joint work with Pablo Neira. Signed-off-by: Varsha Rao <rvarsha016@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: prefer ct event set foo,bar over 'set foo|bar'Florian Westphal2017-06-071-1/+6
| | | | | | | | Translates binop representation to a list-based one, so nft prints "ct event destroy,new" instead of 'ct event destroy|new'. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* expression: don't trim off unary expression on delinearizationPablo Neira Ayuso2017-05-291-4/+1
| | | | | | | | | | | This transformation introduces an unnecessary asymmetry between the linearization and delinearization steps that prevent rule deletion by name to work fine. Moreover, do not print htonl and ntonl from unary expression, this syntax is not allowed by the parser. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: reject: remove dependency for tcp-resetsFlorian Westphal2017-05-181-0/+6
| | | | | | We can remove a l4 dependency in ip/ipv6 families. Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink_delink_delinearize: don't store dependency unless relop checks is eq ↵Florian Westphal2017-05-151-1/+1
| | | | | | | | | | | check 'ip protocol ne 6' is not a dependency for nexthdr protocol, and must not be stored as such. Fixes: 0b858391781ba308 ("src: annotate follow up dependency just after killing another") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: don't kill dependencies accross statementsFlorian Westphal2017-05-081-1/+27
| | | | | | | | | | | | | | | | | | | | nft currently translates ip protocol tcp meta mark set 1 tcp dport 22 to mark set 0x00000001 tcp dport 22 This is wrong, the latter form is same as mark set 0x00000001 ip protocol tcp tcp dport 22 and thats not correct (original rule sets mark for tcp packets only). We need to clear the dependency stack whenever we see a statement other than stmt_expr, as these will have side effects (counter, payload mangling, logging and the like). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* hash: generate a random seed if seed option is emptyLiping Zhang2017-04-151-1/+3
| | | | | | | | | | | Typing the "nft add rule x y ct mark set jhash ip saddr mod 2" will not generate a random seed, instead, the seed will always be zero. So if seed option is empty, we shoulde not set the NFTA_HASH_SEED attribute, then a random seed will be generated in the kernel. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* exthdr: Add support for exthdr specific flagsPhil Sutter2017-03-101-2/+3
| | | | | | | | | This allows to have custom flags in exthdr expression, which is necessary for upcoming existence checks (of both IPv6 extension headers as well as TCP options). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: hash: support of symmetric hashLaura Garcia Liebana2017-03-061-14/+21
| | | | | | | | | | | | | | | | | | | | | This patch provides symmetric hash support according to source ip address and port, and destination ip address and port. The new attribute NFTA_HASH_TYPE has been included to support different types of hashing functions. Currently supported NFT_HASH_JENKINS through jhash and NFT_HASH_SYM through symhash. The main difference between both types are: - jhash requires an expression with sreg, symhash doesn't. - symhash supports modulus and offset, but not seed. Examples: nft add rule ip nat prerouting ct mark set jhash ip saddr mod 2 nft add rule ip nat prerouting ct mark set symhash mod 2 Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support zone set statement with optional directionFlorian Westphal2017-02-281-1/+5
| | | | | | | | | | | | nft automatically understands 'ct zone set 1' but when a direction is specified too we get a parser error since they are currently only allowed for plain ct expressions. This permits the existing syntax ('ct original zone') for all tokens with an optional direction also for set statements. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: remove integer_type_postprocess()Pablo Neira Ayuso2017-02-251-29/+0
| | | | | | | | | | Not required anymore since the set definition now comes with the right byteorder for integer types via NFTA_SET_USERDATA area. So we don't need to look at the lhs anymore. Note that this was a workaround that does not work with named sets, where we cannot assume we have a lhs, since it is valid to have a named set that is not referenced from any rule. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: automatically kill dependencies for exthdr and tcpoptManuel Messner2017-02-121-1/+1
| | | | | | | | | | | | | | | | | | | This patch automatically removes the dependencies for exthdr and tcpopt. # nft add rule filter input tcp option maxseg kind 3 counter. # nft list table filter input Before: # ip protocol 6 tcp option maxseg kind 3 counter After: # tcp option maxseg kind 3 counter Thus allowing to write tests as follows: # tcp option maxseg kind 3;ok Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add TCP option matchingManuel Messner2017-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables nft to match against TCP options. Currently these TCP options are supported: * End of Option List (eol) * No-Operation (noop) * Maximum Segment Size (maxseg) * Window Scale (window) * SACK Permitted (sack_permitted) * SACK (sack) * Timestamps (timestamp) Syntax: tcp options $option_name [$offset] $field_name Example: # count all incoming packets with a specific maximum segment size `x` # nft add rule filter input tcp option maxseg size x counter # count all incoming packets with a SACK TCP option where the third # (counted from zero) left field is greater `x`. # nft add rule filter input tcp option sack 2 left \> x counter If the offset (the `2` in the example above) is zero, it can optionally be omitted. For all non-SACK TCP options it is always zero, thus can be left out. Option names and field names are parsed from templates, similar to meta and ct options rather than via keywords to prevent adding more keywords than necessary. Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
* exthdr: prepare for tcp supportManuel Messner2017-02-121-1/+3
| | | | | | | | | | | right now exthdr only deals with ipv6 extension headers, followup patch will enable tcp option matching. This adds the 'op' arg to exthdr_init. Signed-off-by: Manuel Messner <mm@skelett.io> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add support for stateful object mapsPablo Neira Ayuso2017-01-031-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | You can create these maps using explicit map declarations: # nft add table filter # nft add chain filter input { type filter hook input priority 0\; } # nft add map filter badguys { type ipv4_addr : counter \; } # nft add rule filter input counter name ip saddr map @badguys # nft add counter filter badguy1 # nft add counter filter badguy2 # nft add element filter badguys { 192.168.2.3 : "badguy1" } # nft add element filter badguys { 192.168.2.4 : "badguy2" } Or through implicit map definitions: table ip filter { counter http-traffic { packets 8 bytes 672 } chain input { type filter hook input priority 0; policy accept; counter name tcp dport map { 80 : "http-traffic", 443 : "http-traffic"} } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add stateful object reference expressionPablo Neira Ayuso2017-01-031-0/+33
| | | | | | | | | This patch adds a new objref statement to refer to existing stateful objects from rules, eg. # nft add rule filter input counter name test counter Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add used quota supportPablo Neira Ayuso2017-01-031-0/+2
| | | | | | | | | | | | | table ip x { chain y { type filter hook forward priority 0; policy accept; quota over 200 mbytes used 1143 kbytes drop } } This patch allows us to list and to restore used quota. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Interpret OP_NEQ against a set as OP_LOOKUPAnatole Denis2016-11-291-0/+10
| | | | | | | | | | Now that the support for inverted matching is in the kernel and in libnftnl, add it to nftables too. This fixes bug #888 Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add log flags syntax supportLiping Zhang2016-11-241-0/+4
| | | | | | | | | | | | | | | | | | | | | Now NF_LOG_XXX is exposed to the userspace, we can set it explicitly. Like iptables LOG target, we can log TCP sequence numbers, TCP options, IP options, UID owning local socket and decode MAC header. Note the log flags are mutually exclusive with group. Some examples are listed below: # nft add rule t c log flags tcp sequence,options # nft add rule t c log flags ip options # nft add rule t c log flags skuid # nft add rule t c log flags ether # nft add rule t c log flags all # nft add rule t c log flags all group 1 <cmdline>:1:14-16: Error: flags and group are mutually exclusive add rule t c log flags all group 1 ^^^ Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add notrack supportPablo Neira Ayuso2016-11-141-0/+8
| | | | | | | This patch adds the notrack statement, to skip connection tracking for certain packets. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add offset attribute for hash expressionLaura Garcia Liebana2016-11-091-2/+3
| | | | | | | | | | | Add support to add an offset to the hash generator, eg. ct mark set hash ip saddr mod 10 offset 100 This will generate marks with series between 100-109. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add fib expressionFlorian Westphal2016-10-281-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the 'fib' expression which can be used to obtain the output interface from the route table based on either source or destination address of a packet. This can be used to e.g. add reverse path filtering: # drop if not coming from the same interface packet # arrived on # nft add rule x prerouting fib saddr . iif oif eq 0 drop # accept only if from eth0 # nft add rule x prerouting fib saddr . iif oif eq "eth0" accept # accept if from any valid interface # nft add rule x prerouting fib saddr oif accept Querying of address type is also supported. This can be used to e.g. only accept packets to addresses configured in the same interface: # fib daddr . iif type local Its also possible to use mark and verdict map, e.g.: # nft add rule x prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : drop, unicast : accept } Signed-off-by: Florian Westphal <fw@strlen.de>
* rt: introduce routing expressionAnders K. Pedersen2016-10-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce rt expression for routing related data with support for nexthop (i.e. the directly connected IP address that an outgoing packet is sent to), which can be used either for matching or accounting, eg. # nft add rule filter postrouting \ ip daddr 192.168.1.0/24 rt nexthop != 192.168.0.1 drop This will drop any traffic to 192.168.1.0/24 that is not routed via 192.168.0.1. # nft add rule filter postrouting \ flow table acct { rt nexthop timeout 600s counter } # nft add rule ip6 filter postrouting \ flow table acct { rt nexthop timeout 600s counter } These rules count outgoing traffic per nexthop. Note that the timeout releases an entry if no traffic is seen for this nexthop within 10 minutes. # nft add rule inet filter postrouting \ ether type ip \ flow table acct { rt nexthop timeout 600s counter } # nft add rule inet filter postrouting \ ether type ip6 \ flow table acct { rt nexthop timeout 600s counter } Same as above, but via the inet family, where the ether type must be specified explicitly. "rt classid" is also implemented identical to "meta rtclassid", since it is more logical to have this match in the routing expression going forward. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add offset attribute for numgen expressionLaura Garcia Liebana2016-10-271-2/+3
| | | | | | | | | | | | | Add support to add an offset to the numgen generated value. Example: ct mark set numgen inc mod 2 offset 100 This will generate marks with serie like 100, 101, 100, ... Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use new range expression for != [a,b] intervalsPablo Neira Ayuso2016-10-171-0/+45
| | | | | | | Use new range expression in the kernel to fix wrong bytecode generation. This patch also adjust tests so we don't hit problems there. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: fix compile error due to _UNTIL renamed to _MODULUS in libnftnlLiping Zhang2016-09-121-1/+1
| | | | | | | | | | | | | In the latest libnftnl, NFTNL_EXPR_NG_UNTIL was renamed to NFTNL_EXPR_NG_MODULUS, so compile error happened: netlink_linearize.c: In function ‘netlink_gen_numgen’: netlink_linearize.c:184:26: error: ‘NFTNL_EXPR_NG_UNTIL’ undeclared (first use in this function) Also update NFTA_NG_UNTIL to NFTA_NG_MODULUS. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: remove byteorder conversionFlorian Westphal2016-09-091-2/+0
| | | | | | | | | This is what made ether addresses get formatted correctly with plain payload expression (ether saddr 00:11 ...) when listing rules. Not needed anymore since etheraddr_type is now BIG_ENDIAN. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink_delinearize: Avoid potential null pointer derefPablo Neira Ayuso2016-09-071-0/+13
| | | | | | | | | | | | | Phil Sutter says: As netlink_get_register() may return NULL, we must not pass the returned data unchecked to expr_set_type() as that will dereference it. Since the parser has failed at that point anyway, by returning early we can skip the useless statement allocation that follows in netlink_parse_ct_stmt(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Phil Sutter <phil@nwl.cc>
* src: add hash expressionPablo Neira Ayuso2016-08-291-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is special expression that transforms an input expression into a 32-bit unsigned integer. This expression takes a modulus parameter to scale the result and the random seed so the hash result becomes harder to predict. You can use it to set the packet mark, eg. # nft add rule x y meta mark set jhash ip saddr . ip daddr mod 2 seed 0xdeadbeef You can combine this with maps too, eg. # nft add rule x y dnat to jhash ip saddr mod 2 seed 0xdeadbeef map { \ 0 : 192.168.20.100, \ 1 : 192.168.30.100 \ } Currently, this expression implements the jenkins hash implementation available in the Linux kernel: http://lxr.free-electrons.com/source/include/linux/jhash.h But it should be possible to extend it to support any other hash function type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add numgen expressionPablo Neira Ayuso2016-08-291-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new expression allows us to generate incremental and random numbers bound to a specified modulus value. The following rule sets the conntrack mark of 0 to the first packet seen, then 1 to second packet, then 0 again to the third packet and so on: # nft add rule x y ct mark set numgen inc mod 2 A more useful example is a simple load balancing scenario, where you can also use maps to set the destination NAT address based on this new numgen expression: # nft add rule nat prerouting \ dnat to numgen inc mod 2 map { 0 : 192.168.10.100, 1 : 192.168.20.200 } So this is distributing new connections in a round-robin fashion between 192.168.10.100 and 192.168.20.200. Don't forget the special NAT chain semantics: Only the first packet evaluates the rule, follow up packets rely on conntrack to apply the NAT information. You can also emulate flow distribution with different backend weights using intervals: # nft add rule nat prerouting \ dnat to numgen inc mod 10 map { 0-5 : 192.168.10.100, 6-9 : 192.168.20.200 } So 192.168.10.100 gets 60% of the workload, while 192.168.20.200 gets 40%. We can also be mixed with dynamic sets, thus weight can be updated in runtime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add quota statementPablo Neira Ayuso2016-08-291-0/+14
| | | | | | | | | | | | | This new statement is stateful, so it can be used from flow tables, eg. # nft add rule filter input \ flow table http { ip saddr timeout 60s quota over 50 mbytes } drop This basically sets a quota per source IP address of 50 mbytes after which packets are dropped. Note that the timeout releases the entry if no traffic is seen from this IP after 60 seconds. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: decode payload statmentFlorian Westphal2016-08-011-5/+178
| | | | | | | | This allows nft to display payload set operations if the header isn't byte aligned or has non-byte divisible sizes. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netlink: add __binop_adjust helperFlorian Westphal2016-08-011-4/+9
| | | | | | | | | | | | binop_adjust takes an expression whose LHS is expected to be the binop expression that we use to adjust a payload expression based on a mask (to match sub-byte headers like iphdr->version). A followup patch has to pass the binop directly, so add add a helper for it. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add xt compat supportPablo Neira Ayuso2016-07-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At compilation time, you have to pass this option. # ./configure --with-xtables And libxtables needs to be installed in your system. This patch allows to list a ruleset containing xt extensions loaded through iptables-compat-restore tool. Example: $ iptables-save > ruleset $ cat ruleset *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -p tcp -m multiport --dports 80,81 -j REJECT COMMIT $ sudo iptables-compat-restore ruleset $ sudo nft list rulseset table ip filter { chain INPUT { type filter hook input priority 0; policy accept; ip protocol tcp tcp dport { 80,81} counter packets 0 bytes 0 reject } chain FORWARD { type filter hook forward priority 0; policy drop; } chain OUTPUT { type filter hook output priority 0; policy accept; } } A translation of the extension is shown if this is available. In other case, match or target definition is preceded by a hash. For example, classify target has not translation: $ sudo nft list chain mangle POSTROUTING table ip mangle { chain POSTROUTING { type filter hook postrouting priority -150; policy accept; ip protocol tcp tcp dport 80 counter packets 0 bytes 0 # CLASSIFY set 20:10 ^^^ } } If the whole ruleset is translatable, the users can (re)load it using "nft -f" and get nft native support for all their rules. This patch is joint work by the authors listed below. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>