nftables - nft command line tool

	Commit message (Collapse)	Author	Age	Files	Lines
*	src: use ibrname and obrname	Pablo Neira Ayuso	2018-04-19	1	-2/+2
\| \| \| \| \| \| \| \| \|	Legacy tool name is 'brctl' and so the 'br' prefix is already known. If we use ibrname and obrname it looks consistent with iifname and oifname. So let's this instead of ibridgename and obridgename since Florian likes this too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: rename ibrportname, obrportname	Florian Westphal	2018-04-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For bridge, iifname is the port name, whereas 'ibrport' is the logical name of the bridge ("br0") the port ("iifname") is enslaved to. So, 'ibrport' is a misnomer. libnftl calls these 'bri_iifname' and 'bri_oifname', which is good but using 'briiifname' in nft is rather ugly, so use 'ibridgename' and 'obridgename' instead. Old names are still recognized, listing shows the new names. Signed-off-by: Florian Westphal <fw@strlen.de>
*	scanner: add helpers token	Florian Westphal	2018-04-17	1	-0/+1
\| \| \| \| \| \| \| \| \|	without it, you get: nft list ct helpers table filter Error: syntax error, unexpected string, expecting helper or helpers Fixes: 14fd3ad720f6e ("src: prepare for future ct timeout policy support") Signed-off-by: Florian Westphal <fw@strlen.de>
*	erec: Review erec_print()	Phil Sutter	2018-04-14	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new requirement to erec for the upcoming JSON support is printing records with file input descriptors without open stream. The approach is to treat 'name' field as file name, open it, extract the offending line and close it again. Further changes to libnftables input parsing routines though have shown that the whole concept of file pointer reuse in erec is tedious and not worth keeping: * Closed files are to be supported as well, so there needs to be fallback code for opening the file anyway. * When input descriptor is duplicated from parser state into an error record, the file pointer is copied as well. Therefore care has to be taken to not free the parser state before any error records have been printed. This is the only point where old and duplicated input descriptors are connected. Therefore drop struct input_descriptor's 'fp' field and just always open the file by name. This way also the old stream offset doesn't have to be restored after reading. While being at it, this patch fixes two other (potential) problems: * If the offending line from input contains tabs, add them at the right position in the marker buffer as well to avoid misalignment. * The input file may not be seekable (/dev/stdin for instance), so skip printing of offending line and markers if it couldn't be read properly. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	libnftables: Fix for input without trailing newline	Phil Sutter	2018-04-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Input parser implementation requires a newline at end of input, otherwise the last pattern may not be recognized correctly. If input comes from a file, the culprit was YY_INPUT macro not expecting the last line not ending with a newline, so the last word wasn't accepted. This is easily fixed by checking for feof(yyin) in there. A simple test case for that is: \| echo -en "table ip t {\nchain c {\n}\n}" >/tmp/foo \| nft -f /tmp/foo Input from a string buffer is a bit more tricky: The culprit here is that detection of classid pattern is done by checking the character following it which makes it impossible to sit right at end of input and I haven't found an alternative to that. After dropping the manual newline appending when combining argv into a single buffer in main(), a rule like this won't be recognized anymore: \| nft add rule ip t c meta priority feed:babe Since a direct call to run_cmd_from_buffer() via libnftables bypasses the sanitizing done in main() entirely, it has to happen in libnftables instead which means creating a newline-terminated duplicate of the input buffer. Note that main() created a buffer one byte longer than needed since it accounts for whitespace at end of each argv but doesn't add it to the buffer for the last one, so buffer length is reduced by two bytes instead of just one although only one less character is printed into it. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: Adding support for segment routing header 'srh'	Ahmed Abdelsalam	2018-03-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Segment Routing Header "SRH" is new type of IPv6 Routing extension header (type 4). SRH contains a list of segments (each is represented as an IPv6 address) to be visited by packets during the journey from source to destination. The SRH specification are defined in the below IETF SRH draft. https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-07 Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: support for get element command	Pablo Neira Ayuso	2018-03-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	You need a Linux kernel >= 4.15 to use this feature. This patch allows us to dump the content of an existing set. # nft list ruleset table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 3.3.3.3, 5.5.5.5-6.6.6.6 } } } You check if a single element exists in the set: # nft get element x x { 1.1.1.5 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } Output means '1.1.1.5' belongs to the '1.1.1.1-2.2.2.2' interval. You can also check for intervals: # nft get element x x { 1.1.1.1-2.2.2.2 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } If you try to check for an element that doesn't exist, an error is displayed. # nft get element x x { 1.1.1.0 } Error: Could not receive set elements: No such file or directory get element x x { 1.1.1.0 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can also check for multiple elements in one go: # nft get element x x { 1.1.1.5, 5.5.5.10 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 5.5.5.5-6.6.6.6 } } } You can also use this to fetch the existing timeout for specific elements, in case you have a set with timeouts in place: # nft get element w z { 2.2.2.2 } table ip w { set z { type ipv4_addr timeout 30s elements = { 2.2.2.2 expires 17s } } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: flow offload support	Pablo Neira Ayuso	2018-03-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch allows us to refer to existing flowtables: # nft add rule x x flow offload @m Packets matching this rule create an entry in the flow table 'm', hence, follow up packets that get to the flowtable at ingress bypass the classic forwarding path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add support to add flowtables	Pablo Neira Ayuso	2018-03-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to create flowtable: # nft add table x # nft add flowtable x m { hook ingress priority 10\; devices = { eth0, wlan0 }\; } You have to specify hook and priority. So far, only the ingress hook is supported. The priority represents where this flowtable is placed in the ingress hook, which is registered to the devices that the user specifies. You can also use the 'create' command instead to bail out in case that there is an existing flowtable with this name. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: support for flowtable listing	Pablo Neira Ayuso	2018-03-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to dump existing flowtable. # nft list ruleset table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You can also list existing flowtables via: # nft list flowtables table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You need a Linux kernel >= 4.16-rc to test this new feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	Added undefine/redefine keywords	David Fabian	2018-02-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a small patch to nft which adds two new keywords - undefine and redefine. undefine simply undefines a variable from the current scope. redefine allows one to change a variable definition. We have a firewall written in bash (using iptables) that is organized by customer VLANs. Each VLAN has its own set of bash variables holding things like uplink iface names, gateway IPs, etc. We want to rewrite the firewall to nftables but are stuck on the fact that nft variables cannot be overridden in the same scope. We have each VLAN configuration in a separate file containing pre/post-routing, input, output and forward rules,and we include those files to a master firewall configuration. One solution is to rename all the variables with some VLAN specific (pre/su)ffix. But that is cumbersome. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add 'auto-merge' option to sets	Pablo Neira Ayuso	2018-01-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After discussions with Karel here: https://bugzilla.netfilter.org/show_bug.cgi?id=1184 And later on with Phil Sutter, we decided to disable the automatic merge feature in sets with intervals. This feature is problematic because it introduces an inconsistency between what we add and what we later on get. This is going to get worse with the upcoming timeout support for intervals. Therefore, we turned off this by default. However, Jeff Kletsky and folks like this feature, so let's restore this behaviour on demand with this new 'auto-merge' statement, that you can place on the set definition, eg. # nft list ruleset table ip x { ... set y { type ipv4_addr flags interval auto-merge } } # nft add element x z { 1.1.1.1-2.2.2.2, 1.1.1.2 } Regarding implementation details: Given this feature only makes sense from userspace, let's store this in the set user data area, so nft knows it has to do automatic merge of adjacent/overlapping elements as per user request. # nft add set x z { type ipv4_addr\; auto-merge\; } Error: auto-merge only works with interval sets add set x z { type ipv4_addr; auto-merge; } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1216 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: Add import command for low level json	Shyam Saini	2018-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new operation allows to import low level virtual machine ruleset in json to make incremental changes using the parse functions of libnftnl. A basic way to test this new functionality is: $ cat file.json \| nft import vm json where the file.json is a ruleset exported in low level json format. To export json rules in low level virtual machine format we need to specify "vm" token before json. See below $ nft export vm json and $ nft export/import json will do no operations. Same goes with "$nft monitor" Highly based on work from Alvaro Neira <alvaroneay@gmail.com> and Arturo Borrero <arturo@netfilter.org> Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org> Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: deprecate "flow table" syntax, replace it by "meter"	Pablo Neira Ayuso	2017-11-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to bugzilla 1137: "flow tables" should not be syntactically unique. "Flow tables are always named, but they don't conform to the way sets, maps, and dictionaries work in terms of "add" and "delete" and all that. They are also "flow tables" instead of one word like "flows" or "throttle" or something. It seems weird to just have these break the syntactic expectations." Personally, I never liked the reference to "table" since we have very specific semantics in terms of what a "table" is netfilter for long time. This patch promotes "meter" as the new keyword. The former syntax is still accepted for a while, just to reduce chances of breaking things. At some point the former syntax will just be removed. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1137 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org>
*	libnftables: Introduce getters and setters for everything	Phil Sutter	2017-10-24	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces getter/setter pairs for all parts in struct nft_ctx (and contained structs) which should be configurable. Most of them are simple ones, just allowing to get/set a given field: * nft_ctx_{get,set}_dry_run() -> ctx->check * nft_ctx_output_{get,set}_numeric() -> ctx->output.numeric * nft_ctx_output_{get,set}_stateless() -> ctx->output.stateless * nft_ctx_output_{get,set}_ip2name() -> ctx->output.ip2name * nft_ctx_output_{get,set}_debug() -> ctx->debug_mask * nft_ctx_output_{get,set}_handle() -> ctx->output.handle * nft_ctx_output_{get,set}_echo() -> ctx->output.echo A more complicated case is include paths handling: In order to keep the API simple, remove INCLUDE_PATHS_MAX restraint and dynamically allocate nft_ctx field include_paths instead. So there is: * nft_ctx_add_include_path() -> add an include path to the list * nft_ctx_clear_include_paths() -> flush the list of include paths Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: IPv4-Mapped IPv6 addresses support	Pablo Neira Ayuso	2017-10-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scanner rejects IPv4-Mapped IPv6 addresses, eg. # cat test #!/usr/sbin/nft -f flush ruleset table inet global { set blackhole_ipv6 { type ipv6_addr flags interval elements = { ::ffff:0.0.0.0/96 } } } # nft -f test test:8:30-38: Error: syntax error, unexpected string, expecting comma or '}' elements = { ::ffff:0.0.0.0/96 } ^^^^^^^^^^ According to RFC4291, Sect. 2.5.5.2. IPv4-Mapped IPv6 Address: \| 80 bits \| 16 \| 32 bits \| +--------------------------------------+--------------------------+ \|0000..............................0000\|FFFF\| IPv4 address \| +--------------------------------------+----+---------------------+ Update scanner bits to parse this. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1188 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser_bison: use keywords in ct expression	Pablo Neira Ayuso	2017-09-27	1	-0/+9
\| \| \| \| \| \| \| \|	Using string give us more chances to hit shift/reduce conflicts when extending this grammar, more specifically, from the stmt_expr rule, so add keywords for this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add stateful object support for limit	Pablo M. Bermudo Garay	2017-09-04	1	-0/+1
\| \| \| \| \| \| \| \|	This patch adds support for a new type of stateful object: limit. Creation, deletion and listing operations are supported. Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: Make use of yylex_init_extra()	Phil Sutter	2017-08-24	1	-2/+1
\| \| \| \| \| \|	This combines the calls to yylex_init() and yyset_extra(). Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: Fix for wrong parameter type of scanner_destroy()	Phil Sutter	2017-08-24	1	-1/+1
\| \| \| \| \| \| \| \|	The function takes the scanner as argument, not the state. This wasn't a real issue since scanner is a void pointer, which means it's only casted around without need. So this fix is a rather cosmetic one. Signed-off-by: Phil Sutter <phil@nwl.cc>
*	scanner: Fix for memleak due to unclosed file pointer	Phil Sutter	2017-08-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When including a file, it is opened by fopen() and therefore needs to be closed after scanning has finished using fclose(), otherwise valgrind will report a memleak. This patch changes struct input_descriptor to track the opened FILE pointer instead of the file descriptor so the pointer is available for closing in scanner_destroy(). While at it, change erec_print() to work on the open FILE pointer so it doesn't have to call fileno() in beforehand. And as a little bonus, use C99 initializer of the buffer to get rid of the call to memset(). Note that it is necessary to call erec_print_list() prior to destroying the scanner, otherwise it will start manipulating an already freed FILE pointer (and therefore crash the program). Signed-off-by: Phil Sutter <phil@nwl.cc>
*	src: add include_paths to struct nft_ctx	Pablo Neira Ayuso	2017-08-23	1	-5/+5
\| \| \| \| \| \| \|	Not convenient to keep this as static for the upcoming library, so let's move it where it belongs. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: handle rule tracing as an monitor object	Pablo Neira Ayuso	2017-08-02	1	-0/+1
\| \| \| \| \| \|	Traces are not an event type, they should be handled as an object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: free filename when destroying scanner	Eric Leblond	2017-07-17	1	-2/+9
\| \| \| \| \| \| \| \|	To be able to do so we duplicate the name in the indesc if it is set. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: support for wildcards in include statements.	Ismo Puustinen	2017-06-27	1	-119/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use glob() to find paths in include statements. The rules are these: 1. If no files can be found in the pattern with wildcards, do not return an error. 2. Do not match any files beginning with '.'. 3. Do not handle include directories anymore. For example, the statement: include "foo/" would now need to be rewritten: include "foo/*" Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: add files in include dirs in alphabetical order.	Ismo Puustinen	2017-06-07	1	-27/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This means that if you have a directory structure like this /foo /foo/02_rules.nft /foo/01_rules.nft where *.nft files in directory /foo are nft scripts, then an include statement in another nft script like this include "/foo/" guarantees that "01_rules.nft" is loaded before "02_rules.nft". Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: add support for include directories	Ismo Puustinen	2017-06-06	1	-23/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a string after "include" keyword points to a directory instead of a file, consider the directory to contain only nft rule files and try to load them all. This helps with a use case where services drop their own firewall configuration files into a directory and nft needs to include those without knowing the exact file names. File loading order from the include directory is not specified, so the files inside an include directory should not depend on each other. Fixes(Bug 1154 - Allow include statement to operate on directories and/or wildcards). Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	exthdr: Implement existence check	Phil Sutter	2017-03-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	This allows to check for existence of an IPv6 extension or TCP option header by using the following syntax: \| exthdr frag exists \| tcpopt window exists Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	Introduce boolean datatype and boolean expression	Phil Sutter	2017-03-10	1	-0/+3
\| \| \| \| \|	Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: hash: support of symmetric hash	Laura Garcia Liebana	2017-03-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides symmetric hash support according to source ip address and port, and destination ip address and port. The new attribute NFTA_HASH_TYPE has been included to support different types of hashing functions. Currently supported NFT_HASH_JENKINS through jhash and NFT_HASH_SYM through symhash. The main difference between both types are: - jhash requires an expression with sreg, symhash doesn't. - symhash supports modulus and offset, but not seed. Examples: nft add rule ip nat prerouting ct mark set jhash ip saddr mod 2 nft add rule ip nat prerouting ct mark set symhash mod 2 Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: revisit tcp options support	Pablo Neira Ayuso	2017-02-28	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework syntax, add tokens so we can extend the grammar more easily. This has triggered several syntax changes with regards to the original patch, specifically: tcp option sack0 left 1 There is no space between sack and the block number anymore, no more offset field, now they are a single field. Just like we do with rt, rt0 and rt2. This simplifies our grammar and that is good since it makes our life easier when extending it later on to accomodate new features. I have also renamed sack_permitted to sack-permitted. I couldn't find any option using underscore so far, so let's keep it consistent with what we have. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add conntrack zone support	Florian Westphal	2017-02-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables zone get/set support. As the zone can be optionally tied to a direction as well we need a new token for this (unless we turn reply/original into tokens in which case we could handle zone via STRING). There was some discussion on how zone set support should be handled, especially 'zone set 1'. There are several issues to consider: 1. its not possible to change a zone 'later on', any given conntrack flow has exactly one zone for its entire lifetime. 2. to create conntracks in a given zone, the zone therefore has to be assigned before the packet gets picked up by conntrack (so that lookup finds the correct existing flow or the flow is created with the desired zone id). In iptables, this is enforced because zones are assigned with CT target and this is restricted to the 'raw' table in iptables, which runs after defragmentation but before connection tracking. 3. Thus, in nftables the 'ct zone set' rule needs to hook before conntrack too, e.g. via table raw { chain pre { type filter hook prerouting priority -300; iif eth3 ct zone set 23 } chain out { type filter hook output priority -300; oif eth3 ct zone set 23 } } ... but this is not enforced. There were two alternatives to better document this. One was to use an explicit 'template' keyword: nft ... template zone set 23 ... but 'connection tracking templates' are a kernel detail that users should not and need not know about. The other one was to use the meta keyword instead since we're (from a practical point of view) assigning the zone to the packet, not the conntrack: nft ... meta zone set 23 However, next patch also supports 'directional' zones, and nft ... meta original zone 23 makes no sense because 'direction' refers to a direction as understood by the connection tracker. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add TCP option matching	Manuel Messner	2017-02-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch enables nft to match against TCP options. Currently these TCP options are supported: * End of Option List (eol) * No-Operation (noop) * Maximum Segment Size (maxseg) * Window Scale (window) * SACK Permitted (sack_permitted) * SACK (sack) * Timestamps (timestamp) Syntax: tcp options $option_name [$offset] $field_name Example: # count all incoming packets with a specific maximum segment size `x` # nft add rule filter input tcp option maxseg size x counter # count all incoming packets with a SACK TCP option where the third # (counted from zero) left field is greater `x`. # nft add rule filter input tcp option sack 2 left \> x counter If the offset (the `2` in the example above) is zero, it can optionally be omitted. For all non-SACK TCP options it is always zero, thus can be left out. Option names and field names are parsed from templates, similar to meta and ct options rather than via keywords to prevent adding more keywords than necessary. Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
*	ct: add average bytes per packet counter support	Liping Zhang	2017-01-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to connbytes extension in iptables, now you can use it to match average bytes per packet a connection has transferred so far. For example, match avgpkt in "BOTH" diretion: # nft add rule x y ct avgpkt \> 100 Match avgpkt in reply direction: # nft add rule x y ct reply avgpkt \< 900 Or match avgpkt in original direction: # nft add rule x y ct original avgpkt \> 200 Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	scanner: fix search_in_include_path test	Anatole Denis	2017-01-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	clang emits a warning in this function as we're using a boolean as the third argument to strncmp. Indeed, this function only checks the first byte of the path as is, so files beginning with . will be incorrectly included from the current working directory instead of the include directory. Fixes: f92a1a5c4a87 ("scanner: honor absolute and relative paths via include file") Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add stateful object reference expression	Pablo Neira Ayuso	2017-01-03	1	-0/+1
\| \| \| \| \| \| \| \| \|	This patch adds a new objref statement to refer to existing stateful objects from rules, eg. # nft add rule filter input counter name test counter Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: reset internal stateful objects	Pablo Neira Ayuso	2017-01-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to atomically dump and reset stateful objects, eg. # nft list counters table ip filter { counter test { packets 1024 bytes 100000 } } # nft reset quotas table filter counter test { packets 1024 bytes 100000 } # nft reset quotas table filter counter test { packets 0 bytes 0 } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: listing of stateful objects	Pablo Neira Ayuso	2017-01-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows you to dump existing stateful objects, eg. # nft list ruleset table ip filter { counter test { packets 64 bytes 1268 } quota test { over 1 mbytes used 1268 bytes } chain input { type filter hook input priority 0; policy accept; quota name test drop counter name test } } # nft list quotas table ip filter { quota test { over 1 mbytes used 1268 bytes } } # nft list counters table ip filter { counter test { packets 64 bytes 1268 } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add used quota support	Pablo Neira Ayuso	2017-01-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	table ip x { chain y { type filter hook forward priority 0; policy accept; quota over 200 mbytes used 1143 kbytes drop } } This patch allows us to list and to restore used quota. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add log flags syntax support	Liping Zhang	2016-11-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now NF_LOG_XXX is exposed to the userspace, we can set it explicitly. Like iptables LOG target, we can log TCP sequence numbers, TCP options, IP options, UID owning local socket and decode MAC header. Note the log flags are mutually exclusive with group. Some examples are listed below: # nft add rule t c log flags tcp sequence,options # nft add rule t c log flags ip options # nft add rule t c log flags skuid # nft add rule t c log flags ether # nft add rule t c log flags all # nft add rule t c log flags all group 1 <cmdline>:1:14-16: Error: flags and group are mutually exclusive add rule t c log flags all group 1 ^^^ Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add notrack support	Pablo Neira Ayuso	2016-11-14	1	-0/+2
\| \| \| \| \| \| \|	This patch adds the notrack statement, to skip connection tracking for certain packets. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add fib expression	Florian Westphal	2016-10-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the 'fib' expression which can be used to obtain the output interface from the route table based on either source or destination address of a packet. This can be used to e.g. add reverse path filtering: # drop if not coming from the same interface packet # arrived on # nft add rule x prerouting fib saddr . iif oif eq 0 drop # accept only if from eth0 # nft add rule x prerouting fib saddr . iif oif eq "eth0" accept # accept if from any valid interface # nft add rule x prerouting fib saddr oif accept Querying of address type is also supported. This can be used to e.g. only accept packets to addresses configured in the same interface: # fib daddr . iif type local Its also possible to use mark and verdict map, e.g.: # nft add rule x prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : drop, unicast : accept } Signed-off-by: Florian Westphal <fw@strlen.de>
*	rt: introduce routing expression	Anders K. Pedersen	2016-10-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce rt expression for routing related data with support for nexthop (i.e. the directly connected IP address that an outgoing packet is sent to), which can be used either for matching or accounting, eg. # nft add rule filter postrouting \ ip daddr 192.168.1.0/24 rt nexthop != 192.168.0.1 drop This will drop any traffic to 192.168.1.0/24 that is not routed via 192.168.0.1. # nft add rule filter postrouting \ flow table acct { rt nexthop timeout 600s counter } # nft add rule ip6 filter postrouting \ flow table acct { rt nexthop timeout 600s counter } These rules count outgoing traffic per nexthop. Note that the timeout releases an entry if no traffic is seen for this nexthop within 10 minutes. # nft add rule inet filter postrouting \ ether type ip \ flow table acct { rt nexthop timeout 600s counter } # nft add rule inet filter postrouting \ ether type ip6 \ flow table acct { rt nexthop timeout 600s counter } Same as above, but via the inet family, where the ether type must be specified explicitly. "rt classid" is also implemented identical to "meta rtclassid", since it is more logical to have this match in the routing expression going forward. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	meta: allow resolving meta keys at run time	Florian Westphal	2016-10-27	1	-2/+0
\| \| \| \| \| \| \| \| \|	use the meta template to translate the textual token to the enum value. This allows to remove two keywords from the scanner and also means we do not need to introduce new keywords when more meta keys get added. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	ct: allow resolving ct keys at run time	Florian Westphal	2016-10-27	1	-6/+0
\| \| \| \| \| \| \|	... and remove those keywords we no longer need. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	parser: add offset keyword and parser rule	Pablo Neira Ayuso	2016-10-27	1	-0/+1
\| \| \| \| \| \|	This is required by the numgen and jhash expressions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: don't need keyword for log level	Pablo Neira Ayuso	2016-10-21	1	-8/+0
\| \| \| \| \| \| \|	We can handle log levels without keywords in our grammar, use string instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add hash expression	Pablo Neira Ayuso	2016-08-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is special expression that transforms an input expression into a 32-bit unsigned integer. This expression takes a modulus parameter to scale the result and the random seed so the hash result becomes harder to predict. You can use it to set the packet mark, eg. # nft add rule x y meta mark set jhash ip saddr . ip daddr mod 2 seed 0xdeadbeef You can combine this with maps too, eg. # nft add rule x y dnat to jhash ip saddr mod 2 seed 0xdeadbeef map { \ 0 : 192.168.20.100, \ 1 : 192.168.30.100 \ } Currently, this expression implements the jenkins hash implementation available in the Linux kernel: http://lxr.free-electrons.com/source/include/linux/jhash.h But it should be possible to extend it to support any other hash function type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add numgen expression	Pablo Neira Ayuso	2016-08-29	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new expression allows us to generate incremental and random numbers bound to a specified modulus value. The following rule sets the conntrack mark of 0 to the first packet seen, then 1 to second packet, then 0 again to the third packet and so on: # nft add rule x y ct mark set numgen inc mod 2 A more useful example is a simple load balancing scenario, where you can also use maps to set the destination NAT address based on this new numgen expression: # nft add rule nat prerouting \ dnat to numgen inc mod 2 map { 0 : 192.168.10.100, 1 : 192.168.20.200 } So this is distributing new connections in a round-robin fashion between 192.168.10.100 and 192.168.20.200. Don't forget the special NAT chain semantics: Only the first packet evaluates the rule, follow up packets rely on conntrack to apply the NAT information. You can also emulate flow distribution with different backend weights using intervals: # nft add rule nat prerouting \ dnat to numgen inc mod 10 map { 0-5 : 192.168.10.100, 6-9 : 192.168.20.200 } So 192.168.10.100 gets 60% of the workload, while 192.168.20.200 gets 40%. We can also be mixed with dynamic sets, thus weight can be updated in runtime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	src: add quota statement	Pablo Neira Ayuso	2016-08-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This new statement is stateful, so it can be used from flow tables, eg. # nft add rule filter input \ flow table http { ip saddr timeout 60s quota over 50 mbytes } drop This basically sets a quota per source IP address of 50 mbytes after which packets are dropped. Note that the timeout releases the entry if no traffic is seen from this IP after 60 seconds. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>