summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* src: add datatype->describe()Pablo Neira Ayuso2021-03-251-0/+1
| | | | | | | | | | | | As an alternative to print the datatype values when no symbol table is available. Use it to print protocols available via getprotobynumber() which actually refers to /etc/protocols. Not very efficient, getprotobynumber() causes a series of open()/close() calls on /etc/protocols, but this is called from a non-critical path. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1503 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftables: add flags offload to flowtableFrank Wunderlich2021-03-251-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allow flags (currently only offload) in flowtables like it is stated here: https://lwn.net/Articles/804384/ tested on mt7622/Bananapi-R64 table ip filter { flowtable f { hook ingress priority filter + 1 devices = { lan3, lan0, wan } flags offload; } chain forward { type filter hook forward priority filter; policy accept; ip protocol { tcp, udp } flow add @f } } table ip nat { chain post { type nat hook postrouting priority filter; policy accept; oifname "wan" masquerade } } Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* scanner: log: move to own scopeFlorian Westphal2021-03-241-0/+2
| | | | | | | GROUP and PREFIX are used by igmp and nat, so they can't be moved out of INITIAL scope yet. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: counter: move to own scopeFlorian Westphal2021-03-241-0/+1
| | | | | | move bytes/packets away from initial state. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add support for scope nestingFlorian Westphal2021-03-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a COUNTER scope introduces parsing errors. Example: add rule ... counter ip saddr 1.2.3.4 This is supposed to be COUNTER IP SADDR SYMBOL but it will be parsed as COUNTER IP STRING SYMBOL ... and rule fails with unknown saddr. This is because IP state change gets popped right after it was pushed. bison parser invokes scanner_pop_start_cond() helper via 'close_scope_counter' rule after it has processed the entire 'counter' rule. But that happens *after* flex has executed the 'IP' rule. IOW, the sequence of events is not the exepcted "COUNTER close_scope_counter IP SADDR SYMBOL close_scope_ip", it is "COUNTER IP close_scope_counter". close_scope_counter pops the just-pushed SCANSTATE_IP and returns the scanner to SCANSTATE_COUNTER, so next input token (saddr) gets parsed as a string, which gets then rejected from bison. To resolve this, defer the pop operation until the current state is done. scanner_pop_start_cond() already gets the scope that it has been completed as an argument, so we can compare it to the active state. If those are not the same, just defer the pop operation until the bison reports its done with the active flex scope. This leads to following sequence of events: 1. flex switches to SCANSTATE_COUNTER 2. flex switches to SCANSTATE_IP 3. bison calls scanner_pop_start_cond(SCANSTATE_COUNTER) 4. flex remains in SCANSTATE_IP, bison continues 5. bison calls scanner_pop_start_cond(SCANSTATE_IP) once the entire ip rule has completed: this pops both IP and COUNTER. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: secmark: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: quota: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | ... and move "used" keyword to it. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: limit: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | Moves rate and burst out of INITIAL. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: vlan: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | ID needs to remain exposed as its used by ct, icmp, icmp6 and so on. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: arp: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | allows to move the arp specific tokens out of the INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add ether scopeFlorian Westphal2021-03-161-0/+1
| | | | | | | just like previous change: useless as-is, but prepares for removal of saddr/daddr from INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: add fib scopeFlorian Westphal2021-03-161-0/+1
| | | | | | | | | makes no sense as-is because all keywords need to stay in the INITIAL scope. This can be changed after all saddr/daddr users have been scoped. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ip6: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | move flowlabel and hoplimit. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ip: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | Move the ip option names (rr, lsrr, ...) out of INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ct: move to own scopeFlorian Westphal2021-03-161-0/+1
| | | | | | | | | | | | This allows moving multiple ct specific keywords out of INITIAL scope. Next few patches follow same pattern: 1. add a scope_close_XXX rule 2. add a SCANSTATE_XXX & make flex switch to it when encountering XXX keyword 3. make bison leave SCANSTATE_XXXX when it has seen the complete expression. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: move remaining cache functions in rule.c to cache.cPablo Neira Ayuso2021-03-111-2/+4
| | | | | | Move all the cache logic to src/cache.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* scanner: socket: move to own scopeFlorian Westphal2021-03-111-0/+1
| | | | Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: rt: move to own scopeFlorian Westphal2021-03-111-0/+1
| | | | | | | | classid and nexthop can be moved out of INIT scope. Rest are still needed because tehy are used by other expressions as well. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: ipsec: move to own scopeFlorian Westphal2021-03-111-0/+1
| | | | | | ... and hide the ipsec specific tokens from the INITITAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: queue: move to own scopeFlorian Westphal2021-03-111-0/+1
| | | | | | allows to remove 3 queue specific keywords from INITIAL scope. Signed-off-by: Florian Westphal <fw@strlen.de>
* scanner: introduce start condition stackFlorian Westphal2021-03-111-0/+8
| | | | | | | | | | | | | | | | | | | | Add a small initial chunk of flex start conditionals. This starts with two low-hanging fruits, numgen and j/symhash. NUMGEN and HASH start conditions are entered from flex when the corresponding expression token is encountered. Flex returns to the INIT condition when the bison parser has seen a complete numgen/hash statement. This intentionally uses a stack rather than BEGIN() to eventually support nested states. The scanner_pop_start_cond() function argument is not used yet, but will need to be used later to deal with nesting. Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: remove nft_mnl_socket_reopen()Pablo Neira Ayuso2021-03-051-1/+0
| | | | | | | | | | | | | | nft_mnl_socket_reopen() was introduced to deal with the EINTR case. By reopening the netlink socket, pending netlink messages that are part of a stale netlink dump are implicitly drop. This patch replaces the nft_mnl_socket_reopen() strategy by pulling out all of the remaining netlink message to restart in a clean state. This is implicitly fixing up a bug in the table ownership support, which assumes that the netlink socket remains open until nft_ctx_free() is invoked. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: support for the table owner flagPablo Neira Ayuso2021-03-022-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | Add new flag to allow userspace process to own tables: Tables that have an owner can only be updated/destroyed by the owner. The table is destroyed either if the owner process calls nft_ctx_free() or owner process is terminated (implicit table release). The ruleset listing includes the program name that owns the table: nft> list ruleset table ip x { # progname nft flags owner chain y { type filter hook input priority filter; policy accept; counter packets 1 bytes 309 } } Original code to pretty print the netlink portID to program name has been extracted from the conntrack userspace utility. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* table: rework flags printingPablo Neira Ayuso2021-03-021-1/+1
| | | | | | | Simplify routine to print the table flags. Add table_flag_name() and use it from json too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add negation match on singleton bitmask valuePablo Neira Ayuso2021-02-051-0/+1
| | | | | | | | | | | | | | | | | This patch provides a shortcut for: ct status and dnat == 0 which allows to check for the packet whose dnat bit is unset: # nft add rule x y ct status ! dnat counter This operation is only available for expression with a bitmask basetype, eg. # nft describe ct status ct expression, datatype ct_status (conntrack status) (basetype bitmask, integer), 32 bits Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: resync nf_tables.h cache copyPablo Neira Ayuso2021-01-061-1/+25
| | | | | | Get this header in sync with nf-next as of 5.11-rc. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cli: add libedit supportPablo Neira Ayuso2021-01-051-1/+1
| | | | | | | | Extend cli to support for libedit readline shim code: ./configure --with-cli=editline Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: set on flags to request multi-statement supportPablo Neira Ayuso2021-01-041-0/+3
| | | | | | | Old kernel reject requests for element with multiple statements because userspace sets on the flags for multi-statements. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set element multi-statement supportPablo Neira Ayuso2020-12-183-2/+9
| | | | | | | | Extend the set element infrastructure to support for several statements. This patch places the statements right after the key when printing it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support for multi-statement in dynamic sets and mapsPablo Neira Ayuso2020-12-171-2/+2
| | | | | | | | This patch allows for two statements for dynamic set updates, e.g. nft rule x y add @y { ip daddr limit rate 1/second counter } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* payload: auto-remove simple icmp/icmpv6 dependency expressionsFlorian Westphal2020-12-091-1/+3
| | | | | | | | | | | | Instead of: icmpv6 type packet-too-big icmpv6 mtu 1280 display just icmpv6 mtu 1280 The dependency added for id/sequence is still kept, its handled by a anon set instead to cover both the echo 'request' and 'reply' cases. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add auto-dependencies for ipv6 icmp6Florian Westphal2020-12-091-0/+4
| | | | | | Extend the earlier commit to also cover icmpv6. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add auto-dependencies for ipv4 icmpFlorian Westphal2020-12-092-1/+18
| | | | | | | | | | | | | | The ICMP header has field values that are only exist for certain types. Mark the icmp proto 'type' field as a nextheader field and add a new th description to store the icmp type dependency. This can later be re-used for other protocol dependend definitions such as mptcp options -- which are all share the same tcp option number and have a special 4 bit marker inside the mptcp option space that tells how the remaining option looks like. Signed-off-by: Florian Westphal <fw@strlen.de>
* proto: reduce size of proto_desc structureFlorian Westphal2020-12-091-7/+7
| | | | | | | This will need an additional field. We can compress state here to avoid further size increase. Signed-off-by: Florian Westphal <fw@strlen.de>
* exthdr: remove unused proto_key member from structFlorian Westphal2020-12-091-1/+0
| | | | | | also, no need for this struct to be in the parser. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: enable json echo output when reading native syntaxJose M. Guisado Gomez2020-12-022-0/+7
| | | | | | | | | | | | | | | | | | | | This patch fixes a bug in which nft did not print any output when specifying --echo and --json and reading nft native syntax. This patch respects behavior when input is json, in which the output would be the identical input plus the handles. Adds a json_echo member inside struct nft_ctx to build and store the json object containing the json command objects, the object is built using a mock monitor to reuse monitor json code. This json object is only used when we are sure we have not read json from input. [ added json_alloc_echo() to compile without json support --pablo ] Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1446 Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: Fix ARP header field orderingPhil Sutter2020-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | In ARP header, destination ether address sits between source IP and destination IP addresses. Enum arp_hdr_fields had this wrong, which in turn caused wrong ordering of entries in proto_arp->templates. When expanding a combined payload expression, code assumes that template entries are ordered by header offset, therefore the destination ether address match was printed as raw if an earlier field was matched as well: | arp saddr ip 192.168.1.1 arp daddr ether 3e:d1:3f:d6:12:0b was printed as: | arp saddr ip 192.168.1.1 @nh,144,48 69068440080907 Note: Although strictly not necessary, reorder fields in proto_arp->templates as well to match their actual ordering, just to avoid confusion. Fixes: 4b0f2a712b579 ("src: support for arp sender and target ethernet and IPv4 addresses") Signed-off-by: Phil Sutter <phil@nwl.cc>
* tcpopt: allow to check for presence of any tcp optionFlorian Westphal2020-11-091-1/+2
| | | | | | | | | | | | | nft currently doesn't allow to check for presence of arbitrary tcp options. Only known options where nft provides a template can be tested for. This allows to test for presence of raw protocol values as well. Example: tcp option 42 exists Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: split tcpopt_hdr_fields into per-option enumFlorian Westphal2020-11-091-10/+36
| | | | | | | | | | | | | | | | | | Currently we're limited to ten template fields in exthdr_desc struct. Using a single enum for all tpc option fields thus won't work indefinitely (TCPOPTHDR_FIELD_TSECR is 9) when new option templates get added. Fortunately we can just use one enum per tcp option to avoid this. As a side effect this also allows to simplify the sack offset calculations. Rather than computing that on-the-fly, just add extra fields to the SACK template. expr->exthdr.offset now holds the 'raw' value, filled in from the option template. This would ease implementation of 'raw option matching' using offset and length to load from the option. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopts: clean up parser -> tcpopt.c plumbingFlorian Westphal2020-11-091-17/+18
| | | | | | | | | | | | | | | | | tcpopt template mapping is asymmetric: one mapping is to match dumped netlink exthdr expression to the original tcp option template. This struct is indexed by the raw, on-write kind/type number. The other mapping maps parsed options to the tcp option template. Remove the latter. The parser is changed to translate the textual option name, e.g. "maxseg" to the on-wire number. This avoids the second mapping, it will also allow to more easily support raw option matching in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: improve rule error reportingPablo Neira Ayuso2020-10-201-2/+25
| | | | | | | | | | | | | | | | | | | | | Kernel provides information regarding expression since 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). A common mistake is to refer a chain which does not exist, e.g. # nft add rule x y jump test Error: Could not process rule: No such file or directory add rule x y jump test ^^^^ Use the existing netlink extended error reporting infrastructure to provide better error reporting as in the example above. Requires Linux kernel patch 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: constify location parameter in cmd_add_loc()Pablo Neira Ayuso2020-10-191-3/+3
| | | | | | | Constify pointer to location object to compile check for unintentional updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: larger number of error locationsPablo Neira Ayuso2020-10-191-1/+1
| | | | | | | | | | Statically store up to 32 locations per command, if the number of locations is larger than 32, then skip rather than hit assertion. Revisit this later to dynamically store location per command using a hashtable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* proto: add sctp crc32 checksum fixupFlorian Westphal2020-10-152-0/+3
| | | | | | | | | | Stateless SCTP header mangling doesn't work reliably. This tells the kernel to update the checksum field using the sctp crc32 algorithm. Note that this needs additional kernel support to work. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: ingress inet supportPablo Neira Ayuso2020-10-131-0/+1
| | | | | | | | | | | | | | | | | | Add support for inet ingress chains. table inet filter { chain ingress { type filter hook ingress device "veth0" priority filter; policy accept; } chain input { type filter hook input priority filter; policy accept; } chain forward { type filter hook forward priority filter; policy accept; } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for chainsJose M. Guisado Gomez2020-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | This patch enables the user to specify a comment when adding a chain. Relies on kernel space supporting userdata for chains. > nft add table ip filter > nft add chain ip filter input { comment "test"\; type filter hook input priority 0\; policy accept\; } > list ruleset table ip filter { chain input { comment "test" type filter hook input priority filter; policy accept; } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: context tracking for multiple transport protocolsPablo Neira Ayuso2020-09-152-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the protocol context infrastructure to track multiple transport protocols when they are specified from sets. This removes errors like: "transport protocol mapping is only valid after transport protocol match" when invoking: # nft add rule x z meta l4proto { tcp, udp } dnat to 1.1.1.1:80 This patch also catches conflicts like: # nft add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ and: # nft add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ Note that: - the singleton protocol context tracker is left in place until the existing users are updated to use this new multiprotocol tracker. Moving forward, it would be good to consolidate things around this new multiprotocol context tracker infrastructure. - link and network layers are not updated to use this infrastructure yet. The code that deals with vlan conflicts relies on forcing protocol context updates to the singleton protocol base. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for objectsJose M. Guisado Gomez2020-09-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enables specifying an optional comment when declaring named objects. The comment is to be specified inside the object's block ({} block) Relies on libnftnl exporting nftnl_obj_get_data and kernel space support to store the comments. For consistency, this patch makes the comment be printed first when listing objects. Adds a testcase importing all commented named objects except for secmark, although it's supported. Example: Adding a quota with a comment > add table inet filter > nft add quota inet filter q { over 1200 bytes \; comment "test_comment"\; } > list ruleset table inet filter { quota q { comment "test_comment" over 1200 bytes } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* socket: add support for "wildcard" keyBalazs Scheidler2020-08-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iptables had a "-m socket --transparent" which didn't match sockets that are bound to all addresses (e.g. 0.0.0.0 for ipv4, and ::0 for ipv6). It was possible to override this behavior by using --nowildcard, in which case it did match zero bound sockets as well. The issue is that nftables never included the wildcard check, so in effect it behaved like "iptables -m socket --transparent --nowildcard" with no means to exclude wildcarded listeners. This is a problem as a user-space process that binds to 0.0.0.0:<port> that enables IP_TRANSPARENT would effectively intercept traffic going in _any_ direction on the specific port, whereas in most cases, transparent proxies would only need this for one specific address. The solution is to add "socket wildcard" key to the nft_socket module, which makes it possible to match on the wildcardness of a socket from one's ruleset. This is how to use it: table inet haproxy { chain prerouting { type filter hook prerouting priority -150; policy accept; socket transparent 1 socket wildcard 0 mark set 0x00000001 } } This patch effectively depends on its counterpart in the kernel. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support when adding tablesJose M. Guisado Gomez2020-08-281-0/+1
| | | | | | | | | | | | | | | | | | | Adds userdata building logic if a comment is specified when creating a new table. Adds netlink userdata parsing callback function. Relies on kernel supporting userdata for nft_table. Example: > nft add table ip x { comment "test"\; } > nft list ruleset table ip x { comment "test" } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>