summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* exthdr: remove unused proto_key member from structFlorian Westphal2020-12-092-5/+0
| | | | | | also, no need for this struct to be in the parser. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser_bison: double close_scope() call for implicit chainsPablo Neira Ayuso2020-12-081-1/+1
| | | | | | | | Call close_scope() from chain_block_alloc only. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1485 Fixes: c330152b7f77 ("src: support for implicit chain bindings") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: fix formatting of if statementsJose M. Guisado Gomez2020-12-081-6/+6
| | | | | | | Replace some "if(" introduced in cb7e02f4 by "if (" Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: add assignment check for json_echoJose M. Guisado Gomez2020-12-081-2/+2
| | | | | | | | | | | | | When --echo and --json is specified and native syntax is read, only the last instruction is printed. This happens because the reference to the json_echo is reassigned each time netlink_echo_callback is executed for an instruction to be echoed. Add an assignment check for json_echo to avoid reassigning it. Fixes: cb7e02f44d6a (src: enable json echo output when reading native syntax) Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: report EPERM for non-root usersPablo Neira Ayuso2020-12-042-2/+7
| | | | | | | | | $ /usr/sbin/nft list ruleset Operation not permitted (you must be root) Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1372 Acked-by: Arturo Borrero Gonzalez <arturo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: reply netlink error message might be larger than MNL_SOCKET_BUFFER_SIZEPablo Neira Ayuso2020-12-041-1/+4
| | | | | | | | | | | | | | | | | Netlink attribute maximum size is 65536 bytes (given nla_len is 16-bits). NFTA_SET_ELEM_LIST_ELEMENTS stores as many set elements as possible that can fit into this netlink attribute. Netlink messages with NLMSG_ERROR type originating from the kernel contain the original netlink message as payload, they might be larger than 65536 bytes. Add NFT_MNL_ACK_MAXSIZE which estimates the maximum Netlink header coming as (error) reply from the kernel. This estimate is based on the maximum netlink message size that nft sends from userspace. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1464 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: allow to restore limit from dynamic setPablo Neira Ayuso2020-12-041-0/+32
| | | | | | | Update parser to allow to restore limit per set element in dynamic set. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1477 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: Fix seqnum_to_json() functionalityPhil Sutter2020-12-041-4/+23
| | | | | | | | | | | | | | | | | | | Introduction of json_cmd_assoc_hash missed that by the time the hash table insert happens, the struct cmd object's 'seqnum' field which is used as key is not initialized yet. This doesn't happen until nft_netlink() prepares the batch object which records the lowest seqnum. Therefore push all json_cmd_assoc objects into a temporary list until the first lookup happens. At this time, all referenced cmd objects have their seqnum set and the list entries can be moved into the hash table for fast lookups. To expose such problems in the future, make json_events_cb() emit an error message if the passed message has a handle but no assoc entry is found for its seqnum. Fixes: 389a0e1edc89a ("json: echo: Speedup seqnum_to_json()") Cc: Derek Dai <daiderek@gmail.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: enable json echo output when reading native syntaxJose M. Guisado Gomez2020-12-023-17/+58
| | | | | | | | | | | | | | | | | | | | This patch fixes a bug in which nft did not print any output when specifying --echo and --json and reading nft native syntax. This patch respects behavior when input is json, in which the output would be the identical input plus the handles. Adds a json_echo member inside struct nft_ctx to build and store the json object containing the json command objects, the object is built using a mock monitor to reuse monitor json code. This json object is only used when we are sure we have not read json from input. [ added json_alloc_echo() to compile without json support --pablo ] Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=1446 Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: echo: Speedup seqnum_to_json()Phil Sutter2020-11-231-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | Derek Dai reports: "If there are a lot of command in JSON node, seqnum_to_json() will slow down application (eg: firewalld) dramatically since it iterate whole command list every time." He sent a patch implementing a lookup table, but we can do better: Speed this up by introducing a hash table to store the struct json_cmd_assoc objects in, taking their netlink sequence number as key. Quickly tested restoring a ruleset containing about 19k rules: | # time ./before/nft -jeaf large_ruleset.json >/dev/null | 4.85user 0.47system 0:05.48elapsed 97%CPU (0avgtext+0avgdata 69732maxresident)k | 0inputs+0outputs (15major+16937minor)pagefaults 0swaps | # time ./after/nft -jeaf large_ruleset.json >/dev/null | 0.18user 0.44system 0:00.70elapsed 89%CPU (0avgtext+0avgdata 68484maxresident)k | 0inputs+0outputs (15major+16645minor)pagefaults 0swaps Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1479 Reported-by: Derek Dai <daiderek@gmail.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* proto: Fix ARP header field orderingPhil Sutter2020-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | In ARP header, destination ether address sits between source IP and destination IP addresses. Enum arp_hdr_fields had this wrong, which in turn caused wrong ordering of entries in proto_arp->templates. When expanding a combined payload expression, code assumes that template entries are ordered by header offset, therefore the destination ether address match was printed as raw if an earlier field was matched as well: | arp saddr ip 192.168.1.1 arp daddr ether 3e:d1:3f:d6:12:0b was printed as: | arp saddr ip 192.168.1.1 @nh,144,48 69068440080907 Note: Although strictly not necessary, reorder fields in proto_arp->templates as well to match their actual ordering, just to avoid confusion. Fixes: 4b0f2a712b579 ("src: support for arp sender and target ethernet and IPv4 addresses") Signed-off-by: Phil Sutter <phil@nwl.cc>
* json: tcp: add raw tcp option match supportFlorian Westphal2020-11-092-26/+48
| | | | | | | | | To similar change as in previous one, this time for the jason (de)serialization. Re-uses the raw payload match syntax, i.e. base,offset,length. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcp: add raw tcp option match supportFlorian Westphal2020-11-093-4/+16
| | | | | | tcp option @42,16,4 (@kind,offset,length). Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: allow to check for presence of any tcp optionFlorian Westphal2020-11-095-7/+57
| | | | | | | | | | | | | nft currently doesn't allow to check for presence of arbitrary tcp options. Only known options where nft provides a template can be tested for. This allows to test for presence of raw protocol values as well. Example: tcp option 42 exists Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: split tcpopt_hdr_fields into per-option enumFlorian Westphal2020-11-098-95/+103
| | | | | | | | | | | | | | | | | | Currently we're limited to ten template fields in exthdr_desc struct. Using a single enum for all tpc option fields thus won't work indefinitely (TCPOPTHDR_FIELD_TSECR is 9) when new option templates get added. Fortunately we can just use one enum per tcp option to avoid this. As a side effect this also allows to simplify the sack offset calculations. Rather than computing that on-the-fly, just add extra fields to the SACK template. expr->exthdr.offset now holds the 'raw' value, filled in from the option template. This would ease implementation of 'raw option matching' using offset and length to load from the option. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopt: rename noop to nopFlorian Westphal2020-11-091-1/+1
| | | | | | | 'nop' is the tcp padding "option". "noop" is retained for compatibility on parser side. Signed-off-by: Florian Westphal <fw@strlen.de>
* tcpopts: clean up parser -> tcpopt.c plumbingFlorian Westphal2020-11-094-76/+55
| | | | | | | | | | | | | | | | | tcpopt template mapping is asymmetric: one mapping is to match dumped netlink exthdr expression to the original tcp option template. This struct is indexed by the raw, on-write kind/type number. The other mapping maps parsed options to the tcp option template. Remove the latter. The parser is changed to translate the textual option name, e.g. "maxseg" to the on-wire number. This avoids the second mapping, it will also allow to more easily support raw option matching in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* parser: merge sack-perm/sack-permitted and maxseg/mssFlorian Westphal2020-11-093-12/+10
| | | | | | | | | | | | | | | | | | | | | | One was added by the tcp option parsing ocde, the other by synproxy. So we have: synproxy ... sack-perm synproxy ... mss and tcp option maxseg tcp option sack-permitted This kills the extra tokens on the scanner/parser side, so sack-perm and sack-permitted can both be used. Likewise, 'synproxy maxseg' and 'tcp option mss size 42' will work too. On the output side, the shorter form is now preferred, i.e. sack-perm and mss. Signed-off-by: Florian Westphal <fw@strlen.de>
* json: add missing nat_type flag and netmap nat flagFlorian Westphal2020-11-052-10/+103
| | | | | | | | | | | | | | | | JSON in/output doesn't know about nat_type and thus cannot save/restore nat mappings involving prefixes or concatenations because the snat statement lacks the prefix/concat/interval type flags. Furthermore, bison parser was extended to support netmap. This is done with an internal 'netmap' flag that is passed to the kernel. We need to dump/restore that as well. Also make sure ip/snat.t passes in json mode. Fixes: 35a6b10c1bc4 ("src: add netmap support") Fixes: 9599d9d25a6b ("src: NAT support for intervals in maps") Signed-off-by: Florian Westphal <fw@strlen.de>
* src: Optimize prefix matches on byte-boundariesPhil Sutter2020-11-042-3/+6
| | | | | | | | | | | | | | | | If a prefix expression's length is on a byte-boundary, it is sufficient to just reduce the length passed to "cmp" expression. No need for explicit bitwise modification of data on LHS. The relevant code is already there, used for string prefix matches. There is one exception though, namely zero-length prefixes: Kernel doesn't accept zero-length "cmp" expressions, so keep them in the old code-path for now. This patch depends upon the previous one to correctly parse odd-sized payload matches but has to extend support for non-payload LHS as well. In practice, this is needed for "ct" expressions as they allow matching against IP address prefixes, too. Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: Support odd-sized payload matchesPhil Sutter2020-11-042-0/+11
| | | | | | | | | | | | When expanding a payload match, don't disregard oversized templates at the right offset. A more flexible user may extract less bytes from the packet if only parts of a field are interesting, e.g. only the prefix of source/destination address. Support that by using the template, but fix the length. Later when creating a relational expression for it, detect the unusually small payload expression length and turn the RHS value into a prefix expression. Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: add netdev support for reject defaultJose M. Guisado Gomez2020-11-021-0/+1
| | | | | | | | | | Enables not specifying any icmp type and code when using reject inside netdev. This patch completely enables using reject for the netdev family. Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Revert "monitor: do not print generation ID with --echo"Pablo Neira Ayuso2020-10-231-1/+1
| | | | | | | | | | | | | Revert 0e258556f7f3 ("monitor: do not print generation ID with --echo"). There is actually a kernel bug which is preventing from displaying this generation ID message. Update the tests/shell to remove the last line of the --echo output which displays the generation ID once the "netfilter: nftables: fix netlink report logic in flowtable and genid" kernel fix is applied. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* monitor: do not print generation ID with --echoPablo Neira Ayuso2020-10-221-1/+1
| | | | | | This fixes testcases/sets/0036add_set_element_expiration_0 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: Fix memleak in set_dtype_json()Phil Sutter2020-10-221-1/+1
| | | | | | | | | Turns out json_string() already dups the input, so the temporary dup passed to it is lost. Fixes: e70354f53e9f6 ("libnftables: Implement JSON output support") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: UAF in interval_map_decompose()Pablo Neira Ayuso2020-10-201-3/+5
| | | | | | | | | | | reported by tests/monitor# bash run-tests.sh ... SUMMARY: AddressSanitizer: heap-use-after-free /home/pablo/devel/scm/git-netfilter/nftables/src/expression.c:1385 in expr_ops Due to incorrect structure layout when calling interval_expr_copy(). Fixes: c1f0476fd590 ("segtree: copy expr data to closing element") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: improve rule error reportingPablo Neira Ayuso2020-10-202-60/+176
| | | | | | | | | | | | | | | | | | | | | Kernel provides information regarding expression since 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). A common mistake is to refer a chain which does not exist, e.g. # nft add rule x y jump test Error: Could not process rule: No such file or directory add rule x y jump test ^^^^ Use the existing netlink extended error reporting infrastructure to provide better error reporting as in the example above. Requires Linux kernel patch 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: constify location parameter in cmd_add_loc()Pablo Neira Ayuso2020-10-192-9/+10
| | | | | | | Constify pointer to location object to compile check for unintentional updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* rule: larger number of error locationsPablo Neira Ayuso2020-10-191-1/+3
| | | | | | | | | | Statically store up to 32 locations per command, if the number of locations is larger than 32, then skip rather than hit assertion. Revisit this later to dynamically store location per command using a hashtable. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* segtree: copy expr data to closing elementFlorian Westphal2020-10-151-40/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When last expr has no closing element we did not propagate expr properties such as comment or expire date to the newly allocated set elem. Before: nft create table t nft 'add set t s { type ipv4_addr; flags interval; timeout 60s; }' nft add element t s { 224.0.0.0/3 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0-255.255.255.255 } nft flush set t s nft add element t s { 224.0.0.0/4, 240.0.0.0/4 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0/4 expires 55s152ms, 240.0.0.0-255.255.255.255 } nft delete set t s nft 'add set t s { type ipv4_addr; flags interval; auto-merge; timeout 60s; }' nft add element t s { 224.0.0.0/4, 240.0.0.0/4 } nft list set t s | grep -o 'elements.*' elements = { 224.0.0.0-255.255.255.255 } After: elements = { 224.0.0.0-255.255.255.255 expires 58s515ms } elements = { 224.0.0.0/4 expires 54s622ms, 240.0.0.0-255.255.255.255 expires 54s622ms } elements = { 224.0.0.0-255.255.255.255 expires 57s92ms } Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1454 Signed-off-by: Florian Westphal <fw@strlen.de>
* proto: add sctp crc32 checksum fixupFlorian Westphal2020-10-152-1/+9
| | | | | | | | | | Stateless SCTP header mangling doesn't work reliably. This tells the kernel to update the checksum field using the sctp crc32 algorithm. Note that this needs additional kernel support to work. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: ingress inet supportPablo Neira Ayuso2020-10-132-2/+8
| | | | | | | | | | | | | | | | | | Add support for inet ingress chains. table inet filter { chain ingress { type filter hook ingress device "veth0" priority filter; policy accept; } chain input { type filter hook input priority filter; policy accept; } chain forward { type filter hook forward priority filter; policy accept; } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Solves Bug 1462 - `nft -j list set` does not show countersGopal Yadav2020-10-081-1/+7
| | | | | | | | | Element counters reside in 'stmt' field as counter statement. Append them to 'elem' object as additional 'counter' property, generated by counter_stmt_json(). Signed-off-by: Gopal Yadav <gopunop@gmail.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
* evaluate: Reject quoted strings containing only wildcardPhil Sutter2020-10-061-2/+5
| | | | | | | | | | | | | | | | | | | | | Fix for an assertion fail when trying to match against an all-wildcard interface name: | % nft add rule t c iifname '"*"' | nft: expression.c:402: constant_expr_alloc: Assertion `(((len) + (8) - 1) / (8)) > 0' failed. | zsh: abort nft add rule t c iifname '"*"' Fix this by detecting the string in expr_evaluate_string() and returning an error message: | % nft add rule t c iifname '"*"' | Error: All-wildcard strings are not supported | add rule t c iifname "*" | ^^^ While being at it, drop the 'datalen >= 1' clause from the following conditional as together with the added check for 'datalen == 0', all possible other values have been caught already.
* src: add comment support for chainsJose M. Guisado Gomez2020-09-304-0/+54
| | | | | | | | | | | | | | | | | | | | This patch enables the user to specify a comment when adding a chain. Relies on kernel space supporting userdata for chains. > nft add table ip filter > nft add chain ip filter input { comment "test"\; type filter hook input priority 0\; policy accept\; } > list ruleset table ip filter { chain input { comment "test" type filter hook input priority filter; policy accept; } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* json: Combining --terse with --json has no effectGopal Yadav2020-09-221-1/+1
| | | | | | | | --terse with --json is ignored, fix this. This patch also includes a test. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1388 Signed-off-by: Gopal Yadav <gopunop@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: context tracking for multiple transport protocolsPablo Neira Ayuso2020-09-156-17/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the protocol context infrastructure to track multiple transport protocols when they are specified from sets. This removes errors like: "transport protocol mapping is only valid after transport protocol match" when invoking: # nft add rule x z meta l4proto { tcp, udp } dnat to 1.1.1.1:80 This patch also catches conflicts like: # nft add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z ip protocol { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ and: # nft add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 Error: conflicting protocols specified: udp vs. tcp add rule x z meta l4proto { tcp, udp } tcp dport 20 dnat to 1.1.1.1:80 ^^^^^^^^^ Note that: - the singleton protocol context tracker is left in place until the existing users are updated to use this new multiprotocol tracker. Moving forward, it would be good to consolidate things around this new multiprotocol context tracker infrastructure. - link and network layers are not updated to use this infrastructure yet. The code that deals with vlan conflicts relies on forcing protocol context updates to the singleton protocol base. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* evaluate: remove one indent level in __expr_evaluate_payload()Pablo Neira Ayuso2020-09-141-25/+24
| | | | | | | If there is protocol context for this base, just return from function to remove one level of indentation. This patch is cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: avoid repeated command list traversal on errorsJindrich Makovicka2020-09-141-2/+14
| | | | | | | | | Because the command seqnums are monotonic, repeated traversals of the cmds list from the beginning are not necessary as long as the error seqnums are also monotonic. Signed-off-by: Jindrich Makovicka <makovick@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: larger receive socket buffer for netlink errorsPablo Neira Ayuso2020-09-141-16/+5
| | | | | | | | Assume each error in the batch will result in a 1k notification for the non-echo flag set on case as described in 860671662d3f ("mnl: fix --echo buffer size again"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_bison: fail when specifying multiple commentsJose M. Guisado Gomez2020-09-141-0/+63
| | | | | | | | | | | | | | | | | | | | Before this patch grammar supported specifying multiple comments, and only the last value would be assigned. This patch adds a function to test if an attribute is already assigned and, if so, calls erec_queue with this attribute location. Use this function in order to check for duplication (or more) of comments for actions that support it. > nft add table inet filter { flags "dormant"\; comment "test"\; comment "another"\;} Error: You can only specify this once. This statement is duplicated. add table inet filter { flags dormant; comment test; comment another;} ^^^^^^^^^^^^^^^^ Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for objectsJose M. Guisado Gomez2020-09-084-0/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enables specifying an optional comment when declaring named objects. The comment is to be specified inside the object's block ({} block) Relies on libnftnl exporting nftnl_obj_get_data and kernel space support to store the comments. For consistency, this patch makes the comment be printed first when listing objects. Adds a testcase importing all commented named objects except for secmark, although it's supported. Example: Adding a quota with a comment > add table inet filter > nft add quota inet filter q { over 1200 bytes \; comment "test_comment"\; } > list ruleset table inet filter { quota q { comment "test_comment" over 1200 bytes } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mergesort: find base value expression type via recursionPablo Neira Ayuso2020-09-041-25/+40
| | | | | | | | | | | | | | | | Sets that store flags might contain a mixture of values and binary operations. Find the base value type via recursion to compare the expressions. Make sure concatenations are listed in a deterministic way via concat_expr_msort_value() which builds a mpz value with the tuple. Adjust a few tests after this update since listing differs after this update. Fixes: 14ee0a979b62 ("src: sort set elements in netlink_get_setelems()") Fixes: 3926a3369bb5 ("mergesort: unbreak listing with binops") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src/scanner.l: fix whitespace issue for the TRANSPARENT keywordBalazs Scheidler2020-08-291-1/+1
| | | | | Signed-off-by: Balazs Scheidler <bazsi77@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* socket: add support for "wildcard" keyBalazs Scheidler2020-08-295-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iptables had a "-m socket --transparent" which didn't match sockets that are bound to all addresses (e.g. 0.0.0.0 for ipv4, and ::0 for ipv6). It was possible to override this behavior by using --nowildcard, in which case it did match zero bound sockets as well. The issue is that nftables never included the wildcard check, so in effect it behaved like "iptables -m socket --transparent --nowildcard" with no means to exclude wildcarded listeners. This is a problem as a user-space process that binds to 0.0.0.0:<port> that enables IP_TRANSPARENT would effectively intercept traffic going in _any_ direction on the specific port, whereas in most cases, transparent proxies would only need this for one specific address. The solution is to add "socket wildcard" key to the nft_socket module, which makes it possible to match on the wildcardness of a socket from one's ruleset. This is how to use it: table inet haproxy { chain prerouting { type filter hook prerouting priority -150; policy accept; socket transparent 1 socket wildcard 0 mark set 0x00000001 } } This patch effectively depends on its counterpart in the kernel. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support when adding tablesJose M. Guisado Gomez2020-08-284-2/+56
| | | | | | | | | | | | | | | | | | | Adds userdata building logic if a comment is specified when creating a new table. Adds netlink userdata parsing callback function. Relies on kernel supporting userdata for nft_table. Example: > nft add table ip x { comment "test"\; } > nft list ruleset table ip x { comment "test" } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add chain hashtable cachePablo Neira Ayuso2020-08-264-62/+116
| | | | | | | | | | | | | | | | | | | | | | | | This significantly improves ruleset listing time with large rulesets (~50k rules) with _lots_ of non-base chains. # time nft list ruleset &> /dev/null Before this patch: real 0m11,172s user 0m6,810s sys 0m4,220s After this patch: real 0m4,747s user 0m0,802s sys 0m3,912s This patch also removes list_bindings from netlink_ctx since there is no need to keep a temporary list of chains anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add expression handler hashtablePablo Neira Ayuso2020-08-262-10/+38
| | | | | | | netlink_parsers is actually small, but update this code to use a hashtable instead since more expressions may come in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftables: dump raw element info from libnftnl when netlink debugging is onFlorian Westphal2020-08-201-2/+38
| | | | | | | | | | | | | | | | Example: nft --debug=netlink list ruleset inet firewall @knock_candidates_ipv4 element 0100007f 00007b00 : 0 [end] element 0200007f 0000f1ff : 0 [end] element 0100007f 00007a00 : 0 [end] inet firewall @__set0 element 00000100 : 0 [end] element 00000200 : 0 [end] inet firewall knock-input 3 [ meta load l4proto => reg 1 ] ... Signed-off-by: Florian Westphal <fw@strlen.de>
* mergesort: unbreak listing with binopsPablo Neira Ayuso2020-08-201-0/+2
| | | | | | | | | | | | | tcp flags == {syn, syn|ack} tcp flags & (fin|syn|rst|psh|ack|urg) == {ack, psh|ack, fin, fin|psh|ack} results in: BUG: Unknown expression binop nft: mergesort.c:47: expr_msort_cmp: Assertion `0' failed. Aborted (core dumped) Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>