summaryrefslogtreecommitdiffstats
path: root/src/mnl.c
Commit message (Collapse)AuthorAgeFilesLines
* mnl: Don't use nftnl_set_set()Phil Sutter2019-10-151-1/+1
| | | | | | | | | The function is unsafe to use as it effectively bypasses data length checks. Instead use nftnl_set_set_str() which at least asserts a const char pointer is passed. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: do not cache sender buffer sizePablo Neira Ayuso2019-09-221-6/+6
| | | | | | | | | | | SO_SNDBUF never fails, this socket option just provides a hint to the kernel. SO_SNDBUFFORCE sets the buffer size to zero if the value goes over INT_MAX. Userspace is caching the buffer hint that sends to the kernel, so it might leave userspace out of sync if the kernel ignores the hint. Do not make assumptions, fetch the sender buffer size from the kernel via getsockopt(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add synproxy stateful object supportFernando Fernandez Mancera2019-09-131-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Add support for "synproxy" stateful object. For example (for TCP port 80 and using maps with saddr): table ip foo { synproxy https-synproxy { mss 1460 wscale 7 timestamp sack-perm } synproxy other-synproxy { mss 1460 wscale 5 } chain bar { tcp dport 80 synproxy name "https-synproxy" synproxy name ip saddr map { 192.168.1.0/24 : "https-synproxy", 192.168.2.0/24 : "other-synproxy" } } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: fix --echo buffer size againPablo Neira Ayuso2019-09-101-12/+14
| | | | | | | | | | | | | | | | If restart is triggered with --echo, it causes rules to be duplicated which is not correct. Remove restart logic. 1. If user passes --echo, use a default 4mb buffer. 2. assume each element in the batch will result in a 1k notification. This passes tests both in x86_64 and s390. Joint work with Florian Westphal. Fixes: 877baf9538f6 ("src: mnl: retry when we hit -ENOBUFS") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: mnl: retry when we hit -ENOBUFSFlorian Westphal2019-08-141-2/+10
| | | | | | | | | | | | | | | | | | | | tests/shell/testcases/transactions/0049huge_0 still fails with ENOBUFS error after endian fix done in previous patch. Its enough to increase the scale factor (4) on s390x, but rather than continue with these "guess the proper size" game, just increase the buffer size and retry up to 3 times. This makes above test work on s390x. So, implement what Pablo suggested in the earlier commit: We could also explore increasing the buffer and retry if mnl_nft_socket_sendmsg() hits ENOBUFS if we ever hit this problem again. v2: call setsockopt unconditionally, then increase on error. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: mnl: fix setting rcvbuffer sizeFlorian Westphal2019-08-131-1/+1
| | | | | | | | | | | | | Kernel expects socklen_t (int). Using size_t causes kernel to read upper 0-bits. This caused tests/shell/testcases/transactions/0049huge_0 to fail on s390x -- it uses 'echo' mode and will quickly overrun the tiny buffer size set due to this bug. Fixes: 89c82c261bb5 ("mnl: estimate receiver buffer size") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow variable in chain policyFernando Fernandez Mancera2019-08-081-3/+6
| | | | | | | | | | | | This patch allows you to use variables in chain policy definition, e.g. define default_policy = "accept" add table ip foo add chain ip foo bar {type filter hook input priority filter; policy $default_policy} Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow variables in the chain priority specificationFernando Fernandez Mancera2019-08-081-4/+9
| | | | | | | | | | | | | | | | | This patch allows you to use variables in chain priority definitions, e.g. define prio = filter define prionum = 10 define prioffset = "filter - 150" add table ip foo add chain ip foo bar { type filter hook input priority $prio; } add chain ip foo ber { type filter hook input priority $prionum; } add chain ip foo bor { type filter hook input priority $prioffset; } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add ct expectations supportStéphane Veyret2019-07-161-0/+13
| | | | | | | This modification allow to directly add/list/delete expectations. Signed-off-by: Stéphane Veyret <sveyret@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set_is_datamap(), set_is_objmap() and set_is_map() helpersPablo Neira Ayuso2019-07-161-3/+3
| | | | | | | | | | | | | Two map types are currently possible: * data maps, ie. set_is_datamap(). * object maps, ie. set_is_objmap(). This patch adds helper functions to check for the map type. set_is_map() allows you to check for either map type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove unnecessary NLM_F_ACK flagsPablo Neira Ayuso2019-07-031-7/+7
| | | | | | | | | | | On error, the kernel already sends to userspace an acknowledgement for the table and chain deletion case. In case of NLM_F_DUMP, the NLM_F_ACK is not required as the kernel always sends a NLMSG_DONE at the end of the dumping, even if the list of objects is empty. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: Support intra-transaction rule referencesPhil Sutter2019-06-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | A rule may be added before or after another one using index keyword. To support for the other rule being added within the same batch, one has to make use of NFTNL_RULE_ID and NFTNL_RULE_POSITION_ID attributes. This patch does just that among a few more crucial things: * If cache is complete enough to contain rules, update cache when evaluating rule commands so later index references resolve correctly. * Reduce rule_translate_index() to its core code which is the actual linking of rules and consequently rename the function. The removed bits are pulled into the calling rule_evaluate() to reduce code duplication in between cache updates with and without rule reference. * Pass the current command op to rule_evaluate() as indicator whether to insert before or after a referenced rule or at beginning or end of chain in cache. Exploit this from chain_evaluate() to avoid adding the chain's rules a second time. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: bogus error when running monitor modePablo Neira Ayuso2019-06-071-3/+1
| | | | | | | | | | Fix bogus error message: # nft monitor Cannot set up netlink socket buffer size to 16777216 bytes, falling back to 16777216 bytes Fixes: bcf60fb819bf ("mnl: add mnl_set_rcvbuffer() and use it") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: generation ID is 32-bit longPablo Neira Ayuso2019-06-071-3/+8
| | | | | | | | Update mnl_genid_get() to return 32-bit long generation ID. Add nft_genid_u16() which allows us to catch ruleset updates from the netlink dump path via 16-bit long nfnetlink resource ID field. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: single cache_update() call to build cache before evaluationPablo Neira Ayuso2019-06-061-7/+1
| | | | | | | | | | | | | | | This patch allows us to make one single cache_update() call. Thus, there is not need to rebuild an incomplete cache from the middle of the batch processing. Note that nft_run_cmd_from_filename() does not need a full netlink dump to build the cache anymore, this should speed nft -f with incremental updates and very large rulesets. cache_evaluate() calculates the netlink dump to populate the cache that this batch needs. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Simplify mnl_batch_talk()Phil Sutter2019-05-311-16/+13
| | | | | | | | | | | | | | | | | | By mimicking mnl_nft_event_listener() code, mnl_batch_talk() may be simplified quite a bit: * Turn the conditional loop into an unconditional one. * Call select() at loop start, which merges the two call sites. * Check readfds content after select() returned instead of in loop condition - if fd is not set, break to return error state stored in 'err' variable. * Old code checked that select() return code is > 0, but that was redundant: if FD_ISSET() returns true, select return code was 1. * Move 'nlh' helper variable definition into error handling block, it is not used outside of it. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Initialize fd_set before select(), not afterPhil Sutter2019-05-311-3/+3
| | | | | | | | | | Calling FD_SET() in between return of select() and call to FD_ISSET() effectively renders the whole thing useless: FD_ISSET() will always return true no matter what select() actually did. Fixes: a72315d2bad47 ("src: add rule batching support") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: mnl_batch_talk() returns -1 on internal netlink errorsPablo Neira Ayuso2019-05-311-5/+2
| | | | | | Display an error in case internal netlink plumbing hits problems. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: estimate receiver buffer sizePablo Neira Ayuso2019-05-311-2/+7
| | | | | | | | | | | | | | | | | Set a receiver buffer size based on the number of commands and the average message size, this is useful for the --echo option in order to avoid ENOBUFS errors. On the kernel side, each skbuff consumes truesize from the socket queue (although it uses NLMSG_GOODSIZE to allocate it), which is approximately four times the estimated size per message that we get in turn for each echo message to ensure enough receiver buffer space. We could also explore increasing the buffer and retry if mnl_nft_socket_sendmsg() hits ENOBUFS if we ever hit this problem again. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: add mnl_nft_batch_to_msg()Pablo Neira Ayuso2019-05-311-18/+36
| | | | | | This function transforms the batch into a msghdr object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: call mnl_set_sndbuffer() from mnl_batch_talk()Pablo Neira Ayuso2019-05-311-1/+2
| | | | | | Instead of mnl_nft_socket_sendmsg(), just a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: mnl_set_rcvbuffer() skips buffer size update if it is too smallPablo Neira Ayuso2019-05-311-0/+11
| | | | | | | Check for existing buffer size, if this is larger than the newer buffer size, skip this size update. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: add mnl_set_rcvbuffer() and use itPablo Neira Ayuso2019-05-311-14/+23
| | | | | | This new function allows us to set the netlink receiver buffer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: use UDATA defines from libnftnlPhil Sutter2019-05-031-3/+3
| | | | | | | | | | | | | Userdata attribute names have been added to libnftnl, use them instead of the local copy. While being at it, rename udata_get_comment() in netlink_delinearize.c and the callback it uses since the function is specific to rules. Also integrate the existence check for NFTNL_RULE_USERDATA into it along with the call to nftnl_rule_get_data(). Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: name is ignored when deleting a tableFlorian Westphal2019-01-181-4/+0
| | | | | | | | | | | | | nlt is reallocated, leaking first allocation and also removing the table name/handle that was set on nlt object. Add a test case for this as well, the batch is supposed to fail when trying to delete a non-existant table, rather than wiping all tables in the same address family. Fixes: 12c362e2214a0 ("mnl: remove alloc_nftnl_table()") Reported-by: Mikhail Morfikov <mmorfikov@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: remove deprecated code for export/import commandsPablo Neira Ayuso2018-12-271-60/+0
| | | | | | | | | | | | | | | | | | Update parser to display this error message: # nft export json Error: JSON export is no longer supported, use 'nft -j list ruleset' instead export json ^^^^^^^^^^^^ Just like: # nft export vm json Error: JSON export is no longer supported, use 'nft -j list ruleset' instead export vm json ^^^^^^^^^^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add nft_ctx_output_{get,set}_echo() to nft_ctx_output_{get,set}_flagsPablo Neira Ayuso2018-10-291-1/+1
| | | | | | | | Add NFT_CTX_OUTPUT_ECHO flag and echo the command that has been send to the kernel. Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_flowtable()Pablo Neira Ayuso2018-10-241-16/+54
| | | | | | | We can remove alloc_nftnl_flowtable() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Improve error checking in mnl_nft_event_listener()Phil Sutter2018-10-241-2/+5
| | | | | | | | When trying to adjust receive buffer size, the second call to setsockopt() was not error-checked. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: use either name or handle to refer to objectsPablo Neira Ayuso2018-10-231-4/+5
| | | | | | We can only specify either name or handle to refer to objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_obj()Pablo Neira Ayuso2018-10-231-16/+91
| | | | | | | We can remove alloc_nftnl_obj() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: move socket open and reopen to mnl.cPablo Neira Ayuso2018-10-231-0/+22
| | | | | | | These functions are part of the mnl backend, move them there. Remove netlink_close_sock(), use direct call to mnl_socket_close(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: pass struct nft_ctx through struct netlink_ctxPablo Neira Ayuso2018-10-221-15/+13
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_set()Pablo Neira Ayuso2018-10-101-26/+166
| | | | | | | We can remove alloc_nftnl_set() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_rule()Pablo Neira Ayuso2018-10-101-17/+73
| | | | | | | We can remove alloc_nftnl_rule() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_chain()Pablo Neira Ayuso2018-10-041-12/+86
| | | | | | | | The netlink layer sits in between the mnl and the rule layers, remove it. We can remove alloc_nftnl_chain() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove alloc_nftnl_table()Pablo Neira Ayuso2018-10-041-12/+43
| | | | | | | | The netlink layer sits in between the mnl and the rule layers, remove it. We can remove alloc_nftnl_table() and consolidate infrastructure in the src/mnl.c file. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: mnl: make nft_mnl_talk() publicFernando Fernandez Mancera2018-08-231-1/+1
| | | | | | | | As we are going to use the function nft_mnl_talk() from the incoming nftnl_osf.c, we make it public. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: remove non-batch netlink codePablo Neira Ayuso2018-04-201-133/+0
| | | | | | This functions have no clients anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove unused batch support checksPablo Neira Ayuso2018-03-071-64/+0
| | | | | | Follow up after cc8c5fd02448 ("netlink: remove non-batching routine"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for get element commandPablo Neira Ayuso2018-03-071-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | You need a Linux kernel >= 4.15 to use this feature. This patch allows us to dump the content of an existing set. # nft list ruleset table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 3.3.3.3, 5.5.5.5-6.6.6.6 } } } You check if a single element exists in the set: # nft get element x x { 1.1.1.5 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } Output means '1.1.1.5' belongs to the '1.1.1.1-2.2.2.2' interval. You can also check for intervals: # nft get element x x { 1.1.1.1-2.2.2.2 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2 } } } If you try to check for an element that doesn't exist, an error is displayed. # nft get element x x { 1.1.1.0 } Error: Could not receive set elements: No such file or directory get element x x { 1.1.1.0 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can also check for multiple elements in one go: # nft get element x x { 1.1.1.5, 5.5.5.10 } table ip x { set x { type ipv4_addr flags interval elements = { 1.1.1.1-2.2.2.2, 5.5.5.5-6.6.6.6 } } } You can also use this to fetch the existing timeout for specific elements, in case you have a set with timeouts in place: # nft get element w z { 2.2.2.2 } table ip w { set z { type ipv4_addr timeout 30s elements = { 2.2.2.2 expires 17s } } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: delete flowtablePablo Neira Ayuso2018-03-051-0/+16
| | | | | | | | This patch allows you to delete an existing flowtable: # nft delete flowtable x m Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to add flowtablesPablo Neira Ayuso2018-03-051-0/+16
| | | | | | | | | | | | | | | | | This patch allows you to create flowtable: # nft add table x # nft add flowtable x m { hook ingress priority 10\; devices = { eth0, wlan0 }\; } You have to specify hook and priority. So far, only the ingress hook is supported. The priority represents where this flowtable is placed in the ingress hook, which is registered to the devices that the user specifies. You can also use the 'create' command instead to bail out in case that there is an existing flowtable with this name. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for flowtable listingPablo Neira Ayuso2018-03-051-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to dump existing flowtable. # nft list ruleset table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You can also list existing flowtables via: # nft list flowtables table ip x { flowtable x { hook ingress priority 10 devices = { eth0, tap0 } } } You need a Linux kernel >= 4.16-rc to test this new feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Eliminate struct mnl_ctxPhil Sutter2017-11-161-139/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue leading to this patch was that debug output in nft_mnl_talk() bypasses the application-defined output_fp. While investigating, another problem was discovered: Most of the ad-hoc defined mnl_ctx objects have their field 'debug_mask' set to zero regardless of what netlink_ctx contains (this affects non-batch code path only). The intuitive solution to both of those issues required to extend function parameters of all the non-batch functions as well as the common nft_mnl_talk() one. Instead of complicating them even further, this patch instead makes them accept a pointer to netlink_ctx as first parameter to gather both the old (nf_sock, seqnum) and the new values (debug_mask, octx) from. Since after the above change struct mnl_ctx was not really used anymore, so the remaining places were adjusted as well to allow for removing the struct altogether. Note that cache routines needed special treatment: Although parameters of cache_update() make it a candidate for the same change, it can't be converted since it is called in evaluation phase sometimes in which there is no netlink context available (but just eval context instead). Since netlink_genid_get() needs a netlink context though, the ad-hoc netlink_ctx definition from cache_init() is moved into cache_update() to have it available there already. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* libnftables: Get rid of explicit cache flushesPhil Sutter2017-10-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past, CLI as a potentially long running process had to make sure it kept it's cache up to date with kernel's rule set. A simple test case is this: | shell a | shell b | | # nft -i | # nft add table ip t | | | nft> list ruleset | | table ip t { | | } | # nft flush ruleset | | | nft> list ruleset | | nft> In order to make sure interactive CLI wouldn't incorrectly list the table again in the second 'list' command, it immediately flushed it's cache after every command execution. This patch eliminates the need for that by making cache updates depend on kernel's generation ID: A cache update stores the current rule set's ID in struct nft_cache, consecutive calls to cache_update() compare that stored value to the current generation ID received from kernel - if the stored value is zero (i.e. no previous cache update did happen) or if it doesn't match the kernel's value (i.e. cache is outdated) the cache is flushed and fully initialized again. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add nft_ prefix to everything exposed through include/nftables/nftables.hPablo Neira Ayuso2017-10-241-3/+3
| | | | | | | | Prepend nft_ prefix before these are exposed, reduce chances we hit symbol namespace pollution problems when mixing libnftables with other existing libraries. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: fix broken sequence number allocationPablo Neira Ayuso2017-10-021-1/+1
| | | | | | | | Wrong arithmetics with pointer. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1178 Fixes: 0d9d04c31481 ("src: make netlink sequence number non-static") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: do not set NLM_F_CREATE in deletion requestsPablo Neira Ayuso2017-09-081-2/+5
| | | | | | | This flag is not legal there, it only makes sense for addition requests. This patch has no impact at all in any of the nf_tables kernel versions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: fix error handling in mnl_batch_talkEric Leblond2017-08-241-2/+5
| | | | | | | | | | | | If one of the command is failing we should return an error. Pablo says: "This is not a real issue since nft_netlink() returns an error in case the list of errors is not empty. But we can indeed simplify things by removing that explicit assignment in nft_netlink() so mnl_batch_talk() consistently reports when if an error has happened. Signee-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>