summaryrefslogtreecommitdiffstats
path: root/iptables/nft.c
Commit message (Collapse)AuthorAgeFilesLines
* nft-cache: Fetch cache per tablePhil Sutter2020-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Restore per-table operation of cache routines as initially implemented in commit e2883c5531e6e ("nft-cache: Support partial cache per table"). As before, this doesn't limit fetching of tables (their number is supposed to be low) but instead limits fetching of sets, chains and rules to the specified table. For this to behave correctly when restoring without flushing over multiple tables, cache must be freed fully after each commit - otherwise the previous table's cache level is reused for the current one. The exception being fake cache, used for flushing restore: NFT_CL_FAKE is set just once at program startup, so it must stay set otherwise consecutive tables cause pointless cache fetching. The sole use-case requiring a multi-table cache, iptables-save, is indicated by req->table being NULL. Therefore, req->table assignment is a bit sloppy: All calls to nft_cache_level_set() are assumed to set the same table value, collision detection exists merely to catch programming mistakes. Make nft_fini() call nft_release_cache() instead of flush_chain_cache(), the former does a full cache deinit including cache_req contents. Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: remove cache build callsPablo Neira Ayuso2020-05-111-21/+0
| | | | | | | | | The cache requirements are now calculated once from the parsing phase. There is no need to call __nft_build_cache() from several spots in the codepath anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: restore among supportPablo Neira Ayuso2020-05-111-0/+15
| | | | | | | Update among support to work again with the new parser and cache logic. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: calculate cache requirements from list of commandsPablo Neira Ayuso2020-05-111-2/+15
| | | | | | | | | | | | | | | | | | | | This patch uses the new list of commands to calculate the cache requirements, the rationale after this updates is the following: #1 Parsing, that builds the list of commands and it also calculates cache level requirements. #2 Cache building. #3 Translate commands to jobs #4 Translate jobs to netlink This patch removes the pre-parsing code in xtables-restore.c to calculate the cache. After this patch, cache is calculated only once, there is no need to cancel and refetch for an in-transit transaction. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: split parsing from netlink commandsPablo Neira Ayuso2020-05-111-59/+199
| | | | | | | | | | | | | | This patch updates the parser to generate a list of command objects. This list of commands is then transformed to a list of netlink jobs. This new command object stores the rule using the nftnl representation via nft_rule_new(). To reduce the number of updates in this patch, the nft_*_rule_find() functions have been updated to restore the native representation to skip the update of the rule comparison code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc>
* ebtables-restore: Drop custom table flush routinePhil Sutter2020-05-111-21/+0
| | | | | | | | At least since flushing xtables-restore doesn't fetch chains from kernel anymore, problems with pending policy rule delete jobs can't happen anymore. Signed-off-by: Phil Sutter <phil@nwl.cc>
* xtables: Review nft_init()Phil Sutter2020-02-241-1/+8
| | | | | | | | | | | | | | | | | Move common code into nft_init(), such as: * initial zeroing nft_handle fields * family ops lookup and assignment to 'ops' field * setting of 'family' field This requires minor adjustments in xtables_restore_main() so extra field initialization doesn't happen before nft_init() call. As a side-effect, this fixes segfaulting xtables-monitor binary when printing rules for trace event as in that code-path 'ops' field wasn't initialized. Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: Drop pointless assignmentPhil Sutter2020-02-181-1/+0
| | | | | | | | No need to set 'i' to zero here, it is not used before the next assignment. Fixes: 77e6a93d5c9dc ("xtables: add and set "implict" flag on transaction objects") Signed-off-by: Phil Sutter <phil@nwl.cc>
* ebtables: among: Support mixed MAC and MAC/IP entriesPhil Sutter2020-02-181-1/+19
| | | | | | | | | | | | | | Powered by Stefano's support for concatenated ranges, a full among match replacement can be implemented. The trick is to add MAC-only elements as a concatenation of MAC and zero-length prefix, i.e. a range from 0.0.0.0 till 255.255.255.255. Although not quite needed, detection of pure MAC-only matches is left in place. For those, no implicit 'meta protocol' match is added (which is required otherwise at least to keep nft output correct) and no concat type is used for the set. Signed-off-by: Phil Sutter <phil@nwl.cc>
* nft: bridge: Rudimental among extension supportPhil Sutter2019-11-251-0/+149
| | | | | | | | | Support among match as far as possible given the limitations of nftables sets, namely limited to homogeneous MAC address only or MAC and IP address only matches. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Support parsing lookup expressionPhil Sutter2019-11-251-1/+2
| | | | | | | | Add required glue code to support family specific lookup expression parsers implemented as family_ops callback. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Support NFT_COMPAT_SET_ADDPhil Sutter2019-11-251-0/+58
| | | | | | | | Implement the required infrastructure to create sets as part of a batch job commit. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Eliminate pointless calls to nft_family_ops_lookup()Phil Sutter2019-11-251-10/+5
| | | | | | | | | If nft_handle is available, use its 'ops' field instead of performing a new lookup. For the same reason, there is no need to pass ops pointer to __nft_print_header(). Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: family_ops: Pass nft_handle to 'rule_to_cs' callbackPhil Sutter2019-11-251-10/+8
| | | | | | | | | | | This is the actual callback used to parse nftables rules. Pass nft_handle to it so it can access the cache (and possible sets therein). Having to pass nft_handle to nft_rule_print_save() allows to simplify it a bit since no family ops lookup has to be done anymore. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: family_ops: Pass nft_handle to 'print_rule' callbackPhil Sutter2019-11-251-9/+10
| | | | | | | | Prepare for 'rule_to_cs' callback to receive nft_handle pointer so it is able to access cache for set lookups. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: family_ops: Pass nft_handle to 'rule_find' callbackPhil Sutter2019-11-251-1/+1
| | | | | | | | | In order to prepare for rules containing set references, nft handle has to be passed to nft_rule_to_iptables_command_state() in order to let it access the set in cache. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: family_ops: Pass nft_handle to 'add' callbackPhil Sutter2019-11-251-2/+3
| | | | | | | | | In order for add_match() to create anonymous sets when converting xtables matches it needs access to nft handle. So pass it along from callers of family ops' add callback. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Fix -Z for rules with NFTA_RULE_COMPATPhil Sutter2019-11-151-0/+39
| | | | | | | | | | | | | | | | | | | | | The special nested attribute NFTA_RULE_COMPAT holds information about any present l4proto match (given via '-p' parameter) in input. The match is contained as meta expression as well, but some xtables extensions explicitly check it's value (see e.g. xt_TPROXY). This nested attribute is input only, the information is lost after parsing (and initialization of compat extensions). So in order to feed a rule back to kernel with zeroed counters, the attribute has to be reconstructed based on the rule's expressions. Other code paths are not affected since rule_to_cs() callback will populate respective fields in struct iptables_command_state and 'add' callback (which is the inverse to rule_to_cs()) calls add_compat() in any case. Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: CMD_ZERO needs a rule cachePhil Sutter2019-11-151-0/+2
| | | | | | | | | | | In order to zero rule counters, they have to be fetched from kernel. Fix this for both standalone calls as well as xtables-restore --noflush. Fixes: b5cb6e631c828 ("nft-cache: Fetch only chains in nft_chain_list_get()") Fixes: 09cb517949e69 ("xtables-restore: Improve performance of --noflush operation") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Use ARRAY_SIZE() macro in nft_strerror()Phil Sutter2019-10-231-1/+1
| | | | | | | | Variable 'table' is an array of type struct table_struct, so this is a classical use-case for ARRAY_SIZE() macro. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Optimize flushing all chains of a tablePhil Sutter2019-10-171-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | Leverage nftables' support for flushing all chains of a table by omitting NFTNL_RULE_CHAIN attribute in NFT_MSG_DELRULE payload. The only caveat is with verbose output, as that still requires to have a list of (existing) chains to iterate over. Apart from that, implementing this shortcut is pretty straightforward: Don't retrieve a chain list and just call __nft_rule_flush() directly which doesn't set above attribute if chain name pointer is NULL. A bigger deal is keeping rule cache consistent: Instead of just clearing rule list for each flushed chain, flush_rule_cache() is updated to iterate over all cached chains of the given table, clearing their rule lists if not called for a specific chain. While being at it, sort local variable declarations in nft_rule_flush() from longest to shortest and drop the loop-local 'chain_name' variable (but instead use 'chain' function parameter which is not used at that point). Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Support nft_is_table_compatible() per chainPhil Sutter2019-10-171-8/+24
| | | | | | | | | | | | | | When operating on a single chain only, compatibility checking causes unwanted overhead by checking all chains of the current table. Avoid this by accepting the current chain name as parameter and pass it along to nft_chain_list_get(). While being at it, introduce nft_assert_table_compatible() which calls xtables_error() in case compatibility check fails. If a chain name was given, include that in error message. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Reduce cache overhead of nft_chain_builtin_init()Phil Sutter2019-10-171-4/+5
| | | | | | | | There is no need for a full chain cache, fetch only the few builtin chains that might need to be created. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft-cache: Support partial rule cache per chainPhil Sutter2019-10-171-18/+17
| | | | | | | | | | | | | | | | | Accept an additional chain name pointer in __nft_build_cache() and pass it along to fetch only that specific chain and its rules. Enhance nft_build_cache() to take an optional nftnl_chain pointer to fetch rules for. Enhance nft_chain_list_get() to take an optional chain name. If cache level doesn't include chains already, it will fetch only the specified chain from kernel (if existing) and add that to table's chain list which is returned. This keeps operations for all chains of a table or a specific one within the same code path in nft.c. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft-cache: Fetch only chains in nft_chain_list_get()Phil Sutter2019-10-171-0/+20
| | | | | | | | | | The function is used to return the given table's chains, so fetching chain cache is enough. Add calls to nft_build_cache() in places where a rule cache is required. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Extract cache routines into nft-cache.cPhil Sutter2019-10-101-356/+4
| | | | | | | | The amount of code dealing with caching only is considerable and hence deserves an own source file. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Avoid nested cache fetchingPhil Sutter2019-10-101-2/+1
| | | | | | | | | Don't call fetch_table_cache() from within fetch_chain_cache() but instead from __nft_build_cache(). Since that is the only caller of fetch_chain_cache(), this change should not have any effect in practice. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Pass nft_handle to flush_cache()Phil Sutter2019-10-101-17/+11
| | | | | | | | | This allows to call nft_table_builtin_find() and hence removes the only real user of __nft_table_builtin_find(). Consequently remove the latter by integrating it into its sole caller. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xtables-restore: Minimize caching when flushingPhil Sutter2019-09-301-0/+17
| | | | | | | | | | | | | | Unless --noflush was given, xtables-restore merely needs the list of tables to decide whether to delete it or not. Introduce nft_fake_cache() function which populates table list, initializes chain lists (so nft_chain_list_get() returns an empty list instead of NULL) and sets 'have_cache' to turn any later calls to nft_build_cache() into nops. If --noflush was given, call nft_build_cache() just once instead of for each table line in input. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>
* nft: Make nftnl_table_list_get() fetch only tablesPhil Sutter2019-09-301-1/+2
| | | | | | | No need for a full cache to serve the list of tables. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>
* nft: Fix for add and delete of same rule in single batchPhil Sutter2019-09-301-0/+3
| | | | | | | | | | | | | | | Another corner-case found when extending restore ordering test: If a delete command in a dump referenced a rule added earlier within the same dump, kernel would reject the resulting NFT_MSG_DELRULE command. Catch this by assigning the rule to delete a RULE_ID value if it doesn't have a handle yet. Since __nft_rule_del() does not duplicate the nftnl_rule object when creating the NFT_COMPAT_RULE_DELETE command, this RULE_ID value is added to both NEWRULE and DELRULE commands - exactly what is needed to establish the reference. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>
* nft: Get rid of NFT_COMPAT_EXPR_MAX definePhil Sutter2019-09-261-4/+2
| | | | | | | | Instead simply use ARRAY_SIZE() macro to not overstep supported_exprs array. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xtables_error() does not returnPhil Sutter2019-09-251-6/+2
| | | | | | | | | It's a define which resolves into a callback which in turn is declared with noreturn attribute. It will never return, therefore drop all explicit exit() calls or other dead code immediately following it. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Florian Westphal <fw@strlen.de>
* nft Increase mnl_talk() receive buffer sizePhil Sutter2019-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This improves cache population quite a bit and therefore helps when dealing with large rulesets. A simple hard to improve use-case is listing the last rule in a large chain. These are the average program run times depending on number of rules: rule count | legacy | nft old | nft new --------------------------------------------------------- 50,000 | .052s | .611s | .406s 100,000 | .115s | 2.12s | 1.24s 150,000 | .265s | 7.63s | 4.14s 200,000 | .411s | 21.0s | 10.6s So while legacy iptables is still magnitudes faster, this simple change doubles iptables-nft performance in ideal cases. Note that using a larger buffer than 32KB doesn't further improve performance since linux kernel won't transmit more data at once. This limit was set (actually extended from 16KB) in kernel commit d35c99ff77ecb ("netlink: do not enter direct reclaim from netlink_dump()"). Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Introduce nft_bridge_commit()Phil Sutter2019-09-231-2/+6
| | | | | | | | No need to check family value from nft_commit() if we can have a dedicated callback for bridge family. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Use nftnl_*_set_str() functionsPhil Sutter2019-09-231-14/+14
| | | | | | | | | | Although it doesn't make a difference in practice, they are the correct API functions to use when assigning string attributes. While doing so, also drop the needless casts to non-const. Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Drop stale include directivePhil Sutter2019-08-011-1/+0
| | | | | | | | This is a leftover, the file does not exist in fresh clones. Fixes: 06fd5e46d46f7 ("xtables: Drop support for /etc/xtables.conf") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xtables: Drop support for /etc/xtables.confPhil Sutter2019-07-291-154/+8
| | | | | | | | As decided upon at NFWS2019, drop support for configurable nftables base chains to use with iptables-nft. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Set errno in nft_rule_flush()Phil Sutter2019-07-291-1/+3
| | | | | | | | | | | When trying to flush a non-existent chain, errno gets set in nft_xtables_config_load(). That is an unintended side-effect and when support for xtables.conf is later removed, iptables-nft will emit the generic "Incompatible with this kernel." error message instead of "No chain/target/match by that name." as it should. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Make nft_for_each_table() more versatilePhil Sutter2019-07-231-3/+3
| | | | | | | Support passing arbitrary data (via void pointer) to the callback. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: exit in case we can't fetch current genidFlorian Westphal2019-07-151-2/+8
| | | | | | | | | | | When running iptables -nL as non-root user, iptables would loop indefinitely. With this change, it will fail with iptables v1.8.3 (nf_tables): Could not fetch rule set generation id: Permission denied (you must be root) Reported-by: Amish <anon.amish@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Move send/receive buffer sizes into nft_handlePhil Sutter2019-07-031-10/+7
| | | | | | | | | Store them next to the mnl_socket pointer. While being at it, add a comment to mnl_set_rcvbuffer() explaining why the buffer size is changed. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Pass nft_handle down to mnl_batch_talk()Phil Sutter2019-07-031-23/+18
| | | | | | | | | >From there, pass it along to mnl_nft_socket_sendmsg() and further down to mnl_set_{snd,rcv}buffer(). This prepares the code path for keeping stored socket buffer sizes in struct nft_handle. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: Set socket receive bufferPhil Sutter2019-07-031-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When trying to delete user-defined chains in a large ruleset, iptables-nft aborts with "No buffer space available". This can be reproduced using the following script: | #! /bin/bash | iptables-nft-restore <( | | echo "*filter" | for i in $(seq 0 200000);do | printf ":chain_%06x - [0:0]\n" $i | done | for i in $(seq 0 200000);do | printf -- "-A INPUT -j chain_%06x\n" $i | printf -- "-A INPUT -j chain_%06x\n" $i | done | echo COMMIT | | ) | iptables-nft -X The problem seems to be the sheer amount of netlink error messages sent back to user space (one EBUSY for each chain). To solve this, set receive buffer size depending on number of commands sent to kernel. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: reset netlink sender buffer size of socket restartPablo Neira Ayuso2019-05-201-0/+1
| | | | | | | | Otherwise, mnl_set_sndbuffer() skips the buffer update after socket restart. Then, sendmsg() fails with EMSGSIZE later on when sending the batch to the kernel. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: do not retry on EINTRPablo Neira Ayuso2019-05-201-21/+5
| | | | | | | Patch ab1cd3b510fa ("nft: ensure cache consistency") already handles consistency via generation ID. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: don't care about previous state in ERESTARTPablo Neira Ayuso2019-05-201-7/+10
| | | | | | | We need to re-evalute based on the existing cache generation. Fixes: 58d7de0181f6 ("xtables: handle concurrent ruleset modifications") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: don't skip table addition from ERESTARTPablo Neira Ayuso2019-05-201-9/+1
| | | | | | | I don't find a scenario that trigger this case. Fixes: 58d7de0181f6 ("xtables: handle concurrent ruleset modifications") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* xtables: Fix for explicit rule flushesPhil Sutter2019-05-201-1/+1
| | | | | | | | | | | The commit this fixes added a new parameter to __nft_rule_flush() to mark a rule flush job as implicit or not. Yet the code added to that function ignores the parameter and instead always sets batch job's 'implicit' flag to 1. Fixes: 77e6a93d5c9dc ("xtables: add and set "implict" flag on transaction objects") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nft: keep original cache in case of ERESTARTPablo Neira Ayuso2019-05-201-3/+20
| | | | | | | | | | | | | | | Phil Sutter says: "The problem is that data in h->obj_list potentially sits in cache, too. At least rules have to be there so insert with index works correctly. If the cache is flushed before regenerating the batch, use-after-free occurs which crashes the program." This patch keeps around the original cache until we have refreshed the batch. Fixes: 862818ac3a0de ("xtables: add and use nft_build_cache") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>