summaryrefslogtreecommitdiffstats
path: root/src/mnl.c
Commit message (Collapse)AuthorAgeFilesLines
* src: remove xfree() and use plain free()Thomas Haller2023-11-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xmalloc() (and similar x-functions) are used for allocation. They wrap malloc()/realloc() but will abort the program on ENOMEM. The meaning of xmalloc() is that it wraps malloc() but aborts on failure. I don't think x-functions should have the notion, that this were potentially a different memory allocator that must be paired with a particular xfree(). Even if the original intent was that the allocator is abstracted (and possibly not backed by standard malloc()/free()), then that doesn't seem a good idea. Nowadays libc allocators are pretty good, and we would need a very special use cases to switch to something else. In other words, it will never happen that xmalloc() is not backed by malloc(). Also there were a few places, where a xmalloc() was already "wrongly" paired with free() (for example, iface_cache_release(), exit_cookie(), nft_run_cmd_from_buffer()). Or note how pid2name() returns an allocated string from fscanf(), which needs to be freed with free() (and not xfree()). This requirement bubbles up the callers portid2name() and name_by_portid(). This case was actually handled correctly and the buffer was freed with free(). But it shows that mixing different allocators is cumbersome to get right. Of course, we don't actually have different allocators and whether to use free() or xfree() makes no different. The point is that xfree() serves no actual purpose except raising irrelevant questions about whether x-functions are correctly paired with xfree(). Note that xfree() also used to accept const pointers. It is bad to unconditionally for all deallocations. Instead prefer to use plain free(). To free a const pointer use free_const() which obviously wraps free, as indicated by the name. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add free_const() and use it instead of xfree()Thomas Haller2023-11-091-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost everywhere xmalloc() and friends is used instead of malloc(). This is almost everywhere paired with xfree(). xfree() has two problems. First, it brings the wrong notion that xmalloc() should be paired with xfree(), as if xmalloc() would not use the plain malloc() allocator. In practices, xfree() just wraps free(), and it wouldn't make sense any other way. xfree() should go away. This will be addressed in the next commit. The problem addressed by this commit is that xfree() accepts a const pointer. Paired with the practice of almost always using xfree() instead of free(), all our calls to xfree() cast away constness of the pointer, regardless whether that is necessary. Declaring a pointer as const should help us to catch wrong uses. If the xfree() function always casts aways const, the compiler doesn't help. There are many places that rightly cast away const during free. But not all of them. Add a free_const() macro, which is like free(), but accepts const pointers. We should always make an intentional choice whether to use free() or free_const(). Having a free_const() macro makes this very common choice clearer, instead of adding a (void*) cast at many places. Note that we now pair xmalloc() allocations with a free() call (instead of xfree(). That inconsistency will be resolved in the next commit. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <string.h> in <nft.h>Thomas Haller2023-09-281-1/+0
| | | | | | | | <string.h> provides strcmp(), as such it's very basic and used everywhere. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* include: include <stdlib.h> in <nft.h>Thomas Haller2023-09-111-1/+0
| | | | | | | | | | | | | | It provides malloc()/free(), which is so basic that we need it everywhere. Include via <nft.h>. The ultimate purpose is to define more things in <nft.h>. While it has not corresponding C sources, <nft.h> can contain macros and static inline functions, and is a good place for things that we shall have everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that is a very basic dependency, that will be needed for that. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add <nft.h> header and include it as firstThomas Haller2023-08-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <config.h> is generated by the configure script. As it contains our feature detection, it want to use it everywhere. Likewise, in some of our sources, we define _GNU_SOURCE. This defines the C variant we want to use. Such a define need to come before anything else, and it would be confusing if different source files adhere to a different C variant. It would be good to use autoconf's AC_USE_SYSTEM_EXTENSIONS, in which case we would also need to ensure that <config.h> is always included as first. Instead of going through all source files and include <config.h> as first, add a new header "include/nft.h", which is supposed to be included in all our sources (and as first). This will also allow us later to prepare some common base, like include <stdbool.h> everywhere. We aim that headers are self-contained, so that they can be included in any order. Which, by the way, already didn't work because some headers define _GNU_SOURCE, which would only work if the header gets included as first. <nft.h> is however an exception to the rule: everything we compile shall rely on having <nft.h> header included as first. This applies to source files (which explicitly include <nft.h>) and to internal header files (which are only compiled indirectly, by being included from a source file). Note that <config.h> has no include guards, which is at least ugly to include multiple times. It doesn't cause problems in practice, because it only contains defines and the compiler doesn't warn about redefining a macro with the same value. Still, <nft.h> also ensures to include <config.h> exactly once. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset {set,map,element}' commandsPhil Sutter2023-07-131-4/+18
| | | | | | | | | | | All these are used to reset state in set/map elements, i.e. reset the timeout or zero quota and counter values. While 'reset element' expects a (list of) elements to be specified which should be reset, 'reset set/map' will reset all elements in the given set/map. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: support bpf id decode in nft list hooksFlorian Westphal2023-05-221-0/+40
| | | | | | | | | | | This allows 'nft list hooks' to also display the bpf program id attached. Example: hook input { -0000000128 nf_hook_run_bpf id 6 .. Signed-off-by: Florian Westphal <fw@strlen.de>
* mnl: incomplete extended error reporting for singleton device in chainPablo Neira Ayuso2023-04-251-0/+1
| | | | | | | | | | | Fix error reporting when single device is specifies in chain: # nft add chain netdev filter ingress '{ devices = { x }; }' add chain netdev filter ingress { devices = { x }; } ^ Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: handle singleton element in netdevice setPablo Neira Ayuso2023-04-251-14/+32
| | | | | | | | expr_evaluate_set() turns sets with singleton element into value, nft_dev_add() expects a list of expression, so it crashes. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1676 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow for updating devices on existing netdev chainPablo Neira Ayuso2023-04-241-52/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to add/remove devices to an existing chain: # cat ruleset.nft table netdev x { chain y { type filter hook ingress devices = { eth0 } priority 0; policy accept; } } # nft -f ruleset.nft # nft add chain netdev x y '{ devices = { eth1 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth0, eth1 } priority 0; policy accept; } } # nft delete chain netdev x y '{ devices = { eth0 }; }' # nft list ruleset table netdev x { chain y { type filter hook ingress devices = { eth1 } priority 0; policy accept; } } This feature allows for creating an empty netdev chain, with no devices. In such case, no packets are seen until a device is registered. This patch includes extended netlink error reporting: # nft add chain netdev x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add chain netdev x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: flowtable support for extended netlink error reportingPablo Neira Ayuso2023-04-241-60/+82
| | | | | | | | | | | | | | | This patch extends existing flowtable support to improve error reporting: # nft add flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory add flowtable inet x y { devices = { x } ; } ^ # nft delete flowtable inet x y '{ devices = { x } ; }' Error: Could not process rule: No such file or directory delete flowtable inet x y { devices = { x } ; } ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: set SO_SNDBUF before SO_SNDBUFFORCEPablo Neira Ayuso2023-04-241-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set SO_SNDBUF before SO_SNDBUFFORCE: Unpriviledged user namespace does not have CAP_NET_ADMIN on the host (user_init_ns) namespace. SO_SNDBUF always succeeds in Linux, always try SO_SNDBUFFORCE after it. Moreover, suggest the user to bump socket limits if EMSGSIZE after having see EPERM previously, when calling SO_SNDBUFFORCE. Provide a hint to the user too: # nft -f test.nft netlink: Error: Could not process rule: Message too long Please, rise /proc/sys/net/core/wmem_max on the host namespace. Hint: 4194304 bytes Dave Pfike says: Prior to this patch, nft inside a systemd-nspawn container was failing to install my ruleset (which includes a large-ish map), with the error netlink: Error: Could not process rule: Message too long strace reveals: setsockopt(3, SOL_SOCKET, SO_SNDBUFFORCE, [524288], 4) = -1 EPERM (Operation not permitted) This is despite the nspawn process supposedly having CAP_NET_ADMIN. A web search reveals at least one other user having the same issue: https://old.reddit.com/r/Proxmox/comments/scnoav/lxc_container_debian_11_nftables_geoblocking/ Reported-by: Dave Pifke <dave@pifke.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cmd: move command functions to src/cmd.cPablo Neira Ayuso2023-03-111-0/+1
| | | | | | Move several command functions to src/cmd.c to debloat src/rule.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support to command "destroy"Fernando F. Mancera2023-02-061-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | "destroy" command performs a deletion as "delete" command but does not fail if the object does not exist. As there is no NLM_F_* flag for ignoring such error, it needs to be ignored directly on error handling. Example of use: # nft list ruleset table ip filter { chain output { } } # nft destroy table ip missingtable # echo $? 0 # nft list ruleset table ip filter { chain output { } } Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Implement 'reset rule' and 'reset rules' commandsPhil Sutter2023-01-181-4/+14
| | | | | | | | Reset rule counters and quotas in kernel, i.e. without having to reload them. Requires respective kernel patch to support NFT_MSG_GETRULE_RESET message type. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: dump_nf_hooks() leaks memory in error pathPhil Sutter2023-01-131-2/+9
| | | | | | | Have to free the basehook object before returning to caller. Fixes: 4694f7230195b ("src: add support for base hook dumping") Signed-off-by: Phil Sutter <phil@nwl.cc>
* src: Add GPLv2+ header to .c files of recent creationPablo Neira Ayuso2023-01-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch comes after a proposal of mine at NFWS 2022 that resulted in agreement to license recent .c files under GPLv2+ by the attendees at this meeting: - Stefano Brivio - Fernando F. Mancera - Phil Sutter - Jozsef Kadlecsik - Florian Westphal - Laura Garcia - Arturo Borrero - Pablo Neira It has already happened that one of the external library dependencies was moved to GPLv3+ (libreadline), resulting in a change to libedit by default in b4dded0ca78d ("configure: default to libedit for cli"). I have added the GPLv2+ header to the following files: Authors ------- src/cmd.c Pablo src/fib.c Florian src/hash.c Pablo src/iface.c Pablo src/json.c Phil + fixes from occasional contributors src/libnftables.c Eric Leblond and Phil src/mergesort.c Elise Lenion src/misspell.c Pablo src/mnl.c Pablo + fixes from occasional contributors src/monitor.c Arturo src/numgen.c Pablo src/osf.c Fernando src/owner.c Pablo src/parser_json.c Phil + fixes from occasional contributors src/print.c Phil src/xfrm.c Florian src/xt.c Pablo Eric Leblond and Elise Lennion did not attend NFWS 2022, but they acknowledged this license update already in the past when I proposed this to them in private emails. Update COPYING file too to refer that we are now moving towards GPLv2 or any later. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: store netlink error location for set elementsPablo Neira Ayuso2022-06-271-11/+17
| | | | | | | | | | | | | | | | | Store set element location in the per-command netlink error location array. This allows for fine grain error reporting when adding and deleting elements. # nft -f test.nft test.nft:5:4-20: Error: Could not process rule: File exists 00:01:45:09:0b:26 : drop, ^^^^^^^^^^^^^^^^^ test.nft contains a large map with one redundant entry. Thus, users do not have to find the needle in the stack. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: update mnl_nft_setelem_del() to allow for more reusePablo Neira Ayuso2022-04-131-3/+3
| | | | | | Pass handle and element list as parameters to allow for code reuse. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow to use typeof of raw expressions in set declarationPablo Neira Ayuso2022-03-291-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the dynamic datatype to allocate an instance of TYPE_INTEGER and set length and byteorder. Add missing information to the set userdata area for raw payload expressions which allows to rebuild the set typeof from the listing path. A few examples: - With anonymous sets: nft add rule x y ip saddr . @ih,32,32 { 1.1.1.1 . 0x14, 2.2.2.2 . 0x1e } - With named sets: table x { set y { typeof ip saddr . @ih,32,32 elements = { 1.1.1.1 . 0x14 } } } Incremental updates are also supported, eg. nft add element x y { 3.3.3.3 . 0x28 } expr_evaluate_concat() is used to evaluate both set key definitions and set key values, using two different function might help to simplify this code in the future. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: do not use the nft_cache_filter object from mnl.cPablo Neira Ayuso2022-01-151-7/+5
| | | | | | Pass the table and chain strings to mnl_nft_rule_dump() instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: Support filtering for a specific flowtablePhil Sutter2021-12-031-4/+10
| | | | | | | | | | Extend nft_cache_filter to hold a flowtable name so 'list flowtable' command causes fetching the requested flowtable only. Dump flowtables just once instead of for each table, merely assign fetched data to tables inside the loop. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: Filter set list on server sidePhil Sutter2021-12-031-4/+11
| | | | | | | | | Fetch either all tables' sets at once, a specific table's sets or even a specific set if needed instead of iterating over the list of previously fetched tables and fetching for each, then ignoring anything returned that doesn't match the filter. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: Filter chain list on kernel sidePhil Sutter2021-12-031-3/+18
| | | | | | | | | | | | | When operating on a specific chain, add payload to NFT_MSG_GETCHAIN so kernel returns only relevant data. Since ENOENT is an expected return code, do not treat this as error. While being at it, improve code in chain_cache_cb() a bit: - Check chain's family first, it is a less expensive check than comparing table names. - Do not extract chain name of uninteresting chains. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: Filter rule list on kernel sidePhil Sutter2021-12-031-2/+19
| | | | | | | | | | | Instead of fetching all existing rules in kernel's ruleset and filtering in user space, add payload to the dump request specifying the table and chain to filter for. Since list_rule_cb() no longer needs the filter, pass only netlink_ctx to the callback and drop struct rule_cache_dump_ctx. Signed-off-by: Phil Sutter <phil@nwl.cc>
* cache: Filter tables on kernel sidePhil Sutter2021-12-031-3/+19
| | | | | | | | | | | Instead of requesting a dump of all tables and filtering the data in user space, construct a non-dump request if filter contains a table so kernel returns only that single table. This should improve nft performance in rulesets with many tables present. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: different signedness compilation warningPablo Neira Ayuso2021-11-191-1/+1
| | | | | | | | | mnl.c: In function ‘mnl_batch_talk’: mnl.c:417:17: warning: comparison of integer expressions of different signedness: ‘unsigned in’ and ‘long int’ [-Wsign-compare] if (rcvbufsiz < NFT_MNL_ECHO_RCVBUFF_DEFAULT) ^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: do not build nftnl_set element listPablo Neira Ayuso2021-11-081-23/+89
| | | | | | | | | | | | Do not call alloc_setelem_cache() to build the set element list in nftnl_set. Instead, translate one single set element expression to nftnl_set_elem object at a time and use this object to build the netlink header. Using a huge test set containing 1.1 million element blocklist, this patch is reducing userspace memory consumption by 40%. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: revisit hook listingPablo Neira Ayuso2021-08-061-98/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update this command to display the hook datapath for a packet depending on its family. This patch also includes: - Group of existing hooks based on the hook location. - Order hooks by priority, from INT_MIN to INT_MAX. - Do not add sign to priority zero. - Refresh include/linux/netfilter/nfnetlink_hook.h cache copy. - Use NFNLA_CHAIN_* attributes to print the chain family, table and name. If NFNLA_CHAIN_* attributes are not available, display the hookfn name. - Update syntax: remove optional hook parameter, promote the 'device' argument. The following example shows the hook datapath for IPv4 packets coming in from netdevice 'eth0': # nft list hooks ip device eth0 family ip { hook ingress { +0000000010 chain netdev x y [nf_tables] +0000000300 chain inet m w [nf_tables] } hook input { -0000000100 chain ip a b [nf_tables] +0000000300 chain inet m z [nf_tables] } hook forward { -0000000225 selinux_ipv4_forward 0000000000 chain ip a c [nf_tables] } hook output { -0000000225 selinux_ipv4_output } hook postrouting { +0000000225 selinux_ipv4_postroute } } Note that the listing above includes the existing netdev and inet hooks/chains which *might* interfer in the travel of an incoming IPv4 packet. This allows users to debug the pipeline, basically, to understand in what order the hooks/chains are evaluated for the IPv4 packets. If the netdevice is not specified, then the ingress hooks are not shown. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: replace opencoded NFT_SET_ANONYMOUS set flag check by set_is_anonymous()Pablo Neira Ayuso2021-06-141-1/+1
| | | | | | | | Use set_is_anonymous() to check for the NFT_SET_ANONYMOUS set flag instead. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add support for base hook dumpingFlorian Westphal2021-06-091-1/+327
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Example output: $ nft list hook ip input family ip hook input { +0000000000 nft_do_chain_inet [nf_tables] # nft table ip filter chain input +0000000010 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain filter_INPUT +0000000100 nf_nat_ipv4_local_in [nf_nat] +2147483647 ipv4_confirm [nf_conntrack] } $ nft list hooks netdev type ingress device lo family netdev hook ingress device lo { +0000000000 nft_do_chain_netdev [nf_tables] } $ nft list hooks inet family ip hook prerouting { -0000000400 ipv4_conntrack_defrag [nf_defrag_ipv4] -0000000300 iptable_raw_hook [iptable_raw] -0000000290 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain raw_PREROUTING -0000000200 ipv4_conntrack_in [nf_conntrack] -0000000140 nft_do_chain_inet [nf_tables] # nft table ip firewalld chain mangle_PREROUTING -0000000100 nf_nat_ipv4_pre_routing [nf_nat] } ... 'nft list hooks' will display everyting except the netdev family via successive dump request for all family:hook combinations. Signed-off-by: Florian Westphal <fw@strlen.de>
* libnftables: location-based error reporting for chain typePablo Neira Ayuso2021-05-201-1/+7
| | | | | | | | | | | | | | | | | Store the location of the chain type for better error reporting. Several users that compile custom kernels reported that error reporting is misleading when accidentally selecting CONFIG_NFT_NAT=n. After this patch, a better hint is provided: # nft 'add chain x y { type nat hook prerouting priority dstnat; }' Error: Could not process rule: No such file or directory add chain x y { type nat hook prerouting priority dstnat; } ^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: unbreak deletion by table handlePablo Neira Ayuso2021-05-021-1/+1
| | | | | | | | | Use NFTA_TABLE_HANDLE instead of NFTA_TABLE_NAME to refer to the table 64-bit unique handle. Fixes: 7840b9224d5b ("evaluate: remove table from cache on delete table") Fixes: f8aec603aa7e ("src: initial extended netlink error reporting") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Increase BATCH_PAGE_SIZE to support huge rulesetsPhil Sutter2021-04-211-4/+4
| | | | | | | Apply the same change from iptables-nft to nftables to keep them in sync with regards to max supported transaction sizes. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: do not set flowtable flags twicePablo Neira Ayuso2021-03-311-5/+0
| | | | | | | | Flags are already set on from mnl_nft_flowtable_add(), remove duplicated code. Fixes: e6cc9f37385 ("nftables: add flags offload to flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* nftables: add flags offload to flowtableFrank Wunderlich2021-03-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allow flags (currently only offload) in flowtables like it is stated here: https://lwn.net/Articles/804384/ tested on mt7622/Bananapi-R64 table ip filter { flowtable f { hook ingress priority filter + 1 devices = { lan3, lan0, wan } flags offload; } chain forward { type filter hook forward priority filter; policy accept; ip protocol { tcp, udp } flow add @f } } table ip nat { chain post { type nat hook postrouting priority filter; policy accept; oifname "wan" masquerade } } Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: Set NFTNL_SET_DATA_TYPE before dumping set elementsPhil Sutter2021-03-091-0/+3
| | | | | | | | In combination with libnftnl's commit "set_elem: Fix printing of verdict map elements", This adds the vmap target to netlink dumps. Adjust dumps in tests/py accordingly. Signed-off-by: Phil Sutter <phil@nwl.cc>
* mnl: remove nft_mnl_socket_reopen()Pablo Neira Ayuso2021-03-051-13/+17
| | | | | | | | | | | | | | nft_mnl_socket_reopen() was introduced to deal with the EINTR case. By reopening the netlink socket, pending netlink messages that are part of a stale netlink dump are implicitly drop. This patch replaces the nft_mnl_socket_reopen() strategy by pulling out all of the remaining netlink message to restart in a clean state. This is implicitly fixing up a bug in the table ownership support, which assumes that the netlink socket remains open until nft_ctx_free() is invoked. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add set element multi-statement supportPablo Neira Ayuso2020-12-181-3/+14
| | | | | | | | Extend the set element infrastructure to support for several statements. This patch places the statements right after the key when printing it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: reply netlink error message might be larger than MNL_SOCKET_BUFFER_SIZEPablo Neira Ayuso2020-12-041-1/+4
| | | | | | | | | | | | | | | | | Netlink attribute maximum size is 65536 bytes (given nla_len is 16-bits). NFTA_SET_ELEM_LIST_ELEMENTS stores as many set elements as possible that can fit into this netlink attribute. Netlink messages with NLMSG_ERROR type originating from the kernel contain the original netlink message as payload, they might be larger than 65536 bytes. Add NFT_MNL_ACK_MAXSIZE which estimates the maximum Netlink header coming as (error) reply from the kernel. This estimate is based on the maximum netlink message size that nft sends from userspace. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1464 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: improve rule error reportingPablo Neira Ayuso2020-10-201-2/+63
| | | | | | | | | | | | | | | | | | | | | Kernel provides information regarding expression since 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). A common mistake is to refer a chain which does not exist, e.g. # nft add rule x y jump test Error: Could not process rule: No such file or directory add rule x y jump test ^^^^ Use the existing netlink extended error reporting infrastructure to provide better error reporting as in the example above. Requires Linux kernel patch 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for chainsJose M. Guisado Gomez2020-09-301-0/+11
| | | | | | | | | | | | | | | | | | | | This patch enables the user to specify a comment when adding a chain. Relies on kernel space supporting userdata for chains. > nft add table ip filter > nft add chain ip filter input { comment "test"\; type filter hook input priority 0\; policy accept\; } > list ruleset table ip filter { chain input { comment "test" type filter hook input priority filter; policy accept; } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* mnl: larger receive socket buffer for netlink errorsPablo Neira Ayuso2020-09-141-16/+5
| | | | | | | | Assume each error in the batch will result in a 1k notification for the non-echo flag set on case as described in 860671662d3f ("mnl: fix --echo buffer size again"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for objectsJose M. Guisado Gomez2020-09-081-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enables specifying an optional comment when declaring named objects. The comment is to be specified inside the object's block ({} block) Relies on libnftnl exporting nftnl_obj_get_data and kernel space support to store the comments. For consistency, this patch makes the comment be printed first when listing objects. Adds a testcase importing all commented named objects except for secmark, although it's supported. Example: Adding a quota with a comment > add table inet filter > nft add quota inet filter q { over 1200 bytes \; comment "test_comment"\; } > list ruleset table inet filter { quota q { comment "test_comment" over 1200 bytes } } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support when adding tablesJose M. Guisado Gomez2020-08-281-2/+15
| | | | | | | | | | | | | | | | | | | Adds userdata building logic if a comment is specified when creating a new table. Adds netlink userdata parsing callback function. Relies on kernel supporting userdata for nft_table. Example: > nft add table ip x { comment "test"\; } > nft list ruleset table ip x { comment "test" } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add comment support for set declarationsJose M. Guisado Gomez2020-08-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow users to add a comment when declaring a named set. Adds set output handling the comment in both nftables and json format. $ nft add table ip x $ nft add set ip x s {type ipv4_addr\; comment "some_addrs"\; elements = {1.1.1.1, 1.2.3.4}} $ nft list ruleset table ip x { set s { type ipv4_addr; comment "some_addrs" elements = { 1.1.1.1, 1.2.3.4 } } } $ nft --json list ruleset { "nftables": [ { "metainfo": { "json_schema_version": 1, "release_name": "Capital Idea #2", "version": "0.9.6" } }, { "table": { "family": "ip", "handle": 4857, "name": "x" } }, { "set": { "comment": "some_addrs", "elem": [ "1.1.1.1", "1.2.3.4" ], "family": "ip", "handle": 1, "name": "s", "table": "x", "type": "ipv4_addr" } } ] } Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for implicit chain bindingsPablo Neira Ayuso2020-07-151-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows you to group rules in a subchain, e.g. table inet x { chain y { type filter hook input priority 0; tcp dport 22 jump { ip saddr { 127.0.0.0/8, 172.23.0.0/16, 192.168.13.0/24 } accept ip6 saddr ::1/128 accept; } } } This also supports for the `goto' chain verdict. This patch adds a new chain binding list to avoid a chain list lookup from the delinearize path for the usual chains. This can be simplified later on with a single hashtable per table for all chains. From the shell, you have to use the explicit separator ';', in bash you have to escape this: # nft add rule inet x y tcp dport 80 jump { ip saddr 127.0.0.1 accept\; ip6 saddr ::1 accept \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: allow flowtable definitions with no devicesPablo Neira Ayuso2020-06-021-4/+6
| | | | | | | | | | | | | The listing shows no devices: # nft list ruleset table ip x { flowtable y { hook ingress priority filter } } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: delete devices to an existing flowtablePablo Neira Ayuso2020-06-021-0/+11
| | | | | | | | This patch allows you to remove a device to an existing flowtable: # nft delete flowtable x y { devices = { eth0 } \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: add devices to an existing flowtablePablo Neira Ayuso2020-06-021-5/+11
| | | | | | | | This patch allows you to add new devices to an existing flowtables. # nft add flowtable x y { devices = { eth0 } \; } Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>