| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When adding/deleting large number of elements in one step in ipset, it can
take a reasonable amount of time and can result in soft lockup errors. The
patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of
consecutive elements to add/delete") tried to fix it by limiting the max
elements to process at all. However it was not enough, it is still possible
that we get hung tasks. Lowering the limit is not reasonable, so the
approach in this patch is as follows: rely on the method used at resizing
sets and save the state when we reach a smaller internal batch limit,
unlock/lock and proceed from the saved state. Thus we can avoid long
continuous tasks and at the same time removed the limit to add/delete large
number of elements in one step.
The nfnl mutex is held during the whole operation which prevents one to issue
other ipset commands in parallel.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com
Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete")
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hash:net,port,net set type supports /0 subnets. However, the patch
commit 5f7b51bf09baca8e titled "netfilter: ipset: Limit the maximal range
of consecutive elements to add/delete" did not take into account it and
resulted in an endless loop. The bug is actually older but the patch
5f7b51bf09baca8e brings it out earlier.
Handle /0 subnets properly in hash:net,port,net set types.
Reported-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Use a more modern alternative to gzip.
Suggested-by: Jan Engelhardt <jengelh@inai.de>
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
| |
|
|
|
|
| |
hash:net,iface sets
|
|
|
|
|
|
|
| |
The patch "netfilter: ipset: enforce documented limit to prevent allocating
huge memory" was too strict and prevented to add up to 64 clashing elements
to a hash:net,iface type of set. This patch fixes the issue and now the type
behaves as documented.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hash:ip type had a test for netmask, add a similar test for bitmask
feature as well, and add another test where bitmask is not a valid
netmask.
Repeat the same three tests for hash:ip,port and hash:net,net.
Add a test to make sure bitmask and netmask options cannot be added at the
same time.
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
| |
We added bitmask support to hash:ip and added both netmask and bitmask
to hash:net,net and hash:ip,port
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
Create a new revision of hash:netnet and add support for bitmask
parameter. The set did not support netmask so we'll add both netmask and
bitmask.
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
Create a new revision of hash:ipport and add support for bitmask
parameter. The set did not support netmask so we'll add both netmask and
bitmask.
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Create a new revision of hash:ip and add support for bitmask parameter.
The set already had support for netmask so only add bitmask here.
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new parameter to complement the existing 'netmask' option. The
main difference between netmask and bitmask is that bitmask takes any
arbitrary ip address as input, it does not have to be a valid netmask.
The name of the new parameter is 'bitmask'. This lets us mask out
arbitrary bits in the ip address, for example:
ipset create set1 hash:ip bitmask 255.128.255.0
ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new parameter to complement the existing 'netmask' option. The
main difference between netmask and bitmask is that bitmask takes any
arbitrary ip address as input, it does not have to be a valid netmask.
The name of the new parameter is 'bitmask'. This lets us mask out
arbitrary bits in the ip address, for example:
ipset create set1 hash:ip bitmask 255.128.255.0
ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduced a regression: commit 48596a8ddc46 ("netfilter:
ipset: Fix adding an IPv4 range containing more than 2^31 addresses")
The variable e.ip is passed to adtfn() function which finally adds the
ip address to the set. The patch above refactored the for loop and moved
e.ip = htonl(ip) to the end of the for loop.
What this means is that if the value of "ip" changes between the first
assignement of e.ip and the forloop, then e.ip is pointing to a
different ip address than "ip".
Test case:
$ ipset create jdtest_tmp hash:ip family inet hashsize 2048 maxelem 100000
$ ipset add jdtest_tmp 10.0.1.1/31
ipset v6.21.1: Element cannot be added to the set: it's already added
The value of ip gets updated inside the "else if (tb[IPSET_ATTR_CIDR])"
block but e.ip is still pointing to the old value.
Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(),
switch from __nlmsg_put to nlmsg_put(), and explain the bounds check
for dealing with the memcpy() across a composite flexible array struct.
Avoids this future run-time warning:
memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16)
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: syzbot <syzkaller@googlegroups.com>
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
| |
And fix a typo committed by me in em_sched.c too.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a couple of places in net/sched/ that check skb->protocol and act
on the value there. However, in the presence of VLAN tags, the value stored
in skb->protocol can be inconsistent based on whether VLAN acceleration is
enabled. The commit quoted in the Fixes tag below fixed the users of
skb->protocol to use a helper that will always see the VLAN ethertype.
However, most of the callers don't actually handle the VLAN ethertype, but
expect to find the IP header type in the protocol field. This means that
things like changing the ECN field, or parsing diffserv values, stops
working if there's a VLAN tag, or if there are multiple nested VLAN
tags (QinQ).
To fix this, change the helper to take an argument that indicates whether
the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
make sure to skip all of them, so behaviour is consistent even in QinQ
mode.
To make the helper usable from the ECN code, move it to if_vlan.h instead
of pkt_sched.h.
v3:
- Remove empty lines
- Move vlan variable definitions inside loop in skb_protocol()
- Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
bpf_skb_ecn_set_ce()
v2:
- Use eth_type_vlan() helper in skb_protocol()
- Also fix code that reads skb->protocol directly
- Change a couple of 'if/else if' statements to switch constructs to avoid
calling the helper twice
Reported-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
| |
em_sched.c was left out, fix it now.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.
Remove it now.
Also, in xt_connmark.h, move the copyright text at the top of the file
which is a much more common pattern.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Daniel Xu reported that the hash:net,iface type of the ipset subsystem does
not limit adding the same network with different interfaces to a set, which
can lead to huge memory usage or allocation failure.
The quick reproducer is
$ ipset create ACL.IN.ALL_PERMIT hash:net,iface hashsize 1048576 timeout 0
$ for i in $(seq 0 100); do /sbin/ipset add ACL.IN.ALL_PERMIT 0.0.0.0/0,kaf_$i timeout 0 -exist; done
The backtrace when vmalloc fails:
[Tue Oct 25 00:13:08 2022] ipset: vmalloc error: size 1073741848, exceeds total pages
<...>
[Tue Oct 25 00:13:08 2022] Call Trace:
[Tue Oct 25 00:13:08 2022] <TASK>
[Tue Oct 25 00:13:08 2022] dump_stack_lvl+0x48/0x60
[Tue Oct 25 00:13:08 2022] warn_alloc+0x155/0x180
[Tue Oct 25 00:13:08 2022] __vmalloc_node_range+0x72a/0x760
[Tue Oct 25 00:13:08 2022] ? hash_netiface4_add+0x7c0/0xb20
[Tue Oct 25 00:13:08 2022] ? __kmalloc_large_node+0x4a/0x90
[Tue Oct 25 00:13:08 2022] kvmalloc_node+0xa6/0xd0
[Tue Oct 25 00:13:08 2022] ? hash_netiface4_resize+0x99/0x710
<...>
The fix is to enforce the limit documented in the ipset(8) manpage:
> The internal restriction of the hash:net,iface set type is that the same
> network prefix cannot be stored with more than 64 different interfaces
> in a single set.
Reported-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 7661809d493b426e979f39ab512e3adf41fbcc69
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed Jul 14 09:45:49 2021 -0700
mm: don't allow oversized kvmalloc() calls
limits the max allocatable memory via kvmalloc() to MAX_INT. Apply the
same limit in ipset.
Reported-by: syzbot+3493b1873fb3ea827986@syzkaller.appspotmail.com
Reported-by: syzbot+2b8443c35458a617c904@syzkaller.appspotmail.com
Reported-by: syzbot+ee5cb15f4a0e85e0d54e@syzkaller.appspotmail.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Executing /usr/sbin/ipset-translate results in the ipset functionality being run, rather than the ipset-translate functionality.
# ipset-translate destroy fred
This command is not supported, use `ipset-translate restore < file'
# /usr/sbin/ipset-translate destroy fred
ipset v7.15: The set with the given name does not exist
use basename() to resolve the issue.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1626
Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The parser assumes the set is an IPv4 ipset because IPSET_OPT_FAMILY is
not set.
# ipset-translate restore < ./ipset-mwan3_set_connected_ipv6.dump
add table inet global
add set inet global mwan3_connected_v6 { type ipv6_addr; flags interval; }
flush set inet global mwan3_connected_v6
ipset v7.15: Error in line 4: Syntax error: '64' is out of range 0-32
Remove ipset_xlate_type_get(), call ipset_xlate_set_get() instead to
obtain the set type and family.
Reported-by: Florian Eckert <fe@dev.tdt.de>
Fixes: 325af556cd3a ("add ipset to nftables translation infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
originally reported in
https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/thread/ZIXKNQHSSCQ4ZLEGYYKLAXQ4PQ5EYFGZ/
by Larry Len Rainey
Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang warns:
net/netfilter/ipset/ip_set_hash_ipportnet.c:249:29: warning: variable
'port_to' is uninitialized when used here [-Wuninitialized]
if (((u64)ip_to - ip + 1)*(port_to - port + 1) > IPSET_MAX_RANGE)
^~~~~~~
net/netfilter/ipset/ip_set_hash_ipportnet.c:167:45: note: initialize the
variable 'port_to' to silence this warning
u32 ip = 0, ip_to = 0, p = 0, port, port_to;
^
= 0
net/netfilter/ipset/ip_set_hash_ipportnet.c:249:39: warning: variable
'port' is uninitialized when used here [-Wuninitialized]
if (((u64)ip_to - ip + 1)*(port_to - port + 1) > IPSET_MAX_RANGE)
^~~~
net/netfilter/ipset/ip_set_hash_ipportnet.c:167:36: note: initialize the
variable 'port' to silence this warning
u32 ip = 0, ip_to = 0, p = 0, port, port_to;
^
= 0
2 warnings generated.
The range check was added before port and port_to are initialized.
Shuffle the check after the initialization so that the check works
properly.
Fixes: 7fb6c63025ff ("netfilter: ipset: Limit the maximal range of consecutive elements to
add/delete")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
The number of hosts in a netblock must be a power of two,
so use shift instead of division.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
A new function was not added to libipset.map at the previous release,
fix it. Reported by Jan Engelhardt.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
Actually, this is the part of it which allows specifying protocols
by number :-)
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
| |
Avoid possible number overflows when calculating the number of
consecutive elements. Also, compute properly the consecutive
elements in the case of hash:net* types.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
This allows us to optimise and reduce restore time by specifying
protocol numbers, especially for large ipsets.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
The range size of consecutive elements were not limited. Thus one
could define a huge range which may result soft lockup errors due
to the long execution time. Now the range size is limited to 2^20
entries. Reported by Brad Spengler.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test checks that the translation from ipset to nftables is correct.
term$ cd tests/xlate
term$ ./runtest.sh
in case that the translation is not correct, it shows the diff with expected
translation output.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides the ipset-translate utility which allows you to
translate your existing ipset file to nftables.
The ipset-translate utility is actually a symlink to ipset, which checks
for 'argv[0] == ipset-translate' to exercise the translation path.
You can translate your ipset file through:
ipset-translate restore < sets.ipt
This patch reuses the existing parser and API to represent the sets and
the elements.
There is a new ipset_xlate_set dummy object that allows to store a
created set to fetch the type without interactions with the kernel.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Do not call restore() from ipset_parser(). Instead, ipset_parser()
returns the IPSET_CMD_RESTORE command and the caller invokes restore().
This patch comes in preparation for the ipset to nftables translation
infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
ipset_parse_argv() parses, builds and send the netlink messages to the
kernel. This patch extracts the parser and wrap it around the new
ipset_parser() function.
This patch comes is preparation for the ipset to nftables translation
infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
Backport patch "netfilter: use nfnetlink_unicast()" from
Pablo Neira Ayuso <pablo@netfilter.org>.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
| |
Backport patch "netfilter: nfnetlink: consolidate callback type"
from Pablo Neira Ayuso <pablo@netfilter.org>.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|