summaryrefslogtreecommitdiffstats
path: root/kernel/net/netfilter
Commit message (Collapse)AuthorAgeFilesLines
* netfilter: ipset: Suppress false sparse warningsJozsef Kadlecsik2024-02-121-2/+2
| | | | | | | | Due to the code reorganization the functions in question now run by call_rcu(), not under rcu locking and pointer access. This produces false sparse warning which are suppressed by the patch. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: remove set destroy at ip_set module removalJozsef Kadlecsik2024-02-051-24/+3
| | | | | | | | | The ip_set module can only be removed when all set module type modules are already removed. A set type module can only be removed when all sets belonging to the given type are already removed. So it is not possible that there's any set defined at ip_set module removal. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Cleanup the code of destroy operation and explain the two ↵Jozsef Kadlecsik2024-02-051-11/+33
| | | | | | stages in comments Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Missing gc cancellations fixedJozsef Kadlecsik2024-02-042-2/+4
| | | | | | | | | | | | | | | | | | The patch fdb8e12cc2cc ("netfilter: ipset: fix performance regression in swap operation") missed to add the calls to gc cancellations at the error path of create operations and at module unload. Also, because the half of the destroy operations now executed by a function registered by call_rcu(), neither NFNL_SUBSYS_IPSET mutex or rcu read lock is held and therefore the checking of them results false warnings. Reported-by: syzbot+52bbc0ad036f6f0d4a25@syzkaller.appspotmail.com Reported-by: Brad Spengler <spender@grsecurity.net> Reported-by: Стас Ничипорович <stasn77@gmail.com> Fixes: fdb8e12cc2cc ("netfilter: ipset: fix performance regression in swap operation") Tested-by: Brad Spengler <spender@grsecurity.net> Tested-by: Стас Ничипорович <stasn77@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* treewide: Convert del_timer*() to timer_shutdown*()Steven Rostedt (Google)2024-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to several bugs caused by timers being re-armed after they are shutdown and just before they are freed, a new state of timers was added called "shutdown". After a timer is set to this state, then it can no longer be re-armed. The following script was run to find all the trivial locations where del_timer() or del_timer_sync() is called in the same function that the object holding the timer is freed. It also ignores any locations where the timer->function is modified between the del_timer*() and the free(), as that is not considered a "trivial" case. This was created by using a coccinelle script and the following commands: $ cat timer.cocci @@ expression ptr, slab; identifier timer, rfield; @@ ( - del_timer(&ptr->timer); + timer_shutdown(&ptr->timer); | - del_timer_sync(&ptr->timer); + timer_shutdown_sync(&ptr->timer); ) ... when strict when != ptr->timer ( kfree_rcu(ptr, rfield); | kmem_cache_free(slab, ptr); | kfree(ptr); ) $ spatch timer.cocci . > /tmp/t.patch $ patch -p1 < /tmp/t.patch Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ] Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ] Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ] Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: fix race condition between swap/destroy and kernel side ↵Jozsef Kadlecsik2024-01-294-19/+61
| | | | | | | | | | | | | | | | | | | | | | | add/del/test v4 The patch "netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test", commit 28628fa9 fixes a race condition. But the synchronize_rcu() added to the swap function unnecessarily slows it down: it can safely be moved to destroy and use call_rcu() instead. Eric Dumazet pointed out that simply calling the destroy functions as rcu callback does not work: sets with timeout use garbage collectors which need cancelling at destroy which can wait. Therefore the destroy functions are split into two: cancelling garbage collectors safely at executing the command received by netlink and moving the remaining part only into the rcu callback. Link: https://lore.kernel.org/lkml/C0829B10-EAA6-4809-874E-E1E9C05A8D84@automattic.com/ Fixes: 28628fa952fe ("netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test") Reported-by: Ale Crismani <ale.crismani@automattic.com> Reported-by: David Wang <00107082@163.com> Tested-by: David Wang <00107082@163.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: fix race condition between swap/destroy and kernel side ↵Jozsef Kadlecsik2023-12-111-24/+6
| | | | | | | | | | add/del/test v3 Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair. So there's no need to extend the rcu read locked area in ipset itself. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: fix race condition between swap/destroy and kernel side ↵Jozsef Kadlecsik2023-11-041-3/+3
| | | | | | | | | add/del/test v2 synchronize_rcu() is moved into ip_set_swap() in order not to burden ip_set_destroy() unnecessarily when all sets are destroyed Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: fix race condition between swap/destroy and kernel side ↵Jozsef Kadlecsik2023-10-191-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add/del/test Linkui Xiao reported that there's a race condition when ipset swap and destroy is called, which can lead to crash in add/del/test element operations. Swap then destroy are usual operations to replace a set with another one in a production system. The issue can in some cases be reproduced with the script: ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip1 172.20.0.0/16 ipset add hash_ip1 192.168.0.0/16 iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT while [ 1 ] do # ... Ongoing traffic... ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip2 172.20.0.0/16 ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 sleep 0.05 done In the race case the possible order of the operations are CPU0 CPU1 ip_set_test ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 hash_net_kadt Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy removed it, hash_net_kadt crashes. The fix is to protect both the list of the sets and the set pointers in an extended RCU region and before calling destroy, wait to finish all started rcu_read_lock(). The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>. Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/ Reported by: Linkui Xiao <xiaolinkui@kylinos.cn> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAPJozsef Kadlecsik2023-09-181-2/+10
| | | | | | | | | | | | | Kyle Zeng reported that there is a race between IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it. The race is caused by using the wrong reference counter, i.e. the ref counter instead of ref_netlink. Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Tested-by: Kyle Zeng <zengyhkyle@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ↵Kyle Zeng2023-09-181-0/+1
| | | | | | | | | | | | | | | | | | ip_set_hash_netportnet.c The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: refactor deprecated strncpyJustin Stitt2023-09-181-6/+6
| | | | | | | | | Use `strscpy_pad` instead of `strncpy`. Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* netfilter: ipset: remove rcu_read_lock_bh pair from ip_set_testFlorian Westphal2023-09-181-2/+0
| | | | | | | | | | | | Callers already hold rcu_read_lock. Prior to RCU conversion this used to be a read_lock_bh(), but now the bh-disable isn't needed anymore. Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Replace strlcpy with strscpyAzeem Shaikh2023-09-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). Direct replacement is safe here since return value from all callers of STRLCPY macro were ignored. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230613003437.3538694-1-azeemshaikh38@gmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add schedule point in call_ad().Kuniyuki Iwashima2023-09-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller found a repro that causes Hung Task [0] with ipset. The repro first creates an ipset and then tries to delete a large number of IPs from the ipset concurrently: IPSET_ATTR_IPADDR_IPV4 : 172.20.20.187 IPSET_ATTR_CIDR : 2 The first deleting thread hogs a CPU with nfnl_lock(NFNL_SUBSYS_IPSET) held, and other threads wait for it to be released. Previously, the same issue existed in set->variant->uadt() that could run so long under ip_set_lock(set). Commit 5e29dc36bd5e ("netfilter: ipset: Rework long task execution when adding/deleting entries") tried to fix it, but the issue still exists in the caller with another mutex. While adding/deleting many IPs, we should release the CPU periodically to prevent someone from abusing ipset to hang the system. Note we need to increment the ipset's refcnt to prevent the ipset from being destroyed while rescheduling. [0]: INFO: task syz-executor174:268 blocked for more than 143 seconds. Not tainted 6.4.0-rc1-00145-gba79e9a73284 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor174 state:D stack:0 pid:268 ppid:260 flags:0x0000000d Call trace: __switch_to+0x308/0x714 arch/arm64/kernel/process.c:556 context_switch kernel/sched/core.c:5343 [inline] __schedule+0xd84/0x1648 kernel/sched/core.c:6669 schedule+0xf0/0x214 kernel/sched/core.c:6745 schedule_preempt_disabled+0x58/0xf0 kernel/sched/core.c:6804 __mutex_lock_common kernel/locking/mutex.c:679 [inline] __mutex_lock+0x6fc/0xdb0 kernel/locking/mutex.c:747 __mutex_lock_slowpath+0x14/0x20 kernel/locking/mutex.c:1035 mutex_lock+0x98/0xf0 kernel/locking/mutex.c:286 nfnl_lock net/netfilter/nfnetlink.c:98 [inline] nfnetlink_rcv_msg+0x480/0x70c net/netfilter/nfnetlink.c:295 netlink_rcv_skb+0x1c0/0x350 net/netlink/af_netlink.c:2546 nfnetlink_rcv+0x18c/0x199c net/netfilter/nfnetlink.c:658 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x664/0x8cc net/netlink/af_netlink.c:1365 netlink_sendmsg+0x6d0/0xa4c net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x4b8/0x810 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1f8/0x2a4 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2593 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x84/0x270 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x24c arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Reported-by: syzkaller <syzkaller@googlegroups.com> Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* net: Kconfig: fix spellosRandy Dunlap2023-09-181-1/+1
| | | | | | | | | | | | | | | | Fix spelling in net/ Kconfig files. (reported by codespell) Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: coreteam@netfilter.org Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Link: https://lore.kernel.org/r/20230124181724.18166-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function.Gavrilov Ilia2023-01-281-2/+2
| | | | | | | | | | | | | | | | When first_ip is 0, last_ip is 0xFFFFFFFF, and netmask is 31, the value of an arithmetic expression 2 << (netmask - mask_bits - 1) is subject to overflow due to a failure casting operands to a larger data type before performing the arithmetic. Note that it's harmless since the value will be checked at the next step. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b9fed748185a ("netfilter: ipset: Check and reject crazy /0 input parameters") Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Rework long task execution when adding/deleting entriesJozsef Kadlecsik2022-12-3010-80/+67
| | | | | | | | | | | | | | | | | | | | | When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete")
* netfilter: ipset: fix hash:net,port,net hang with /0 subnetJozsef Kadlecsik2022-12-301-19/+21
| | | | | | | | | | | | | The hash:net,port,net set type supports /0 subnets. However, the patch commit 5f7b51bf09baca8e titled "netfilter: ipset: Limit the maximal range of consecutive elements to add/delete" did not take into account it and resulted in an endless loop. The bug is actually older but the patch 5f7b51bf09baca8e brings it out earlier. Handle /0 subnets properly in hash:net,port,net set types. Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: restore allowing 64 clashing elements in hash:net,ifaceJozsef Kadlecsik2022-11-211-1/+1
| | | | | | | The patch "netfilter: ipset: enforce documented limit to prevent allocating huge memory" was too strict and prevented to add up to 64 clashing elements to a hash:net,iface type of set. This patch fixes the issue and now the type behaves as documented.
* netfilter: ipset: Add support for new bitmask parameterVishwanath Pai2022-11-204-26/+114
| | | | | | | | | | | | | | | Add a new parameter to complement the existing 'netmask' option. The main difference between netmask and bitmask is that bitmask takes any arbitrary ip address as input, it does not have to be a valid netmask. The name of the new parameter is 'bitmask'. This lets us mask out arbitrary bits in the ip address, for example: ipset create set1 hash:ip bitmask 255.128.255.0 ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80 Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: regression in ip_set_hash_ip.cVishwanath Pai2022-11-071-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This patch introduced a regression: commit 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses") The variable e.ip is passed to adtfn() function which finally adds the ip address to the set. The patch above refactored the for loop and moved e.ip = htonl(ip) to the end of the for loop. What this means is that if the value of "ip" changes between the first assignement of e.ip and the forloop, then e.ip is pointing to a different ip address than "ip". Test case: $ ipset create jdtest_tmp hash:ip family inet hashsize 2048 maxelem 100000 $ ipset add jdtest_tmp 10.0.1.1/31 ipset v6.21.1: Element cannot be added to the set: it's already added The value of ip gets updated inside the "else if (tb[IPSET_ATTR_CIDR])" block but e.ip is still pointing to the old value. Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: move from strlcpy with unused retval to strscpyWolfram Sang2022-11-071-2/+2
| | | | | | | | | | | Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Florian Westphal <fw@strlen.de>
* netlink: Bounds-check struct nlmsgerr creationKees Cook2022-11-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(), switch from __nlmsg_put to nlmsg_put(), and explain the bounds check for dealing with the memcpy() across a composite flexible array struct. Avoids this future run-time warning: memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16) Cc: Jakub Kicinski <kuba@kernel.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: syzbot <syzkaller@googlegroups.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.org Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: ipset: enforce documented limit to prevent allocating huge memoryJozsef Kadlecsik2022-11-071-24/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Xu reported that the hash:net,iface type of the ipset subsystem does not limit adding the same network with different interfaces to a set, which can lead to huge memory usage or allocation failure. The quick reproducer is $ ipset create ACL.IN.ALL_PERMIT hash:net,iface hashsize 1048576 timeout 0 $ for i in $(seq 0 100); do /sbin/ipset add ACL.IN.ALL_PERMIT 0.0.0.0/0,kaf_$i timeout 0 -exist; done The backtrace when vmalloc fails: [Tue Oct 25 00:13:08 2022] ipset: vmalloc error: size 1073741848, exceeds total pages <...> [Tue Oct 25 00:13:08 2022] Call Trace: [Tue Oct 25 00:13:08 2022] <TASK> [Tue Oct 25 00:13:08 2022] dump_stack_lvl+0x48/0x60 [Tue Oct 25 00:13:08 2022] warn_alloc+0x155/0x180 [Tue Oct 25 00:13:08 2022] __vmalloc_node_range+0x72a/0x760 [Tue Oct 25 00:13:08 2022] ? hash_netiface4_add+0x7c0/0xb20 [Tue Oct 25 00:13:08 2022] ? __kmalloc_large_node+0x4a/0x90 [Tue Oct 25 00:13:08 2022] kvmalloc_node+0xa6/0xd0 [Tue Oct 25 00:13:08 2022] ? hash_netiface4_resize+0x99/0x710 <...> The fix is to enforce the limit documented in the ipset(8) manpage: > The internal restriction of the hash:net,iface set type is that the same > network prefix cannot be stored with more than 64 different interfaces > in a single set. Reported-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix oversized kvmalloc() callsJozsef Kadlecsik2022-10-171-2/+2
| | | | | | | | | | | | | | | | commit 7661809d493b426e979f39ab512e3adf41fbcc69 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jul 14 09:45:49 2021 -0700 mm: don't allow oversized kvmalloc() calls limits the max allocatable memory via kvmalloc() to MAX_INT. Apply the same limit in ipset. Reported-by: syzbot+3493b1873fb3ea827986@syzkaller.appspotmail.com Reported-by: syzbot+2b8443c35458a617c904@syzkaller.appspotmail.com Reported-by: syzbot+ee5cb15f4a0e85e0d54e@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix maximal range check in hash_ipportnet4_uadt()Nathan Chancellor2021-08-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang warns: net/netfilter/ipset/ip_set_hash_ipportnet.c:249:29: warning: variable 'port_to' is uninitialized when used here [-Wuninitialized] if (((u64)ip_to - ip + 1)*(port_to - port + 1) > IPSET_MAX_RANGE) ^~~~~~~ net/netfilter/ipset/ip_set_hash_ipportnet.c:167:45: note: initialize the variable 'port_to' to silence this warning u32 ip = 0, ip_to = 0, p = 0, port, port_to; ^ = 0 net/netfilter/ipset/ip_set_hash_ipportnet.c:249:39: warning: variable 'port' is uninitialized when used here [-Wuninitialized] if (((u64)ip_to - ip + 1)*(port_to - port + 1) > IPSET_MAX_RANGE) ^~~~ net/netfilter/ipset/ip_set_hash_ipportnet.c:167:36: note: initialize the variable 'port' to silence this warning u32 ip = 0, ip_to = 0, p = 0, port, port_to; ^ = 0 2 warnings generated. The range check was added before port and port_to are initialized. Shuffle the check after the initialization so that the check works properly. Fixes: 7fb6c63025ff ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* 64bit division isn't allowed on 32bit, replace it with shiftJozsef Kadlecsik2021-07-281-1/+2
| | | | | | | The number of hosts in a netblock must be a power of two, so use shift instead of division. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Limit the maximal range of consecutive elements to add/delete fixJozsef Kadlecsik2021-07-1610-19/+70
| | | | | | | | Avoid possible number overflows when calculating the number of consecutive elements. Also, compute properly the consecutive elements in the case of hash:net* types. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Limit the maximal range of consecutive elements to add/deleteJozsef Kadlecsik2021-07-149-0/+26
| | | | | | | | | The range size of consecutive elements were not limited. Thus one could define a huge range which may result soft lockup errors due to the long execution time. Now the range size is limited to 2^20 entries. Reported by Brad Spengler. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Backport "netfilter: use nfnetlink_unicast()"Jozsef Kadlecsik2021-06-261-40/+10
| | | | | | | Backport patch "netfilter: use nfnetlink_unicast()" from Pablo Neira Ayuso <pablo@netfilter.org>. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Backport "netfilter: nfnetlink: consolidate callback type"Jozsef Kadlecsik2021-06-261-0/+16
| | | | | | | Backport patch "netfilter: nfnetlink: consolidate callback type" from Pablo Neira Ayuso <pablo@netfilter.org>. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Backport "netfilter: nfnetlink: add struct nfnl_info and pass it to callbacks"Jozsef Kadlecsik2021-06-261-52/+71
| | | | | | | Backport patch "netfilter: nfnetlink: add struct nfnl_info and pass it to callbacks" from Pablo Neira Ayuso <pablo@netfilter.org>. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Backport "netfilter: add helper function to set up the nfnetlink header and ↵Jozsef Kadlecsik2021-06-261-14/+3
| | | | | | | | | use it" Backport patch "netfilter: add helper function to set up the nfnetlink header and use it" from Pablo Neira Ayuso <pablo@netfilter.org>. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Fix patch "Handle false warning from -Wstringop-overflow"Jozsef Kadlecsik2020-12-201-1/+1
| | | | | | Return code of strscpy() was not handled properly. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* treewide: rename nla_strlcpy to nla_strscpy.Francis Laniel2020-12-201-2/+2
| | | | | | | | | Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new name of this function. Signed-off-by: Francis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* netfilter: ipset: fix shift-out-of-bounds in htable_bits()Vasily Averin2020-12-201-19/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | htable_bits() can call jhash_size(32) and trigger shift-out-of-bounds UBSAN: shift-out-of-bounds in net/netfilter/ipset/ip_set_hash_gen.h:151:6 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 0 PID: 8498 Comm: syz-executor519 Not tainted 5.10.0-rc7-next-20201208-syzkaller #0 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 htable_bits net/netfilter/ipset/ip_set_hash_gen.h:151 [inline] hash_mac_create.cold+0x58/0x9b net/netfilter/ipset/ip_set_hash_gen.h:1524 ip_set_create+0x610/0x1380 net/netfilter/ipset/ip_set_core.c:1115 nfnetlink_rcv_msg+0xecc/0x1180 net/netfilter/nfnetlink.c:252 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494 nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:600 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2345 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This patch replaces htable_bits() by simple fls(hashsize - 1) call: it alone returns valid nbits both for round and non-round hashsizes. It is normal to set any nbits here because it is validated inside following htable_size() call which returns 0 for nbits>31. Fixes: 1feab10d7e6d("netfilter: ipset: Unified hash type generation") Reported-by: syzbot+d66bfadebca46cf61a2b@syzkaller.appspotmail.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: fixes possible oops in mtype_resizeVasily Averin2020-12-201-9/+17
| | | | | | | | | | | | | | | | | | currently mtype_resize() can cause oops t = ip_set_alloc(htable_size(htable_bits)); if (!t) { ret = -ENOMEM; goto out; } t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits)); Increased htable_bits can force htable_size() to return 0. In own turn ip_set_alloc(0) returns not 0 but ZERO_SIZE_PTR, so follwoing access to t->hregion should trigger an OOPS. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Handle false warning from -Wstringop-overflowJozsef Kadlecsik2020-12-141-1/+1
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Move compiler specific compatibility support to separated fileJozsef Kadlecsik2020-12-071-0/+1
| | | | | | Kernel compatibility support was broken in 7.9, reported by Ed W. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Complete backward compatibility fix for package copy of <linux/jhash.h>Jozsef Kadlecsik2020-11-191-2/+0
| | | | | | | An unnecessary condition prevented to compile pfxlen.c with the patch 202cfef66b3a1e0988d applied, it's fixed. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: enable memory accounting for ipset allocationsVasily Averin2020-11-191-16/+1
| | | | | | | | | | | | Currently netadmin inside non-trusted container can quickly allocate whole node's memory via request of huge ipset hashtable. Other ipset-related memory allocations should be restricted too. v2: fixed typo ALLOC -> ACCOUNT Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: prevent uninit-value in hash_ip6_addEric Dumazet2020-11-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot found that we are not validating user input properly before copying 16 bytes [1]. Using NLA_BINARY in ipaddr_policy[] for IPv6 address is not correct, since it ensures at most 16 bytes were provided. We should instead make sure user provided exactly 16 bytes. In old kernels (before v4.20), fix would be to remove the NLA_BINARY, since NLA_POLICY_EXACT_LEN() was not yet available. [1] BUG: KMSAN: uninit-value in hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892 CPU: 1 PID: 11611 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197 hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892 hash_ip6_uadt+0x976/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:267 call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720 ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808 ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833 nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252 netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494 nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d5/0x830 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45deb9 Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fe2e503fc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000029ec0 RCX: 000000000045deb9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 000000000118bf60 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c R13: 000000000169fb7f R14: 00007fe2e50409c0 R15: 000000000118bf2c Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline] kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289 __msan_chain_origin+0x57/0xa0 mm/kmsan/kmsan_instr.c:147 ip6_netmask include/linux/netfilter/ipset/pfxlen.h:49 [inline] hash_ip6_netmask net/netfilter/ipset/ip_set_hash_ip.c:185 [inline] hash_ip6_uadt+0xb1c/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:263 call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720 ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808 ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833 nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252 netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494 nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d5/0x830 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline] kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289 kmsan_memcpy_memmove_metadata+0x25e/0x2d0 mm/kmsan/kmsan.c:226 kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:246 __msan_memcpy+0x46/0x60 mm/kmsan/kmsan_instr.c:110 ip_set_get_ipaddr6+0x2cb/0x370 net/netfilter/ipset/ip_set_core.c:310 hash_ip6_uadt+0x439/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:255 call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720 ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808 ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833 nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252 netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494 nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d5/0x830 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline] kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104 kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76 slab_alloc_node mm/slub.c:2906 [inline] __kmalloc_node_track_caller+0xc61/0x15f0 mm/slub.c:4512 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x309/0xae0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1094 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline] netlink_sendmsg+0xdb8/0x1840 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d5/0x830 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Compatibility: Check for the fourth arg of list_for_each_entry_rcu()Jozsef Kadlecsik2020-11-191-2/+2
| | | | | | | A forth argument of list_for_each_entry_rcu() was introduced, handle the compatibility issue. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Expose the initval hash parameter to userspaceJozsef Kadlecsik2020-09-2113-16/+33
| | | | | | It makes possible to reproduce exactly the same set after a save/restore. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Add bucketsize parameter to all hash typesJozsef Kadlecsik2020-09-2114-39/+71
| | | | | | | | | The parameter defines the upper limit in any hash bucket at adding new entries from userspace - if the limit would be exceeded, ipset doubles the hash size and rehashes. It means the set may consume more memory but gives faster evaluation at matching in the set. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Support the -exist flag with the destroy commandJozsef Kadlecsik2020-09-201-1/+3
| | | | | | | | The -exist flag was supported with the create, add and delete commands. In order to gracefully handle the destroy command with nonexistent sets, the -exist flag is added to destroy too. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-09-201-1/+1
| | | | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: Replace zero-length array with flexible-array memberGustavo A. R. Silva2020-09-204-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] Lastly, fix checkpatch.pl warning WARNING: __aligned(size) is preferred over __attribute__((aligned(size))) in net/bridge/netfilter/ebtables.c This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: call ip_set_free() instead of kfree()Eric Dumazet2020-09-204-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever ip_set_alloc() is used, allocated memory can either use kmalloc() or vmalloc(). We should call kvfree() or ip_set_free() invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 21935 Comm: syz-executor.3 Not tainted 5.8.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__phys_addr+0xa7/0x110 arch/x86/mm/physaddr.c:28 Code: 1d 7a 09 4c 89 e3 31 ff 48 d3 eb 48 89 de e8 d0 58 3f 00 48 85 db 75 0d e8 26 5c 3f 00 4c 89 e0 5b 5d 41 5c c3 e8 19 5c 3f 00 <0f> 0b e8 12 5c 3f 00 48 c7 c0 10 10 a8 89 48 ba 00 00 00 00 00 fc RSP: 0000:ffffc900018572c0 EFLAGS: 00010046 RAX: 0000000000040000 RBX: 0000000000000001 RCX: ffffc9000fac3000 RDX: 0000000000040000 RSI: ffffffff8133f437 RDI: 0000000000000007 RBP: ffffc90098aff000 R08: 0000000000000000 R09: ffff8880ae636cdb R10: 0000000000000000 R11: 0000000000000000 R12: 0000408018aff000 R13: 0000000000080000 R14: 000000000000001d R15: ffffc900018573d8 FS: 00007fc540c66700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc9dcd67200 CR3: 0000000059411000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: virt_to_head_page include/linux/mm.h:841 [inline] virt_to_cache mm/slab.h:474 [inline] kfree+0x77/0x2c0 mm/slab.c:3749 hash_net_create+0xbb2/0xd70 net/netfilter/ipset/ip_set_hash_gen.h:1536 ip_set_create+0x6a2/0x13c0 net/netfilter/ipset/ip_set_core.c:1128 nfnetlink_rcv_msg+0xbe8/0xea0 net/netfilter/nfnetlink.c:230 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469 nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:564 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2352 ___sys_sendmsg+0xf3/0x170 net/socket.c:2406 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45cb19 Code: Bad RIP value. RSP: 002b:00007fc540c65c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004fed80 RCX: 000000000045cb19 RDX: 0000000000000000 RSI: 0000000020001080 RDI: 0000000000000003 RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 000000000000095e R14: 00000000004cc295 R15: 00007fc540c666d4 Fixes: f66ee0410b1c ("netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports") Fixes: 03c8b234e61a ("netfilter: ipset: Generalize extensions support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>