summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ipset 7.18 releasedv7.18Jozsef Kadlecsik2023-09-193-1/+28
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Add json output to list commandThomas Oberhammer2023-09-186-7/+97
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAPJozsef Kadlecsik2023-09-181-2/+10
| | | | | | | | | | | | | Kyle Zeng reported that there is a race between IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it. The race is caused by using the wrong reference counter, i.e. the ref counter instead of ref_netlink. Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Tested-by: Kyle Zeng <zengyhkyle@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ↵Kyle Zeng2023-09-181-0/+1
| | | | | | | | | | | | | | | | | | ip_set_hash_netportnet.c The missing IP_SET_HASH_WITH_NET0 macro in ip_set_hash_netportnet can lead to the use of wrong `CIDR_POS(c)` for calculating array offsets, which can lead to integer underflow. As a result, it leads to slab out-of-bound access. This patch adds back the IP_SET_HASH_WITH_NET0 macro to ip_set_hash_netportnet to address the issue. Fixes: 886503f34d63 ("netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net") Suggested-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* compatibility: handle strscpy_pad()Jozsef Kadlecsik2023-09-182-0/+26
|
* netfilter: ipset: refactor deprecated strncpyJustin Stitt2023-09-181-6/+6
| | | | | | | | | Use `strscpy_pad` instead of `strncpy`. Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* netfilter: ipset: remove rcu_read_lock_bh pair from ip_set_testFlorian Westphal2023-09-181-2/+0
| | | | | | | | | | | | Callers already hold rcu_read_lock. Prior to RCU conversion this used to be a read_lock_bh(), but now the bh-disable isn't needed anymore. Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Replace strlcpy with strscpyAzeem Shaikh2023-09-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). Direct replacement is safe here since return value from all callers of STRLCPY macro were ignored. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230613003437.3538694-1-azeemshaikh38@gmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add schedule point in call_ad().Kuniyuki Iwashima2023-09-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller found a repro that causes Hung Task [0] with ipset. The repro first creates an ipset and then tries to delete a large number of IPs from the ipset concurrently: IPSET_ATTR_IPADDR_IPV4 : 172.20.20.187 IPSET_ATTR_CIDR : 2 The first deleting thread hogs a CPU with nfnl_lock(NFNL_SUBSYS_IPSET) held, and other threads wait for it to be released. Previously, the same issue existed in set->variant->uadt() that could run so long under ip_set_lock(set). Commit 5e29dc36bd5e ("netfilter: ipset: Rework long task execution when adding/deleting entries") tried to fix it, but the issue still exists in the caller with another mutex. While adding/deleting many IPs, we should release the CPU periodically to prevent someone from abusing ipset to hang the system. Note we need to increment the ipset's refcnt to prevent the ipset from being destroyed while rescheduling. [0]: INFO: task syz-executor174:268 blocked for more than 143 seconds. Not tainted 6.4.0-rc1-00145-gba79e9a73284 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor174 state:D stack:0 pid:268 ppid:260 flags:0x0000000d Call trace: __switch_to+0x308/0x714 arch/arm64/kernel/process.c:556 context_switch kernel/sched/core.c:5343 [inline] __schedule+0xd84/0x1648 kernel/sched/core.c:6669 schedule+0xf0/0x214 kernel/sched/core.c:6745 schedule_preempt_disabled+0x58/0xf0 kernel/sched/core.c:6804 __mutex_lock_common kernel/locking/mutex.c:679 [inline] __mutex_lock+0x6fc/0xdb0 kernel/locking/mutex.c:747 __mutex_lock_slowpath+0x14/0x20 kernel/locking/mutex.c:1035 mutex_lock+0x98/0xf0 kernel/locking/mutex.c:286 nfnl_lock net/netfilter/nfnetlink.c:98 [inline] nfnetlink_rcv_msg+0x480/0x70c net/netfilter/nfnetlink.c:295 netlink_rcv_skb+0x1c0/0x350 net/netlink/af_netlink.c:2546 nfnetlink_rcv+0x18c/0x199c net/netfilter/nfnetlink.c:658 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x664/0x8cc net/netlink/af_netlink.c:1365 netlink_sendmsg+0x6d0/0xa4c net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x4b8/0x810 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1f8/0x2a4 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2593 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x84/0x270 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x24c arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Reported-by: syzkaller <syzkaller@googlegroups.com> Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* net: Kconfig: fix spellosRandy Dunlap2023-09-181-1/+1
| | | | | | | | | | | | | | | | Fix spelling in net/ Kconfig files. (reported by codespell) Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: coreteam@netfilter.org Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Link: https://lore.kernel.org/r/20230124181724.18166-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* tests: hash:ip,port.t: Replace VRRP by GRE protocolPhil Sutter2023-03-102-5/+5
| | | | | | | | Some systems may not have "vrrp" as alias to "carp" yet, so use a protocol which is less likely to cause problems for testing purposes. Fixes: a67aa712ed912 ("tests: hash:ip,port.t: 'vrrp' is printed as 'carp'") Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: hash:ip,port.t: 'vrrp' is printed as 'carp'Phil Sutter2023-03-101-1/+1
| | | | | | | | | | | | % grep vrrp /etc/protocols | carp 112 CARP vrrp # Common Address Redundancy Protocol Nowadays, carp seems to be the preferred name for protocol 112. Simply change the expected output for lack of idea for a backwards compatible change which is not simply using another protocol. Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: cidr.sh: Add ipcalc fallbackPhil Sutter2023-03-101-4/+28
| | | | | | If netmask is not available, ipcalc may be a viable replacement. Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: xlate: Make test input validPhil Sutter2023-03-103-5/+15
| | | | | | | | | | | | | Make sure ipset at least accepts the test input by running it against plain ipset once for sanity. This exposed two issues: * Set 'hip5' doesn't have comment support, so add the commented elements to 'hip6' instead (likely a typo). * Set 'bip1' range 2.0.0.1-2.1.0.1 exceeds the max allowed for bitmap sets. Reduce it accordingly. Fixes: 7587d1c4b5465 ("tests: add tests ipset to nftables") Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: xlate: Test built binary by defaultPhil Sutter2023-03-102-6/+3
| | | | | | | | | Testing the host's iptables-translate by default is unintuitive. Since the ipset-translate symlink is created upon 'make install', add a local symlink to the repository pointing at a built binary in src/. Using this by default is consistent with the regular testsuite. Signed-off-by: Phil Sutter <phil@nwl.cc>
* xlate: Drop dead codePhil Sutter2023-03-101-3/+0
| | | | | | | | Set type is not needed when manipulating elements, the assigned variable was unused in that case. Fixes: 325af556cd3a6 ("add ipset to nftables translation infrastructure") Signed-off-by: Phil Sutter <phil@nwl.cc>
* xlate: Fix for fd leak in error pathPhil Sutter2023-03-101-1/+1
| | | | | | | A rather cosmetic issue though, the program will terminate anyway. Fixes: 325af556cd3a6 ("add ipset to nftables translation infrastructure") Signed-off-by: Phil Sutter <phil@nwl.cc>
* configure.ac: fix bashismsSam James2023-01-281-3/+3
| | | | | | | | | | | | | configure scripts need to be runnable with a POSIX-compliant /bin/sh. On many (but not all!) systems, /bin/sh is provided by Bash, so errors like this aren't spotted. Notably Debian defaults to /bin/sh provided by dash which doesn't tolerate such bashisms as '=='. This retains compatibility with bash. Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* lib/Makefile.am: fix pkgconfig dirSam James2023-01-281-1/+1
| | | | | Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function.Gavrilov Ilia2023-01-281-2/+2
| | | | | | | | | | | | | | | | When first_ip is 0, last_ip is 0xFFFFFFFF, and netmask is 31, the value of an arithmetic expression 2 << (netmask - mask_bits - 1) is subject to overflow due to a failure casting operands to a larger data type before performing the arithmetic. Note that it's harmless since the value will be checked at the next step. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b9fed748185a ("netfilter: ipset: Check and reject crazy /0 input parameters") Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* ipset 7.17 releasedv7.17Jozsef Kadlecsik2022-12-303-1/+11
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Tests: When verifying comments/timeouts, make sure entries don't expireJozsef Kadlecsik2022-12-302-2/+2
|
* Tests: Make sure the internal batches add the correct number of elementsJozsef Kadlecsik2022-12-301-0/+6
|
* Tests: Verify that hash:net,port,net type can handle 0/0 properlyJozsef Kadlecsik2022-12-301-0/+6
|
* netfilter: ipset: Rework long task execution when adding/deleting entriesJozsef Kadlecsik2022-12-3011-81/+68
| | | | | | | | | | | | | | | | | | | | | When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete")
* netfilter: ipset: fix hash:net,port,net hang with /0 subnetJozsef Kadlecsik2022-12-301-19/+21
| | | | | | | | | | | | | The hash:net,port,net set type supports /0 subnets. However, the patch commit 5f7b51bf09baca8e titled "netfilter: ipset: Limit the maximal range of consecutive elements to add/delete" did not take into account it and resulted in an endless loop. The bug is actually older but the patch 5f7b51bf09baca8e brings it out earlier. Handle /0 subnets properly in hash:net,port,net set types. Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Makefile: Create LZMA-compressed dist-filesPhil Sutter2022-12-101-1/+1
| | | | | | | | | Use a more modern alternative to gzip. Suggested-by: Jan Engelhardt <jengelh@inai.de> Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* ipset 7.16 releasedv7.16Jozsef Kadlecsik2022-11-213-1/+37
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* Add new ipset_parse_bitmask() function to the library interface.Jozsef Kadlecsik2022-11-212-1/+5
|
* test: Make sure no more than 64 clashing elements can be added to ↵Jozsef Kadlecsik2022-11-211-0/+4
| | | | hash:net,iface sets
* netfilter: ipset: restore allowing 64 clashing elements in hash:net,ifaceJozsef Kadlecsik2022-11-211-1/+1
| | | | | | | The patch "netfilter: ipset: enforce documented limit to prevent allocating huge memory" was too strict and prevented to add up to 64 clashing elements to a hash:net,iface type of set. This patch fixes the issue and now the type behaves as documented.
* Fix all debug mode warningsJozsef Kadlecsik2022-11-203-17/+24
|
* netfilter: ipset: add tests for the new bitmask featureVishwanath Pai2022-11-2016-1/+426
| | | | | | | | | | | | | | The hash:ip type had a test for netmask, add a similar test for bitmask feature as well, and add another test where bitmask is not a valid netmask. Repeat the same three tests for hash:ip,port and hash:net,net. Add a test to make sure bitmask and netmask options cannot be added at the same time. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Update the man page to include netmask/bitmask optionsVishwanath Pai2022-11-201-3/+30
| | | | | | | | We added bitmask support to hash:ip and added both netmask and bitmask to hash:net,net and hash:ip,port Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add bitmask support to hash:netnetVishwanath Pai2022-11-201-0/+101
| | | | | | | | | | Create a new revision of hash:netnet and add support for bitmask parameter. The set did not support netmask so we'll add both netmask and bitmask. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add bitmask support to hash:ipportVishwanath Pai2022-11-201-0/+108
| | | | | | | | | | Create a new revision of hash:ipport and add support for bitmask parameter. The set did not support netmask so we'll add both netmask and bitmask. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add bitmask support to hash:ipVishwanath Pai2022-11-201-0/+86
| | | | | | | | | Create a new revision of hash:ip and add support for bitmask parameter. The set already had support for netmask so only add bitmask here. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add support for new bitmask parameterVishwanath Pai2022-11-2011-3/+83
| | | | | | | | | | | | | | | Add a new parameter to complement the existing 'netmask' option. The main difference between netmask and bitmask is that bitmask takes any arbitrary ip address as input, it does not have to be a valid netmask. The name of the new parameter is 'bitmask'. This lets us mask out arbitrary bits in the ip address, for example: ipset create set1 hash:ip bitmask 255.128.255.0 ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80 Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Add support for new bitmask parameterVishwanath Pai2022-11-206-26/+126
| | | | | | | | | | | | | | | Add a new parameter to complement the existing 'netmask' option. The main difference between netmask and bitmask is that bitmask takes any arbitrary ip address as input, it does not have to be a valid netmask. The name of the new parameter is 'bitmask'. This lets us mask out arbitrary bits in the ip address, for example: ipset create set1 hash:ip bitmask 255.128.255.0 ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80 Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: regression in ip_set_hash_ip.cVishwanath Pai2022-11-071-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This patch introduced a regression: commit 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses") The variable e.ip is passed to adtfn() function which finally adds the ip address to the set. The patch above refactored the for loop and moved e.ip = htonl(ip) to the end of the for loop. What this means is that if the value of "ip" changes between the first assignement of e.ip and the forloop, then e.ip is pointing to a different ip address than "ip". Test case: $ ipset create jdtest_tmp hash:ip family inet hashsize 2048 maxelem 100000 $ ipset add jdtest_tmp 10.0.1.1/31 ipset v6.21.1: Element cannot be added to the set: it's already added The value of ip gets updated inside the "else if (tb[IPSET_ATTR_CIDR])" block but e.ip is still pointing to the old value. Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: move from strlcpy with unused retval to strscpyWolfram Sang2022-11-071-2/+2
| | | | | | | | | | | Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Florian Westphal <fw@strlen.de>
* compatibility: handle unsafe_memcpy()Jozsef Kadlecsik2022-11-071-0/+6
|
* netlink: Bounds-check struct nlmsgerr creationKees Cook2022-11-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(), switch from __nlmsg_put to nlmsg_put(), and explain the bounds check for dealing with the memcpy() across a composite flexible array struct. Avoids this future run-time warning: memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16) Cc: Jakub Kicinski <kuba@kernel.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: syzbot <syzkaller@googlegroups.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.org Signed-off-by: David S. Miller <davem@davemloft.net>
* compatibility: move to skb_protocol in the code from tc_skb_protocolJozsef Kadlecsik2022-11-072-6/+4
| | | | And fix a typo committed by me in em_sched.c too.
* Compatibility: check kvcalloc, kvfree, kvzalloc in slab.h tooJozsef Kadlecsik2022-11-071-1/+13
|
* sched: consistently handle layer3 header accesses in the presence of VLANsToke Høiland-Jørgensen2022-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a couple of places in net/sched/ that check skb->protocol and act on the value there. However, in the presence of VLAN tags, the value stored in skb->protocol can be inconsistent based on whether VLAN acceleration is enabled. The commit quoted in the Fixes tag below fixed the users of skb->protocol to use a helper that will always see the VLAN ethertype. However, most of the callers don't actually handle the VLAN ethertype, but expect to find the IP header type in the protocol field. This means that things like changing the ECN field, or parsing diffserv values, stops working if there's a VLAN tag, or if there are multiple nested VLAN tags (QinQ). To fix this, change the helper to take an argument that indicates whether the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we make sure to skip all of them, so behaviour is consistent even in QinQ mode. To make the helper usable from the ECN code, move it to if_vlan.h instead of pkt_sched.h. v3: - Remove empty lines - Move vlan variable definitions inside loop in skb_protocol() - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and bpf_skb_ecn_set_ce() v2: - Use eth_type_vlan() helper in skb_protocol() - Also fix code that reads skb->protocol directly - Change a couple of 'if/else if' statements to switch constructs to avoid calling the helper twice Reported-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com> Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner2022-11-071-4/+1
| | | | em_sched.c was left out, fix it now.
* headers: Remove some left-over license text in include/uapi/linux/netfilter/Christophe JAILLET2022-11-071-4/+0
| | | | | | | | | | | | | When the SPDX-License-Identifier tag has been added, the corresponding license text has not been removed. Remove it now. Also, in xt_connmark.h, move the copyright text at the top of the file which is a much more common pattern. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Florian Westphal <fw@strlen.de>
* netfilter: ipset: enforce documented limit to prevent allocating huge memoryJozsef Kadlecsik2022-11-071-24/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Xu reported that the hash:net,iface type of the ipset subsystem does not limit adding the same network with different interfaces to a set, which can lead to huge memory usage or allocation failure. The quick reproducer is $ ipset create ACL.IN.ALL_PERMIT hash:net,iface hashsize 1048576 timeout 0 $ for i in $(seq 0 100); do /sbin/ipset add ACL.IN.ALL_PERMIT 0.0.0.0/0,kaf_$i timeout 0 -exist; done The backtrace when vmalloc fails: [Tue Oct 25 00:13:08 2022] ipset: vmalloc error: size 1073741848, exceeds total pages <...> [Tue Oct 25 00:13:08 2022] Call Trace: [Tue Oct 25 00:13:08 2022] <TASK> [Tue Oct 25 00:13:08 2022] dump_stack_lvl+0x48/0x60 [Tue Oct 25 00:13:08 2022] warn_alloc+0x155/0x180 [Tue Oct 25 00:13:08 2022] __vmalloc_node_range+0x72a/0x760 [Tue Oct 25 00:13:08 2022] ? hash_netiface4_add+0x7c0/0xb20 [Tue Oct 25 00:13:08 2022] ? __kmalloc_large_node+0x4a/0x90 [Tue Oct 25 00:13:08 2022] kvmalloc_node+0xa6/0xd0 [Tue Oct 25 00:13:08 2022] ? hash_netiface4_resize+0x99/0x710 <...> The fix is to enforce the limit documented in the ipset(8) manpage: > The internal restriction of the hash:net,iface set type is that the same > network prefix cannot be stored with more than 64 different interfaces > in a single set. Reported-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: Fix oversized kvmalloc() callsJozsef Kadlecsik2022-10-171-2/+2
| | | | | | | | | | | | | | | | commit 7661809d493b426e979f39ab512e3adf41fbcc69 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jul 14 09:45:49 2021 -0700 mm: don't allow oversized kvmalloc() calls limits the max allocatable memory via kvmalloc() to MAX_INT. Apply the same limit in ipset. Reported-by: syzbot+3493b1873fb3ea827986@syzkaller.appspotmail.com Reported-by: syzbot+2b8443c35458a617c904@syzkaller.appspotmail.com Reported-by: syzbot+ee5cb15f4a0e85e0d54e@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>