summaryrefslogtreecommitdiffstats
path: root/kernel/net/netfilter/ipset/ip_set_bitmap_gen.h
Commit message (Collapse)AuthorAgeFilesLines
* netfilter: ipset: fix race condition between swap/destroy and kernel side ↵Jozsef Kadlecsik2024-01-291-3/+11
| | | | | | | | | | | | | | | | | | | | | | | add/del/test v4 The patch "netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test", commit 28628fa9 fixes a race condition. But the synchronize_rcu() added to the swap function unnecessarily slows it down: it can safely be moved to destroy and use call_rcu() instead. Eric Dumazet pointed out that simply calling the destroy functions as rcu callback does not work: sets with timeout use garbage collectors which need cancelling at destroy which can wait. Therefore the destroy functions are split into two: cancelling garbage collectors safely at executing the command received by netlink and moving the remaining part only into the rcu callback. Link: https://lore.kernel.org/lkml/C0829B10-EAA6-4809-874E-E1E9C05A8D84@automattic.com/ Fixes: 28628fa952fe ("netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test") Reported-by: Ale Crismani <ale.crismani@automattic.com> Reported-by: David Wang <00107082@163.com> Tested-by: David Wang <00107082@163.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: ipset: use bitmap infrastructure completelyJozsef Kadlecsik2020-01-191-1/+1
| | | | | | | | | | | | | | | | The bitmap allocation did not use full unsigned long sizes when calculating the required size and that was triggered by KASAN as slab-out-of-bounds read in several places. The patch fixes all of them. Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* netfilter: fix a use-after-free in mtype_destroy()Cong Wang2020-01-151-1/+1
| | | | | | | | | | | | map->members is freed by ip_set_free() right before using it in mtype_ext_cleanup() again. So we just have to move it down. Reported-by: syzbot+4c3cc6dbe7259dbf9054@syzkaller.appspotmail.com Fixes: 40cd63bf33b2 ("netfilter: ipset: Support extensions which need a per data destroy function") Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner2019-10-311-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* netfilter: ipset: remove inline from static functions in .c files.Jeremy Sowden2019-10-071-1/+1
| | | | | | | | | | | | | The inline function-specifier should not be used for static functions defined in .c files since it bloats the kernel. Instead leave the compiler to decide which functions to inline. While a couple of the files affected (ip_set_*_gen.h) are technically headers, they contain templates for generating the common parts of particular set-types and so we treat them like .c files. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
* ipset: update my email addressJozsef Kadlecsik2019-06-051-1/+1
| | | | | | | | | It's better to use my kadlec@netfilter.org email address in the source code. I might not be able to use kadlec@blackhole.kfki.hu in the future. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* netfilter: ipset: add resched points during set listingFlorian Westphal2018-01-041-0/+1
| | | | | | | | | | | | | | When sets are extremely large we can get softlockup during ipset -L. We could fix this by adding cond_resched_rcu() at the right location during iteration, but this only works if RCU nesting depth is 1. At this time entire variant->list() is called under under rcu_read_lock_bh. This used to be a read_lock_bh() but as rcu doesn't really lock anything, it does not appear to be needed, so remove it (ipset increments set reference count before this, so a set deletion should not be possible). Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* Fix "don't update counters" mode when counters used at the matchingJozsef Kadlecsik2018-01-041-8/+1
| | | | The matching of the counters was not taken into account, fixed.
* Backport patch: netfilter: ipset: Convert timers to use timer_setup()Jozsef Kadlecsik2018-01-031-5/+4
|
* Backport missing part of patch: netfilter: Remove unnecessary cast on void ↵Jozsef Kadlecsik2017-09-111-2/+1
| | | | pointer
* netfilter: ipset: Remove unnecessary cast on void pointersimran singhal2017-03-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The following Coccinelle script was used to detect this: @r@ expression x; void* e; type T; identifier f; @@ ( *((T *)e) | ((T *)x)[...] | ((T*)x)->f | - (T*) e ) Signed-off-by: simran singhal <singhalsimran0@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Correct the reported memory size for bitmap:* typesJozsef Kadlecsik2016-10-131-3/+6
| | | | | | | The patch "Fix extension alignmen" (c7cf6f3b) removed counting the non-dynamic extensions into the used up memory area, fixed. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* netfilter: ipset: use setup_timer() and mod_timer().Muhammad Falak R Wani2016-05-191-5/+2
| | | | | | | | | | | | | | | Use setup_timer() and instead of init_timer(), being the preferred way of setting up a timer. Also, quoting the mod_timer() function comment: -> mod_timer() is a more efficient way to update the expire field of an active timer (if the timer is inactive it will be activated). Use setup_timer() and mod_timer() to setup and arm a timer, making the code compact and easier to read. Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* netfilter: ipset: fix race condition in ipset save, swap and deleteVishwanath Pai2016-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix adds a new reference counter (ref_netlink) for the struct ip_set. The other reference counter (ref) can be swapped out by ip_set_swap and we need a separate counter to keep track of references for netlink events like dump. Using the same ref counter for dump causes a race condition which can be demonstrated by the following script: ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \ counters ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset save & ipset swap hash_ip3 hash_ip2 ipset destroy hash_ip3 /* will crash the machine */ Swap will exchange the values of ref so destroy will see ref = 0 instead of ref = 1. With this fix in place swap will not succeed because ipset save still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink). Both delete and swap will error out if ref_netlink != 0 on the set. Note: The changes to *_head functions is because previously we would increment ref whenever we called these functions, we don't do that anymore. Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Fix extension alignmentJozsef Kadlecsik2015-11-071-14/+7
| | | | | | | | | | | | | The data extensions in ipset lacked the proper memory alignment and thus could lead to kernel crash on several architectures. Therefore the structures have been reorganized and alignment attributes added where needed. The patch was tested on armv7h by Gerhard Wiesinger and on x86_64, sparc64 by Jozsef Kadlecsik. Reported-by: Gerhard Wiesinger <lists@wiesinger.com> Tested-by: Gerhard Wiesinger <lists@wiesinger.com> Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Count non-static extension memory into the set memory size for userspaceJozsef Kadlecsik2015-06-261-2/+3
| | | | | | | | Non-static (i.e. comment) extension was not counted into the memory size. A new internal counter is introduced for this. In the case of the hash types the sizes of the arrays are counted there as well so that we can avoid to scan the whole set when just the header data is requested.
* Add element count to all set types headerJozsef Kadlecsik2015-06-251-1/+9
| | | | | | It is better to list the set elements for all set types, thus the header information is uniform. Element counts are therefore added to the bitmap and list types.
* netfilter: ipset: Separate memsize calculation code into dedicated functionsJozsef Kadlecsik2015-05-061-4/+14
| | | | | | | | | | Hash types already has it's memsize calculation code in separate functions. Do the same for *bitmap* and *list* sets. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Make sure bitmap:ip,mac detects the proper MAC even when it's overwrittenJozsef Kadlecsik2015-03-291-2/+8
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* RCU safe comment extension handlingJozsef Kadlecsik2015-03-291-4/+10
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Fix coding styles reported by checkpatch.plJozsef Kadlecsik2015-01-061-7/+9
|
* Remove unnecessary integer RCU handling and fix sparse warningsJozsef Kadlecsik2014-11-271-2/+2
|
* Introduce RCU in all set types instead of rwlock per setJozsef Kadlecsik2014-11-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance is tested by Jesper Dangaard Brouer: Simple drop in FORWARD ~~~~~~~~~~~~~~~~~~~~~~ Dropping via simple iptables net-mask match:: iptables -t raw -N simple || iptables -t raw -F simple iptables -t raw -I simple -s 198.18.0.0/15 -j DROP iptables -t raw -D PREROUTING -j simple iptables -t raw -I PREROUTING -j simple Drop performance in "raw": 11.3Mpps Generator: sending 12.2Mpps (tx:12264083 pps) Drop via original ipset in RAW table ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Create a set with lots of elements:: sudo ./ipset destroy test echo "create test hash:ip hashsize 65536" > test.set for x in `seq 0 255`; do for y in `seq 0 255`; do echo "add test 198.18.$x.$y" >> test.set done done sudo ./ipset restore < test.set Dropping via ipset:: iptables -t raw -F iptables -t raw -N net198 || iptables -t raw -F net198 iptables -t raw -I net198 -m set --match-set test src -j DROP iptables -t raw -I PREROUTING -j net198 Drop performance in "raw" with ipset: 8Mpps Perf report numbers ipset drop in "raw":: + 24.65% ksoftirqd/1 [ip_set] [k] ip_set_test - 21.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_lock_bh - _raw_read_lock_bh + 99.88% ip_set_test - 19.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_unlock_bh - _raw_read_unlock_bh + 99.72% ip_set_test + 4.31% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_kadt + 2.27% ksoftirqd/1 [ixgbe] [k] ixgbe_fetch_rx_buffer + 2.18% ksoftirqd/1 [ip_tables] [k] ipt_do_table + 1.81% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_test + 1.61% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 1.44% ksoftirqd/1 [kernel.kallsyms] [k] build_skb + 1.42% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv + 1.36% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip + 1.16% ksoftirqd/1 [kernel.kallsyms] [k] dev_gro_receive + 1.09% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_unlock + 0.96% ksoftirqd/1 [ixgbe] [k] ixgbe_clean_rx_irq + 0.95% ksoftirqd/1 [kernel.kallsyms] [k] __netdev_alloc_frag + 0.88% ksoftirqd/1 [kernel.kallsyms] [k] kmem_cache_alloc + 0.87% ksoftirqd/1 [xt_set] [k] set_match_v3 + 0.85% ksoftirqd/1 [kernel.kallsyms] [k] inet_gro_receive + 0.83% ksoftirqd/1 [kernel.kallsyms] [k] nf_iterate + 0.76% ksoftirqd/1 [kernel.kallsyms] [k] put_compound_page + 0.75% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_lock Drop via ipset in RAW table with RCU-locking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ With RCU locking, the RW-lock is gone. Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps Performance-tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
* netfilter: ipset: Add skbinfo extension kernel support for the bitmap set types.Anton Danilov2014-09-081-0/+4
| | | | | | | | Add skbinfo extension kernel support for the bitmap set types. Inroduce the new revisions of bitmap_ip, bitmap_ipmac and bitmap_port set types. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Use netlink callback dump args onlyJozsef Kadlecsik2013-10-021-5/+6
| | | | | Instead of cb->data, use callback dump args only and introduce symbolic names instead of plain numbers at accessing the argument members.
* Use a common function at listing the extensions of the elementsJozsef Kadlecsik2013-09-251-19/+10
|
* netfilter: ipset: Support comments in bitmap-type ipsets.Oliver Smith2013-09-231-6/+8
| | | | | | | | | | | | | | This provides kernel support for creating bitmap ipsets with comment support. As is the case for hashes, this incurs a penalty when flushing or destroying the entire ipset as the entries must first be walked in order to free the comment strings. This penalty is of course far less than the cost of listing an ipset to userspace. Any set created without support for comments will be flushed/destroyed as before. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Support extensions which need a per data destroy functionJozsef Kadlecsik2013-09-091-7/+31
|
* Move extension data to set structureJozsef Kadlecsik2013-09-071-31/+28
| | | | | | Default timeout and extension offsets are moved to struct set, because all set types supports all extensions and it makes possible to generalize extension support.
* Rename extension offset ids to extension idsJozsef Kadlecsik2013-09-061-2/+2
|
* Rename simple macro names to avoid namespace issues.Jozsef Kadlecsik2013-05-011-25/+22
| | | | Reported-by: David Laight <David.Laight@ACULAB.COM>
* Don't call ip_nest_end needlessly in the error pathJozsef Kadlecsik2013-04-271-1/+1
| | | | Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
* The bitmap types with counter supportJozsef Kadlecsik2013-04-091-1/+13
| | | | Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* Unified bitmap type generationJozsef Kadlecsik2013-04-091-0/+265
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>