| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inline function-specifier should not be used for static functions
defined in .c files since it bloats the kernel. Instead leave the
compiler to decide which functions to inline.
While a couple of the files affected (ip_set_*_gen.h) are technically
headers, they contain templates for generating the common parts of
particular set-types and so we treat them like .c files.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
It's better to use my kadlec@netfilter.org email address in
the source code. I might not be able to use
kadlec@blackhole.kfki.hu in the future.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sets are extremely large we can get softlockup during ipset -L.
We could fix this by adding cond_resched_rcu() at the right location
during iteration, but this only works if RCU nesting depth is 1.
At this time entire variant->list() is called under under rcu_read_lock_bh.
This used to be a read_lock_bh() but as rcu doesn't really lock anything,
it does not appear to be needed, so remove it (ipset increments set
reference count before this, so a set deletion should not be possible).
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
| |
The matching of the counters was not taken into account, fixed.
|
| |
|
|
|
|
| |
pointer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following Coccinelle script was used to detect this:
@r@
expression x;
void* e;
type T;
identifier f;
@@
(
*((T *)e)
|
((T *)x)[...]
|
((T*)x)->f
|
- (T*)
e
)
Signed-off-by: simran singhal <singhalsimran0@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
| |
The patch "Fix extension alignmen" (c7cf6f3b) removed counting
the non-dynamic extensions into the used up memory area, fixed.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use setup_timer() and instead of init_timer(), being the preferred way
of setting up a timer.
Also, quoting the mod_timer() function comment:
-> mod_timer() is a more efficient way to update the expire field of an
active timer (if the timer is inactive it will be activated).
Use setup_timer() and mod_timer() to setup and arm a timer, making the
code compact and easier to read.
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fix adds a new reference counter (ref_netlink) for the struct ip_set.
The other reference counter (ref) can be swapped out by ip_set_swap and we
need a separate counter to keep track of references for netlink events
like dump. Using the same ref counter for dump causes a race condition
which can be demonstrated by the following script:
ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
counters
ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
counters
ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
counters
ipset save &
ipset swap hash_ip3 hash_ip2
ipset destroy hash_ip3 /* will crash the machine */
Swap will exchange the values of ref so destroy will see ref = 0 instead of
ref = 1. With this fix in place swap will not succeed because ipset save
still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).
Both delete and swap will error out if ref_netlink != 0 on the set.
Note: The changes to *_head functions is because previously we would
increment ref whenever we called these functions, we don't do that
anymore.
Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The data extensions in ipset lacked the proper memory alignment and
thus could lead to kernel crash on several architectures. Therefore
the structures have been reorganized and alignment attributes added
where needed. The patch was tested on armv7h by Gerhard Wiesinger and
on x86_64, sparc64 by Jozsef Kadlecsik.
Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Gerhard Wiesinger <lists@wiesinger.com>
Tested-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
|
|
|
| |
Non-static (i.e. comment) extension was not counted into the memory
size. A new internal counter is introduced for this. In the case of
the hash types the sizes of the arrays are counted there as well so
that we can avoid to scan the whole set when just the header data
is requested.
|
|
|
|
|
|
| |
It is better to list the set elements for all set types, thus the
header information is uniform. Element counts are therefore added
to the bitmap and list types.
|
|
|
|
|
|
|
|
|
|
| |
Hash types already has it's memsize calculation code in separate
functions. Do the same for *bitmap* and *list* sets.
Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Suggested-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Performance is tested by Jesper Dangaard Brouer:
Simple drop in FORWARD
~~~~~~~~~~~~~~~~~~~~~~
Dropping via simple iptables net-mask match::
iptables -t raw -N simple || iptables -t raw -F simple
iptables -t raw -I simple -s 198.18.0.0/15 -j DROP
iptables -t raw -D PREROUTING -j simple
iptables -t raw -I PREROUTING -j simple
Drop performance in "raw": 11.3Mpps
Generator: sending 12.2Mpps (tx:12264083 pps)
Drop via original ipset in RAW table
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a set with lots of elements::
sudo ./ipset destroy test
echo "create test hash:ip hashsize 65536" > test.set
for x in `seq 0 255`; do
for y in `seq 0 255`; do
echo "add test 198.18.$x.$y" >> test.set
done
done
sudo ./ipset restore < test.set
Dropping via ipset::
iptables -t raw -F
iptables -t raw -N net198 || iptables -t raw -F net198
iptables -t raw -I net198 -m set --match-set test src -j DROP
iptables -t raw -I PREROUTING -j net198
Drop performance in "raw" with ipset: 8Mpps
Perf report numbers ipset drop in "raw"::
+ 24.65% ksoftirqd/1 [ip_set] [k] ip_set_test
- 21.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_lock_bh
- _raw_read_lock_bh
+ 99.88% ip_set_test
- 19.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_unlock_bh
- _raw_read_unlock_bh
+ 99.72% ip_set_test
+ 4.31% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_kadt
+ 2.27% ksoftirqd/1 [ixgbe] [k] ixgbe_fetch_rx_buffer
+ 2.18% ksoftirqd/1 [ip_tables] [k] ipt_do_table
+ 1.81% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_test
+ 1.61% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 1.44% ksoftirqd/1 [kernel.kallsyms] [k] build_skb
+ 1.42% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv
+ 1.36% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip
+ 1.16% ksoftirqd/1 [kernel.kallsyms] [k] dev_gro_receive
+ 1.09% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_unlock
+ 0.96% ksoftirqd/1 [ixgbe] [k] ixgbe_clean_rx_irq
+ 0.95% ksoftirqd/1 [kernel.kallsyms] [k] __netdev_alloc_frag
+ 0.88% ksoftirqd/1 [kernel.kallsyms] [k] kmem_cache_alloc
+ 0.87% ksoftirqd/1 [xt_set] [k] set_match_v3
+ 0.85% ksoftirqd/1 [kernel.kallsyms] [k] inet_gro_receive
+ 0.83% ksoftirqd/1 [kernel.kallsyms] [k] nf_iterate
+ 0.76% ksoftirqd/1 [kernel.kallsyms] [k] put_compound_page
+ 0.75% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_lock
Drop via ipset in RAW table with RCU-locking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With RCU locking, the RW-lock is gone.
Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps
Performance-tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
|
|
|
|
|
|
|
|
| |
Add skbinfo extension kernel support for the bitmap set types.
Inroduce the new revisions of bitmap_ip, bitmap_ipmac and bitmap_port set types.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
|
|
|
| |
Instead of cb->data, use callback dump args only and introduce symbolic
names instead of plain numbers at accessing the argument members.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides kernel support for creating bitmap ipsets with comment
support.
As is the case for hashes, this incurs a penalty when flushing or
destroying the entire ipset as the entries must first be walked in order
to free the comment strings. This penalty is of course far less than the
cost of listing an ipset to userspace. Any set created without support
for comments will be flushed/destroyed as before.
Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
| |
|
|
|
|
|
|
| |
Default timeout and extension offsets are moved to struct set, because
all set types supports all extensions and it makes possible to generalize
extension support.
|
| |
|
|
|
|
| |
Reported-by: David Laight <David.Laight@ACULAB.COM>
|
|
|
|
| |
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
| |
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|
|
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
|