| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This supports both IPv4:
# nft --debug=netlink add rule ip filter forward ip ecn ce counter
ip filter forward
[ payload load 1b @ network header + 1 => reg 1 ]
[ bitwise reg 1 = (reg=1 & 0x00000003 ) ^ 0x00000000 ]
[ cmp eq reg 1 0x00000003 ]
[ counter pkts 0 bytes 0 ]
For IPv6:
# nft --debug=netlink add rule ip6 filter forward ip6 ecn ce counter
ip6 filter forward
[ payload load 1b @ network header + 1 => reg 1 ]
[ bitwise reg 1 = (reg=1 & 0x00000030 ) ^ 0x00000000 ]
[ cmp eq reg 1 0x00000030 ]
[ counter pkts 0 bytes 0 ]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This supports both IPv4:
# nft --debug=netlink add rule filter forward ip dscp cs1 counter
ip filter forward
[ payload load 1b @ network header + 1 => reg 1 ]
[ bitwise reg 1 = (reg=1 & 0x000000fc ) ^ 0x00000000 ]
[ cmp neq reg 1 0x00000080 ]
[ counter pkts 0 bytes 0 ]
And also IPv6, note that in this case we take two bytes from the payload:
# nft --debug=netlink add rule ip6 filter input ip6 dscp cs4 counter
ip6 filter input
[ payload load 2b @ network header + 0 => reg 1 ]
[ bitwise reg 1 = (reg=1 & 0x0000c00f ) ^ 0x00000000 ]
[ cmp eq reg 1 0x00000008 ]
[ counter pkts 0 bytes 0 ]
Given the DSCP is split in two bytes, the less significant nibble
of the first byte and the two most significant 2 bits of the second
byte.
The 8 bit traffic class in RFC2460 after the version field are used for
DSCP (6 bit) and ECN (2 bit). Support for ECN comes in a follow up
patch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Add the first non-matching segment if the set is empty or if the set
becomes empty after the element removal.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Allow explicit compound expression to initialize the set intervals.
Incremental updates to interval sets require this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... can now display nftables nftrace debug information.
$ nft filter input tcp dport 10000 nftrace set 1
$ nft filter input icmp type echo-request nftrace set 1
$ nft -nn monitor trace
trace id e1f5055f ip filter input packet: iif eth0 ether saddr 63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 ip saddr 192.168.122.1 ip daddr 192.168.122.83 ip tos 0 ip ttl 64 ip id 32315 ip length 84 icmp type echo-request icmp code 0 icmp id 10087 icmp sequence 1
trace id e1f5055f ip filter input rule icmp type echo-request nftrace set 1 (verdict continue)
trace id e1f5055f ip filter input verdict continue
trace id e1f5055f ip filter input
trace id 74e47ad2 ip filter input packet: iif vlan0 ether saddr 63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 vlan pcp 0 vlan cfi 1 vlan id 1000 ip saddr 10.0.0.1 ip daddr 10.0.0.2 ip tos 0 ip ttl 64 ip id 49030 ip length 84 icmp type echo-request icmp code 0 icmp id 10095 icmp sequence 1
trace id 74e47ad2 ip filter input rule icmp type echo-request nftrace set 1 (verdict continue)
trace id 74e47ad2 ip filter input verdict continue
trace id 74e47ad2 ip filter input
trace id 3030de23 ip filter input packet: iif vlan0 ether saddr 63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 vlan pcp 0 vlan cfi 1 vlan id 1000 ip saddr 10.0.0.1 ip daddr 10.0.0.2 ip tos 16 ip ttl 64 ip id 59062 ip length 60 tcp sport 55438 tcp dport 10000 tcp flags == syn tcp window 29200
trace id 3030de23 ip filter input rule tcp dport 10000 nftrace set 1 (verdict continue)
trace id 3030de23 ip filter input verdict continue
trace id 3030de23 ip filter input
Based on a patch from Florian Westphal, which again was based on a patch
from Markus Kötter.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
| |
The next patch introduces packet decoding for tracing messages based on
the proto definitions. In order to provide a readable output, add a filter
to surpress uninteresting header fields and allow to specify and explicit
output order.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
| |
Add payload_is_stacked() to determine whether a protocol expression match defines
a stacked protocol on the same layer.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
| |
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
| |
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add rule ip6 filter input ip6 saddr ::1/128 ip6 daddr ::1/128 fails,
we ask to compare a 32byte immediate which is not supported:
[ payload load 32b @ network header + 8 => reg 1 ]
[ cmp eq reg 1 0x00000000 0x00000000 0x00000000 0x01000000 0x00000000 0x00000000 0x00000000 0x02000000 ]
We would need to use two cmps in this case, i.e.:
[ payload load 32b @ network header + 8 => reg 1 ]
[ cmp eq reg 1 0x00000000 0x00000000 0x00000000 0x01000000 ]
[ cmp eq reg 2 0x00000000 0x00000000 0x00000000 0x02000000 ]
Seems however that this requires a bit more changes to how nft
handles register allocations, we'd also need to undo the constant merge.
Lets disable merging for now so that we generate
[ payload load 16b @ network header + 8 => reg 1 ]
[ cmp eq reg 1 0x00000000 0x00000000 0x00000000 0x01000000 ]
[ payload load 16b @ network header + 24 => reg 1 ]
[ cmp eq reg 1 0x00000000 0x00000000 0x00000000 0x02000000 ]
... if merge would bring us over the 128 bit register size.
Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1032
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Now it is possible to store multiple variable length user data into rule.
Modify the parser in order to fill the nftnl_udata with the comment, and
the print function for extract these commentary and print it to user.
Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Store the parser location structure for handle and position IDs so we
can use this information from the evaluation step, to provide better
error reporting.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
|
|
|
|
|
|
|
| |
Added missing dist. file mini-gmp.h in include/Makefile.am
Signed-off-by: Magnus Öberg <magnus.oberg@westermo.se>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Provide full support for masquerading by allowing port range selection, eg.
# nft add rule nat postrouting ip protocol tcp masquerade to :1024-10024
Signed-off-by: Shivani Bhardwaj <shivanib134@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables nft to display
frag frag-off 33
... by considering a mask during binop postprocess in case
the initial template lookup done when the exthdr expression was
created did not yield a match.
In the above example, kernel netlink data specifies 16bits,
but the frag field is only 13bits wide.
We use the implicit binop mask to re-do the template lookup with
corrected offset and size information.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Its possible that we cannot find the template without also
considering an implicit mask. For this we need to store the offset.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Should treat this as if user would have asked to match ipv6 header field.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add support for the forward statement, only available at the
netdev family.
# nft add table netdev filter
# nft add chain netdev filter ingress { type filter hook ingress device eth0 priority 0\; }
# nft add rule netdev filter ingress fwd to dummy0
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far it was only possible to match packet under a rate limit, this
patch allows you to explicitly indicate if you want to match packets
that goes over or until the rate limit, eg.
... limit rate over 3/second counter log prefix "OVERLIMIT: " drop
... limit rate over 3 mbytes/second counter log prefix "OVERLIMIT: " drop
... ct state invalid limit rate until 1/second counter log prefix "INVALID: "
When listing rate limit until, this shows:
... ct state invalid limit rate 1/second counter log prefix "INVALID: "
thus, the existing syntax is still valid (i.e. default to rate limit until).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
packets and bytes need special treatment -- we want to be able to get
packet/byte counter in either direction, but also express
'fetch in *BOTH* directions', i.e.
ct packets original + ct packets reply > 1000
This either requires a '+' expression, a new 'both' direction, or
keys where direction is optional, i.e.
ct packets > 12345 ; original + reply
ct original packets > 12345 ; original
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few keys in the ct expression are directional, i.e.
we need to tell kernel if it should fetch REPLY or ORIGINAL direction.
Split ct_keys into ct_keys & ct_keys_dir, the latter are those keys
that the kernel rejects unless also given a direction.
During postprocessing we also need to invoke ct_expr_update_type,
problem is that e.g. ct saddr can be any family (ip, ipv6) so we need
to update the expected data type based on the network base.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
| |
So we can use the 'redirect' reserve word as constant from the rhs
expression. Thus, we can use it as icmp type.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
This relies on NFT_META_PROTOCOL instead of ethernet protocol type
header field to prepare support for non-ethernet protocols in the
future.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update bitfield definitions to match according to the way they are
expressed in RFC and IEEE specifications.
This required a bit of update for c3f0501 ("src: netlink_linearize:
handle sub-byte lengths").
>From the linearize step, to calculate the shift based on the bitfield
offset, we need to obtain the length of the word in bytes:
len = round_up(expr->len, BITS_PER_BYTE);
Then, we substract the offset bits and the bitfield length.
shift = len - (offset + expr->len);
From the delinearize, payload_expr_trim() needs to obtain the real
offset through:
off = round_up(mask->len, BITS_PER_BYTE) - mask_len;
For vlan id (offset 12), this gets the position of the last bit set in
the mask (ie. 12), then we substract the length we fetch in bytes (16),
so we obtain the real bitfield offset (4).
Then, we add that to the original payload offset that was expressed in
bytes:
payload_offset += off;
Note that payload_expr_trim() now also adjusts the payload expression to
its real length and offset so we don't need to propagate the mask
expression.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Add support for payload mangling using the payload statement. The syntax
is similar to the other data changing statements:
nft filter output tcp dport set 25
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
| |
The checksum key is used to determine the correct position where to update
the checksum for the payload statement.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
| |
The comment does not belong to the handle, it belongs to the rule.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Error: conflicting protocols specified: inet vs. ether
tcp dport 22 iiftype ether ether saddr 00:0f:54:0c:11:4
^^^^^^^^^^^
This allows the implicit inet proto dependency to get replaced
by an ethernet one.
This is possible since by the time we detect the conflict the
meta dependency for the network protocol has already been added.
So we only need to add another dependency on the Linklayer frame type.
Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=981
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contrary to iptables, we use the asterisk character '*' as wildcard.
# nft --debug=netlink add rule test test iifname eth\*
ip test test
[ meta load iifname => reg 1 ]
[ cmp eq reg 1 0x00687465 ]
Note that this generates an optimized comparison without bitwise.
In case you want to match a device that contains an asterisk, you have
to escape the asterisk, ie.
# nft add rule test test iifname eth\\*
The wildcard string handling occurs from the evaluation step, where we
convert from:
relational
/ \
/ \
meta value
oifname eth*
to:
relational
/ \
/ \
meta prefix
ofiname
As Patrick suggested, this not actually a wildcard but a prefix since it
only applies to the string when placed at the end.
More comments:
* This relaxes the left->size > right->size from netlink_parse_cmp()
for strings since the optimization that this patch applies may now
result in bogus errors.
* This patch can be later on extended to apply a similar optimization to
payload expressions when:
expr->len % BITS_PER_BYTE == 0
For meta and ct, the kernel checks for the exact length of the attributes
(it expects integer 32 bits) so we can't do it unless we relax that.
* Wildcard strings are not supported from sets and maps yet. Error
reporting is not very good at this stage since expr_evaluate_prefix()
doesn't have enough context (ctx->set is NULL, the set object is
currently created later after evaluating the lhs and rhs of the
relational). I'll be following up on this later.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify the parser and add necessary functions to provide the command "nft
replace rule <ruleid_spec> <new_rule>"
Example of use:
# nft list ruleset -a
table ip filter {
chain output {
ip daddr 8.8.8.7 counter packets 0 bytes 0 # handle 3
}
}
# nft replace rule filter output handle 3 ip daddr 8.8.8.8 counter
# nft list ruleset -a
table ip filter {
chain output {
ip daddr 8.8.8.8 counter packets 0 bytes 0 # handle 3
}
}
Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
# nft list chains
table ip filter {
chain test1 {
}
chain test2 {
}
chain input {
type filter hook input priority 0; policy accept;
}
}
table ip6 filter {
chain test1 {
}
chain input {
type filter hook input priority 0; policy accept;
}
}
You can also filter out per family:
# nft list chains ip
table ip x {
chain y {
}
chain xz {
}
chain input {
type filter hook input priority 0; policy accept;
}
}
# nft list chains ip6
table ip6 filter {
chain x {
}
chain input {
type filter hook input priority 0; policy accept;
}
}
This command only shows the chain declarations, so the content (the
definition) is omitted.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
This allows you to clone packets to destination address, eg.
... dup to 172.20.0.2
... dup to 172.20.0.2 device eth1
... dup to ip saddr map { 192.168.0.2 : 172.20.0.2, ... } device eth1
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
... limit rate 1024 mbytes/second burst 10240 bytes
... limit rate 1/second burst 3 packets
This parameter is optional.
You need a Linux kernel >= 4.3-rc1.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
This example show how to accept packets below the ratelimit:
... limit rate 1024 mbytes/second counter accept
You need a Linux kernel >= 4.3-rc1.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
This allows to list rules that check fields that are not aligned on byte
boundary.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
currently 'vlan id 42' or even 'vlan type ip' doesn't work since
we expect ethernet header but get vlan.
So if we want to add another protocol header to the same base, we
attempt to figure out if the new header can fit on top of the existing
one (i.e. proto_find_num gives a protocol number when asking to find
link between the two).
We also annotate protocol description for eth and vlan with the full
header size and track the offset from the current base.
Otherwise, 'vlan type ip' fetches the protocol field from mac header
offset 0, which is some mac address.
Instead, we must consider full size of ethernet header.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
| |
Adapt the nftables code to use the new symbols in libnftnl. This patch contains
quite some renaming to reserve the nft_ prefix for our high level library.
Explicitly request libnftnl 1.0.5 at configure stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
When adding declared chains to the cache, we may hold more than one single
reference from struct cmd and the cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
We may hold multiple references to table objects in follow up patches when
adding object declarations to the cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces the generic object cache that is populated during the
evaluation phase.
The first client of this infrastructure are table objects. As a result, there
is a single call to netlink_list_tables().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This branch adds support for the new 'netdev' family. This also resolves a
simple conflict with the default chain policy printing.
Conflicts:
src/rule.c
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch adds support for the new 'netdev' table. So far, this table allows
you to create filter chains from ingress.
The following example shows a very simple base configuration with one table that
contains a basechain that is attached to the 'eth0':
# nft list table netdev filter
table netdev filter {
chain eth0-ingress {
type filter hook ingress device eth0 priority 0; policy accept;
}
}
You can test that this works by adding a simple rule with counters:
# nft add rule netdev filter eth0-ingress counter
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|/
|
|
|
|
| |
Set human readable hookname chain->hookstr field from delinearize.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
| |
Pad all but the last sub-expressions of a concat expressions.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prepare netlink_linearize for 32 bit register usage:
Switch to use 16 data registers of 32 bit each. A helper function takes
care of mapping the registers to the NFT_REG32 values and, if the
register refers to the beginning of an 128 bit area, the old NFT_REG_1-4
values for compatibility.
New register reservation and release helper function take the size into
account and reserve the required amount of registers.
The reservation and release functions will so far still always allocate
128 bit. If no other expression in a rule uses a 32 bit register directly,
these will be mapped to the old register values, meaning everything
continues to work with old kernel versions.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
| |
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The set statement is used to dynamically add or update elements in a set.
Syntax:
# nft filter input set add tcp dport @myset
# nft filter input set add ip saddr timeout 10s @myset
# nft filter input set update ip saddr timeout 10s @myset
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
| |
Syntax:
# nft add element filter test { 192.168.0.1 comment "some host" }
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support specifying per element timeout values and displaying the expiration
time.
If an element should not use the default timeout value of the set, an
element specific value can be specified as follows:
# nft add element filter test { 192.168.0.1, 192.168.0.2 timeout 10m}
For listing of elements that use the default timeout value, just the
expiration time is shown, otherwise the element specific timeout value
is also displayed:
set test {
type ipv4_addr
timeout 1h
elements = { 192.168.0.2 timeout 10m expires 9m59s, 192.168.0.1 expires 59m59s}
}
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Timeout support can be enabled in one of two ways:
1. Using a default timeout value:
set test {
type ipv4_addr;
timeout 1h;
}
2. Using the timeout flag without a default:
set test {
type ipv4_addr;
flags timeout;
}
Optionally a garbage collection interval can be specified using
gc-interval <interval>;
Signed-off-by: Patrick McHardy <kaber@trash.net>
|