| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The flow statement allows to instantiate per flow statements for user
defined flows. This can so far be used for per flow accounting or limiting,
similar to what the iptables hashlimit provides. Flows can be aged using
the timeout option.
Examples:
# nft filter input flow ip saddr . tcp dport limit rate 10/second
# nft filter input flow table acct iif . ip saddr timeout 60s counter
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
Return the parsed statement instead of adding it to the rule in order to
parse statements contained in the flow statement.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
This provides a generic way to transfer shifts from the left hand side
to the right hand range side of a relational expression when performing
transformations from the evaluation step.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
Add payload_is_stacked() to determine whether a protocol expression match defines
a stacked protocol on the same layer.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
| |
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code contains multiple scattered around fragments to fiddle with the
protocol contexts to work around the fact that stacked headers update the
context for the incorrect layer.
Fix this by updating the correct layer in payload_expr_pctx_update() and
also take care of offset adjustments there and only there. Remove all
manual protocol context fiddling and change protocol context debugging to
also print the offset for stacked headers.
All previously successful testcases pass.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
| |
Now it is possible to store multiple variable length user data into rule.
Modify the parser in order to fill the nftnl_udata with the comment, and
the print function for extract these commentary and print it to user.
Signed-off-by: Carlos Falgueras García <carlosfg@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Store the parser location structure for handle and position IDs so we
can use this information from the evaluation step, to provide better
error reporting.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
|
|
|
|
|
|
|
|
|
| |
Provide full support for masquerading by allowing port range selection, eg.
# nft add rule nat postrouting ip protocol tcp masquerade to :1024-10024
Signed-off-by: Shivani Bhardwaj <shivanib134@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables nft to display
frag frag-off 33
... by considering a mask during binop postprocess in case
the initial template lookup done when the exthdr expression was
created did not yield a match.
In the above example, kernel netlink data specifies 16bits,
but the frag field is only 13bits wide.
We use the implicit binop mask to re-do the template lookup with
corrected offset and size information.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
binop_postprocess takes care of removing masks if we're dealing
with payload expressions that have non-byte divisible sizes
or offsets.
Same can happen when matching some extension header fields, i.e.
this also needs to handle exthdr expression, not just payload.
So rename payload to left and move test for left type to
binop_postprocess.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
exthdr expression requires a dependency on ipv6; we can
thus remove an ipv6 protocol test if its present.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
payload_match_postprocess() expects a relational with payload of his lhs
and value on the rhs.
Moreover, payload_match_expand() releases the previous expression so
valgrind reports an use-after-free when pruning the implicit binop.
Fix this by calling payload_match_postprocess() in first place.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
| |
The inet and netdev families generate two implicit dependencies to check
for the interface type, so we have to check just after killing an implicit
dependency if there is another that we should annotate to kill it as well.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add support for the forward statement, only available at the
netdev family.
# nft add table netdev filter
# nft add chain netdev filter ingress { type filter hook ingress device eth0 priority 0\; }
# nft add rule netdev filter ingress fwd to dummy0
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for OP_EQ before removing a dependency, else we may zap wrong one,
changing the meaning of the rule.
Listing without patch:
ip protocol udp udp dport ssh
ip protocol udp udp dport ssh
counter packets 1 bytes 308 ip protocol udp udp dport ssh
With patch:
ip protocol != tcp udp dport ssh
ip protocol != udp udp dport ssh
ip protocol != tcp counter packets 1 bytes 308 udp dport ssh
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
old nft list:
mark set unknown unknown & 0xfff [invalid type] map { 3 : 0x00000017, 1 : 0x0000002a}
new:
mark set vlan id map { 3 : 0x00000017, 1 : 0x0000002a}
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
Just move the payload trim part to a separate function.
Next patch will add a second call site to deal with map ops
that use a lookup based on a binop result.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far it was only possible to match packet under a rate limit, this
patch allows you to explicitly indicate if you want to match packets
that goes over or until the rate limit, eg.
... limit rate over 3/second counter log prefix "OVERLIMIT: " drop
... limit rate over 3 mbytes/second counter log prefix "OVERLIMIT: " drop
... ct state invalid limit rate until 1/second counter log prefix "INVALID: "
When listing rate limit until, this shows:
... ct state invalid limit rate 1/second counter log prefix "INVALID: "
thus, the existing syntax is still valid (i.e. default to rate limit until).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
During delinearization we attempt to remove masks, for instance
ip saddr $x/32. (mask matches the entire size).
However, in some special cases the lhs size is unknown (0), this
happens f.e. with
'ct saddr original 1.2.3.4/24' which had its '/24' chopped off.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
expr->len 0 can appear for some data types whose size can be different
based on some external state, e.g. the conntrack src/dst addresses.
The nft type is 'invalid/0-length' in the template definition, the
size is set (on linearization) based on the network base family,
i.e. the type is changed to ip or ipv6 address at a later stage.
For delinarization, skip zero-length expression as concat type
and give expr_postprocess a chance to fix the types.
Without this change the previous patch will result in nft consuming all
available memory when trying to display e.g. a 'ct saddr' rule.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few keys in the ct expression are directional, i.e.
we need to tell kernel if it should fetch REPLY or ORIGINAL direction.
Split ct_keys into ct_keys & ct_keys_dir, the latter are those keys
that the kernel rejects unless also given a direction.
During postprocessing we also need to invoke ct_expr_update_type,
problem is that e.g. ct saddr can be any family (ip, ipv6) so we need
to update the expected data type based on the network base.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
564b0e7c13f9 ("netlink_delinearize: postprocess expression before range
merge") crashes nft when the previous statement is removed via
payload_dependency_kill() as this pointer is not valid anymore.
Move the pointer to the previous statement to rule_pp_ctx and invalidate
it when required.
Reported-by: "Pablo M. Bermudo Garay" <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reported-by: "Pablo M. Bermudo Garay" <pablombg@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Dependency statement go away after postprocess, so we should consider
them for possible range merges.
This problem was uncovered when adding support for sub-byte payload
ranges.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update bitfield definitions to match according to the way they are
expressed in RFC and IEEE specifications.
This required a bit of update for c3f0501 ("src: netlink_linearize:
handle sub-byte lengths").
>From the linearize step, to calculate the shift based on the bitfield
offset, we need to obtain the length of the word in bytes:
len = round_up(expr->len, BITS_PER_BYTE);
Then, we substract the offset bits and the bitfield length.
shift = len - (offset + expr->len);
From the delinearize, payload_expr_trim() needs to obtain the real
offset through:
off = round_up(mask->len, BITS_PER_BYTE) - mask_len;
For vlan id (offset 12), this gets the position of the last bit set in
the mask (ie. 12), then we substract the length we fetch in bytes (16),
so we obtain the real bitfield offset (4).
Then, we add that to the original payload offset that was expressed in
bytes:
payload_offset += off;
Note that payload_expr_trim() now also adjusts the payload expression to
its real length and offset so we don't need to propagate the mask
expression.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
| |
We have to clone the payload expression before attaching it to the lhs
of the relational expression, this payload expression is located at the
lhs of the binary operation that is released thereafter.
Fixes: 39f15c2 ("nft: support listing expressions that use non-byte header fields")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
| |
The conversion to the net libnftnl API has left a lot of indentation damage
in the netlink functions. Fix it up.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
| |
Add support for payload mangling using the payload statement. The syntax
is similar to the other data changing statements:
nft filter output tcp dport set 25
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
| |
The comment does not belong to the handle, it belongs to the rule.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contrary to iptables, we use the asterisk character '*' as wildcard.
# nft --debug=netlink add rule test test iifname eth\*
ip test test
[ meta load iifname => reg 1 ]
[ cmp eq reg 1 0x00687465 ]
Note that this generates an optimized comparison without bitwise.
In case you want to match a device that contains an asterisk, you have
to escape the asterisk, ie.
# nft add rule test test iifname eth\\*
The wildcard string handling occurs from the evaluation step, where we
convert from:
relational
/ \
/ \
meta value
oifname eth*
to:
relational
/ \
/ \
meta prefix
ofiname
As Patrick suggested, this not actually a wildcard but a prefix since it
only applies to the string when placed at the end.
More comments:
* This relaxes the left->size > right->size from netlink_parse_cmp()
for strings since the optimization that this patch applies may now
result in bogus errors.
* This patch can be later on extended to apply a similar optimization to
payload expressions when:
expr->len % BITS_PER_BYTE == 0
For meta and ct, the kernel checks for the exact length of the attributes
(it expects integer 32 bits) so we can't do it unless we relax that.
* Wildcard strings are not supported from sets and maps yet. Error
reporting is not very good at this stage since expr_evaluate_prefix()
doesn't have enough context (ctx->set is NULL, the set object is
currently created later after evaluating the lhs and rhs of the
relational). I'll be following up on this later.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
This allows you to clone packets to destination address, eg.
... dup to 172.20.0.2
... dup to 172.20.0.2 device eth1
... dup to ip saddr map { 192.168.0.2 : 172.20.0.2, ... } device eth1
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
| |
... limit rate 1024 mbytes/second burst 10240 bytes
... limit rate 1/second burst 3 packets
This parameter is optional.
You need a Linux kernel >= 4.3-rc1.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
| |
This example show how to accept packets below the ratelimit:
... limit rate 1024 mbytes/second counter accept
You need a Linux kernel >= 4.3-rc1.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
|
|
|
|
|
|
|
| |
This allows to list rules that check fields that are not aligned on byte
boundary.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
currently 'vlan id 42' or even 'vlan type ip' doesn't work since
we expect ethernet header but get vlan.
So if we want to add another protocol header to the same base, we
attempt to figure out if the new header can fit on top of the existing
one (i.e. proto_find_num gives a protocol number when asking to find
link between the two).
We also annotate protocol description for eth and vlan with the full
header size and track the offset from the current base.
Otherwise, 'vlan type ip' fetches the protocol field from mac header
offset 0, which is some mac address.
Instead, we must consider full size of ethernet header.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
|
|
|
|
|
|
|
| |
Adapt the nftables code to use the new symbols in libnftnl. This patch contains
quite some renaming to reserve the nft_ prefix for our high level library.
Explicitly request libnftnl 1.0.5 at configure stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Florian Westphal says:
09565a4b1ed4863d44c4509a93c50f44efd12771 ("netlink_delinearize: consolidate
range printing") causes nft to segfault on 32bit machine when printing l4proto
ranges.
The problem is that meta_expr_pctx_update() assumes that right is a value, but
after this change it can also be a range.
Thus, expr->value contents are undefined (its union). On x86_64 this is also
broken but by virtue of struct layout and pointer sizes, value->_mp_size will
almost always be 0 so mpz_get_uint8() returns 0.
But on x86-32 _mp_size will be huge value (contains expr->right pointer of
range), so we crash in libgmp.
Pablo says:
We shouldn't call pctx_update(), before the transformation we had
there a expr->op == { OP_GT, OP_GTE, OP_LT, OP_LTE }. So we never
entered that path as the assert in payload_expr_pctx_update()
indicates.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Florian Westphal <fw@strlen.de>
|
|
|
|
| |
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When the RHS length differs from the LHS length (which is only the
first expression), both expressions are assumed to be concat expressions.
The LHS concat expression is reconstructed from the available register
values, advancing by the number of registers required by the subexpressions'
register space, until the RHS length has been reached.
The RHS concat expression is reconstructed by splitting the data value
into multiple subexpressions based on the LHS concat expressions types.
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Introduce a helper function to translate register numbers from the kernel
from the compat values to the NFT_REG32 values.
Internally we use the register numbers 0-16:
* 0 is the verdict register in both old and new addressing modes.
* 1-16 are the 32 bit data registers
The NFT_REG32_00 values are mapped to 1-16, the NFT_REG_1-NFT_REG_4
values are each use up 4 registers starting at 1 (1, 5, 9, 13).
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
before:
table ip filter {
chain test {
cpu { 67108864, 50331648, 33554432}
}
}
after:
table ip filter {
chain test {
cpu { 4, 3, 2 }
}
}
Related to 525323352904 ("expr: add set_elem_expr as container for set element
attributes").
We'll have to revisit this once we have support to use integer datatypes from
set declarations, see: http://patchwork.ozlabs.org/patch/480068/
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|\ \ \
| |_|/
|/| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds a routine to the postprocess stage to check if the previous
expression statement and the current actually represent a range, so we can
provide a more compact listing, eg.
# nft -nn list table test
table ip test {
chain test {
tcp dport 22
tcp dport 22-23
tcp dport != 22-23
ct mark != 0x00000016-0x00000017
ct mark 0x00000016-0x00000017
mark 0x00000016-0x00000017
mark != 0x00000016-0x00000017
}
}
To do so, the context state stores a pointer to the current statement. This
pointer needs to be invalidated in case the current statement is replaced.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This function encapsulates the payload expansion logic. This change in required
by the follow up patch to consolidate range printing.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch is required by the range postprocess routine that comes in follow up
patches.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |/
| |
| |
| |
| |
| | |
Instead of a copy of the context variable.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The set statement is used to dynamically add or update elements in a set.
Syntax:
# nft filter input set add tcp dport @myset
# nft filter input set add ip saddr timeout 10s @myset
# nft filter input set update ip saddr timeout 10s @myset
Signed-off-by: Patrick McHardy <kaber@trash.net>
|