summaryrefslogtreecommitdiffstats
path: root/doc/primary-expression.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/primary-expression.txt')
-rw-r--r--doc/primary-expression.txt139
1 files changed, 117 insertions, 22 deletions
diff --git a/doc/primary-expression.txt b/doc/primary-expression.txt
index 0316a7e1..782494bd 100644
--- a/doc/primary-expression.txt
+++ b/doc/primary-expression.txt
@@ -36,6 +36,13 @@ add such a rule, it will stop matching if the interface gets renamed and it
will match again in case interface gets deleted and later a new interface
with the same name is created.
+Like with iptables, wildcard matching on interface name prefixes is available for
+*iifname* and *oifname* matches by appending an asterisk (*) character. Note
+however that unlike iptables, nftables does not accept interface names
+consisting of the wildcard character only - users are supposed to just skip
+those always matching expressions. In order to match on literal asterisk
+character, one may escape it using backslash (\).
+
.Meta expression types
[options="header"]
|==================
@@ -76,6 +83,12 @@ ifname
|oiftype|
Output interface hardware type|
iface_type
+|sdif|
+Slave device input interface index |
+iface_index
+|sdifname|
+Slave device interface name|
+ifname
|skuid|
UID associated with originating socket|
uid
@@ -110,7 +123,7 @@ integer (32 bit)
pseudo-random number|
integer (32 bit)
|ipsec|
-boolean|
+true if packet was ipsec encrypted |
boolean (1 bit)
|iifkind|
Input interface kind |
@@ -149,43 +162,54 @@ Device group (32 bit number). Can be specified numerically or as symbolic name d
Packet type: *host* (addressed to local host), *broadcast* (to all),
*multicast* (to group), *other* (addressed to another host).
|ifkind|
-Interface kind (16 byte string). Does not have to exist.
+Interface kind (16 byte string). See TYPES in ip-link(8) for a list.
|time|
Either an integer or a date in ISO format. For example: "2019-06-06 17:00".
Hour and seconds are optional and can be omitted if desired. If omitted,
midnight will be assumed.
The following three would be equivalent: "2019-06-06", "2019-06-06 00:00"
-and "2019-06-06 00:00:00".
+and "2019-06-06 00:00:00". Use a range expression such as
+"2019-06-06 10:00"-"2019-06-10 14:00" for matching a time range.
When an integer is given, it is assumed to be a UNIX timestamp.
|day|
Either a day of week ("Monday", "Tuesday", etc.), or an integer between 0 and 6.
Strings are matched case-insensitively, and a full match is not expected (e.g. "Mon" would match "Monday").
-When an integer is given, 0 is Sunday and 6 is Saturday.
+When an integer is given, 0 is Sunday and 6 is Saturday. Use a range expression
+such as "Monday"-"Wednesday" for matching a week day range.
|hour|
A string representing an hour in 24-hour format. Seconds can optionally be specified.
-For example, 17:00 and 17:00:00 would be equivalent.
+For example, 17:00 and 17:00:00 would be equivalent. Use a range expression such
+as "17:00"-"19:00" for matching a time range.
|=============================
.Using meta expressions
-----------------------
# qualified meta expression
filter output meta oif eth0
+filter forward meta iifkind { "tun", "veth" }
# unqualified meta expression
filter output oif eth0
-# packet was subject to ipsec processing
+# incoming packet was subject to ipsec processing
raw prerouting meta ipsec exists accept
+
+# match incoming packet from 03:00 to 14:00 local time
+raw prerouting meta hour "03:00"-"14:00" counter accept
-----------------------
SOCKET EXPRESSION
~~~~~~~~~~~~~~~~~
[verse]
-*socket* {*transparent* | *mark*}
+*socket* {*transparent* | *mark* | *wildcard*}
+*socket* *cgroupv2* *level* 'NUM'
Socket expression can be used to search for an existing open TCP/UDP socket and
its attributes that can be associated with a packet. It looks for an established
-or non-zero bound listening socket (possibly with a non-local address).
+or non-zero bound listening socket (possibly with a non-local address). You can
+also use it to match on the socket cgroupv2 at a given ancestor level, e.g. if
+the socket belongs to cgroupv2 'a/b', ancestor level 1 checks for a matching on
+cgroup 'a' and ancestor level 2 checks for a matching on cgroup 'b'.
.Available socket attributes
[options="header"]
@@ -195,22 +219,30 @@ or non-zero bound listening socket (possibly with a non-local address).
Value of the IP_TRANSPARENT socket option in the found socket. It can be 0 or 1.|
boolean (1 bit)
|mark| Value of the socket mark (SOL_SOCKET, SO_MARK). | mark
+|wildcard|
+Indicates whether the socket is wildcard-bound (e.g. 0.0.0.0 or ::0). |
+boolean (1 bit)
+|cgroupv2|
+cgroup version 2 for this socket (path from /sys/fs/cgroup)|
+cgroupv2
|==================
.Using socket expression
------------------------
-# Mark packets that correspond to a transparent socket
+# Mark packets that correspond to a transparent socket. "socket wildcard 0"
+# means that zero-bound listener sockets are NOT matched (which is usually
+# exactly what you want).
table inet x {
chain y {
- type filter hook prerouting priority -150; policy accept;
- socket transparent 1 mark set 0x00000001 accept
+ type filter hook prerouting priority mangle; policy accept;
+ socket transparent 1 socket wildcard 0 mark set 0x00000001 accept
}
}
# Trace packets that corresponds to a socket with a mark value of 15
table inet x {
chain y {
- type filter hook prerouting priority -150; policy accept;
+ type filter hook prerouting priority mangle; policy accept;
socket mark 0x0000000f nftrace set 1
}
}
@@ -218,10 +250,18 @@ table inet x {
# Set packet mark to socket mark
table inet x {
chain y {
- type filter hook prerouting priority -150; policy accept;
+ type filter hook prerouting priority mangle; policy accept;
tcp dport 8080 mark set socket mark
}
}
+
+# Count packets for cgroupv2 "user.slice" at level 1
+table inet x {
+ chain y {
+ type filter hook input priority filter; policy accept;
+ socket cgroupv2 level 1 "user.slice" counter
+ }
+}
----------------------
OSF EXPRESSION
@@ -261,7 +301,7 @@ If no TTL attribute is passed, make a true IP header and fingerprint TTL true co
# Accept packets that match the "Linux" OS genre signature without comparing TTL.
table inet x {
chain y {
- type filter hook input priority 0; policy accept;
+ type filter hook input priority filter; policy accept;
osf ttl skip name "Linux"
}
}
@@ -305,13 +345,7 @@ If no route was found for the source address/input interface combination, the ou
In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
If only 'saddr oif' is given, then oif can be any interface index or zero.
-In this example, 'saddr . iif' lookups up routing information based on the source address and the input interface.
-oif picks the output interface index from the routing information.
-If no route was found for the source address/input interface combination, the output interface index is zero.
-In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
-If only 'saddr oif' is given, then oif can be any interface index or zero.
-
-# drop packets to address not configured on ininterface
+# drop packets to address not configured on incoming interface
filter prerouting fib daddr . iif type != { local, broadcast, multicast } drop
# perform lookup in a specific 'blackhole' table (0xdead, needs ip appropriate ip rule)
@@ -355,13 +389,15 @@ Routing Realm (32 bit number). Can be specified numerically or as symbolic name
--------------------------
# IP family independent rt expression
filter output rt classid 10
-filter output rt ipsec missing
# IP family dependent rt expressions
ip filter output rt nexthop 192.168.0.1
ip6 filter output rt nexthop fd00::1
inet filter output rt ip nexthop 192.168.0.1
inet filter output rt ip6 nexthop fd00::1
+
+# outgoing packet will be encapsulated/encrypted by ipsec
+filter output rt ipsec exists
--------------------------
IPSEC EXPRESSIONS
@@ -397,3 +433,62 @@ ipv4_addr/ipv6_addr
Destination address of the tunnel|
ipv4_addr/ipv6_addr
|=================================
+
+*Note:* When using xfrm_interface, this expression is not useable in output
+hook as the plain packet does not traverse it with IPsec info attached - use a
+chain in postrouting hook instead.
+
+NUMGEN EXPRESSION
+~~~~~~~~~~~~~~~~~
+
+[verse]
+*numgen* {*inc* | *random*} *mod* 'NUM' [ *offset* 'NUM' ]
+
+Create a number generator. The *inc* or *random* keywords control its
+operation mode: In *inc* mode, the last returned value is simply incremented.
+In *random* mode, a new random number is returned. The value after *mod*
+keyword specifies an upper boundary (read: modulus) which is not reached by
+returned numbers. The optional *offset* allows one to increment the returned value
+by a fixed offset.
+
+A typical use-case for *numgen* is load-balancing:
+
+.Using numgen expression
+------------------------
+# round-robin between 192.168.10.100 and 192.168.20.200:
+add rule nat prerouting dnat to numgen inc mod 2 map \
+ { 0 : 192.168.10.100, 1 : 192.168.20.200 }
+
+# probability-based with odd bias using intervals:
+add rule nat prerouting dnat to numgen random mod 10 map \
+ { 0-2 : 192.168.10.100, 3-9 : 192.168.20.200 }
+------------------------
+
+HASH EXPRESSIONS
+~~~~~~~~~~~~~~~~
+
+[verse]
+*jhash* {*ip saddr* | *ip6 daddr* | *tcp dport* | *udp sport* | *ether saddr*} [*.* ...] *mod* 'NUM' [ *seed* 'NUM' ] [ *offset* 'NUM' ]
+*symhash* *mod* 'NUM' [ *offset* 'NUM' ]
+
+Use a hashing function to generate a number. The functions available are
+*jhash*, known as Jenkins Hash, and *symhash*, for Symmetric Hash. The
+*jhash* requires an expression to determine the parameters of the packet
+header to apply the hashing, concatenations are possible as well. The value
+after *mod* keyword specifies an upper boundary (read: modulus) which is
+not reached by returned numbers. The optional *seed* is used to specify an
+init value used as seed in the hashing function. The optional *offset*
+allows one to increment the returned value by a fixed offset.
+
+A typical use-case for *jhash* and *symhash* is load-balancing:
+
+.Using hash expressions
+------------------------
+# load balance based on source ip between 2 ip addresses:
+add rule nat prerouting dnat to jhash ip saddr mod 2 map \
+ { 0 : 192.168.10.100, 1 : 192.168.20.200 }
+
+# symmetric load balancing between 2 ip addresses:
+add rule nat prerouting dnat to symhash mod 2 map \
+ { 0 : 192.168.10.100, 1 : 192.168.20.200 }
+------------------------