summaryrefslogtreecommitdiffstats
path: root/doc/statements.txt
blob: 3b82436750eee53648d352cb14d9622a6bcc3ae9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
VERDICT STATEMENT
~~~~~~~~~~~~~~~~~
The verdict statement alters control flow in the ruleset and issues policy decisions for packets.

[verse]
{*accept* | *drop* | *queue* | *continue* | *return*}
{*jump* | *goto*} 'chain'

*accept* and *drop* are absolute verdicts -- they terminate ruleset evaluation immediately.

[horizontal]
*accept*:: Terminate ruleset evaluation and accept the packet.
The packet can still be dropped later by another hook, for instance accept
in the forward hook still allows to drop the packet later in the postrouting hook,
or another forward base chain that has a higher priority number and is evaluated
afterwards in the processing pipeline.
*drop*:: Terminate ruleset evaluation and drop the packet.
The drop occurs instantly, no further chains or hooks are evaluated.
It is not possible to accept the packet in a later chain again, as those
are not evaluated anymore for the packet.
*queue*:: Terminate ruleset evaluation and queue the packet to userspace.
Userspace must provide a drop or accept verdict.  In case of accept, processing
resumes with the next base chain hook, not the rule following the queue verdict.
*continue*:: Continue ruleset evaluation with the next rule. This
 is the default behaviour in case a rule issues no verdict.
*return*:: Return from the current chain and continue evaluation at the
 next rule in the last chain. If issued in a base chain, it is equivalent to the
 base chain policy.
*jump* 'chain':: Continue evaluation at the first rule in 'chain'. The current
 position in the ruleset is pushed to a call stack and evaluation will continue
 there when the new chain is entirely evaluated or a *return* verdict is issued.
 In case an absolute verdict is issued by a rule in the chain, ruleset evaluation
 terminates immediately and the specific action is taken.
*goto* 'chain':: Similar to *jump*, but the current position is not pushed to the
 call stack, meaning that after the new chain evaluation will continue at the last
 chain instead of the one containing the goto statement.

.Using verdict statements
-------------------
# process packets from eth0 and the internal network in from_lan
# chain, drop all packets from eth0 with different source addresses.

filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan
filter input iif eth0 drop
-------------------

PAYLOAD STATEMENT
~~~~~~~~~~~~~~~~~
[verse]
'payload_expression' *set* 'value'

The payload statement alters packet content. It can be used for example to
set ip DSCP (diffserv) header field or ipv6 flow labels.

.route some packets instead of bridging
---------------------------------------
# redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging
# assumes 00:11:22:33:44:55 is local MAC address.
bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55
-------------------------------------------

.Set IPv4 DSCP header field
---------------------------
ip forward ip dscp set 42
---------------------------

EXTENSION HEADER STATEMENT
~~~~~~~~~~~~~~~~~~~~~~~~~~
[verse]
'extension_header_expression' *set* 'value'

The extension header statement alters packet content in variable-sized headers.
This can currently be used to alter the TCP Maximum segment size of packets,
similar to TCPMSS.

.change tcp mss
---------------
tcp flags syn tcp option maxseg size set 1360
# set a size based on route information:
tcp flags syn tcp option maxseg size set rt mtu
---------------

LOG STATEMENT
~~~~~~~~~~~~~
[verse]
*log* [*prefix* 'quoted_string'] [*level* 'syslog-level'] [*flags* 'log-flags']
*log* *group* 'nflog_group' [*prefix* 'quoted_string'] [*queue-threshold* 'value'] [*snaplen* 'size']
*log level audit*

The log statement enables logging of matching packets. When this statement is
used from a rule, the Linux kernel will print some information on all matching
packets, such as header fields, via the kernel log (where it can be read with
dmesg(1) or read in the syslog).

In the second form of invocation (if 'nflog_group' is specified), the Linux
kernel will pass the packet to nfnetlink_log which will multicast the packet
through a netlink socket to the specified multicast group. One or more userspace
processes may subscribe to the group to receive the packets, see
libnetfilter_queue documentation for details.

In the third form of invocation (if level audit is specified), the Linux
kernel writes a message into the audit buffer suitably formatted for reading
with auditd. Therefore no further formatting options (such as prefix or flags)
are allowed in this mode.

This is a non-terminating statement, so the rule evaluation continues after
the packet is logged.

.log statement options
[options="header"]
|==================
|Keyword | Description | Type
|prefix|
Log message prefix|
quoted string
|level|
Syslog level of logging |
string: emerg, alert, crit, err, warn [default], notice, info, debug, audit
|group|
NFLOG group to send messages to|
unsigned integer (16 bit)
|snaplen|
Length of packet payload to include in netlink message |
unsigned integer (32 bit)
|queue-threshold|
Number of packets to queue inside the kernel before sending them to userspace |
unsigned integer (32 bit)
|==================================

.log-flags
[options="header"]
|==================
| Flag | Description
|tcp sequence|
Log TCP sequence numbers.
|tcp options|
Log options from the TCP packet header.
|ip options|
Log options from the IP/IPv6 packet header.
|skuid|
Log the userid of the process which generated the packet.
|ether|
Decode MAC addresses and protocol.
|all|
Enable all log flags listed above.
|==============================

.Using log statement
--------------------
# log the UID which generated the packet and ip options
ip filter output log flags skuid flags ip options

# log the tcp sequence numbers and tcp options from the TCP packet
ip filter output log flags tcp sequence,options

# enable all supported log flags
ip6 filter output log flags all
-----------------------

REJECT STATEMENT
~~~~~~~~~~~~~~~~
[verse]
____
*reject* [ *with* 'REJECT_WITH' ]

'REJECT_WITH' := *icmp type* 'icmp_code' |
                 *icmpv6 type* 'icmpv6_code' |
                 *icmpx type* 'icmpx_code' |
                 *tcp reset*
____

A reject statement is used to send back an error packet in response to the
matched packet otherwise it is equivalent to drop so it is a terminating
statement, ending rule traversal. This statement is only valid in the input,
forward and output chains, and user-defined chains which are only called from
those chains.

.different ICMP reject variants are meant for use in different table families
[options="header"]
|==================
|Variant |Family | Type
|icmp|
ip|
icmp_code
|icmpv6|
ip6|
icmpv6_code
|icmpx|
inet|
icmpx_code
|==================

For a description of the different types and a list of supported keywords refer
to DATA TYPES section above. The common default reject value is
*port-unreachable*. +

Note that in bridge family, reject statement is only allowed in base chains
which hook into input or prerouting.

COUNTER STATEMENT
~~~~~~~~~~~~~~~~~
A counter statement sets the hit count of packets along with the number of bytes.

[verse]
*counter* *packets* 'number' *bytes* 'number'
*counter* { *packets* 'number' | *bytes* 'number' }

CONNTRACK STATEMENT
~~~~~~~~~~~~~~~~~~~
The conntrack statement can be used to set the conntrack mark and conntrack labels.

[verse]
*ct* {*mark* | *event* | *label* | *zone*} *set* 'value'

The ct statement sets meta data associated with a connection. The zone id
has to be assigned before a conntrack lookup takes place, i.e. this has to be
done in prerouting and possibly output (if locally generated packets need to be
placed in a distinct zone), with a hook priority of -300.

.Conntrack statement types
[options="header"]
|==================
|Keyword| Description| Value
|event|
conntrack event bits |
bitmask, integer (32 bit)
|helper|
name of ct helper object to assign to the connection |
quoted string
|mark|
Connection tracking mark |
mark
|label|
Connection tracking label|
label
|zone|
conntrack zone|
integer (16 bit)
|==================

.save packet nfmark in conntrack
--------------------------------
ct mark set meta mark
--------------------------------

.set zone mapped via interface
------------------------------
table inet raw {
  chain prerouting {
      type filter hook prerouting priority -300;
      ct zone set iif map { "eth1" : 1, "veth1" : 2 }
  }
  chain output {
      type filter hook output priority -300;
      ct zone set oif map { "eth1" : 1, "veth1" : 2 }
  }
}
------------------------------------------------------

.restrict events reported by ctnetlink
--------------------------------------
ct event set new,related,destroy
--------------------------------------

META STATEMENT
~~~~~~~~~~~~~~
A meta statement sets the value of a meta expression. The existing meta fields
are: priority, mark, pkttype, nftrace. +

[verse]
*meta* {*mark* | *priority* | *pkttype* | *nftrace*} *set* 'value'

A meta statement sets meta data associated with a packet. +

.Meta statement types
[options="header"]
|==================
|Keyword| Description| Value
|priority |
TC packet priority|
tc_handle
|mark|
Packet mark |
mark
|pkttype |
packet type |
pkt_type
|nftrace |
ruleset packet tracing on/off. Use *monitor trace* command to watch traces|
0, 1
|==========================

LIMIT STATEMENT
~~~~~~~~~~~~~~~
[verse]
____
*limit rate* [*over*] 'packet_number' */* 'TIME_UNIT' [*burst* 'packet_number' *packets*]
*limit rate* [*over*] 'byte_number' 'BYTE_UNIT' */* 'TIME_UNIT' [*burst* 'byte_number' 'BYTE_UNIT']

'TIME_UNIT' := *second* | *minute* | *hour* | *day*
'BYTE_UNIT' := *bytes* | *kbytes* | *mbytes*
____

A limit statement matches at a limited rate using a token bucket filter. A rule
using this statement will match until this limit is reached. It can be used in
combination with the log statement to give limited logging. The optional
*over* keyword makes it match over the specified rate.

.limit statement values
[options="header"]
|==================
|Value | Description | Type
|packet_number |
Number of packets |
unsigned integer (32 bit)
|byte_number |
Number of bytes |
unsigned integer (32 bit)
|========================

NAT STATEMENTS
~~~~~~~~~~~~~~
[verse]
____
*snat to* 'address' [*:*'port'] ['PRF_FLAGS']
*snat to* 'address' *-* 'address' [*:*'port' *-* 'port'] ['PRF_FLAGS']
*snat* { *ip* | *ip6* } *to* 'address' *-* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS']
*dnat to* 'address' [*:*'port'] ['PRF_FLAGS']
*dnat to* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS']
*dnat* { *ip* | *ip6* } *to* 'address' [*:*'port' *-* 'port'] ['PR_FLAGS']
*masquerade to* [*:*'port'] ['PRF_FLAGS']
*masquerade to* [*:*'port' *-* 'port'] ['PRF_FLAGS']
*redirect to* [*:*'port'] ['PRF_FLAGS']
*redirect to* [*:*'port' *-* 'port'] ['PRF_FLAGS']

'PRF_FLAGS' := 'PRF_FLAG' [*,* 'PRF_FLAGS']
'PR_FLAGS'  := 'PR_FLAG' [*,* 'PR_FLAGS']
'PRF_FLAG'  := 'PR_FLAG' | *fully-random*
'PR_FLAG'   := *persistent* | *random*
____

The nat statements are only valid from nat chain types. +

The *snat* and *masquerade* statements specify that the source address of the
packet should be modified. While *snat* is only valid in the postrouting and
input chains, *masquerade* makes sense only in postrouting. The dnat and
redirect statements are only valid in the prerouting and output chains, they
specify that the destination address of the packet should be modified. You can
use non-base chains which are called from base chains of nat chain type too.
All future packets in this connection will also be mangled, and rules should
cease being examined.

The *masquerade* statement is a special form of snat which always uses the
outgoing interface's IP address to translate to. It is particularly useful on
gateways with dynamic (public) IP addresses.

The *redirect* statement is a special form of dnat which always translates the
destination address to the local host's one. It comes in handy if one only wants
to alter the destination port of incoming traffic on different interfaces.

When used in the inet family (available with kernel 5.2), the dnat and snat
statements require the use of the ip and ip6 keyword in case an address is
provided, see the examples below.

Before kernel 4.18 nat statements require both prerouting and postrouting base chains
to be present since otherwise packets on the return path won't be seen by
netfilter and therefore no reverse translation will take place.

.NAT statement values
[options="header"]
|==================
|Expression| Description| Type
|address|
Specifies that the source/destination address of the packet should be modified.
You may specify a mapping to relate a list of tuples composed of arbitrary
expression key with address value. |
ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. meta mark map { 10 : 192.168.1.2, 20 : 192.168.1.3 }
|port|
Specifies that the source/destination address of the packet should be modified. |
port number (16 bit)
|===============================

.NAT statement flags
[options="header"]
|==================
|Flag| Description
|persistent |
Gives a client the same source-/destination-address for each connection.
|random|
In kernel 5.0 and newer this is the same as fully-random.
In earlier kernels the port mapping will be randomized using a seeded MD5
hash mix using source and destination address and destination port.

|fully-random|
If used then port mapping is generated based on a 32-bit pseudo-random algorithm.
|=============================

.Using NAT statements
---------------------
# create a suitable table/chain setup for all further examples
add table nat
add chain nat prerouting { type nat hook prerouting priority 0; }
add chain nat postrouting { type nat hook postrouting priority 100; }

# translate source addresses of all packets leaving via eth0 to address 1.2.3.4
add rule nat postrouting oif eth0 snat to 1.2.3.4

# redirect all traffic entering via eth0 to destination address 192.168.1.120
add rule nat prerouting iif eth0 dnat to 192.168.1.120

# translate source addresses of all packets leaving via eth0 to whatever
# locally generated packets would use as source to reach the same destination
add rule nat postrouting oif eth0 masquerade

# redirect incoming TCP traffic for port 22 to port 2222
add rule nat prerouting tcp dport 22 redirect to :2222

# inet family:
# handle ip dnat:
add rule inet nat prerouting dnat ip to 10.0.2.99
# handle ip6 dnat:
add rule inet nat prerouting dnat ip6 to fe80::dead
# this masquerades both ipv4 and ipv6:
add rule inet nat postrouting meta oif ppp0 masquerade

------------------------

TPROXY STATEMENT
~~~~~~~~~~~~~~~~
Tproxy redirects the packet to a local socket without changing the packet header
in any way. If any of the arguments is missing the data of the incoming packet
is used as parameter. Tproxy matching requires another rule that ensures the
presence of transport protocol header is specified.

[verse]
*tproxy to* 'address'*:*'port'
*tproxy to* {'address' | *:*'port'}

This syntax can be used in *ip/ip6* tables where network layer protocol is
obvious. Either IP address or port can be specified, but at least one of them is
necessary.

[verse]
*tproxy* {*ip* | *ip6*} *to* 'address'[*:*'port']
*tproxy to :*'port'

This syntax can be used in *inet* tables. The *ip/ip6* parameter defines the
family the rule will match. The *address* parameter must be of this family.
When only *port* is defined, the address family should not be specified. In
this case the rule will match for both families.

.tproxy attributes
[options="header"]
|=================
| Name | Description
| address | IP address the listening socket with IP_TRANSPARENT option is bound to.
| port | Port the listening socket with IP_TRANSPARENT option is bound to.
|=================

.Example ruleset for tproxy statement
-------------------------------------
table ip x {
    chain y {
        type filter hook prerouting priority -150; policy accept;
        tcp dport ntp tproxy to 1.1.1.1
        udp dport ssh tproxy to :2222
    }
}
table ip6 x {
    chain y {
       type filter hook prerouting priority -150; policy accept;
       tcp dport ntp tproxy to [dead::beef]
       udp dport ssh tproxy to :2222
    }
}
table inet x {
    chain y {
        type filter hook prerouting priority -150; policy accept;
        tcp dport 321 tproxy to :ssh
        tcp dport 99 tproxy ip to 1.1.1.1:999
        udp dport 155 tproxy ip6 to [dead::beef]:smux
    }
}
-------------------------------------

SYNPROXY STATEMENT
~~~~~~~~~~~~~~~~~~
This statement will process TCP three-way-handshake parallel in netfilter
context to protect either local or backend system. This statement requires
connection tracking because sequence numbers need to be translated.

[verse]
*synproxy* [*mss* 'mss_value'] [*wscale* 'wscale_value'] ['SYNPROXY_FLAGS']

.synproxy statement attributes
[options="header"]
|=================
| Name | Description
| mss | Maximum segment size announced to clients. This must match the backend.
| wscale | Window scale announced to clients. This must match the backend.
|=================

.synproxy statement flags
[options="header"]
|=================
| Flag | Description
| sack-perm |
Pass client selective acknowledgement option to backend (will be disabled if
not present).
| timestamp |
Pass client timestamp option to backend (will be disabled if not present, also
needed for selective acknowledgement and window scaling).
|=================

.Example ruleset for synproxy statement
---------------------------------------
Determine tcp options used by backend, from an external system

              tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)'
                  port 80 &
              telnet 192.0.2.42 80
              18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757:
                  Flags [S.], seq 360414582, ack 788841994, win 14480,
                  options [mss 1460,sackOK,
                  TS val 1409056151 ecr 9690221,
                  nop,wscale 9],
                  length 0

Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID.

              echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose

Make SYN packets untracked.

	table ip x {
		chain y {
			type filter hook prerouting priority raw; policy accept;
			tcp flags syn notrack
		}
	}

Catch UNTRACKED (SYN  packets) and INVALID (3WHS ACK packets) states and send
them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK
syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and
drop incorrect cookies. Flags combinations not expected during  3WHS will not
match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this
will be out-of-flow packets that were not matched by SYNPROXY.

    table ip foo {
            chain z {
                    type filter hook input priority filter; policy accept;
                    ct state { invalid, untracked } synproxy mss 1460 wscale 9 timestamp sack-perm
                    ct state invalid drop
            }
    }

The outcome ruleset of the steps above should be similar to the one below.

	table ip x {
		chain y {
			type filter hook prerouting priority raw; policy accept;
	                tcp flags syn notrack
		}

		chain z {
			type filter hook input priority filter; policy accept;
	                ct state { invalid, untracked } synproxy mss 1460 wscale 9 timestamp sack-perm
		        ct state invalid drop
	        }
	}
---------------------------------------

FLOW STATEMENT
~~~~~~~~~~~~~~
A flow statement allows us to select what flows you want to accelerate
forwarding through layer 3 network stack bypass. You have to specify the
flowtable name where you want to offload this flow.

*flow add @*'flowtable'

QUEUE STATEMENT
~~~~~~~~~~~~~~~
This statement passes the packet to userspace using the nfnetlink_queue handler.
The packet is put into the queue identified by its 16-bit queue number.
Userspace can inspect and modify the packet if desired. Userspace must then drop
or re-inject the packet into the kernel. See libnetfilter_queue documentation
for details.

[verse]
____
*queue* [*num* 'queue_number'] [*bypass*]
*queue* [*num* 'queue_number_from' - 'queue_number_to'] ['QUEUE_FLAGS']

'QUEUE_FLAGS' := 'QUEUE_FLAG' [*,* 'QUEUE_FLAGS']
'QUEUE_FLAG'  := *bypass* | *fanout*
____


.queue statement values
[options="header"]
|==================
|Value | Description | Type
|queue_number |
Sets queue number, default is 0. |
unsigned integer (16 bit)
|queue_number_from |
Sets initial queue in the range, if fanout is used. |
unsigned integer (16 bit)
|queue_number_to |
Sets closing queue in the range, if fanout is used. |
unsigned integer (16 bit)
|=====================

.queue statement flags
[options="header"]
|==================
|Flag | Description
|bypass |
Let packets go through if userspace application cannot back off. Before using
this flag, read libnetfilter_queue documentation for performance tuning recommendations.
|fanout |
Distribute packets between several queues.
|===============================

DUP STATEMENT
~~~~~~~~~~~~~
The dup statement is used to duplicate a packet and send the copy to a different
destination.

[verse]
*dup to* 'device'
*dup to* 'address' *device* 'device'

.Dup statement values
[options="header"]
|==================
|Expression | Description | Type
|address |
Specifies that the copy of the packet should be sent to a new gateway.|
ipv4_addr, ipv6_addr, e.g. abcd::1234, or you can use a mapping, e.g. ip saddr map { 192.168.1.2 : 10.1.1.1 }
|device |
Specifies that the copy should be transmitted via device. |
string
|===================


.Using the dup statement
------------------------
# send to machine with ip address 10.2.3.4 on eth0
ip filter forward dup to 10.2.3.4 device "eth0"

# copy raw frame to another interface
netdetv ingress dup to "eth0"
dup to "eth0"

# combine with map dst addr to gateways
dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" }
-----------------------------------

FWD STATEMENT
~~~~~~~~~~~~~
The fwd statement is used to redirect a raw packet to another interface. It is
only available in the netdev family ingress hook. It is similar to the dup
statement except that no copy is made.

*fwd to* 'device'

SET STATEMENT
~~~~~~~~~~~~~
The set statement is used to dynamically add or update elements in a set from
the packet path. The set setname must already exist in the given table and must
have been created with one or both of the dynamic and the timeout flags. The
dynamic flag is required if the set statement expression includes a stateful
object. The timeout flag is implied if the set is created with a timeout, and is
required if the set statement updates elements, rather than adding them.
Furthermore, these sets should specify both a maximum set size (to prevent
memory exhaustion), and their elements should have a timeout (so their number
will not grow indefinitely) either from the set definition or from the statement
that adds or updates them. The set statement can be used to e.g. create dynamic
blacklists.

[verse]
{*add* | *update*} *@*'setname' *{* 'expression' [*timeout* 'timeout'] [*comment* 'string'] *}*

.Example for simple blacklist
-----------------------------
# declare a set, bound to table "filter", in family "ip". Timeout and size are mandatory because we will add elements from packet path.
nft add set ip filter blackhole "{ type ipv4_addr; flags timeout; size 65536; }"

# whitelist internal interface.
nft add rule ip filter input meta iifname "internal" accept

# drop packets coming from blacklisted ip addresses.
nft add rule ip filter input ip saddr @blackhole counter drop

# add source ip addresses to the blacklist if more than 10 tcp connection requests occurred per second and ip address.
# entries will timeout after one minute, after which they might be re-added if limit condition persists.
nft add rule ip filter input tcp flags syn tcp dport ssh meter flood size 128000 { ip saddr timeout 10s limit rate over 10/second} add @blackhole { ip saddr timeout 1m } drop

# inspect state of the rate limit meter:
nft list meter ip filter flood

# inspect content of blackhole:
nft list set ip filter blackhole

# manually add two addresses to the set:
nft add element filter blackhole { 10.2.3.4, 10.23.1.42 }
-----------------------------------------------

MAP STATEMENT
~~~~~~~~~~~~~
The map statement is used to lookup data based on some specific input key.

[verse]
____
'expression' *map* *{* 'MAP_ELEMENTS' *}*

'MAP_ELEMENTS' := 'MAP_ELEMENT' [*,* 'MAP_ELEMENTS']
'MAP_ELEMENT'  := 'key' *:* 'value'
____

The 'key' is a value returned by 'expression'.
// XXX: Write about where map statement can be used (list of statements?)

.Using the map statement
------------------------
# select DNAT target based on TCP dport:
# connections to port 80 are redirected to 192.168.1.100,
# connections to port 8888 are redirected to 192.168.1.101
nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 }

# source address based SNAT:
# packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1,
# packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2
nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }
------------------------

VMAP STATEMENT
~~~~~~~~~~~~~~
The verdict map (vmap) statement works analogous to the map statement, but
contains verdicts as values.

[verse]
____
'expression' *vmap* *{* 'VMAP_ELEMENTS' *}*

'VMAP_ELEMENTS' := 'VMAP_ELEMENT' [*,* 'VMAP_ELEMENTS']
'VMAP_ELEMENT'  := 'key' *:* 'verdict'
____

.Using the vmap statement
-------------------------
# jump to different chains depending on layer 4 protocol type:
nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain }
------------------------