3 ip, esp, gre, icmp, icmpv6, ipmux, rudp, tcp, udp, il \- network protocols over IP
7 .B bind -a #I\fIspec\fP /net
13 .BI /net/ipifc/ n /status
14 .BI /net/ipifc/ n /ctl
39 .BI /net/tcp/ n /local
40 .BI /net/tcp/ n /remote
41 .BI /net/tcp/ n /status
42 .BI /net/tcp/ n /listen
49 device provides the interface to Internet Protocol stacks.
51 is an integer from 0 to 15 identifying a stack.
52 Each stack implements IPv4 and IPv6.
53 Each stack is independent of all others:
54 the only information transfer between them is via programs that
55 mount multiple stacks.
56 Normally a system uses only one stack.
57 However multiple stacks can be used for debugging
58 new IP networks or implementing firewalls or proxy
61 All addresses used are 16-byte IPv6 addresses.
62 IPv4 addresses are a subset of the IPv6 addresses and both standard
65 In binary representation, all v4 addresses start with the 12 bytes, in hex:
68 00 00 00 00 00 00 00 00 00 00 ff ff
71 .SS "Configuring interfaces
72 Each stack may have multiple interfaces and each interface
73 may have multiple addresses.
80 file, and numbered subdirectories for each physical interface.
84 file reserves an interface.
85 The file descriptor returned from the
87 will point to the control file,
89 of the newly allocated interface.
92 returns a text string representing the number of the interface.
95 alters aspects of the interface.
98 messages are those described under
99 .B "Protocol directories"
101 .TF "\fLbind loopback\fR"
107 .BI "bind ether " path
108 Treat the device mounted at
110 as an Ethernet medium carrying IP and ARP packets
111 and associate it with this interface.
118 and use the three connections for IPv4, IPv6 and
122 Treat this interface as a packet interface. Assume
123 a user program will read and write the
125 file to receive and transmit IP packets to the kernel.
126 This is used by programs such as
128 to mediate IP packet transfer between the kernel and
129 a PPP encoded device.
131 .BI "bind netdev " path
132 Treat this interface as a packet interface.
135 and read and write the resulting file descriptor
136 to receive and transmit IP packets.
139 Treat this interface as a local loopback. Anything
140 written to it will be looped back.
146 Disassociate the physical device from an IP interface.
148 .BI add\ "local mask remote mtu " proxy
151 .BI try\ "local mask remote mtu " proxy
153 Add a local IP address to the interface.
157 address as a tentative address
158 if it's an IPv6 address.
165 arguments are all optional.
168 is the class mask for the local address.
177 (maximum transmission unit)
178 is 1514 for Ethernet and 4096 for packet media.
181 is the size in bytes of the largest packet that this interface can send.
183 if specified, means that this machine should answer
184 ARP requests for the remote address.
186 does this to make remote machines appear
187 to be connected to the local Ethernet.
189 .BI remove\ "local mask"
190 Remove a local IP address from an interface.
193 Set the maximum transfer unit for this device to
195 The mtu is the maximum size of the packet including any
196 medium-specific headers.
199 Set the maximum transmit speed in bits per second.
202 Set the maximum burst delay in milliseconds. (Default is 40ms)
205 has been set and packets in flight exceed the maximum burst
206 delay then packets send on the interface are discarded until
207 the load drops below the maximum.
212 is missing or non-zero) or disallow
214 is 0) forwarding packets between this interface and others.
217 When forwarding, allow packets from this interface to be
218 echoed back on the same interface.
221 Reassemble IP fragments before forwarding to this interface
223 .\" remainder from netif.c (thus called from devether.c),
224 .\" except add6 and ra6 from ipifc.c
232 Set the interface into promiscuous mode,
233 which makes it accept all incoming packets,
234 whether addressed to it or not.
237 marks the Ethernet packet
239 as being in use, if not already in use
243 of -1 means `all' but appears to be a no-op.
245 .BI addmulti\ Media-addr
248 on this interface as a local address.
250 .BI remmulti\ Media-addr
251 Remove the multicast address
256 Make the wireless interface scan for base stations.
259 Set the interface to pass only packet headers, not data too.
261 .\" remainder from ipifc.c; tedious, so put them last
264 .BI "add6 " "v6addr pfx-len [onlink auto validlt preflt]"
265 Add the local IPv6 address
270 See RFC 2461 §6.2.1 for more detail.
271 The remaining arguments are optional:
276 flag: address is `on-link'
282 valid life-time in seconds
285 preferred life-time in seconds
289 .BI "ra6 " "keyword value ..."
290 Set IPv6 router advertisement (RA) parameter
295 and the meanings of their values follow.
296 See RFC 2461 §6.2.1 for more detail.
297 Flags are true iff non-zero.
299 .TF "\fLreachtime\fR"
302 flag: receive and process RAs.
305 flag: generate and send RAs.
308 flag: ``Managed address configuration'',
312 flag: ``Other stateful configuration'',
316 ``maximum time allowed between sending unsolicited multicast''
317 RAs from the interface, in ms.
320 ``minimum time allowed between sending unsolicited multicast''
321 RAs from the interface, in ms.
324 ``value to be placed in MTU options sent by the router.''
328 sets the Reachable Time field in RAs sent by the router.
329 ``Zero means unspecified (by this router).''
332 sets the Retrans Timer field in RAs sent by the router.
333 ``Zero means unspecified (by this router).''
336 default value of the Cur Hop Limit field in RAs sent by the router.
337 Should be set to the ``current diameter of the Internet.''
338 ``Zero means unspecified (by this router).''
341 sets the Router Lifetime field of RAs sent from the interface, in ms.
342 Zero means the router is not to be used as a default router.
346 Reading the interface's
348 file returns information about the interface. The first line
349 is composed of white-space-separated fields, the first two
350 fields are: device and maxmtu. Subsequent lines list the
351 ip addresses assigned to that inferface. The colums are:
352 ip address, network mask, network address and valid/preferred
353 life times in milliseconds. See
361 controls information about IP routing.
362 When read, it returns one line per routing entry.
363 Each line contains eight white-space-separated fields:
364 target address, target mask, address of next hop, flags,
365 tag, interface number, source address, source mask.
366 The entry used for routing an IP packet is the one with
367 the longest destination and source mask for which
368 destination address ANDed with target mask equals the
369 target and also the source ANDed with the source mask equals
371 The one-character flags are:
387 local unicast address
396 The tag is an arbitrary, up to 4 character, string. It is normally used to
397 indicate what routing protocol originated the route.
401 changes the route table. The messages are:
402 .TF "\fLtag \fIstring\fR"
406 Remove routes of the specified tag, or all routes if
413 with all subsequent routes added via this file descriptor.
415 .BI add\ "target mask nexthop"
417 .BI add\ "target mask nexthop interface"
419 .BI add\ "target mask nexthop source smask"
421 .BI add\ "target mask nexthop interface source smask"
423 .BI add\ "target mask nexthop tag interface source smask"
425 .BI add\ "target mask nexthop type tag interface source smask"
426 Add the route to the table. If one already exists with the
427 same target and mask, replace it. The
429 can be given as eigther the interface number or a local
430 IP address on the desired interface.
432 .BI remove\ "target mask"
434 .BI remove\ "target mask nexthop"
436 .BI remove\ "target mask source smask"
438 .BI remove\ "target mask nexthop source smask"
440 .BI remove\ "target mask nexthop interface source smask"
442 .BI remove\ "target mask nexthop tag interface source smask"
444 .BI remove\ "target mask nexthop type tag interface source smask"
445 Remove the matching route.
447 .SS "Address resolution
450 controls information about address resolution.
451 The kernel automatically updates the v4 ARP and v6 Neighbour Discovery
452 information for Ethernet interfaces.
453 When read, the file returns one line per address containing the
454 type of medium, the status of the entry (OK, WAIT), the IP
455 address, the medium address and the IP address of the interface
456 where the entry is valid.
459 administers the ARP information.
460 The control messages are:
461 .TF "\fLdel \fIIP-addr\fR"
467 .BI add\ "type IP-addr Media-addr Interface-IP-addr"
468 Add an entry or replace an existing one for the
469 same IP address. The optional interface IP address specifies the
470 interface where the ARP entry will be valid. This is needed
471 for IPv6 link local addresses.
474 Delete an individual entry.
476 ARP entries do not time out. The ARP table is a
477 cache with an LRU replacement policy. The IP stack
478 listens for all ARP requests and, if the requester is in
479 the table, the entry is updated.
480 Also, whenever a new address is configured onto an
481 Ethernet, an ARP request is sent to help
482 update the table on other systems.
484 Currently, the only medium type is
489 .SS "Debugging and stack information
490 If any process is holding
492 open, the IP stack queues debugging information to it.
493 This is intended primarily for debugging the IP stack.
494 The information provided is implementation-defined;
495 see the source for details. Generally, what is returned is error messages
500 controls debugging. The control messages are:
501 .TF "\fLclear \fIarglist\fR"
506 is a space-separated list of items for which to enable debugging.
507 The possible items are:
527 is a space-separated list of items for which to disable debugging.
532 is non-zero, restrict debugging to only those
533 packets whose source or destination is that
538 can be read or written by
539 programs. It is normally used by
541 to leave configuration information for other programs
549 may contain up to 1024 bytes.
553 is a read-only file containing all the IP addresses
554 considered local. Each line in the file contains
555 three white-space-separated fields: IP address, usage count,
556 and flags. The usage count is the number of interfaces to which
557 the address applies. The flags are the same as for routing
562 .SS "Protocol directories
566 supports IP as well as several protocols that run over it:
567 TCP, UDP, RUDP, ICMP, IL, GRE, and ESP.
568 TCP and UDP provide the standard Internet
569 protocols for reliable stream and unreliable datagram
571 RUDP is a locally-developed reliable datagram protocol based on UDP.
572 ICMP is IP's catch-all control protocol used to send
573 low level error messages and to implement
575 GRE is a general encapsulation protocol.
576 ESP is the encapsulation protocol for IPsec.
577 IL provides a reliable datagram service for communication
578 between Plan 9 machines but is now deprecated.
580 Each protocol is a subdirectory of the IP stack.
581 The top level directory of each protocol contains a
585 file, and subdirectories numbered from zero to the number of connections
586 opened for this protocol.
590 file reserves a connection. The file descriptor returned from the
592 will point to the control file,
594 of the newly allocated connection.
598 string representing the number of the
600 Connections may be used either to listen for incoming calls
601 or to initiate calls to other machines.
603 A connection is controlled by writing text strings to the associated
606 After a connection has been established data may be read from
609 A connection can be actively established using the
613 A connection can be established passively by first
618 to bind to a local port and then
623 to receive incoming calls.
625 The following control messages are supported:
626 .TF "\fLremmulti \fIip\fR"
629 .BI connect\ ip-address ! port "!r " local
630 Establish a connection to the remote
636 is specified, it is used as the local port number.
641 is, the system will allocate
642 a restricted port number (less than 1024) for the connection to allow communication
648 Otherwise a free port number starting at 5000 is chosen.
649 The connect fails if the combination of local and remote address/port pairs
650 are already assigned to another port.
654 is a decimal port number or
666 calls for any port that no process has explicitly announced.
667 The local IP address cannot be set.
669 fails if the connection is already announced or connected.
673 is a decimal port number or
675 Set the local port number to
677 This exists to support emulation
678 of BSD sockets by the APE libraries (see
680 and is not otherwise used.
684 .\" Set the maximum number of unanswered (queued) incoming
685 .\" connections to an announced port to
689 .\" is set to five. If more than
691 .\" connections are pending,
692 .\" further requests for a service will be rejected.
695 Set the time to live IP field in outgoing packets to
699 Set the service type IP field in outgoing packets to
703 Don't break (UDP) connections because of ICMP errors.
705 .BI addmulti\ "ifc-ip [ mcast-ip ]"
708 on this multicast interface as a local address.
712 use it as the interface's multicast address.
717 from this multicast interface.
719 Port numbers must be in the range 1 to 32767.
721 Several files report the status of a
727 files contain the IP address and port number for the remote and local side of the
730 file contains protocol-dependent information to help debug network connections.
731 On receiving and error or EOF reading or writing the
735 file contains the reason for error.
737 A process may accept incoming connections by
744 will block until a new connection request arrives.
747 will return an open file descriptor which points to the control file of the
748 newly accepted connection.
749 This procedure will accept all calls for the
755 TCP connections are reliable point-to-point byte streams; there are no
757 A connection is determined by the address and port numbers of the two
761 files support the following additional messages:
762 .TF "\fLkeepalive\fI n\fR"
766 close down this TCP connection
769 turn on keep alive messages.
771 if given, is the milliseconds between keepalives
775 emit TCP checksums of zero if
777 is zero; otherwise, and by default,
778 TCP checksums are computed and sent normally.
780 .BI tcpporthogdefense \ onoff
784 enables the TCP port-hog defense for all TCP connections;
789 The defense is a solution to hijacked systems staking out ports
790 as a form of denial-of-service attack.
791 To avoid stateless TCP conversation hogs,
793 picks a TCP sequence number at random for keepalives.
794 If that number gets acked by the other end,
796 shuts down the connection.
798 notably ones that perform stateful inspection,
799 discard such out-of-specification keepalives,
800 so connections through such firewalls
801 will be killed after five minutes
802 by the lack of keepalives.
805 UDP connections carry unreliable and unordered datagrams. A read from
807 will return the next datagram, discarding anything
808 that doesn't fit in the read buffer.
809 A write is sent as a single datagram.
811 By default, a UDP connection is a point-to-point link.
814 establishes a local and remote address/port pair or
817 each datagram coming from a different remote address/port pair
818 establishes a new incoming connection.
819 However, many-to-one semantics is also possible.
827 then all messages sent to the announced port
828 are received on the announced connection prefixed
829 with the corresponding structure,
834 typedef struct Udphdr Udphdr;
837 uchar raddr[16]; /* V6 remote address and port */
838 uchar laddr[16]; /* V6 local address and port */
839 uchar ifcaddr[16]; /* V6 interface address (receive only) */
840 uchar rport[2]; /* remote port */
841 uchar lport[2]; /* local port */
845 Before a write, a user must prefix a similar structure to each message.
846 The system overrides the user specified local port with the announced
847 one. If the user specifies an address that isn't a unicast address in
849 that too is overridden.
850 Since the prefixed structure is the same in read and write, it is relatively
851 easy to write a server that responds to client requests by just copying new
852 data into the message body and then writing back the same buffer that was
855 In this case (writing
866 the usual sequence of
870 must be executed before performing I/O on the corresponding
875 RUDP is a reliable datagram protocol based on UDP,
876 currently only for IPv4.
877 Packets are delivered in order.
878 RUDP does not support
880 One must write either
884 followed immediately by
889 Unlike TCP, the reboot of one end of a connection does
890 not force a closing of the connection. Communications will
891 resume when the rebooted machine resumes talking. Any unacknowledged
892 packets queued before the reboot will be lost. A reboot can
893 be detected by reading the
895 file. It will contain the message
897 .BI hangup\ address ! port
903 are of the far side of the connection.
904 Retransmitting a datagram more than 10 times
905 is treated like a reboot:
906 all queued messages are dropped, an error is queued to the
908 file, and the conversation resumes.
912 files accept the following messages:
913 .TF "\fLranddrop \fI[ percent ]\fR"
920 .BI "hangup " "IP port"
921 Drop the connection to address
926 .BI "randdrop " "[ percent ]"
933 ICMP is a datagram protocol for IPv4 used to exchange control requests and
934 their responses with other machines' IP implementations.
935 ICMP is primarily a kernel-to-kernel protocol, but it is possible
936 to generate `echo request' and read `echo reply' packets from user programs.
939 ICMPv6 is the IPv6 equivalent of ICMP.
947 a user must prefix each message with a corresponding structure,
953 * user level icmpv6 with control message "headers"
955 typedef struct Icmp6hdr Icmp6hdr;
958 uchar laddr[IPaddrlen]; /* local address */
959 uchar raddr[IPaddrlen]; /* remote address */
963 In this case (writing
974 the usual sequence of
978 must be executed before performing I/O on the corresponding
983 IL is a reliable point-to-point datagram protocol that runs over IPv4.
984 Like TCP, IL delivers datagrams
985 reliably and in order. Also like TCP, a connection is
986 determined by the address and port numbers of the two ends.
987 Like UDP, each read and write transfers a single datagram.
989 IL is efficient for LANs but doesn't have the
990 congestion control features needed for use through
992 It is no longer necessary, except to communicate with old standalone
995 Its use is now deprecated.
998 GRE is the encapsulation protocol used by PPTP.
999 The kernel implements just enough of the protocol
1001 Our implementation encapsulates in IPv4, per RFC 1702.
1003 is not allowed in GRE, only
1005 Since GRE has no port numbers, the port number in the connect
1006 is actually the 16 bit
1008 field in the GRE header.
1010 Reads and writes transfer a
1011 GRE datagram starting at the GRE header.
1012 On write, the kernel fills in the
1014 field with the port number specified
1015 in the connect message.
1020 ESP is the Encapsulating Security Payload (RFC 1827, obsoleted by RFC 4303)
1021 for IPsec (RFC 4301).
1022 We currently implement only tunnel mode, not transport mode.
1023 It is used to set up an encrypted tunnel between machines.
1024 Like GRE, ESP has no port numbers. Instead, the
1027 message is the SPI (Security Association Identifier (sic)).
1028 IP packets are written to and read from
1030 The kernel encrypts any packets written to
1032 appends a MAC, and prefixes an ESP header before
1033 sending to the other end of the tunnel.
1034 Received packets are checked against their MAC's,
1035 decrypted, and queued for reading from
1039 is the hexadecimal encoding of a key,
1042 The control messages are:
1043 .TF "\fLesp \fIalg secret\fR"
1046 .BI esp\ "alg secret
1047 Encrypt with the algorithm,
1052 Possible algorithms are:
1062 Use the hash algorithm,
1066 as the key for generating the MAC.
1067 Possible algorithms are:
1072 .BR aes_xcbc_mac_96 .
1075 Turn on header mode. Every buffer read from
1077 starts with 4 unused bytes, and the first 4 bytes
1078 of every buffer written to
1083 Turn off header mode.
1085 .SS "IP packet filter
1088 looks like another protocol directory.
1089 It is a packet filter built on top of IP.
1091 subdirectory represents a different filter.
1092 The connect messages written to the
1094 file describe the filter. Packets matching the filter can be read on the
1096 file. Packets written to the
1098 file are routed to an interface and transmitted.
1100 A filter is a semicolon-separated list of
1101 relations. Each relation describes a portion
1102 of a packet to match. The possible relations are:
1103 .TF "\fLdata[\fIn\fL:\fIm\fL]=\fIexpr\fR "
1107 the IP protocol number must be
1110 .BI data[ n : m ]= expr
1115 following the IP packet must match
1118 .BI iph[ n : m ]= expr
1123 of the IP packet header must match
1127 the packet must have been received on an interface whose address
1132 The source address in the packet must match
1136 The destination address in the packet must match
1144 .IB \ value | value | ...
1148 .IB \ value | value & mask
1150 If a mask is given, the relevant field is first ANDed with
1151 the mask. The result is compared against the value or list
1152 of values for a match. In the case of
1157 the value is a dot-formatted IP address and the mask is a dot-formatted
1158 IP mask. In the case of
1163 both value and mask are strings of 2 hexadecimal digits representing
1166 A packet is delivered to only one filter.
1167 The filters are merged into a single comparison tree.
1168 If two filters match the same packet, the following
1169 rules apply in order (here '>' means is preferred to):
1171 protocol > data > source > destination > interface
1173 lower data offsets > higher data offsets
1175 longer matches > shorter matches
1179 So far this has just been used to implement a version of
1181 and 6to4 tunnelling.
1188 files are read only and contain statistics useful to network monitoring.
1194 returns a list of 19 tagged and newline-separated fields representing:
1199 forwarding status (0 and 2 mean forwarding off,
1204 input address errors
1206 input packets for unknown protocols
1207 input packets discarded
1208 input packets delivered to higher level protocols
1210 output packets discarded
1211 output packets with no route
1212 timed out fragments in reassembly queue
1213 requested reassemblies
1214 successful reassemblies
1216 successful fragmentations
1217 unsuccessful fragmentations
1228 returns a list of 26 tagged and newline-separated fields representing:
1234 bad received messages
1235 unreachables received
1236 time exceededs received
1237 input parameter problems received
1238 source quenches received
1240 echo requests received
1241 echo replies received
1243 timestamp replies received
1244 address mask requests received
1245 address mask replies received
1250 input parameter problems sent
1251 source quenches sent
1256 timestamp replies sent
1257 address mask requests sent
1258 address mask replies sent
1265 returns a list of 11 tagged and newline-separated fields representing:
1270 maximum number of connections
1271 total outgoing calls
1272 total incoming calls
1273 number of established connections to be reset
1274 number of currently established connections
1277 segments retransmitted
1279 bad received segments
1280 transmission failures
1287 returns a list of 4 tagged and newline-separated fields representing:
1293 datagrams received for bad ports
1294 malformed datagrams received
1302 returns a list of 6 tagged and newline-separated fields representing:
1308 header length errors
1309 out of order messages
1310 retransmitted messages
1319 returns a list of 1 tagged number representing:
1323 header length errors
1335 .TF "\fL/lib/rfc/rfc2822"
1341 IPv6 address architecture
1349 has not been heavily used and should be considered experimental.
1350 It may disappear in favor of a more traditional packet filter in the future.