1 .HTML "The Organization of Networks in Plan 9
3 The Organization of Networks in Plan 9
8 presotto,philw@plan9.bell-labs.com
11 Originally appeared in
13 Proc. of the Winter 1993 USENIX Conf.,
18 In a distributed system networks are of paramount importance. This
19 paper describes the implementation, design philosophy, and organization
20 of network support in Plan 9. Topics include network requirements
21 for distributed systems, our kernel implementation, network naming, user interfaces,
22 and performance. We also observe that much of this organization is relevant to
28 Plan 9 [Pike90] is a general-purpose, multi-user, portable distributed system
29 implemented on a variety of computers and networks.
30 What distinguishes Plan 9 is its organization.
31 The goals of this organization were to
33 and to promote resource sharing. One of the keys to its success as a distributed
34 system is the organization and management of its networks.
36 A Plan 9 system comprises file servers, CPU servers and terminals.
37 The file servers and CPU servers are typically centrally
38 located multiprocessor machines with large memories and
39 high speed interconnects.
40 A variety of workstation-class machines
42 connected to the central servers using several networks and protocols.
43 The architecture of the system demands a hierarchy of network
44 speeds matching the needs of the components.
45 Connections between file servers and CPU servers are high-bandwidth point-to-point
47 Connections from the servers fan out to local terminals
48 using medium speed networks
49 such as Ethernet [Met80] and Datakit [Fra80].
50 Low speed connections via the Internet and
51 the AT&T backbone serve users in Oregon and Illinois.
52 Basic Rate ISDN data service and 9600 baud serial lines provide slow
53 links to users at home.
55 Since CPU servers and terminals use the same kernel,
56 users may choose to run programs locally on
57 their terminals or remotely on CPU servers.
58 The organization of Plan 9 hides the details of system connectivity
59 allowing both users and administrators to configure their environment
60 to be as distributed or centralized as they wish.
61 Simple commands support the
62 construction of a locally represented name space
63 spanning many machines and networks.
64 At work, users tend to use their terminals like workstations,
65 running interactive programs locally and
66 reserving the CPU servers for data or compute intensive jobs
67 such as compiling and computing chess endgames.
68 At home or when connected over
69 a slow network, users tend to do most work on the CPU server to minimize
70 traffic on the slow links.
71 The goal of the network organization is to provide the same
72 environment to the user wherever resources are used.
74 Kernel Network Support
76 Networks play a central role in any distributed system. This is particularly
77 true in Plan 9 where most resources are provided by servers external to the kernel.
78 The importance of the networking code within the kernel
79 is reflected by its size;
80 of 25,000 lines of kernel code, 12,500 are network and protocol related.
81 Networks are continually being added and the fraction of code
82 devoted to communications
84 Moreover, the network code is complex.
85 Protocol implementations consist almost entirely of
86 synchronization and dynamic memory management, areas demanding
89 The kernel currently supports Datakit, point-to-point fiber links,
90 an Internet (IP) protocol suite and ISDN data service.
91 The variety of networks and machines
92 has raised issues not addressed by other systems running on commercial
93 hardware supporting only Ethernet or FDDI.
95 The File System protocol
97 A central idea in Plan 9 is the representation of a resource as a hierarchical
99 Each process assembles a view of the system by building a
101 [Needham] connecting its resources.
102 File systems need not represent disc files; in fact, most Plan 9 file systems have no
104 A typical file system dynamically represents
105 some resource like a set of network connections or the process table.
106 Communication between the kernel, device drivers, and local or remote file servers uses a
107 protocol called 9P. The protocol consists of 17 messages
108 describing operations on files and directories.
109 Kernel resident device and protocol drivers use a procedural version
110 of the protocol while external file servers use an RPC form.
111 Nearly all traffic between Plan 9 systems consists
113 9P relies on several properties of the underlying transport protocol.
114 It assumes messages arrive reliably and in sequence and
115 that delimiters between messages
117 When a protocol does not meet these
118 requirements (for example, TCP does not preserve delimiters)
119 we provide mechanisms to marshal messages before handing them
122 A kernel data structure, the
124 is a handle to a file server.
125 Operations on a channel generate the following 9P messages.
130 messages authenticate a connection, established by means external to 9P,
131 and validate its user.
132 The result is an authenticated
134 referencing the root of the
138 message makes a new channel identical to an existing channel, much like
144 may be moved to a file on the server using a
146 message to descend each level in the hierarchy.
151 messages read and write the attributes of the file referenced by a channel.
154 message prepares a channel for subsequent
158 messages to access the contents of the file.
162 perform the actions implied by their names on the file
163 referenced by the channel.
166 message discards a channel without affecting the file.
168 A kernel resident file server called the
170 converts the procedural version of 9P into RPCs.
173 system call provides a file descriptor, which can be
174 a pipe to a user process or a network connection to a remote machine, to
175 be associated with the mount point.
176 After a mount, operations
177 on the file tree below the mount point are sent as messages to the file server.
180 driver manages buffers, packs and unpacks parameters from
181 messages, and demultiplexes among processes using the file server.
185 The network code in the kernel is divided into three layers: hardware interface,
186 protocol processing, and program interface.
187 A device driver typically uses streams to connect the two interface layers.
188 Additional stream modules may be pushed on
189 a device to process protocols.
190 Each device driver is a kernel-resident file system.
191 Simple device drivers serve a single level
192 directory containing just a few files;
193 for example, we represent each UART
194 by a data and a control file.
198 --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1
199 --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1ctl
200 --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2
201 --rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2ctl
204 The control file is used to control the device;
209 sets the line to 1200 baud.
211 Multiplexed devices present
212 a more complex interface structure.
213 For example, the LANCE Ethernet driver
214 serves a two level file tree (Figure 1)
217 device control and configuration
219 user-level protocols like ARP
221 diagnostic interfaces for snooping software.
223 The top directory contains a
225 file and a directory for each connection, numbered
229 Each connection directory corresponds to an Ethernet packet type.
232 file finds an unused connection directory
236 Reading the control file returns the ASCII connection number; the user
237 process can use this value to construct the name of the proper
238 connection directory.
239 In each connection directory files named
245 provide access to the connection.
250 file sets the packet type to 2048
252 configures the connection to receive
253 all IP packets sent to the machine.
254 Subsequent reads of the file
260 file accesses the media;
263 next packet of the selected type.
265 queues a packet for transmission after
266 appending a packet header containing the source address and packet type.
269 file returns ASCII text containing the interface address,
270 packet input/output counts, error statistics, and general information
271 about the state of the interface.
274 If several connections on an interface
275 are configured for a particular packet type, each receives a
276 copy of the incoming packets.
277 The special packet type
288 configures a conversation to receive all packets on the Ethernet.
290 Although the driver interface may seem elaborate,
291 the representation of a device as a set of files using ASCII strings for
292 communication has several advantages.
293 Any mechanism supporting remote access to files immediately
294 allows a remote machine to use our interfaces as gateways.
295 Using ASCII strings to control the interface avoids byte order problems and
296 ensures a uniform representation for
297 devices on the same machine and even allows devices to be accessed remotely.
298 Representing dissimilar devices by the same set of files allows common tools
300 several networks or interfaces.
305 and shell redirection.
309 Network connections are represented as pseudo-devices called protocol devices.
310 Protocol device drivers exist for the Datakit URP protocol and for each of the
311 Internet IP protocols TCP, UDP, and IL.
312 IL, described below, is a new communication protocol used by Plan 9 for
313 transmitting file system RPC's.
314 All protocol devices look identical so user programs contain no
315 network-specific code.
317 Each protocol device driver serves a directory structure
318 similar to that of the Ethernet driver.
319 The top directory contains a
321 file and a directory for each connection numbered
325 Each connection directory contains files to control one
326 connection and to send and receive information.
327 A TCP connection directory looks like this:
331 --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 ctl
332 --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 data
333 --rw-rw---- I 0 ehg bootes 0 Jul 13 21:14 listen
334 --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 local
335 --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 remote
336 --r--r--r-- I 0 bootes bootes 0 Jul 13 21:14 status
337 cpu% cat local remote status
340 tcp/2 1 Established connect
348 supply information about the state of the connection.
354 provide access to the process end of the stream implementing the protocol.
357 file is used to accept incoming calls from the network.
359 The following steps establish a connection.
361 The clone device of the
362 appropriate protocol directory is opened to reserve an unused connection.
364 The file descriptor returned by the open points to the
366 file of the new connection.
367 Reading that file descriptor returns an ASCII string containing
368 the connection number.
370 A protocol/network specific ASCII address string is written to the
376 file is constructed using the connection number.
379 file is opened the connection is established.
381 A process can read and write this file descriptor
382 to send and receive messages from the network.
383 If the process opens the
385 file it blocks until an incoming call is received.
386 An address string written to the
388 file before the listen selects the
389 ports or services the process is prepared to accept.
390 When an incoming call is received, the open completes
391 and returns a file descriptor
394 file of the new connection.
397 file yields a connection number used to construct the path of the
400 A connection remains established while any of the files in the connection directory
401 are referenced or until a close is received from the network.
407 [Rit84a][Presotto] is a bidirectional channel connecting a
408 physical or pseudo-device to user processes.
409 The user processes insert and remove data at one end of the stream.
410 Kernel processes acting on behalf of a device insert data at
412 Asynchronous communications channels such as pipes,
413 TCP conversations, Datakit conversations, and RS232 lines are implemented using
416 A stream comprises a linear list of
417 .I "processing modules" .
418 Each module has both an upstream (toward the process) and
419 downstream (toward the device)
421 Calling the put routine of the module on either end of the stream
422 inserts data into the stream.
423 Each module calls the succeeding one to send data up or down the stream.
425 An instance of a processing module is represented by a pair of
427 one for each direction.
428 The queues point to the put procedures and can be used
429 to queue information traveling along the stream.
430 Some put routines queue data locally and send it along the stream at some
431 later time, either due to a subsequent call or an asynchronous
432 event such as a retransmission timer or a device interrupt.
433 Processing modules create helper kernel processes to
434 provide a context for handling asynchronous events.
435 For example, a helper kernel process awakens periodically
436 to perform any necessary TCP retransmissions.
437 The use of kernel processes instead of serialized run-to-completion service routines
438 differs from the implementation of Unix streams.
439 Unix service routines cannot
440 use any blocking kernel resource and they lack a local long-lived state.
441 Helper kernel processes solve these problems and simplify the stream code.
443 There is no implicit synchronization in our streams.
444 Each processing module must ensure that concurrent processes using the stream
446 This maximizes concurrency but introduces the
447 possibility of deadlock.
448 However, deadlocks are easily avoided by careful programming; to
449 date they have not caused us problems.
451 Information is represented by linked lists of kernel structures called
453 Each block contains a type, some state flags, and pointers to
455 Block buffers can hold either data or control information, i.e., directives
456 to the processing modules.
457 Blocks and block buffers are dynamically allocated from kernel memory.
461 A stream is represented at user level as two files,
465 The actual names can be changed by the device driver using the stream,
466 as we saw earlier in the example of the UART driver.
467 The first process to open either file creates the stream automatically.
468 The last close destroys it.
471 file copies the data into kernel blocks
472 and passes them to the downstream put routine of the first processing module.
473 A write of less than 32K is guaranteed to be contained by a single block.
474 Concurrent writes to the same stream are not synchronized, although the
475 32K block size assures atomic writes for most protocols.
476 The last block written is flagged with a delimiter
477 to alert downstream modules that care about write boundaries.
478 In most cases the first put routine calls the second, the second
479 calls the third, and so on until the data is output.
480 As a consequence, most data is output without context switching.
484 file returns data queued at the top of the stream.
485 The read terminates when the read count is reached
486 or when the end of a delimited block is encountered.
487 A per stream read lock ensures only one process
488 can read from a stream at a time and guarantees
489 that the bytes read were contiguous bytes from the
492 Like UNIX streams [Rit84a],
493 Plan 9 streams can be dynamically configured.
494 The stream system intercepts and interprets
495 the following control blocks:
496 .IP "\f(CWpush\fP \fIname\fR" 15
497 adds an instance of the processing module
499 to the top of the stream.
501 removes the top module of the stream.
502 .IP \f(CWhangup\fP 15
503 sends a hangup message
504 up the stream from the device end.
506 Other control blocks are module-specific and are interpreted by each
510 The convoluted syntax and semantics of the UNIX
512 system call convinced us to leave it out of Plan 9.
521 is identical to writing to a
523 file except the blocks are of type
525 A processing module parses each control block it sees.
526 Commands in control blocks are ASCII strings, so
527 byte ordering is not an issue when one system
528 controls streams in a name space implemented on another processor.
529 The time to parse control blocks is not important, since control
534 The module at the downstream end of the stream is part of a device interface.
535 The particulars of the interface vary with the device.
536 Most device interfaces consist of an interrupt routine, an output
537 put routine, and a kernel process.
538 The output put routine stages data for the
539 device and starts the device if it is stopped.
540 The interrupt routine wakes up the kernel process whenever
541 the device has input to be processed or needs more output staged.
542 The kernel process puts information up the stream or stages more data for output.
543 The division of labor among the different pieces varies depending on
544 how much must be done at interrupt level.
545 However, the interrupt routine may not allocate blocks or call
546 a put routine since both actions require a process context.
550 The conversations using a protocol device must be
551 multiplexed onto a single physical wire.
552 We push a multiplexer processing module
553 onto the physical device stream to group the conversations.
554 The device end modules on the conversations add the necessary header
555 onto downstream messages and then put them to the module downstream
557 The multiplexing module looks at each message moving up its stream and
558 puts it to the correct conversation stream after stripping
559 the header controlling the demultiplexing.
561 This is similar to the Unix implementation of multiplexer streams.
562 The major difference is that we have no general structure that
563 corresponds to a multiplexer.
564 Each attempt to produce a generalized multiplexer created a more complicated
565 structure and underlined the basic difficulty of generalizing this mechanism.
566 We now code each multiplexer from scratch and favor simplicity over
571 Despite five year's experience and the efforts of many programmers,
572 we remain dissatisfied with the stream mechanism.
573 Performance is not an issue;
574 the time to process protocols and drive
575 device interfaces continues to dwarf the
576 time spent allocating, freeing, and moving blocks
578 However the mechanism remains inordinately
580 Much of the complexity results from our efforts
581 to make streams dynamically configurable, to
582 reuse processing modules on different devices
583 and to provide kernel synchronization
584 to ensure data structures
585 don't disappear under foot.
586 This is particularly irritating since we seldom use these properties.
588 Streams remain in our kernel because we are unable to
589 devise a better alternative.
590 Larry Peterson's X-kernel [Pet89a]
591 is the closest contender but
592 doesn't offer enough advantage to switch.
593 If we were to rewrite the streams code, we would probably statically
594 allocate resources for a large fixed number of conversations and burn
595 memory in favor of less complexity.
599 None of the standard IP protocols is suitable for transmission of
600 9P messages over an Ethernet or the Internet.
601 TCP has a high overhead and does not preserve delimiters.
602 UDP, while cheap, does not provide reliable sequenced delivery.
603 Early versions of the system used a custom protocol that was
604 efficient but unsatisfactory for internetwork transmission.
605 When we implemented IP, TCP, and UDP we looked around for a suitable
606 replacement with the following properties:
608 Reliable datagram service with sequenced delivery
612 Low complexity, high performance
616 None met our needs so a new protocol was designed.
617 IL is a lightweight protocol designed to be encapsulated by IP.
618 It is a connection-based protocol
619 providing reliable transmission of sequenced messages between machines.
620 No provision is made for flow control since the protocol is designed to transport RPC
621 messages between client and server.
622 A small outstanding message window prevents too
623 many incoming messages from being buffered;
624 messages outside the window are discarded
625 and must be retransmitted.
626 Connection setup uses a two way handshake to generate
627 initial sequence numbers at each end of the connection;
628 subsequent data messages increment the
629 sequence numbers allowing
630 the receiver to resequence out of order messages.
631 In contrast to other protocols, IL does not do blind retransmission.
632 If a message is lost and a timeout occurs, a query message is sent.
633 The query message is a small control message containing the current
634 sequence numbers as seen by the sender.
635 The receiver responds to a query by retransmitting missing messages.
636 This allows the protocol to behave well in congested networks,
637 where blind retransmission would cause further
639 Like TCP, IL has adaptive timeouts.
640 A round-trip timer is used
641 to calculate acknowledge and retransmission times in terms of the network speed.
642 This allows the protocol to perform well on both the Internet and on local Ethernets.
644 In keeping with the minimalist design of the rest of the kernel, IL is small.
645 The entire protocol is 847 lines of code, compared to 2200 lines for TCP.
646 IL is our protocol of choice.
650 A uniform interface to protocols and devices is not sufficient to
651 support the transparency we require.
652 Since each network uses a different
654 the ASCII strings written to a control file have no common format.
655 As a result, every tool must know the specifics of the networks it
656 is capable of addressing.
657 Moreover, since each machine supplies a subset
658 of the available networks, each user must be aware of the networks supported
659 by every terminal and server machine.
660 This is obviously unacceptable.
662 Several possible solutions were considered and rejected; one deserves
664 We could have used a user-level file server
665 to represent the network name space as a Plan 9 file tree.
666 This global naming scheme has been implemented in other distributed systems.
667 The file hierarchy provides paths to
668 directories representing network domains.
669 Each directory contains
670 files representing the names of the machines in that domain;
671 an example might be the path
672 .CW /net/name/usa/edu/mit/ai .
673 Each machine file contains information like the IP address of the machine.
674 We rejected this representation for several reasons.
675 First, it is hard to devise a hierarchy encompassing all representations
676 of the various network addressing schemes in a uniform manner.
677 Datakit and Ethernet address strings have nothing in common.
678 Second, the address of a machine is
679 often only a small part of the information required to connect to a service on
681 For example, the IP protocols require symbolic service names to be mapped into
682 numeric port numbers, some of which are privileged and hence special.
683 Information of this sort is hard to represent in terms of file operations.
684 Finally, the size and number of the networks being represented burdens users with
685 an unacceptably large amount of information about the organization of the network
686 and its connectivity.
687 In this case the Plan 9 representation of a
688 resource as a file is not appropriate.
690 If tools are to be network independent, a third-party server must resolve
692 A server on each machine, with local knowledge, can select the best network
693 for any particular destination machine or service.
694 Since the network devices present a common interface,
695 the only operation which differs between networks is name resolution.
696 A symbolic name must be translated to
697 the path of the clone file of a protocol
698 device and an ASCII address string to write to the
701 A connection server (CS) provides this service.
705 On most systems several
710 .CW /etc/hosts.equiv ,
714 hold network information.
715 Much time and effort is spent
716 administering these files and keeping
717 them mutually consistent.
719 automatically derive one or more of the files from
720 information in other files but maintenance continues to be
721 difficult and error prone.
723 Since we were writing an entirely new system, we were free to
724 try a simpler approach.
725 One database on a shared server contains all the information
726 needed for network administration.
727 Two ASCII files comprise the main database:
729 contains locally administered information and
731 contains information imported from elsewhere.
732 The files contain sets of attribute/value pairs of the form
733 .I attr\f(CW=\fPvalue ,
738 are alphanumeric strings.
739 Systems are described by multi-line entries;
740 a header line at the left margin begins each entry followed by zero or more
741 indented attribute/value pairs specifying
742 names, addresses, properties, etc.
743 For example, the entry for our CPU server
744 specifies a domain name, an IP address, an Ethernet address,
745 a Datakit address, a boot file, and supported protocols.
748 dom=helix.research.bell-labs.com
750 ip=135.104.9.31 ether=0800690222f0
754 If several systems share entries such as
755 network mask and gateway, we specify that information
756 with the network or subnetwork instead of the system.
757 The following entries define a Class B IP network and
758 a few subnets derived from it.
759 The entry for the network specifies the IP mask,
760 file system, and authentication server for all systems
762 Each subnetwork specifies its default IP gateway.
764 ipnet=mh-astro-net ip=135.104.0.0 ipmask=255.255.255.0
765 fs=bootes.research.bell-labs.com
767 ipnet=unix-room ip=135.104.117.0
769 ipnet=third-floor ip=135.104.51.0
771 ipnet=fourth-floor ip=135.104.52.0
774 Database entries also define the mapping of service names
775 to port numbers for TCP, UDP, and IL.
783 All programs read the database directly so
784 consistency problems are rare.
785 However the database files can become large.
786 Our global file, containing all information about
787 both Datakit and Internet systems in AT&T, has 43,000
789 To speed searches, we build hash table files for each
790 attribute we expect to search often.
791 The hash file entries point to entries
793 Every hash file contains the modification time of its master
794 file so we can avoid using an out-of-date hash table.
795 Searches for attributes that aren't hashed or whose hash table
796 is out-of-date still work, they just take longer.
800 On each system a user level connection server process, CS, translates
801 symbolic names to addresses.
802 CS uses information about available networks, the network database, and
803 other servers (such as DNS) to translate names.
804 CS is a file server serving a single file,
806 A client writes a symbolic name to
808 then reads one line for each matching destination reachable
810 The lines are of the form
811 .I "filename message",
814 is the path of the clone file to open for a new connection and
816 is the string to write to it to make the connection.
817 The following example illustrates this.
819 is a program that prompts for strings to write to
821 and prints the replies.
825 /net/il/clone 135.104.9.31!17008
826 /net/dk/clone nj/astro/helix!9fs
829 CS provides meta-name translation to perform complicated
831 The special network name
833 selects any network in common between source and
834 destination supporting the specified service.
835 A host name of the form \f(CW$\fIattr\f1
836 is the name of an attribute in the network database.
837 The database search returns the value
838 of the matching attribute/value pair
839 most closely associated with the source host.
840 Most closely associated is defined on a per network basis.
841 For example, the symbolic name
842 .CW tcp!$auth!rexauth
843 causes CS to search for the
845 attribute in the database entry for the source system, then its
846 subnetwork (if there is one) and then its network.
850 /net/il/clone 135.104.9.34!17021
851 /net/dk/clone nj/astro/p9auth!rexauth
852 /net/il/clone 135.104.9.6!17021
853 /net/dk/clone nj/astro/musca!rexauth
856 Normally CS derives naming information from its database files.
857 For domain names however, CS first consults another user level
858 process, the domain name server (DNS).
859 If no DNS is reachable, CS relies on its own tables.
861 Like CS, the domain name server is a user level process providing
864 A client writes a request of the form
865 .I "domain-name type" ,
868 is a domain name service resource record type.
869 DNS performs a recursive query through the
870 Internet domain name system producing one line
871 per resource record found. The client reads
873 to retrieve the records.
874 Like other domain name servers, DNS caches information
875 learned from the network.
876 DNS is implemented as a multi-process shared memory application
877 with separate processes listening for network and local requests.
881 The section on protocol devices described the details
882 of making and receiving connections across a network.
883 The dance is straightforward but tedious.
884 Library routines are provided to relieve
885 the programmer of the details.
891 library call establishes a connection to a remote destination.
893 returns an open file descriptor for the
895 file in the connection directory.
897 int dial(char *dest, char *local, char *dir, int *cfdp)
900 is the symbolic name/address of the destination.
902 is the local address.
903 Since most networks do not support this, it is
906 is a pointer to a buffer to hold the path name of the protocol directory
907 representing this connection.
909 fills this buffer if the pointer is non-zero.
911 is a pointer to a file descriptor for the
913 file of the connection.
914 If the pointer is non-zero,
916 opens the control file and tucks the file descriptor here.
920 with a destination name and all other arguments zero.
923 translate the symbolic name to all possible destination addresses
924 and attempts to connect to each in turn until one works.
925 Specifying the special name
927 in the network portion of the destination
928 allows CS to pick a network/protocol in common
929 with the destination for which the requested service is valid.
930 For example, assume the system
931 .CW research.bell-labs.com
932 has the Datakit address
933 .CW nj/astro/research
940 fd = dial("net!research.bell-labs.com!login", 0, 0, 0, 0);
942 tries in succession to connect to
943 .CW nj/astro/research!login
944 on the Datakit and both
945 .CW 135.104.117.5!513
951 accepts addresses instead of symbolic names.
952 For example, the destinations
953 .CW tcp!135.104.117.5!513
955 .CW tcp!research.bell-labs.com!login
957 references to the same machine.
962 four routines to listen for incoming connections.
965 its intention to receive connections,
968 for calls and finally
974 returns an open file descriptor for the
976 file of a connection and fills
979 path of the protocol directory
980 for the announcement.
982 int announce(char *addr, char *dir)
985 is the symbolic name/address announced;
986 if it does not contain a service, the announcement is for
987 all services not explicitly announced.
988 Thus, one can easily write the equivalent of the
991 having to announce each separate service.
992 An announcement remains in force until the control file is
996 returns an open file descriptor for the
1001 of the protocol directory
1002 for the received connection.
1005 from the announcement.
1007 int listen(char *dir, char *ldir)
1013 are called with the control file descriptor and
1017 Some networks such as Datakit accept a reason for a rejection;
1018 networks such as IP ignore the third argument.
1020 int accept(int ctl, char *ldir)
1021 int reject(int ctl, char *ldir, char *reason)
1024 The following code implements a typical TCP listener.
1025 It announces itself, listens for connections, and forks a new
1027 The new process echoes data on the connection until the
1028 remote end closes it.
1029 The "*" in the symbolic name means the announcement is valid for
1030 any addresses bound to the machine the program is run on.
1032 .ta 8n 16n 24n 32n 40n 48n 56n 64n
1037 char adir[40], ldir[40];
1041 afd = announce("tcp!*!echo", adir);
1046 /* listen for a call */
1047 lcfd = listen(adir, ldir);
1051 /* fork a process to echo */
1054 /* accept the call and open the data file */
1055 dfd = accept(lcfd, ldir);
1059 /* echo until EOF */
1060 while((n = read(dfd, buf, sizeof(buf))) > 0)
1076 Communication between Plan 9 machines is done almost exclusively in
1077 terms of 9P messages. Only the two services
1084 service is analogous to
1086 However, rather than emulating a terminal session
1089 creates a process on the remote machine whose name space is an analogue of the window
1090 in which it was invoked.
1092 is a user level file server which allows a piece of name space to be
1093 exported from machine to machine across a network. It is used by the
1095 command to serve the files in the terminal's name space when they are
1099 By convention, the protocol and device driver file systems are mounted in a
1102 Although the per-process name space allows users to configure an
1103 arbitrary view of the system, in practice their profiles build
1104 a conventional name space.
1109 is invoked by an incoming network call.
1112 (the Plan 9 equivalent of
1114 runs the profile of the user
1115 requesting the service to construct a name space before starting
1117 After an initial protocol
1118 establishes the root of the file tree being
1120 the remote process mounts the connection,
1123 to act as a relay file server. Operations in the imported file tree
1124 are executed on the remote server and the results returned.
1126 the name space of the remote machine appears to be exported into a
1133 on a remote machine, mounts the result in the local name space,
1136 No local process is required to serve mounts;
1137 9P messages are generated by the kernel's mount driver and sent
1138 directly over the network.
1141 must be multithreaded since the system calls
1147 Plan 9 does not implement the
1149 system call but does allow processes to share file descriptors,
1150 memory and other resources.
1152 and the configurable name space
1153 provide a means of sharing resources between machines.
1154 It is a building block for constructing complex name spaces
1155 served from many machines.
1157 The simplicity of the interfaces encourages naive users to exploit the potential
1158 of a richly connected environment.
1159 Using these tools it is easy to gateway between networks.
1160 For example a terminal with only a Datakit connection can import from the server
1163 import -a helix /net
1168 command makes a Datakit connection to the machine
1171 it starts an instance
1177 command mounts the remote
1179 directory after (the
1183 the existing contents
1187 The directory contains the union of the local and remote contents of
1189 Local entries supersede remote ones of the same name so
1190 networks on the local machine are chosen in preference
1191 to those supplied remotely.
1192 However, unique entries in the remote directory are now visible in the local
1195 All the networks connected to
1198 are now available in the terminal. The effect on the name space is shown by the following
1204 philw-gnot% import -a musca /net
1219 We decided to make our interface to FTP
1220 a file system rather than the traditional command.
1223 dials the FTP port of a remote system, prompts for login and password, sets image mode,
1224 and mounts the remote file system onto
1226 Files and directories are cached to reduce traffic.
1227 The cache is updated whenever a file is created.
1228 Ftpfs works with TOPS-20, VMS, and various Unix flavors
1229 as the remote system.
1233 The file servers and CPU servers are connected by
1235 point-to-point links.
1236 A link consists of two VME cards connected by a pair of optical
1238 The VME cards use 33MHz Intel 960 processors and AMD's TAXI
1239 fiber transmitter/receivers to drive the lines at 125 Mbit/sec.
1240 Software in the VME card reduces latency by copying messages from system memory
1241 to fiber without intermediate buffering.
1245 We measured both latency and throughput
1246 of reading and writing bytes between two processes
1247 for a number of different paths.
1248 Measurements were made on two- and four-CPU SGI Power Series processors.
1249 The CPUs are 25 MHz MIPS 3000s.
1250 The latency is measured as the round trip time
1251 for a byte sent from one process to another and
1253 Throughput is measured using 16k writes from
1254 one process to another.
1261 Table 1 - Performance
1263 test:throughput:latency
1264 :MBytes/sec:millisec
1270 URP/Datakit:0.22:1.75
1278 The representation of all resources as file systems
1279 coupled with an ASCII interface has proved more powerful
1280 than we had originally imagined.
1281 Resources can be used by any computer in our networks
1282 independent of byte ordering or CPU type.
1283 The connection server provides an elegant means
1284 of decoupling tools from the networks they use.
1285 Users successfully use Plan 9 without knowing the
1286 topology of the system or the networks they use.
1287 More information about 9P can be found in the Section 5 of the Plan 9 Programmer's
1292 [Pike90] R. Pike, D. Presotto, K. Thompson, H. Trickey,
1293 ``Plan 9 from Bell Labs'',
1295 UKUUG Proc. of the Summer 1990 Conf. ,
1299 [Needham] R. Needham, ``Names'', in
1301 Distributed systems,
1304 Addison Wesley, 1989.
1306 [Presotto] D. Presotto, ``Multiprocessor Streams for Plan 9'',
1308 UKUUG Proc. of the Summer 1990 Conf. ,
1310 London, England, 1990.
1312 [Met80] R. Metcalfe, D. Boggs, C. Crane, E. Taf and J. Hupp, ``The
1313 Ethernet Local Network: Three reports'',
1317 XEROX Palo Alto Research Center, February 1980.
1319 [Fra80] A. G. Fraser, ``Datakit - A Modular Network for Synchronous
1320 and Asynchronous Traffic'',
1322 Proc. Int'l Conf. on Communication,
1326 [Pet89a] L. Peterson, ``RPC in the X-Kernel: Evaluating new Design Techniques'',
1328 Proc. Twelfth Symp. on Op. Sys. Princ.,
1330 Litchfield Park, AZ, December 1990.
1332 [Rit84a] D. M. Ritchie, ``A Stream Input-Output System'',
1334 AT&T Bell Laboratories Technical Journal, 68(8),