4 The block device I/O system is like a
5 protocol stack of filters.
6 There are a set of pseudo-devices that call
7 recursively to other pseudo-devices and real devices.
8 The protocol stack is compiled from a configuration
9 string that specifies the order of pseudo-devices and devices.
10 Each pseudo-device and device has a set of entry points
11 that corresponds to the operations that the file system
13 The most notable operations are
19 The device stack can best be described by
20 describing the syntax of the configuration string
21 that specifies the stack.
22 Configuration strings are used
23 during the setup of the file system.
26 In the following recursive definition,
29 string that specifies a block device.
30 .IP "\fID\fP = (\fIDD\fP...)"
32 This is a set of devices that
33 are concatenated to form a single device.
34 The size of the catenated device is the
35 sum of the sizes of each sub-device.
36 .IP "\fID\fP = [\fIDD\fP...]"
38 This is the interleaving of the
40 If there are N devices in the list,
41 then the pseudo-device is the N-way block
42 interleaving of the sub-devices.
43 The size of the interleaved device is
44 N times the size of the smallest sub-device.
45 .IP "\fID\fP = {\fIDD\fP...}"
47 This is a set of devices that
48 constitute a `mirror' of the first sub-device, and form a single device.
49 A write to the device is performed,
50 at the same block address,
51 on the sub-devices, in right-to-left order.
52 A read from the device is performed on each sub-device,
53 in left-to-right order, until a read succeeds without error,
54 or the set is exhausted.
55 One can think of this as a poor man's RAID 1.
56 The size of the device is the size of the smallest sub-device.
57 .IP "\fID\fP = \f(CWp\fP\fIDN1.N2\fP"
59 This is a partition of a sub-device.
60 The sub-device is partitioned into 100 equal pieces.
61 If the size of the sub-device is not divisible by 100,
62 then there will be some slop thrown away at the top.
63 The pseudo-device starts at the N1-th piece and
64 continues for N2 pieces. Thus
67 last third of the device
69 .IP "\fID\fP = \f(CWf\fP\fID\fP"
71 This is a fake write-once-read-many device simulated by a
72 second read-write device.
73 This second device is partitioned
74 into a set of block flags and a set of blocks.
75 The flags are used to generate errors if a
76 block is ever written twice or read without being written first.
77 .IP "\fID\fP = \f(CWx\fP\fID\fP"
79 This is a byte-swapped version of the file system on D.
80 Since the file server currently writes integers in metadata to disk
81 in native byte order, moving a file system to a machine of the other
82 major byte order (e.g., MIPS to Pentium)
85 It knows the sizes of the various integer fields in the file system metadata.
86 Ideally, the file server would follow the Plan 9 religion and write a consistent
87 byte order on disk, regardless of processor.
88 In the mean time, it should be possible to automatically determine the need
89 for byte-swapping by examining data in the super-block of each file system,
90 though this has not been implemented yet.
91 .IP "\fID\fP = \f(CWc\fP\fIDD\fP"
93 This is the cache/WORM device made up of a cache (read-write)
94 device and a WORM (write-once-read-many) device.
96 .IP "\fID\fP = \f(CWo\fP"
98 This is the dump file system that is the
99 two-level hierarchy of all dumps ever taken on a cache/WORM.
100 The read-only root of the cache/WORM file system
101 (on the dump taken Feb 18, 1995) can
104 in this pseudo device.
105 The second dump taken that day will be
107 .IP "\fID\fP = \f(CWw\fP\fIN1.N2.N3\fP"
109 This is a SCSI disk on controller N1, target N2 and logical unit number N3.
110 .IP "\fID\fP = \f(CWh\fP\fIN1.N2.0\fP"
112 This is an (E)IDE or *ATA disk on controller N1, target N2
113 (target 0 is the IDE master, 1 the slave device).
114 These disks are currently run via programmed I/O, not DMA,
115 so they tend to be slower to access than SCSI disks.
116 .IP "\fID\fP = \f(CWr\fP\fIN1\fP"
120 but refers to a side of a WORM disc.
124 .IP "\fID\fP = \f(CWl\fP\fIN1\fP"
128 but one block from the SCSI disk is removed for labeling.
129 .IP "\fID\fP = \f(CWj(\fP\fID\d\s-2\&1\s+2\u\fID\d\s-2\&2\s+2\u\f(CW*)\fID\d\s-2\&3\s+2\u\f1"
132 is the juke box SCSI interface.
134 .I D\d\s-2\&2\s+2\u 's
135 are the SCSI drives in the juke box
137 .I D\d\s-2\&3\s+2\u 's
138 are the demountable platters in the juke box.
145 must be pseudo devices of
158 devices any of the configuration numbers
159 can be replaced by an iterator of the form
161 N1 can be greater than N2, indicating a descending sequence.
166 is the interleaved SCSI disks on SCSI targets
167 2 through 6 of SCSI controller 0.
168 The main file system on
170 is defined by the configuration string
172 c[w1.<0-5>.0]j(w6w5w4w3w2)(l<0-236>l<238-474>)
174 This is a cache/WORM driver.
175 The cache is three interleaved disks on SCSI controller 1
176 targets 0, 1, 2, 3, 4, and 5.
177 The WORM half of the cache/WORM
178 is 474 jukebox disks.
181 has a main file system defined by
183 c[w<1-3>]j(w1.<6-0>.0)(l<0-124>l<128-252>)
187 matters here, since the optical jukebox's WORM drives's
190 run in descending order relative to the numbers of the drives
192 (e.g., the jukebox controller is SCSI target 6,
193 drive #1 is SCSI target 5,
194 and drive #6 is SCSI target 0).