1 .HTML "A Manual for the Plan 9 assembler
3 .ta 8n +8n +8n +8n +8n +8n +8n
6 A Manual for the Plan 9 assembler
9 rob@plan9.bell-labs.com
13 There is an assembler for each of the MIPS, SPARC, Intel 386,
14 Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC,
15 AMD64, DEC Alpha, and Acorn ARM.
18 is the oldest and in many ways the prototype.
19 The assemblers are really just variations of a single program:
20 they share many properties such as left-to-right assignment order for
21 instruction operands and the synthesis of macro instructions
24 to hide the peculiarities of the load and store structure of the machines.
25 To keep things concrete, the first part of this manual is
26 specifically about the 68020.
27 At the end is a description of the differences among
30 The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
31 is a prerequisite for this manual.
35 All pre-defined symbols in the assembler are upper-case.
44 floating-point registers are
51 is used by the C compiler to point to data, enabling short addresses to
55 is constant and must be set during C program initialization
56 to the address of the externally-defined symbol
59 The following hardware registers are defined in the assembler; their
60 meaning should be obvious given a 68020 manual:
73 The assembler also defines several pseudo-registers that
80 is the frame pointer, so
82 is the first argument,
84 is the second, and so on.
86 is the local stack pointer, where automatic variables are held
87 (SP is a pseudo-register only on the 68020);
89 is the first automatic, and so on as with
93 is the top-of-stack register, used for pushing parameters to procedures,
94 saving temporary values, and so on.
96 The assembler and loader track these pseudo-registers so
97 the above statements are true regardless of what has been
98 pushed on the hardware stack, pointed to by
102 refers to the hardware stack pointer, but beware of mixed use of
104 and the above stack-related pseudo-registers, which will cause trouble.
107 instruction is observed by the loader to
108 alter SP and thus will insert a corresponding pop before all returns.
109 The assembler accepts a label-like name to be attached to
115 to help document that
117 is the first argument to a routine.
118 The name goes in the symbol table but has no significance to the result
123 All external references must be made relative to some pseudo-register,
126 (the virtual program counter) or
128 (the ``static base'' register).
130 counts instructions, not bytes of data.
131 For example, to branch to the second following instruction, that is,
132 to skip one instruction, one may write
136 Labels are also allowed, as in
143 When using labels, there is no
149 refers to the beginning of the address space of the program.
150 Thus, references to global data and procedures are written as
157 to push the address of a global array on the stack, or
159 MOVL array+4(SB), TOS
161 to push the second (4-byte) element of the array.
162 Note the use of an offset; the complete list of addressing modes is given below.
163 Similarly, subroutine calls must use
168 File-static variables have syntax
174 will be filled in at load time by a unique integer.
176 When a program starts, it must execute
180 before accessing any global data.
181 (On machines such as the MIPS and SPARC that cannot load a register
182 in a single instruction, constants are loaded through the static base
183 register. The loader recognizes code that initializes the static
184 base register and treats it specially. You must be careful, however,
185 not to load large constants on such machines when the static base
186 register is not set up, such as early in interrupt routines.)
190 Expressions are mostly what one might expect.
191 Where an offset or a constant is expected,
192 a primary expression with unary operators is allowed.
193 A general C constant expression is allowed in parentheses.
195 Source files are preprocessed exactly as in the C compiler, so
203 The simple addressing modes are shared by all the assemblers.
204 Here, for completeness, follows a table of all the 68020 addressing modes,
205 since that machine has the richest set.
208 is an offset, which if zero may be elided, and
210 is a displacement, which is a constant between -128 and 127 inclusive.
211 Many of the modes listed have the same name;
212 scrutiny of the format will show what default is being applied.
213 For instance, indexed mode with no address register supplied operates
214 as though a zero-valued register were used.
215 For "offset" read "displacement."
216 For "\f(CW.s\fP" read one of
226 to indicate the size and scaling of the data.
232 floating-point register F0
233 special names CAAR, CACR, etc.
235 floating point constant $fcon
236 external symbol name+o(SB)
237 local symbol name<>+o(SB)
238 automatic symbol name+o(SP)
240 address of external $name+o(SB)
241 address of local $name<>+o(SB)
242 indirect post-increment (A0)+
243 indirect pre-decrement -(A0)
244 indirect with offset o(A0)
245 indexed with offset o()(R0.s)
246 indexed with offset o(A0)(R0.s)
247 external indexed name+o(SB)(R0.s)
248 local indexed name<>+o(SB)(R0.s)
249 automatic indexed name+o(SP)(R0.s)
250 parameter indexed name+o(FP)(R0.s)
251 offset indirect post-indexed d(o())(R0.s)
252 offset indirect post-indexed d(o(A0))(R0.s)
253 external indirect post-indexed d(name+o(SB))(R0.s)
254 local indirect post-indexed d(name<>+o(SB))(R0.s)
255 automatic indirect post-indexed d(name+o(SP))(R0.s)
256 parameter indirect post-indexed d(name+o(FP))(R0.s)
257 offset indirect pre-indexed d(o()(R0.s))
258 offset indirect pre-indexed d(o(A0))
259 offset indirect pre-indexed d(o(A0)(R0.s))
260 external indirect pre-indexed d(name+o(SB))
261 external indirect pre-indexed d(name+o(SB)(R0.s))
262 local indirect pre-indexed d(name<>+o(SB))
263 local indirect pre-indexed d(name<>+o(SB)(R0.s))
264 automatic indirect pre-indexed d(name+o(SP))
265 automatic indirect pre-indexed d(name+o(SP)(R0.s))
266 parameter indirect pre-indexed d(name+o(FP))
267 parameter indirect pre-indexed d(name+o(FP)(R0.s))
273 Placing data in the instruction stream, say for interrupt vectors, is easy:
274 the pseudo-instructions
280 lay down the value of their single argument, of the appropriate size,
281 as if it were an instruction:
285 places the long 12345 (base 10)
286 in the instruction stream.
288 the only such operator is
290 and it lays down 32-bit quantities.
291 The 386 has all three:
298 to that for 64-bit values.
299 The 960 has only one,
302 Placing information in the data section is more painful.
303 The pseudo-instruction
305 does the work, given two arguments: an address at which to place the item,
307 and the value to place there. For example, to define a character array
309 containing the characters
311 and a terminating null:
313 DATA array+0(SB)/1, $'a'
314 DATA array+1(SB)/1, $'b'
315 DATA array+2(SB)/1, $'c'
320 DATA array+0(SB)/4, $"abc\ez"
325 defines the number of bytes to define,
327 makes the symbol global, and the
329 says how many bytes the symbol occupies.
330 Uninitialized data is zeroed automatically.
333 is equivalent to the C
337 statement may contain a maximum of eight bytes;
338 build larger strings piecewise.
339 Two pseudo-instructions,
343 allow the (obsolete) Alef compilers to build dynamic type information during the load
347 pseudo-instruction has two forms:
349 DYNT , ALEF_SI_5+0(SB)
350 DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
354 defines the symbol to be a small unique integer constant, chosen by the loader,
355 which is some multiple of the word size. In the second form,
357 defines the second symbol in the same way,
358 places the address of the most recently
359 defined text symbol in the array specified by the first symbol at the
360 index defined by the value of the second symbol,
361 and then adjusts the size of the array accordingly.
365 pseudo-instruction takes the same parameters as a
367 statement. Its symbol is used as the base of an array and the
368 data item is installed in the array at the offset specified by the most recent
371 The size of the array is adjusted accordingly.
376 pseudo-instructions are not implemented on the 68020.
380 Entry points are defined by the pseudo-operation
382 which takes as arguments the name of the procedure (including the ubiquitous
384 and the number of bytes of automatic storage to pre-allocate on the stack,
385 which will usually be zero when writing assembly language programs.
386 On machines with a link register, such as the MIPS and SPARC,
387 the special value -4 instructs the loader to generate no PC save
388 and restore instructions, even if the function is not a leaf.
389 Here is a complete procedure that returns the sum
390 of its two arguments:
397 An optional middle argument
400 pseudo-op is a bit field of options to the loader.
401 Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
410 will not be profiled; the first version above would be.
411 Subroutines with peculiar state, such as system call routines,
412 should not be profiled.
414 Setting the 2 bit allows multiple definitions of the same
416 symbol in a program; the loader will place only one such function in the image.
417 It was emitted only by the Alef compilers.
419 Subroutines to be called from C should place their result in
421 even if it is an address.
422 Floating point values are returned in
424 Functions that return a structure to a C program
425 receive as their first argument the address of the location to
428 is unused in the calling protocol for such procedures.
429 A subroutine is responsible for saving its own registers,
430 and therefore is free to use any registers without saving them (``caller saves'').
434 are the exceptions as described above.
438 If you get confused, try using the
442 and compiling a sample program.
443 The standard output is valid input to the assembler.
447 The instruction set of the assembler is not identical to that
449 It is chosen to match what the compiler generates, augmented
450 slightly by specific needs of the operating system.
453 does not distinguish between the various forms of
455 instruction: move quick, move address, etc. Instead the context
456 does the job. For example,
468 A number of instructions do not have the syntax necessary to specify
469 their entire capabilities. Notable examples are the bitfield
471 multiply and divide instructions, etc.
472 For a complete set of generated instruction names (in
474 notation, not Motorola's) see the file
475 .CW /sys/src/cmd/2c/2.out.h .
476 Despite its name, this file contains an enumeration of the
477 instructions that appear in the intermediate files generated
478 by the compiler, which correspond exactly to lines of assembly language.
480 The MC68000 assembler,
482 is essentially the same, honoring the appropriate subset of the instructions
483 and addressing modes.
484 The definitions of these are, nonetheless, part of
487 Laying down instructions
489 The loader modifies the code produced by the assembler and compiler.
491 copies short sequences of code to eliminate branches,
492 and discards unreachable code.
493 The first instruction of every function is assumed to be reachable.
494 The pseudo-instruction
496 which you may see in compiler output,
497 means no instruction at all, rather than an instruction that does nothing.
498 The loader discards all
503 instruction, or any other instruction not known to the assembler, use a
506 Such instructions on RISCs are not scheduled by the loader and must have
507 their delay slots filled manually.
511 The registers are only addressed by number:
516 is the stack pointer;
518 is used as the static base pointer, the analogue of
521 Its value is the address of the global symbol
523 The register holding returned values from subroutines is
525 When a function is called, space for the first argument
528 but in C (not Alef) the value is passed in
534 as a temporary. The system uses
538 as interrupt-time temporaries. Therefore none of these registers
539 should be used in user code.
541 The control registers are not known to the assembler.
542 Instead they are numbered registers
546 Use this trick to access, say,
553 Floating point registers are called
559 must be initialized to the value 0.0,
566 this is done by the operating system.
568 The instructions and their syntax are different from those of the manufacturer's
572 and kin; instead there are
579 (move byte) pseudo-instructions. If the operand is unsigned, the instructions
584 The order of operands is from left to right in dataflow order, just as
585 on the 68020 but not as in MIPS documentation.
588 instructions are reversed with respect to the book; for example, a
595 The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
596 It understands the 64-bit instructions
611 The assembler does not have any cache, load-linked, or store-conditional instructions.
613 Some assembler instructions are expanded into multiple instructions by the loader.
614 For example the loader may convert the load of a 32 bit constant into an
619 Assembler instructions should be laid out as if there
620 were no load, branch, or floating point compare delay slots;
621 the loader will rearrange\(em\f2schedule\f1\(emthe instructions
622 to guarantee correctness and improve performance.
623 The only exception is that the correct scheduling of instructions
624 that use control registers varies from model to model of machine
625 (and is often undocumented) so you should schedule such instructions
626 by hand to guarantee correct behavior.
631 when it needs a true no-op instruction.
632 Use exactly this instruction when scheduling code manually;
633 the loader recognizes it and schedules the code before it and after it independently. Also,
635 pseudo-ops are scheduled like no-ops.
639 pseudo-op disables instruction scheduling
640 (scheduling is enabled by default);
643 Branch folding, code copying, and dead code elimination are
644 disabled for instructions that are not scheduled.
648 Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
649 Registers have numerical names only:
653 Forget about register windows: Plan 9 doesn't use them at all.
654 The machine has 32 global registers, period.
656 [sic] is the stack pointer.
658 is the static base register, with value the address of
661 is the return register and also the register holding the first
662 argument to a C (not Alef) function, again with space reserved at
665 is the loader temporary.
667 Floating-point registers are exactly as on the MIPS.
669 The control registers are known by names such as
671 The instructions to access these registers are
673 instructions, for example
677 for the SPARC instruction
682 Move instructions are similar to those on the MIPS: pseudo-operations
683 that turn into appropriate sequences of
685 instructions, adds, etc.
686 Instructions read from left to right. Because the arguments are
689 the condition codes are not inverted as on the MIPS.
691 The syntax for the ASI stuff is, for example to move a word from ASI 2:
695 The syntax for double indexing is
700 The SPARC's instruction scheduling is similar to the MIPS's.
701 The official no-op instruction is:
708 Registers are numbered
718 it is initialized to the address of
721 must be zero; this should be done manually early in execution by
726 is the loader temporary.
728 There is no support for floating point.
730 The Intel calling convention is not supported and cannot be used; use
733 Instructions are mostly as in the book. The major change is that
739 The extension character for
749 The assembler assumes 32-bit protected mode.
750 The register names are
760 The stack pointer (not a pseudo-register) is
762 and the return register is
764 There is no physical frame pointer but, as for the MIPS,
766 is a pseudo-register that acts as
769 Opcode names are mostly the same as those listed in the Intel manual
775 appended to identify 32-bit,
776 16-bit, and 8-bit operations.
777 The exceptions are loads, stores, and conditionals.
778 All load and store opcodes to and from general registers, special registers
791 or memory are written
807 instruction. If you need to access
809 you must mention it explicitly in a
814 There are many examples of illegal moves, for example,
818 that the loader actually implements as pseudo-operations.
820 The names of conditions in all conditional instructions
823 follow the conventions of the 68020 instead of those of the Intel
861 The addressing modes have syntax like
870 can be replaced by offsets from
874 to access names, for example
875 .CW extern+5(SB)(AX*2) .
877 Other notes: Non-relative
889 are legal loop instructions. Only
893 are recognized repeaters. These are not prefixes, but rather
894 stand-alone opcodes that precede the strings, for example
898 Segment override prefixes in
900 fields are not supported.
904 The assembler assumes 64-bit mode unless a
906 pseudo-operation is given:
910 to change to 32-bit mode.
911 The effect is mainly to diagnose instructions that are illegal in
912 the given mode, but the loader will also assume 32-bit operands and addresses,
913 and 32-bit PC values for call and return.
914 The assembler's conventions are similar to those for the 386, above.
915 The architecture provides extra fixed-point registers
919 All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
920 as described in the processor handbook.
925 puts a value in the low-order 32 bits and clears the top 32 bits to zero.
926 Literal operands are limited to signed 32 bit values, which are sign-extended
927 to 64 bits in 64 bit operations; the exception is
929 which allows 64-bit literals.
930 The external registers in Plan 9's C are allocated from
933 There are many new instructions, including the MMX and XMM media instructions,
934 and conditional move instructions.
944 As with the 386 instruction names,
945 all new 64-bit integer instructions, and the MMX and XMM instructions
948 for `long word' (32 bits) and
950 for `quad word' (64 bits).
951 Some instructions use
953 (`octword') for 128-bit values, where the processor handbook
958 The assembler also consistently uses
961 XMM instructions, instead of
970 can be used to move values to and from control registers, even when
971 the registers might be 64 bits.
972 The assembler often accepts the handbook's name to ease conversion
973 of existing code (but remember that the operand order is uniformly
974 source then destination).
977 type is 64 bits, but passed and returned by value, not by reference.
978 More notably, C pointer values are 64 bits, and thus
981 .CW "unsigned long long"
982 are the only integer types wide enough to hold a pointer value.
983 The C compiler and library use the XMM floating-point instructions, not
984 the old 387 ones, although the latter are implemented by assembler and loader.
985 Unlike the 386, the first integer or pointer argument is passed in a register, which is
987 for an integer or pointer (it can be referred to in assembly code by the pseudonym
990 holds the return value from subroutines as before.
991 Floating-point results are returned in
993 although currently the first floating-point parameter is not passed in a register.
994 All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
995 to preserve alignment and simplify variable-length argument list access,
996 including the first parameter when passed in a register,
997 even though bytes 4 to 7 are not initialized.
1001 On the Alpha, all registers are 64 bits. The architecture handles 32-bit values
1002 by giving them a canonical format (sign extension in the case of integer registers).
1003 Registers are numbered
1008 holds the return value from subroutines, and also the first parameter.
1010 is the stack pointer,
1014 is the link register, and
1018 are linker temporaries.
1020 Floating point registers are numbered
1039 The extension character for
1041 follows DEC's notation:
1050 for quadword (64 bits).
1051 Byte and ``word'' loads and stores may be made unsigned
1057 refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
1061 The Power PC follows the Plan 9 model set by the MIPS and SPARC,
1062 not the elaborate ABIs.
1063 The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
1064 there is no support for the older POWER instructions.
1070 is initialized to zero; this is done by C start up code
1071 and assumed by the compiler and loader.
1073 is the stack pointer.
1075 is the static base register, with value the address of
1078 is the return register and also the register holding the first
1079 argument to a C function, with space reserved at
1083 is the loader temporary.
1084 The external registers in Plan 9's C are allocated from
1088 Floating point registers are called
1092 By convention, several registers are initialized
1093 to specific values; this is done by the operating system.
1095 must be initialized to the value
1096 .CW 0x4330000080000000
1097 (used by float-to-int conversion),
1107 As on the MIPS and SPARC, the assembler accepts arbitrary literals
1112 and others where `immediate' variants exist,
1113 and the loader generates sequences
1119 The register indirect addressing modes use the same syntax as the SPARC,
1120 including double indexing when allowed.
1122 The instruction names are generally derived from the Motorola ones,
1123 subject to slight transformation:
1126 marking the setting of condition codes is replaced by
1130 represents `OE=1' it is replaced by
1142 As well as the three-operand conditional branch instruction
1144 the assembler provides pseudo-instructions for the common cases:
1154 The unconditional branch instruction is
1156 Indirect branches use
1162 Load or store operations are replaced by
1164 variants in the usual way:
1168 (move halfword with sign extension), and
1170 (move byte with sign extension, a pseudo-instruction),
1171 with unsigned variants
1179 `Load or store with update' versions are
1184 Load or store multiple is
1186 The exceptions are the string instructions, which are
1190 and the reservation instructions
1198 all with operands in the usual data-flow order.
1199 Floating-point load or store instructions are
1205 The register to register move instructions
1214 The assembler knows the commonly used special purpose registers:
1222 The rest, which are often architecture-dependent, are referenced as
1224 The segment registers of the 60x series are similarly
1228 can also be a register name, as in
1230 Moves between special purpose registers and general purpose ones,
1231 when allowed by the architecture,
1244 The fields of the condition register
1250 They are used by the
1252 (move field) pseudo-instruction,
1263 They are also accepted in
1264 the conditional branch instruction, for example
1288 The assembler provides access to
1294 The stack pointer is
1296 the link register is
1298 and the static base register is
1301 is the return register and also the register holding
1302 the first argument to a subroutine.
1303 The assembler supports the
1308 It also knows about coprocessor registers
1312 Floating registers are
1320 As with the other architectures, loads and stores are called
1324 for load word or store word, and
1327 load or store multiple,
1328 depending on the operands.
1330 Addressing modes are supported by suffixes to the instructions:
1336 (decrement after), and
1339 These can only be used with the
1342 The move multiple instruction,
1344 defines a range of registers using brackets, e.g.
1348 addressing mode bits
1353 are written in the same manner, for example,
1359 instruction to access user
1363 when in another processor mode.
1364 Shifts and rotates in addressing modes are supported by binary operators
1366 (logical left shift),
1368 (logical right shift),
1370 (arithmetic right shift), and
1372 (rotate right); for example
1375 The assembler does not support indexing by a shifted expression;
1376 only names can be doubly indexed.
1378 Any instruction can be followed by a suffix that makes the instruction conditional:
1381 and so on, as in the ARM manual, with synonyms
1392 and logical instructions
1395 suffix, as ARM allows, to set condition codes.
1401 coprocessor instructions is largely as in the manual, with the usual adjustments.
1402 The assembler directly supports only the ARM floating-point coprocessor
1403 operations used by the compiler:
1414 suffix selecting single or double precision.
1415 Floating-point load or store become
1419 Conversion instructions are also specified by moves:
1430 For details about this assembly language, which was built for the AMD 29240,
1431 look at the sources or examine compiler output.