sys/doc/asm.ms

   1 .HTML "A Manual for the Plan 9 assembler
   2 .ft CW
   3 .ta 8n +8n +8n +8n +8n +8n +8n
   4 .ft
   5 .TL
   6 A Manual for the Plan 9 assembler
   7 .AU
   8 Rob Pike
   9 rob@plan9.bell-labs.com
  10 .SH
  11 Machines
  12 .PP
  13 There is an assembler for each of the MIPS, SPARC, Intel 386,
  14 Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC,
  15 AMD64, DEC Alpha, and Acorn ARM.
  16 The 68020 assembler,
  17 .CW 2a ,
  18 is the oldest and in many ways the prototype.
  19 The assemblers are really just variations of a single program:
  20 they share many properties such as left-to-right assignment order for
  21 instruction operands and the synthesis of macro instructions
  22 such as
  23 .CW MOVE
  24 to hide the peculiarities of the load and store structure of the machines.
  25 To keep things concrete, the first part of this manual is
  26 specifically about the 68020.
  27 At the end is a description of the differences among
  28 the other assemblers.
  29 .PP
  30 The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
  31 is a prerequisite for this manual.
  32 .SH
  33 Registers
  34 .PP
  35 All pre-defined symbols in the assembler are upper-case.
  36 Data registers are
  37 .CW R0
  38 through
  39 .CW R7 ;
  40 address registers are
  41 .CW A0
  42 through
  43 .CW A7 ;
  44 floating-point registers are
  45 .CW F0
  46 through
  47 .CW F7 .
  48 .PP
  49 A pointer in
  50 .CW A6
  51 is used by the C compiler to point to data, enabling short addresses to
  52 be used more often.
  53 The value of
  54 .CW A6
  55 is constant and must be set during C program initialization
  56 to the address of the externally-defined symbol
  57 .CW a6base .
  58 .PP
  59 The following hardware registers are defined in the assembler; their
  60 meaning should be obvious given a 68020 manual:
  61 .CW CAAR ,
  62 .CW CACR ,
  63 .CW CCR ,
  64 .CW DFC ,
  65 .CW ISP ,
  66 .CW MSP ,
  67 .CW SFC ,
  68 .CW SR ,
  69 .CW USP ,
  70 and
  71 .CW VBR .
  72 .PP
  73 The assembler also defines several pseudo-registers that
  74 manipulate the stack:
  75 .CW FP ,
  76 .CW SP ,
  77 and
  78 .CW TOS .
  79 .CW FP
  80 is the frame pointer, so
  81 .CW 0(FP)
  82 is the first argument,
  83 .CW 4(FP)
  84 is the second, and so on.
  85 .CW SP
  86 is the local stack pointer, where automatic variables are held
  87 (SP is a pseudo-register only on the 68020);
  88 .CW 0(SP)
  89 is the first automatic, and so on as with
  90 .CW FP .
  91 Finally,
  92 .CW TOS
  93 is the top-of-stack register, used for pushing parameters to procedures,
  94 saving temporary values, and so on.
  95 .PP
  96 The assembler and loader track these pseudo-registers so
  97 the above statements are true regardless of what has been
  98 pushed on the hardware stack, pointed to by
  99 .CW A7 .
 100 The name
 101 .CW A7
 102 refers to the hardware stack pointer, but beware of mixed use of
 103 .CW A7
 104 and the above stack-related pseudo-registers, which will cause trouble.
 105 Note, too, that the
 106 .CW PEA
 107 instruction is observed by the loader to
 108 alter SP and thus will insert a corresponding pop before all returns.
 109 The assembler accepts a label-like name to be attached to
 110 .CW FP
 111 and
 112 .CW SP
 113 uses, such as
 114 .CW p+0(FP) ,
 115 to help document that
 116 .CW p
 117 is the first argument to a routine.
 118 The name goes in the symbol table but has no significance to the result
 119 of the program.
 120 .SH
 121 Referring to data
 122 .PP
 123 All external references must be made relative to some pseudo-register,
 124 either
 125 .CW PC
 126 (the virtual program counter) or
 127 .CW SB
 128 (the ``static base'' register).
 129 .CW PC
 130 counts instructions, not bytes of data.
 131 For example, to branch to the second following instruction, that is,
 132 to skip one instruction, one may write
 133 .P1
 134         BRA     2(PC)
 135 .P2
 136 Labels are also allowed, as in
 137 .P1
 138         BRA     return
 139         NOP
 140 return:
 141         RTS
 142 .P2
 143 When using labels, there is no
 144 .CW (PC)
 145 annotation.
 146 .PP
 147 The pseudo-register
 148 .CW SB
 149 refers to the beginning of the address space of the program.
 150 Thus, references to global data and procedures are written as
 151 offsets to
 152 .CW SB ,
 153 as in
 154 .P1
 155         MOVL    $array(SB), TOS
 156 .P2
 157 to push the address of a global array on the stack, or
 158 .P1
 159         MOVL    array+4(SB), TOS
 160 .P2
 161 to push the second (4-byte) element of the array.
 162 Note the use of an offset; the complete list of addressing modes is given below.
 163 Similarly, subroutine calls must use
 164 .CW SB :
 165 .P1
 166         BSR     exit(SB)
 167 .P2
 168 File-static variables have syntax
 169 .P1
 170         local<>+4(SB)
 171 .P2
 172 The
 173 .CW <>
 174 will be filled in at load time by a unique integer.
 175 .PP
 176 When a program starts, it must execute
 177 .P1
 178         MOVL    $a6base(SB), A6
 179 .P2
 180 before accessing any global data.
 181 (On machines such as the MIPS and SPARC that cannot load a register
 182 in a single instruction, constants are loaded through the static base
 183 register.  The loader recognizes code that initializes the static
 184 base register and treats it specially.  You must be careful, however,
 185 not to load large constants on such machines when the static base
 186 register is not set up, such as early in interrupt routines.)
 187 .SH
 188 Expressions
 189 .PP
 190 Expressions are mostly what one might expect.
 191 Where an offset or a constant is expected,
 192 a primary expression with unary operators is allowed.
 193 A general C constant expression is allowed in parentheses.
 194 .PP
 195 Source files are preprocessed exactly as in the C compiler, so
 196 .CW #define
 197 and
 198 .CW #include
 199 work.
 200 .SH
 201 Addressing modes
 202 .PP
 203 The simple addressing modes are shared by all the assemblers.
 204 Here, for completeness, follows a table of all the 68020 addressing modes,
 205 since that machine has the richest set.
 206 In the table,
 207 .CW o
 208 is an offset, which if zero may be elided, and
 209 .CW d
 210 is a displacement, which is a constant between -128 and 127 inclusive.
 211 Many of the modes listed have the same name;
 212 scrutiny of the format will show what default is being applied.
 213 For instance, indexed mode with no address register supplied operates
 214 as though a zero-valued register were used.
 215 For "offset" read "displacement."
 216 For "\f(CW.s\fP" read one of
 217 .CW .L ,
 218 or
 219 .CW .W
 220 followed by
 221 .CW *1 ,
 222 .CW *2 ,
 223 .CW *4 ,
 224 or
 225 .CW *8
 226 to indicate the size and scaling of the data.
 227 .IP
 228 .TS
 229 l lfCW.
 230 data register   R0
 231 address register        A0
 232 floating-point register F0
 233 special names   CAAR, CACR, etc.
 234 constant        $con
 235 floating point constant $fcon
 236 external symbol name+o(SB)
 237 local symbol    name<>+o(SB)
 238 automatic symbol        name+o(SP)
 239 argument        name+o(FP)
 240 address of external     $name+o(SB)
 241 address of local        $name<>+o(SB)
 242 indirect post-increment (A0)+
 243 indirect pre-decrement  -(A0)
 244 indirect with offset    o(A0)
 245 indexed with offset     o()(R0.s)
 246 indexed with offset     o(A0)(R0.s)
 247 external indexed        name+o(SB)(R0.s)
 248 local indexed   name<>+o(SB)(R0.s)
 249 automatic indexed       name+o(SP)(R0.s)
 250 parameter indexed       name+o(FP)(R0.s)
 251 offset indirect post-indexed    d(o())(R0.s)
 252 offset indirect post-indexed    d(o(A0))(R0.s)
 253 external indirect post-indexed  d(name+o(SB))(R0.s)
 254 local indirect post-indexed     d(name<>+o(SB))(R0.s)
 255 automatic indirect post-indexed d(name+o(SP))(R0.s)
 256 parameter indirect post-indexed d(name+o(FP))(R0.s)
 257 offset indirect pre-indexed     d(o()(R0.s))
 258 offset indirect pre-indexed     d(o(A0))
 259 offset indirect pre-indexed     d(o(A0)(R0.s))
 260 external indirect pre-indexed   d(name+o(SB))
 261 external indirect pre-indexed   d(name+o(SB)(R0.s))
 262 local indirect pre-indexed      d(name<>+o(SB))
 263 local indirect pre-indexed      d(name<>+o(SB)(R0.s))
 264 automatic indirect pre-indexed  d(name+o(SP))
 265 automatic indirect pre-indexed  d(name+o(SP)(R0.s))
 266 parameter indirect pre-indexed  d(name+o(FP))
 267 parameter indirect pre-indexed  d(name+o(FP)(R0.s))
 268 .TE
 269 .in
 270 .SH
 271 Laying down data
 272 .PP
 273 Placing data in the instruction stream, say for interrupt vectors, is easy:
 274 the pseudo-instructions
 275 .CW LONG
 276 and
 277 .CW WORD
 278 (but not
 279 .CW BYTE )
 280 lay down the value of their single argument, of the appropriate size,
 281 as if it were an instruction:
 282 .P1
 283         LONG    $12345
 284 .P2
 285 places the long 12345 (base 10)
 286 in the instruction stream.
 287 (On most machines,
 288 the only such operator is
 289 .CW WORD
 290 and it lays down 32-bit quantities.
 291 The 386 has all three:
 292 .CW LONG ,
 293 .CW WORD ,
 294 and
 295 .CW BYTE .
 296 The AMD64 adds
 297 .CW QUAD
 298 to that for 64-bit values.
 299 The 960 has only one,
 300 .CW LONG .)
 301 .PP
 302 Placing information in the data section is more painful.
 303 The pseudo-instruction
 304 .CW DATA
 305 does the work, given two arguments: an address at which to place the item,
 306 including its size,
 307 and the value to place there.  For example, to define a character array
 308 .CW array
 309 containing the characters
 310 .CW abc
 311 and a terminating null:
 312 .P1
 313         DATA    array+0(SB)/1, $'a'
 314         DATA    array+1(SB)/1, $'b'
 315         DATA    array+2(SB)/1, $'c'
 316         GLOBL   array(SB), $4
 317 .P2
 318 or
 319 .P1
 320         DATA    array+0(SB)/4, $"abc\ez"
 321         GLOBL   array(SB), $4
 322 .P2
 323 The
 324 .CW /1
 325 defines the number of bytes to define,
 326 .CW GLOBL
 327 makes the symbol global, and the
 328 .CW $4
 329 says how many bytes the symbol occupies.
 330 Uninitialized data is zeroed automatically.
 331 The character
 332 .CW \ez
 333 is equivalent to the C
 334 .CW \e0.
 335 The string in a
 336 .CW DATA
 337 statement may contain a maximum of eight bytes;
 338 build larger strings piecewise.
 339 Two pseudo-instructions,
 340 .CW DYNT
 341 and
 342 .CW INIT ,
 343 allow the (obsolete) Alef compilers to build dynamic type information during the load
 344 phase.
 345 The
 346 .CW DYNT
 347 pseudo-instruction has two forms:
 348 .P1
 349         DYNT    , ALEF_SI_5+0(SB)
 350         DYNT    ALEF_AS+0(SB), ALEF_SI_5+0(SB)
 351 .P2
 352 In the first form,
 353 .CW DYNT
 354 defines the symbol to be a small unique integer constant, chosen by the loader,
 355 which is some multiple of the word size.  In the second form,
 356 .CW DYNT
 357 defines the second symbol in the same way,
 358 places the address of the most recently
 359 defined text symbol in the array specified by the first symbol at the
 360 index defined by the value of the second symbol,
 361 and then adjusts the size of the array accordingly.
 362 .PP
 363 The
 364 .CW INIT
 365 pseudo-instruction takes the same parameters as a
 366 .CW DATA
 367 statement.  Its symbol is used as the base of an array and the
 368 data item is installed in the array at the offset specified by the most recent
 369 .CW DYNT
 370 pseudo-instruction.
 371 The size of the array is adjusted accordingly.
 372 The
 373 .CW DYNT
 374 and
 375 .CW INIT
 376 pseudo-instructions are not implemented on the 68020.
 377 .SH
 378 Defining a procedure
 379 .PP
 380 Entry points are defined by the pseudo-operation
 381 .CW TEXT ,
 382 which takes as arguments the name of the procedure (including the ubiquitous
 383 .CW (SB) )
 384 and the number of bytes of automatic storage to pre-allocate on the stack,
 385 which will usually be zero when writing assembly language programs.
 386 On machines with a link register, such as the MIPS and SPARC,
 387 the special value -4 instructs the loader to generate no PC save
 388 and restore instructions, even if the function is not a leaf.
 389 Here is a complete procedure that returns the sum
 390 of its two arguments:
 391 .P1
 392 TEXT    sum(SB), $0
 393         MOVL    arg1+0(FP), R0
 394         ADDL    arg2+4(FP), R0
 395         RTS
 396 .P2
 397 An optional middle argument
 398 to the
 399 .CW TEXT
 400 pseudo-op is a bit field of options to the loader.
 401 Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
 402 the program.
 403 For example,
 404 .P1
 405 TEXT    sum(SB), 1, $0
 406         MOVL    arg1+0(FP), R0
 407         ADDL    arg2+4(FP), R0
 408         RTS
 409 .P2
 410 will not be profiled; the first version above would be.
 411 Subroutines with peculiar state, such as system call routines,
 412 should not be profiled.
 413 .PP
 414 Setting the 2 bit allows multiple definitions of the same
 415 .CW TEXT
 416 symbol in a program; the loader will place only one such function in the image.
 417 It was emitted only by the Alef compilers.
 418 .PP
 419 Subroutines to be called from C should place their result in
 420 .CW R0 ,
 421 even if it is an address.
 422 Floating point values are returned in
 423 .CW F0 .
 424 Functions that return a structure to a C program
 425 receive as their first argument the address of the location to
 426 store the result;
 427 .CW R0
 428 is unused in the calling protocol for such procedures.
 429 A subroutine is responsible for saving its own registers,
 430 and therefore is free to use any registers without saving them (``caller saves'').
 431 .CW A6
 432 and
 433 .CW A7
 434 are the exceptions as described above.
 435 .SH
 436 When in doubt
 437 .PP
 438 If you get confused, try using the
 439 .CW -S
 440 option to
 441 .CW 2c
 442 and compiling a sample program.
 443 The standard output is valid input to the assembler.
 444 .SH
 445 Instructions
 446 .PP
 447 The instruction set of the assembler is not identical to that
 448 of the machine.
 449 It is chosen to match what the compiler generates, augmented
 450 slightly by specific needs of the operating system.
 451 For example,
 452 .CW 2a
 453 does not distinguish between the various forms of
 454 .CW MOVE
 455 instruction: move quick, move address, etc.  Instead the context
 456 does the job.  For example,
 457 .P1
 458         MOVL    $1, R1
 459         MOVL    A0, R2
 460         MOVW    SR, R3
 461 .P2
 462 generates official
 463 .CW MOVEQ ,
 464 .CW MOVEA ,
 465 and
 466 .CW MOVESR
 467 instructions.
 468 A number of instructions do not have the syntax necessary to specify
 469 their entire capabilities.  Notable examples are the bitfield
 470 instructions, the
 471 multiply and divide instructions, etc.
 472 For a complete set of generated instruction names (in
 473 .CW 2a
 474 notation, not Motorola's) see the file
 475 .CW /sys/src/cmd/2c/2.out.h .
 476 Despite its name, this file contains an enumeration of the
 477 instructions that appear in the intermediate files generated
 478 by the compiler, which correspond exactly to lines of assembly language.
 479 .PP
 480 The MC68000 assembler,
 481 .CW 1a ,
 482 is essentially the same, honoring the appropriate subset of the instructions
 483 and addressing modes.
 484 The definitions of these are, nonetheless, part of
 485 .CW 2.out.h .
 486 .SH
 487 Laying down instructions
 488 .PP
 489 The loader modifies the code produced by the assembler and compiler.
 490 It folds branches,
 491 copies short sequences of code to eliminate branches,
 492 and discards unreachable code.
 493 The first instruction of every function is assumed to be reachable.
 494 The pseudo-instruction
 495 .CW NOP ,
 496 which you may see in compiler output,
 497 means no instruction at all, rather than an instruction that does nothing.
 498 The loader discards all
 499 .CW NOP 's.
 500 .PP
 501 To generate a true
 502 .CW NOP
 503 instruction, or any other instruction not known to the assembler, use a
 504 .CW WORD
 505 pseudo-instruction.
 506 Such instructions on RISCs are not scheduled by the loader and must have
 507 their delay slots filled manually.
 508 .SH
 509 MIPS
 510 .PP
 511 The registers are only addressed by number:
 512 .CW R0
 513 through
 514 .CW R31 .
 515 .CW R29
 516 is the stack pointer;
 517 .CW R30
 518 is used as the static base pointer, the analogue of
 519 .CW A6
 520 on the 68020.
 521 Its value is the address of the global symbol
 522 .CW setR30(SB) .
 523 The register holding returned values from subroutines is
 524 .CW R1 .
 525 When a function is called, space for the first argument
 526 is reserved at
 527 .CW 0(FP)
 528 but in C (not Alef) the value is passed in
 529 .CW R1
 530 instead.
 531 .PP
 532 The loader uses
 533 .CW R28
 534 as a temporary.  The system uses
 535 .CW R26
 536 and
 537 .CW R27
 538 as interrupt-time temporaries.  Therefore none of these registers
 539 should be used in user code.
 540 .PP
 541 The control registers are not known to the assembler.
 542 Instead they are numbered registers
 543 .CW M0 ,
 544 .CW M1 ,
 545 etc.
 546 Use this trick to access, say,
 547 .CW STATUS :
 548 .P1
 549 #define STATUS  12
 550         MOVW    M(STATUS), R1
 551 .P2
 552 .PP
 553 Floating point registers are called
 554 .CW F0
 555 through
 556 .CW F31 .
 557 By convention,
 558 .CW F24
 559 must be initialized to the value 0.0,
 560 .CW F26
 561 to 0.5,
 562 .CW F28
 563 to 1.0, and
 564 .CW F30
 565 to 2.0;
 566 this is done by the operating system.
 567 .PP
 568 The instructions and their syntax are different from those of the manufacturer's
 569 manual.
 570 There are no
 571 .CW lui
 572 and kin; instead there are
 573 .CW MOVW
 574 (move word),
 575 .CW MOVH
 576 (move halfword),
 577 and
 578 .CW MOVB
 579 (move byte) pseudo-instructions.  If the operand is unsigned, the instructions
 580 are
 581 .CW MOVHU
 582 and
 583 .CW MOVBU .
 584 The order of operands is from left to right in dataflow order, just as
 585 on the 68020 but not as in MIPS documentation.
 586 This means that the
 587 .CW Bcond
 588 instructions are reversed with respect to the book; for example, a
 589 .CW va
 590 .CW BGTZ
 591 generates a MIPS
 592 .CW bltz
 593 instruction.
 594 .PP
 595 The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
 596 It understands the 64-bit instructions
 597 .CW MOVV ,
 598 .CW MOVVL ,
 599 .CW ADDV ,
 600 .CW ADDVU ,
 601 .CW SUBV ,
 602 .CW SUBVU ,
 603 .CW MULV ,
 604 .CW MULVU ,
 605 .CW DIVV ,
 606 .CW DIVVU ,
 607 .CW SLLV ,
 608 .CW SRLV ,
 609 and
 610 .CW SRAV .
 611 The assembler does not have any cache, load-linked, or store-conditional instructions.
 612 .PP
 613 Some assembler instructions are expanded into multiple instructions by the loader.
 614 For example the loader may convert the load of a 32 bit constant into an
 615 .CW lui
 616 followed by an
 617 .CW ori .
 618 .PP
 619 Assembler instructions should be laid out as if there
 620 were no load, branch, or floating point compare delay slots;
 621 the loader will rearrange\(em\f2schedule\f1\(emthe instructions
 622 to guarantee correctness and improve performance.
 623 The only exception is that the correct scheduling of instructions
 624 that use control registers varies from model to model of machine
 625 (and is often undocumented) so you should schedule such instructions
 626 by hand to guarantee correct behavior.
 627 The loader generates
 628 .P1
 629         NOR     R0, R0, R0
 630 .P2
 631 when it needs a true no-op instruction.
 632 Use exactly this instruction when scheduling code manually;
 633 the loader recognizes it and schedules the code before it and after it independently.  Also,
 634 .CW WORD
 635 pseudo-ops are scheduled like no-ops.
 636 .PP
 637 The
 638 .CW NOSCHED
 639 pseudo-op disables instruction scheduling
 640 (scheduling is enabled by default);
 641 .CW SCHED
 642 re-enables it.
 643 Branch folding, code copying, and dead code elimination are
 644 disabled for instructions that are not scheduled.
 645 .SH
 646 SPARC
 647 .PP
 648 Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
 649 Registers have numerical names only:
 650 .CW R0
 651 through
 652 .CW R31 .
 653 Forget about register windows: Plan 9 doesn't use them at all.
 654 The machine has 32 global registers, period.
 655 .CW R1
 656 [sic] is the stack pointer.
 657 .CW R2
 658 is the static base register, with value the address of
 659 .CW setSB(SB) .
 660 .CW R7
 661 is the return register and also the register holding the first
 662 argument to a C (not Alef) function, again with space reserved at
 663 .CW 0(FP) .
 664 .CW R14
 665 is the loader temporary.
 666 .PP
 667 Floating-point registers are exactly as on the MIPS.
 668 .PP
 669 The control registers are known by names such as
 670 .CW FSR .
 671 The instructions to access these registers are
 672 .CW MOVW
 673 instructions, for example
 674 .P1
 675         MOVW    Y, R8
 676 .P2
 677 for the SPARC instruction
 678 .P1
 679         rdy     %r8
 680 .P2
 681 .PP
 682 Move instructions are similar to those on the MIPS: pseudo-operations
 683 that turn into appropriate sequences of
 684 .CW sethi
 685 instructions, adds, etc.
 686 Instructions read from left to right.  Because the arguments are
 687 flipped to
 688 .CW SUBCC ,
 689 the condition codes are not inverted as on the MIPS.
 690 .PP
 691 The syntax for the ASI stuff is, for example to move a word from ASI 2:
 692 .P1
 693         MOVW    (R7, 2), R8
 694 .P2
 695 The syntax for double indexing is
 696 .P1
 697         MOVW    (R7+R8), R9
 698 .P2
 699 .PP
 700 The SPARC's instruction scheduling is similar to the MIPS's.
 701 The official no-op instruction is:
 702 .P1
 703         ORN     R0, R0, R0
 704 .P2
 705 .SH
 706 i960
 707 .PP
 708 Registers are numbered
 709 .CW R0
 710 through
 711 .CW R31 .
 712 Stack pointer is
 713 .CW R29 ;
 714 return register is
 715 .CW R4 ;
 716 static base is
 717 .CW R28 ;
 718 it is initialized to the address of
 719 .CW setSB(SB) .
 720 .CW R3
 721 must be zero; this should be done manually early in execution by
 722 .P1
 723         SUBO    R3, R3
 724 .P2
 725 .CW R27
 726 is the loader temporary.
 727 .PP
 728 There is no support for floating point.
 729 .PP
 730 The Intel calling convention is not supported and cannot be used; use
 731 .CW BAL
 732 instead.
 733 Instructions are mostly as in the book.  The major change is that
 734 .CW LOAD
 735 and
 736 .CW STORE
 737 are both called
 738 .CW MOV .
 739 The extension character for
 740 .CW MOV
 741 is as in the manual:
 742 .CW O
 743 for ordinal,
 744 .CW W
 745 for signed, etc.
 746 .SH
 747 i386
 748 .PP
 749 The assembler assumes 32-bit protected mode.
 750 The register names are
 751 .CW SP ,
 752 .CW AX ,
 753 .CW BX ,
 754 .CW CX ,
 755 .CW DX ,
 756 .CW BP ,
 757 .CW DI ,
 758 and
 759 .CW SI .
 760 The stack pointer (not a pseudo-register) is
 761 .CW SP
 762 and the return register is
 763 .CW AX .
 764 There is no physical frame pointer but, as for the MIPS,
 765 .CW FP
 766 is a pseudo-register that acts as
 767 a frame pointer.
 768 .PP
 769 Opcode names are mostly the same as those listed in the Intel manual
 770 with an
 771 .CW L ,
 772 .CW W ,
 773 or
 774 .CW B
 775 appended to identify 32-bit,
 776 16-bit, and 8-bit operations.
 777 The exceptions are loads, stores, and conditionals.
 778 All load and store opcodes to and from general registers, special registers
 779 (such as
 780 .CW CR0,
 781 .CW CR3,
 782 .CW GDTR,
 783 .CW IDTR,
 784 .CW SS,
 785 .CW CS,
 786 .CW DS,
 787 .CW ES,
 788 .CW FS,
 789 and
 790 .CW GS )
 791 or memory are written
 792 as
 793 .P1
 794         MOV\f2x\fP      src,dst
 795 .P2
 796 where
 797 .I x
 798 is
 799 .CW L ,
 800 .CW W ,
 801 or
 802 .CW B .
 803 Thus to get
 804 .CW AL
 805 use a
 806 .CW MOVB
 807 instruction.  If you need to access
 808 .CW AH ,
 809 you must mention it explicitly in a
 810 .CW MOVB :
 811 .P1
 812         MOVB    AH, BX
 813 .P2
 814 There are many examples of illegal moves, for example,
 815 .P1
 816         MOVB    BP, DI
 817 .P2
 818 that the loader actually implements as pseudo-operations.
 819 .PP
 820 The names of conditions in all conditional instructions
 821 .CW J , (
 822 .CW SET )
 823 follow the conventions of the 68020 instead of those of the Intel
 824 assembler:
 825 .CW JOS ,
 826 .CW JOC ,
 827 .CW JCS ,
 828 .CW JCC ,
 829 .CW JEQ ,
 830 .CW JNE ,
 831 .CW JLS ,
 832 .CW JHI ,
 833 .CW JMI ,
 834 .CW JPL ,
 835 .CW JPS ,
 836 .CW JPC ,
 837 .CW JLT ,
 838 .CW JGE ,
 839 .CW JLE ,
 840 and
 841 .CW JGT
 842 instead of
 843 .CW JO ,
 844 .CW JNO ,
 845 .CW JB ,
 846 .CW JNB ,
 847 .CW JZ ,
 848 .CW JNZ ,
 849 .CW JBE ,
 850 .CW JNBE ,
 851 .CW JS ,
 852 .CW JNS ,
 853 .CW JP ,
 854 .CW JNP ,
 855 .CW JL ,
 856 .CW JNL ,
 857 .CW JLE ,
 858 and
 859 .CW JNLE .
 860 .PP
 861 The addressing modes have syntax like
 862 .CW AX ,
 863 .CW (AX) ,
 864 .CW (AX)(BX*4) ,
 865 .CW 10(AX) ,
 866 and
 867 .CW 10(AX)(BX*4) .
 868 The offsets from
 869 .CW AX
 870 can be replaced by offsets from
 871 .CW FP
 872 or
 873 .CW SB
 874 to access names, for example
 875 .CW extern+5(SB)(AX*2) .
 876 .PP
 877 Other notes: Non-relative
 878 .CW JMP
 879 and
 880 .CW CALL
 881 have a
 882 .CW *
 883 added to the syntax.
 884 Only
 885 .CW LOOP ,
 886 .CW LOOPEQ ,
 887 and
 888 .CW LOOPNE
 889 are legal loop instructions.  Only
 890 .CW REP
 891 and
 892 .CW REPN
 893 are recognized repeaters.  These are not prefixes, but rather
 894 stand-alone opcodes that precede the strings, for example
 895 .P1
 896         CLD; REP; MOVSL
 897 .P2
 898 Segment override prefixes in
 899 .CW MOD/RM
 900 fields are not supported.
 901 .SH
 902 AMD64
 903 .PP
 904 The assembler assumes 64-bit mode unless a
 905 .CW MODE
 906 pseudo-operation is given:
 907 .P1
 908         MODE $32
 909 .P2
 910 to change to 32-bit mode.
 911 The effect is mainly to diagnose instructions that are illegal in
 912 the given mode, but the loader will also assume 32-bit operands and addresses,
 913 and 32-bit PC values for call and return.
 914 The assembler's conventions are similar to those for the 386, above.
 915 The architecture provides extra fixed-point registers
 916 .CW R8
 917 to
 918 .CW R15 .
 919 All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
 920 as described in the processor handbook.
 921 For example,
 922 .CW MOVL
 923 to
 924 .CW AX
 925 puts a value in the low-order 32 bits and clears the top 32 bits to zero.
 926 Literal operands are limited to signed 32 bit values, which are sign-extended
 927 to 64 bits in 64 bit operations; the exception is
 928 .CW MOVQ ,
 929 which allows 64-bit literals.
 930 The external registers in Plan 9's C are allocated from
 931 .CW R15
 932 down.
 933 There are many new instructions, including the MMX and XMM media instructions,
 934 and conditional move instructions.
 935 MMX registers are
 936 .CW M0
 937 to
 938 .CW M7 ,
 939 and
 940 XMM registers are
 941 .CW X0
 942 to
 943 .CW X15 .
 944 As with the 386 instruction names,
 945 all new 64-bit integer instructions, and the MMX and XMM instructions
 946 uniformly use
 947 .CW L
 948 for `long word' (32 bits) and
 949 .CW Q
 950 for `quad word' (64 bits).
 951 Some instructions use
 952 .CW O
 953 (`octword') for 128-bit values, where the processor handbook
 954 variously uses
 955 .CW O
 956 or
 957 .CW DQ .
 958 The assembler also consistently uses
 959 .CW PL
 960 for `packed long' in
 961 XMM instructions, instead of
 962 .CW Q ,
 963 .CW DQ
 964 or
 965 .CW PI .
 966 Either
 967 .CW MOVL
 968 or
 969 .CW MOVQ
 970 can be used to move values to and from control registers, even when
 971 the registers might be 64 bits.
 972 The assembler often accepts the handbook's name to ease conversion
 973 of existing code (but remember that the operand order is uniformly
 974 source then destination).
 975 C's
 976 .CW "long long"
 977 type is 64 bits, but passed and returned by value, not by reference.
 978 More notably, C pointer values are 64 bits, and thus
 979 .CW "long long"
 980 and
 981 .CW "unsigned long long"
 982 are the only integer types wide enough to hold a pointer value.
 983 The C compiler and library use the XMM floating-point instructions, not
 984 the old 387 ones, although the latter are implemented by assembler and loader.
 985 Unlike the 386, the first integer or pointer argument is passed in a register, which is
 986 .CW BP
 987 for an integer or pointer (it can be referred to in assembly code by the pseudonym
 988 .CW RARG ).
 989 .CW AX
 990 holds the return value from subroutines as before.
 991 Floating-point results are returned in
 992 .CW X0 ,
 993 although currently the first floating-point parameter is not passed in a register.
 994 All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
 995 to preserve alignment and simplify variable-length argument list access,
 996 including the first parameter when passed in a register,
 997 even though bytes 4 to 7 are not initialized.
 998 .SH
 999 Alpha
1000 .PP
1001 On the Alpha, all registers are 64 bits.  The architecture handles 32-bit values
1002 by giving them a canonical format (sign extension in the case of integer registers).
1003 Registers are numbered
1004 .CW R0
1005 through
1006 .CW R31 .
1007 .CW R0
1008 holds the return value from subroutines, and also the first parameter.
1009 .CW R30
1010 is the stack pointer,
1011 .CW R29
1012 is the static base,
1013 .CW R26
1014 is the link register, and
1015 .CW R27
1016 and
1017 .CW R28
1018 are linker temporaries.
1019 .PP
1020 Floating point registers are numbered
1021 .CW F0
1022 to
1023 .CW F31 .
1024 .CW F28
1025 contains
1026 .CW 0.5 ,
1027 .CW F29
1028 contains
1029 .CW 1.0 ,
1030 and
1031 .CW F30
1032 contains
1033 .CW 2.0 .
1034 .CW F31
1035 is always
1036 .CW 0.0
1037 on the Alpha.
1038 .PP
1039 The extension character for
1040 .CW MOV
1041 follows DEC's notation:
1042 .CW B
1043 for byte (8 bits),
1044 .CW W
1045 for word (16 bits),
1046 .CW L
1047 for long (32 bits),
1048 and
1049 .CW Q
1050 for quadword (64 bits).
1051 Byte and ``word'' loads and stores may be made unsigned
1052 by appending a
1053 .CW U .
1054 .CW S
1055 and
1056 .CW T
1057 refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
1058 .SH
1059 Power PC
1060 .PP
1061 The Power PC follows the Plan 9 model set by the MIPS and SPARC,
1062 not the elaborate ABIs.
1063 The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
1064 there is no support for the older POWER instructions.
1065 Registers are
1066 .CW R0
1067 through
1068 .CW R31 .
1069 .CW R0
1070 is initialized to zero; this is done by C start up code
1071 and assumed by the compiler and loader.
1072 .CW R1
1073 is the stack pointer.
1074 .CW R2
1075 is the static base register, with value the address of
1076 .CW setSB(SB) .
1077 .CW R3
1078 is the return register and also the register holding the first
1079 argument to a C function, with space reserved at
1080 .CW 0(FP)
1081 as on the MIPS.
1082 .CW R31
1083 is the loader temporary.
1084 The external registers in Plan 9's C are allocated from
1085 .CW R30
1086 down.
1087 .PP
1088 Floating point registers are called
1089 .CW F0
1090 through
1091 .CW F31 .
1092 By convention, several registers are initialized
1093 to specific values; this is done by the operating system.
1094 .CW F27
1095 must be initialized to the value
1096 .CW 0x4330000080000000
1097 (used by float-to-int conversion),
1098 .CW F28
1099 to the value 0.0,
1100 .CW F29
1101 to 0.5,
1102 .CW F30
1103 to 1.0, and
1104 .CW F31
1105 to 2.0.
1106 .PP
1107 As on the MIPS and SPARC, the assembler accepts arbitrary literals
1108 as operands to
1109 .CW MOVW ,
1110 and also to
1111 .CW ADD
1112 and others where `immediate' variants exist,
1113 and the loader generates sequences
1114 of
1115 .CW addi ,
1116 .CW addis ,
1117 .CW oris ,
1118 etc. as required.
1119 The register indirect addressing modes use the same syntax as the SPARC,
1120 including double indexing when allowed.
1121 .PP
1122 The instruction names are generally derived from the Motorola ones,
1123 subject to slight transformation:
1124 the
1125 .CW . ' `
1126 marking the setting of condition codes is replaced by
1127 .CW CC ,
1128 and when the letter
1129 .CW o ' `
1130 represents `OE=1' it is replaced by
1131 .CW V .
1132 Thus
1133 .CW add ,
1134 .CW addo.
1135 and
1136 .CW subfzeo.
1137 become
1138 .CW ADD ,
1139 .CW ADDVCC
1140 and
1141 .CW SUBFZEVCC .
1142 As well as the three-operand conditional branch instruction
1143 .CW BC ,
1144 the assembler provides pseudo-instructions for the common cases:
1145 .CW BEQ ,
1146 .CW BNE ,
1147 .CW BGT ,
1148 .CW BGE ,
1149 .CW BLT ,
1150 .CW BLE ,
1151 .CW BVC ,
1152 and
1153 .CW BVS .
1154 The unconditional branch instruction is
1155 .CW BR .
1156 Indirect branches use
1157 .CW "(CTR)"
1158 or
1159 .CW "(LR)"
1160 as target.
1161 .PP
1162 Load or store operations are replaced by
1163 .CW MOV
1164 variants in the usual way:
1165 .CW MOVW
1166 (move word),
1167 .CW MOVH
1168 (move halfword with sign extension), and
1169 .CW MOVB
1170 (move byte with sign extension, a pseudo-instruction),
1171 with unsigned variants
1172 .CW MOVHZ
1173 and
1174 .CW MOVBZ ,
1175 and byte-reversing
1176 .CW MOVWBR
1177 and
1178 .CW MOVHBR .
1179 `Load or store with update' versions are
1180 .CW MOVWU ,
1181 .CW MOVHU ,
1182 and
1183 .CW MOVBZU .
1184 Load or store multiple is
1185 .CW MOVMW .
1186 The exceptions are the string instructions, which are
1187 .CW LSW
1188 and
1189 .CW STSW ,
1190 and the reservation instructions
1191 .CW lwarx
1192 and
1193 .CW stwcx. ,
1194 which are
1195 .CW LWAR
1196 and
1197 .CW STWCCC ,
1198 all with operands in the usual data-flow order.
1199 Floating-point load or store instructions are
1200 .CW FMOVD ,
1201 .CW FMOVDU ,
1202 .CW FMOVS ,
1203 and
1204 .CW FMOVSU .
1205 The register to register move instructions
1206 .CW fmr
1207 and
1208 .CW fmr.
1209 are written
1210 .CW FMOVD
1211 and
1212 .CW FMOVDCC .
1213 .PP
1214 The assembler knows the commonly used special purpose registers:
1215 .CW CR ,
1216 .CW CTR ,
1217 .CW DEC ,
1218 .CW LR ,
1219 .CW MSR ,
1220 and
1221 .CW XER .
1222 The rest, which are often architecture-dependent, are referenced as
1223 .CW SPR(n) .
1224 The segment registers of the 60x series are similarly
1225 .CW SEG(n) ,
1226 but
1227 .I n
1228 can also be a register name, as in
1229 .CW SEG(R3) .
1230 Moves between special purpose registers and general purpose ones,
1231 when allowed by the architecture,
1232 are written as
1233 .CW MOVW ,
1234 replacing
1235 .CW mfcr ,
1236 .CW mtcr ,
1237 .CW mfmsr ,
1238 .CW mtmsr ,
1239 .CW mtspr ,
1240 .CW mfspr ,
1241 .CW mftb ,
1242 and many others.
1243 .PP
1244 The fields of the condition register
1245 .CW CR
1246 are referenced as
1247 .CW CR(0)
1248 through
1249 .CW CR(7) .
1250 They are used by the
1251 .CW MOVFL
1252 (move field) pseudo-instruction,
1253 which produces
1254 .CW mcrf
1255 or
1256 .CW mtcrf .
1257 For example:
1258 .P1
1259         MOVFL   CR(3), CR(0)
1260         MOVFL   R3, CR(1)
1261         MOVFL   R3, $7, CR
1262 .P2
1263 They are also accepted in
1264 the conditional branch instruction, for example
1265 .P1
1266         BEQ     CR(7), label
1267 .P2
1268 Fields of the
1269 .CW FPSCR
1270 are accessed using
1271 .CW MOVFL
1272 in a similar way:
1273 .P1
1274         MOVFL   FPSCR, F0
1275         MOVFL   F0, FPSCR
1276         MOVFL   F0, $7, FPSCR
1277         MOVFL   $0, FPSCR(3)
1278 .P2
1279 producing
1280 .CW mffs ,
1281 .CW mtfsf
1282 or
1283 .CW mtfsfi ,
1284 as appropriate.
1285 .SH
1286 ARM
1287 .PP
1288 The assembler provides access to
1289 .CW R0
1290 through
1291 .CW R14
1292 and the
1293 .CW PC .
1294 The stack pointer is
1295 .CW R13 ,
1296 the link register is
1297 .CW R14 ,
1298 and the static base register is
1299 .CW R12 .
1300 .CW R0
1301 is the return register and also the register holding
1302 the first argument to a subroutine.
1303 The assembler supports the
1304 .CW CPSR
1305 and
1306 .CW SPSR
1307 registers.
1308 It also knows about coprocessor registers
1309 .CW C0
1310 through
1311 .CW C15 .
1312 Floating registers are
1313 .CW F0
1314 through
1315 .CW F7 ,
1316 .CW FPSR
1317 and
1318 .CW FPCR .
1319 .PP
1320 As with the other architectures, loads and stores are called
1321 .CW MOV ,
1322 e.g.
1323 .CW MOVW
1324 for load word or store word, and
1325 .CW MOVM
1326 for
1327 load or store multiple,
1328 depending on the operands.
1329 .PP
1330 Addressing modes are supported by suffixes to the instructions:
1331 .CW .IA
1332 (increment after),
1333 .CW .IB
1334 (increment before),
1335 .CW .DA
1336 (decrement after), and
1337 .CW .DB
1338 (decrement before).
1339 These can only be used with the
1340 .CW MOV
1341 instructions.
1342 The move multiple instruction,
1343 .CW MOVM ,
1344 defines a range of registers using brackets, e.g.
1345 .CW [R0-R12] .
1346 The special
1347 .CW MOVM
1348 addressing mode bits
1349 .CW W ,
1350 .CW U ,
1351 and
1352 .CW P
1353 are written in the same manner, for example,
1354 .CW MOVM.DB.W .
1355 A
1356 .CW .S
1357 suffix allows a
1358 .CW MOVM
1359 instruction to access user
1360 .CW R13
1361 and
1362 .CW R14
1363 when in another processor mode.
1364 Shifts and rotates in addressing modes are supported by binary operators
1365 .CW <<
1366 (logical left shift),
1367 .CW >>
1368 (logical right shift),
1369 .CW ->
1370 (arithmetic right shift), and
1371 .CW @>
1372 (rotate right); for example
1373 .CW "R7>>R2" or
1374 .CW "R2@>2" .
1375 The assembler does not support indexing by a shifted expression;
1376 only names can be doubly indexed.
1377 .PP
1378 Any instruction can be followed by a suffix that makes the instruction conditional:
1379 .CW .EQ ,
1380 .CW .NE ,
1381 and so on, as in the ARM manual, with synonyms
1382 .CW .HS
1383 (for
1384 .CW .CS )
1385 and
1386 .CW .LO
1387 (for
1388 .CW .CC ),
1389 for example
1390 .CW ADD.NE .
1391 Arithmetic
1392 and logical instructions
1393 can have a
1394 .CW .S
1395 suffix, as ARM allows, to set condition codes.
1396 .PP
1397 The syntax of the
1398 .CW MCR
1399 and
1400 .CW MRC
1401 coprocessor instructions is largely as in the manual, with the usual adjustments.
1402 The assembler directly supports only the ARM floating-point coprocessor
1403 operations used by the compiler:
1404 .CW CMP ,
1405 .CW ADD ,
1406 .CW SUB ,
1407 .CW MUL ,
1408 and
1409 .CW DIV ,
1410 all with
1411 .CW F
1412 or
1413 .CW D
1414 suffix selecting single or double precision.
1415 Floating-point load or store become
1416 .CW MOVF
1417 and
1418 .CW MOVD .
1419 Conversion instructions are also specified by moves:
1420 .CW MOVWD ,
1421 .CW MOVWF ,
1422 .CW MOVDW ,
1423 .CW MOVWD ,
1424 .CW MOVFD ,
1425 and
1426 .CW MOVDF .
1427 .SH
1428 AMD 29000
1429 .PP
1430 For details about this assembly language, which was built for the AMD 29240,
1431 look at the sources or examine compiler output.