Developing for a PDP-11

From CSLabsWiki
Revision as of 18:43, 5 July 2015 by Northug (talk | contribs) (Sometimes I hate middle click)

Jump to: navigation, search

The PDP-11 was a minicomputer developed by Digital Equipment Corporation as one of the Programmed Data Processors series. These ISAs (and their successors) were extremely influential in modern stored-program computer architecture; a brief glance at the assembly reveals that it resembles a pleasant cross between the well-known Intel 8086 ISA in operation and ARM with its register model (registers are named rN for some natural N, and operational registers such as pc and sp are amongst them). Unlike many of the other PDPs--and perhaps owing to its popularity in real-time control systems, where supposedly many continue to run as intended--the PDP-11 is still a build target for modern compilers (gcc especially).

As a reference for (and expansion upon) the material in this article, you will want to keep the PDP-11 Handbook under your pillow. Rarely have I seen a platform whose intricacies are so simplistic that it can be fully documented in 112 pages.

Developing for the Platform

As a developer for this platform, it is worth noting that it is a 16-bit microcomputer with a fixed instruction size (of one word). Addresses are similarly limited; the entire virtual address space spans 64k, which was inconceivable at the time of its conception, but quickly became a reality. In order to cope with increasing memory capacity at lower costs, further models were released with 18-bit and 22-bit address busses--but the processor architecture did not change in any significant way. Rather, an MMU (Memory Management Unit) peripheral was added to convert the 16-bit addresses into the native address size. (Though the [www.bitsavers.org/pdf/dec/pdp11/handbooks/PDP11_Handbook1979.pdf processor handbook] calls these "pages", they are, more or less, the first version of segmentation, a type of virtual addressing still used in x86 processors when booted to real mode.) Luckily, PDP-11s which have larger native address sizes generally boot in a 16-bit mode that permits access to the IO bus at its usual addresses for prior versions.

Finally, as one more peculiarity, the native numeric representation of the PDP-11 is octal. Each octal digit maps to a sequence of three bits:

Octal Binary
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1

While this did not evenly fit into the native word size of the machine (all possible words were in the range 000000-177777), it remains to be the standard for all documentation written for the platform. Ultimately, the VAX-11, another DEC successor to this model, introduced the more familiar hexadecimal notation (along with the 32-bit word size and true paging).

Getting Started

PDP-11 assembly is almost trivial enough that, with some experience, it can be written with a hex (oct?) editor. Unlke the VAX and x86 instructions, all instructions took up exactly one word size, and their operands have a consistent representation that is independent of the instruction for which they are encoded--such is the benefit of the orthogonal instruction set. Wikipedia's article on the topic contains more than enough information for anyone with a pencil, paper, and time to become an adequate PDP-11 (dis)assembler, and, indeed, you may want to become practiced in this if you plan to read through memory dumps.

But, of course, setting up an assembler and compiler is infinitely more pleasant. Most examples of assembly are written for the MACRO-11 assembler, an assembler which can still be found by an author who made a version of it for his Windows-based emulator (the source is still "cross-platform"). However, an artifact of building gcc and binutils is gas, the GNU assembler, which--when targetting the PDP-11--understands all of the MACRO-11 syntax I've thrown at it so far (though it is not its native syntax, you have been warned).

Without further ado, then, let's set up a gcc cross-compiler. This is a fairly fundamental step in compiling for any non-native architecture (I would be very surprised if a host viewing this is on a PDP-11) and tends to be the most imposing, though it's not as hard as it seems--no one seems to document it well.

First things first, gcc depends on a matching binutils somewhere--this is where it derives its assembly and various other features. Get a binutils snapshot (the latest and greatest version still works at the time of writing--I'm using 2.24) and extract it somewhere; I like doing so in /tmp because the source tree is still available in pure form as the downloaded archive if I need it:

   cd /tmp
   mkdir build
   cd build
   tar xvf /path/to/binutils-version-stuff.tar.bz2

When this is done (and it may take a little bit), we can set up the build. A relatively undocumented feature of most GNU compiler-related projects is that they expect an out-of-tree build--and may break if you try to build in tree--so don't use "configure" from where it's situated! Although redundant, I like to put build directories inside the working directory of the repository, and name them with their target (so that I may build multiple targets at once).

Targets for binutils and gcc consist of, at the least, a machine architecture and a binary format for output, separated by a dash. We will be using pdp11-aout for this demonstration, as pdp11-elf does not compile in binutils at present. Besides, a.out format was the native binary format for this machine (when running any of the Unices or derivatives it supported).

At this point, you will also want to choose a prefix. The default is /usr/local, but that requires root privileges (sudo make install). If you do not have these, you can still install into a directory you own (like $HOME), but remember to be sure that the directory you choose is in the relevant PATHs (particularly, <dir>/bin in PATH and <dir>/lib in LD_LIBRARY_PATH).

Let's get to it, then. As with above, feel free to change the --target and --prefix arguments to configure:

   cd binutils-version-stuff
   mkdir build-pdp11-aout
   cd build-pdp11-aout
   ../configure --target=pdp11-aout --prefix=/usr/local
   make

Several minutes later (on fast machines; worse for slower ones), the build should finish without error. when that time comes:

   sudo make install

(or just make install if you don't have sudo--and you own the prefix directory.)

If your build errors out, you may have to choose a different target (especially in binary format). There are many ways the build can go wrong, so I couldn't possibly cover them all here. Just remember that, if you choose a different target, you will need to be consistent about it for the next step.

With binutils made and installed, you should be ready for gcc. The setup is about the same, so forgive me if I elide the details.

   cd /tmp/build
   tar xvf /path/to/gcc-version-stuff.tar.bz2
   cd gcc-version-stuff
   mkdir build-pdp11-aout
   cd build-pdp11-aout
   ../configure --target=pdp11-aout --prefix=/usr/local
   make
   sudo make install

If all goes well, after this procedure, you should be able to type pdp11-aout-gcc --version at a prompt and get back the version of GCC you just compiled.

If all hasn't yet gone well, it turns out that GCC building isn't exactly turn-key; fortunately, GCC developers are hosting an easy-to-read list of GCC dependencies, which includes their multiprecision libraries (MPFR, MPF, GMP). Don't worry, you only need these on the host platform, so you don't need to build them from source--though instructions vary widely between platforms you are building on, I found it sufficient to run the following on Ubuntu/Debian:

   sudo apt-get install libgmp-dev libmpfr-dev libmpc-dev

Other Linuces likely have similar packages available from their package managers, or at least ways of building these libraries if need be.

Similarly, you may have to turn off some of the runtime libraries. For example, if you get errors in the build process that resemble:

   configure: error: Can't find stdio.h.
   You must have a usable C system for the target already installed, at least
   including headers and, preferably, the library, before you can configure
   the Objective C runtime system.  If necessary, install gcc now with
    `LANGUAGES=c', then the target library, then build with `LANGUAGES=objc'.
   make[1]: *** [configure-target-libobjc] Error 1
   make[1]: Leaving directory `/tmp/build/gcc-5.1.0/build-pdp11-aout'
   make: *** [all] Error 2

While there may or may not be instructions at the end, the easier and more consistent thing to do is to simply disable building that portion--in this case, the run-time library for Objective C (not the Objective C compiler itself). The important bit is the make part of the error, configure-target-libobjc; anything past the configure-target- bit is the name of the feature; just go back to your ../configure line and add a disable for it:

   ../configure --target=pdp11-aout --prefix=/usr/local --disable-libobjc
   make

You will probably have to do this more than once. When I was through, I had to disable most of the runtime libraries, and stack smashing protection:

   ../configure --target=pdp11-aout --prefix=/usr/local --disable-libstdc++-v3 --disable-libssp --disable-libgfortran --disable-libobjc

The reason is simply that the runtimes for these languages usually expect "nice" features like <stdio.h> that aren't reliably in existence for the target architecture (libssp, additionally, requires some more convoluted features). One can fairly readily build a small, standalone libc using any of those that are about nowadays (newlibc and glibc come to mind), but I won't cover that here (yet). Keep in mind that the compilers are still already built at this point; however, using them will generate code that can't link at the moment.

Building Software

So, if you've gotten this far, you've probably gotten a working toolchain and a fair understanding of the platform. Awesome! cd to /tmp (or your favorite place for crap) and try out your stuff:

$ echo 'int main() { return 0; }' > foo.c; pdp11-aout-gcc foo.c
/usr/local/lib/gcc/pdp11-aout/5.1.0/../../../../pdp11-aout/bin/ld: cannot find crt0.o: No such file or directory
/usr/local/lib/gcc/pdp11-aout/5.1.0/../../../../pdp11-aout/bin/ld: cannot find -lc
collect2: error: ld returned 1 exit status

Oops. crt0.o is a file compiled from crt0.s, usually provided by the platform for getting the runtime set up before calling into main() (and responsible for the exit() call made when main returns). Since we really don't have a target ABI (yet), we don't have crt0.s or crt0.o. We can tell gcc to ignore that hitch with the "-nostdlib" option (which will, among other things, make it not link against a libc):

$ pdp11-aout-gcc -nostdlib foo.c
/tmp/ccTlcvbE.o:/tmp/ccTlcvbE.o:(.text+0xa): undefined reference to `___main'
collect2: error: ld returned 1 exit status

Oops again. Since crt0.s is responsible for calling main, "main" is not really the starting point for our program. In fact, the default is a function called "start"--again, usually in crt0.s. Let's change things around a little more:

$ echo 'int start() { return 0; }' > foo.c; pdp11-aout-gcc -nostdlib foo.c

No errors--that's hopeful. Let's see what things look like:

$ pdp11-aout-objdump -D a.out

a.out:     file format a.out-pdp11


Disassembly of section .text:

00000000 <_start>:
   0:   1166            mov     r5, -(sp)
   2:   1185            mov     sp, r5
   4:   0a00            clr     r0
   6:   1585            mov     (sp)+, r5
   8:   0087            rts     pc

(Feel free to giggle that "a.out", the resulting executable, is really in a.out format.)

Excellent! Instructions at 0 through 4 inclusive are the preamble for the function; it resembles the PUSH %ebp; MOV %esp, %ebp; lines in x86. Instruction 4 clears r0, the return register--"return 0". Finally, 6-8 are the epilogue of the function, having the same role as POP %ebp; RET; in x86.

This should be enough to get you started with writing C programs for the platform. Experiment! You can learn quite a few things about how data moves around, how procedure calls work, and so forth--and you can do it with a modern compiler that supports just about every C project under the sun! (I still wouldn't recommend building Linux yet, though.)

Application Binary Interface

Most of the stuff in this section is already covered in the PDP-11 Handbook. Seriously. Go read it.

The ABI of the PDP-11, as I've observed it being emitted by GCC, is corroborated exactly by the UNIX v5 specifications. In particular, should you be looking to link assembly to C, you'll want to know the calling convention:

Register Usage Saved By
r0 Return registers (up to 32-bit) Caller
r1
r2 Local variables (3x16bit) Callee
r3
r4
r5 Frame pointer (fp, bp--rarely aliased)
r6 Stack pointer (sp--usually aliased)
r7 Program counter (pc--usually aliased)

Interrupts are a slightly different beast. Unlike x86-compatibles, the Interrupt Vector Table is fixed at location 0 in memory, and has a fixed size (sources argue on the size, but 256 words/512 bytes is a safe bet [citation needed]). All interrupts to the processor possess a vector, which is the (necessarily even!) address of two words in the IVT. Most hardware devices can be programmed, either via software or hardware, to alter their vector, whereas certain software instructions generate interrupts to vectors on their own. Some common ones are as follows:

Vector (octal) Cause
0 User device interrupt
4 Bus error, Illegal instruction, Stack overflow (like a General Protection Fault on x86)
10 Reserved instruction (attempt to execute privileged code)
14 BPT and trace trap (for debugging)
20 IOT trap
24 Power fail trap (raised immediately by the processor when the PSU detects loss of line power, but before it loses all power)
30 EMT trap (parameterized by the lower octet of the EMT, more or less)
34 TRAP trap (idem)
40 System software communication (id est, as the system developer, reserved for you :)
44
50
54
60 Teleprinter (TTY out) interrupt
64 Teletype keyboard (TTY in) interrupt
70 Paper tape punch (PT out) interrupt
74 Paper tape reader (PT in) interrupt
100 Device interrupts
374
400 Redline for Stack Overflow (attempts to address deferred through SP when SP is here or lower will raise Stack Overflow)

The documentation of many devices lists their vectors and their configurability.

The overall trap procedure can be emulated in a processor by the following MACRO-11 for some vector VEC:

   PSW = 177776
   mov @#PSW, -(sp)
   mov pc, -(sp)
   mov @#VEC+2, @#PSW
   mov @#VEC, pc

Note that the processor status word (PSW) is memory-mapped at 177776. Also note that the EMT and TRAP instructions are related; they both begin with (octal) 104xxx, where xxx is 0-377 for EMT and 400-777 for TRAP. From within a TRAP or EMT interrupt handler, you can determine the value of the instruction using something similar to:

   mov r0, -(sp)
   mov @2(sp), r0
   ...
   mov (sp)+, r0

It is critical that, from within a service routine, all registers are callee saved. After operation, assuming the stack pointer is back where it was when the handler was entered, one can use the RTI instruction to return from the handler, undoing the trap entry procedure:

   add #4, sp
   mov -2(sp), @#PSW
   mov -4(sp), pc

Do not use RTS pc for returning for an interrupt handler--you will corrupt the stack!

Bits <7:5> of the PSW (zero-indexed) are the Interrupt Priority Level. They specify a threshold above which an interrupt needs to be (in level) to cause service by the processor. Software interrupts occupy the lowest levels (0-3, respectively Stack Overflow, Trace Trap, Trap Instruction (as above), Bus Error), whereas interrupts (4-7) are reserved for hardware devices through four physical lines, named BR4-BR7. Most devices can be somehow configured to use a different level. If the software chooses to honor a bus interrupt, it will raise the matching BG4-BG7 line, which indicates to the device that it should put its vector on the bus. Note that you, the systems programmer, have the ability to control how nested interrupts can be by what the vector's stored PSW's IPL is set to. A good practice is setting the IPL to the expected level of the received interrupt.

Working with SIMH

SIMH is an excellent little historic computer simulator that includes support for the PDP-11, amongst a long list of other contemporaneous systems. There are various ways to get it, including downloading Windows binaries, getting Debian packages, or building it from source tarballs. I won't cover that build process (I got it from a dpkg myself :), but I promise you it should be trivial after building a cross-compiler.

For how well it works, SIMH is one of the most abhorrently-documented projects I've seen. For example, each simulator supports different load formats; the one we're interested in, the PDP-11, states in the distributed PDF document that "load" will receive "standard binary format tapes". No, these aren't .tar files. The only place you can find any documentation whatsoever on the format actually accepted is by cracking open the source:

/* Binary loader.
   Loader format consists of blocks, optionally preceded, separated, and
   followed by zeroes.  Each block consists of:
        001             ---
        xxx              |
        lo_count         |
        hi_count         |
        lo_origin        > count bytes
        hi_origin        |
        data byte        |
        :                |
        data byte       ---
        checksum
   If the byte count is exactly six, the block is the last on the tape, and
   there is no checksum.  If the origin is not 000001, then the origin is
   the PC at which to start the program.
*/

simh/PDP11/pdp11_sys.c, lines 218-237

It's actually not too terrible, if you forgive the fact that there's no documentation for the algorithm computing the checksum. We can ignore the fact that it takes three octal digits to represent a byte for the moment, since we'll be dealing with that anyway.

Lucky for you, I already pounded my head against the wall and banged out this "small" Python script that should do the right thing:

#PDP-11 crappy "terp" (tape) format for loading into simh/pdp11's terp dervs.

import struct
import sys

if len(sys.argv) < 2:
    print '''Usage: python mkterp.py {BLOCK} {BLOCK} {BLOCK}
where each {BLOCK} is:
	[-O <origin>] to set the origin (default 1)
	-d <file> a binary file to read in data (terminates the block).
OR:
	-o <fname> to set the output file name (last instance wins).
OR:
	-p <pc> to set the address at which the program shall start (last instance wins).
'''
    exit()

ofile = 'image.out'
pc = 1 #For some reason, 1 does not set PC to any special value on load.
blocks = [] #[{'org', 'fname'}]
org = 0

i = 1
while i < len(sys.argv):
	if sys.argv[i] == '-o':
		ofile = sys.argv[i+1]
		i+=2
	elif sys.argv[i] == '-p':
		pc = eval(sys.argv[i+1])
		i+=2
	elif sys.argv[i] == '-O':
		org = eval(sys.argv[i+1])
		i+=2
        elif sys.argv[i] == '-d':
		fname = sys.argv[i+1]
		i+=2
		blocks.append({'org': org, 'fname': fname})
		org = 0
	else:
		print 'Unrecognized option:', sys.argv[i], 'skipped'
		i+=1

print 'Blocks to be put into %s:'%(ofile,)
for block in blocks:
	print 'File', block['fname'], '@', block['org']

of = open(ofile, 'wb')
for block in blocks:
	inf = open(block['fname'], 'rb')
	data = inf.read()
	inf.close()
	pkt = struct.pack('<HHH', 1, 6+len(data), block['org'])+data
	csum = 0
	for ch in pkt:
		csum = (csum + ord(ch)) % 256
	of.write(pkt + chr(256 - csum))
pkt = struct.pack('<HHH', 1, 6, pc)
csum = 0
for ch in pkt:
	csum = (csum + ord(ch)) % 256
of.write(pkt + chr(256 - csum))
of.close()
print 'Complete.'

The usage of this script should look something like the following:

   python mkterp.py -O ldaddr_A -d A -O ldaddr_B -d B -O ldaddr_C -d C ... [-o outfile] [-p startPC]

where A is to be loaded into memory starting at ldaddr_A, B is to be loaded into memory at ldaddr_B, and so on. The output file is cooked and ready to serve with "load" on the simulator command prompt.