Difference between revisions of "Developing for a PDP-11"

From CSLabsWiki
Jump to: navigation, search
(Getting Started: Runtime linking issues)
(Application Binary Interface: Updated mmap)
Line 136: Line 136:
 
Interrupts are a slightly different beast. Unlike x86-compatibles, the Interrupt Vector Table is fixed at location 0 in memory, and has a fixed size (sources argue on the size, but 256 words/512 bytes is a safe bet [citation needed]). All interrupts to the processor possess a ''vector'', which is the (necessarily even!) address of two words in the IVT. Most hardware devices can be programmed, either via software or hardware, to alter their vector, whereas certain software instructions generate interrupts to vectors on their own. Some common ones are as follows:
 
Interrupts are a slightly different beast. Unlike x86-compatibles, the Interrupt Vector Table is fixed at location 0 in memory, and has a fixed size (sources argue on the size, but 256 words/512 bytes is a safe bet [citation needed]). All interrupts to the processor possess a ''vector'', which is the (necessarily even!) address of two words in the IVT. Most hardware devices can be programmed, either via software or hardware, to alter their vector, whereas certain software instructions generate interrupts to vectors on their own. Some common ones are as follows:
 
{|class="wikitable"
 
{|class="wikitable"
 +
!Vector (octal)
 
!Cause
 
!Cause
!Vector
 
 
|-
 
|-
|TRAP instruction, generic trap
+
|0||User device interrupt
|34
 
 
|-
 
|-
|BPT instruction, debugger breakpoint
+
|4||Bus error, Illegal instruction, Stack overflow (like a General Protection Fault on x86)
|14
 
 
|-
 
|-
|IOT instruction, IO emulation trap
+
|10||Reserved instruction (attempt to execute privileged code)
|20
 
 
|-
 
|-
|EMT instruction, instruction emulation trap
+
|14||BPT and trace trap (for debugging)
|30
 
 
|-
 
|-
|Debugger serial input (from TTY)
+
|20||IOT trap
|60
 
 
|-
 
|-
|Debugger serial output (to TTY)
+
|24||Power fail trap (raised immediately by the processor when the PSU detects loss of line power, but before it loses all power)
|64
+
|-
 +
|30||EMT trap (parameterized by the lower octet of the EMT, more or less)
 +
|-
 +
|34||TRAP trap (idem)
 +
|-
 +
|40||rowspan="4"| System software communication (id est, as the system developer, reserved for you :)
 +
|-
 +
|44
 +
|-
 +
|50
 +
|-
 +
|54
 +
|-
 +
|60||Teleprinter (TTY out) interrupt
 +
|-
 +
|64||Teletype keyboard (TTY in) interrupt
 +
|-
 +
|70||Paper tape punch (PT out) interrupt
 +
|-
 +
|74||Paper tape reader (PT in) interrupt
 +
|-
 +
|80||rowspan="3"| Device interrupts
 +
|-
 +
|⋮
 +
|-
 +
|374
 +
|-
 +
|style="background-color: #ffbbbb"|400||Redline for Stack Overflow (attempts to address deferred through SP when SP is here or lower will raise Stack Overflow)
 
|}
 
|}
  

Revision as of 06:20, 2 June 2015

The PDP-11 was a minicomputer developed by Digital Equipment Corporation as one of the Programmed Data Processors series. These ISAs (and their successors) were extremely influential in modern stored-program computer architecture; a brief glance at the assembly reveals that it resembles a pleasant cross between the well-known Intel 8086 ISA in operation and ARM with its register model (registers are named rN for some natural N, and operational registers such as pc and sp are amongst them). Unlike many of the other PDPs--and perhaps owing to its popularity in real-time control systems, where supposedly many continue to run as intended--the PDP-11 is still a build target for modern compilers (gcc especially).

As a reference for (and expansion upon) the material in this article, you will want to keep the PDP-11 Handbook under your pillow. Rarely have I seen a platform whose intricacies are so simplistic that it can be fully documented in 112 pages.

Developing for the Platform

As a developer for this platform, it is worth noting that it is a 16-bit microcomputer with a fixed instruction size (of one word). Addresses are similarly limited; the entire virtual address space spans 64k, which was inconceivable at the time of its conception, but quickly became a reality. In order to cope with increasing memory capacity at lower costs, further models were released with 18-bit and 22-bit address busses--but the processor architecture did not change in any significant way. Rather, an MMU (Memory Management Unit) peripheral was added to convert the 16-bit addresses into the native address size. (Though the [www.bitsavers.org/pdf/dec/pdp11/handbooks/PDP11_Handbook1979.pdf processor handbook] calls these "pages", they are, more or less, the first version of segmentation, a type of virtual addressing still used in x86 processors when booted to real mode.) Luckily, PDP-11s which have larger native address sizes generally boot in a 16-bit mode that permits access to the IO bus at its usual addresses for prior versions.

Finally, as one more peculiarity, the native numeric representation of the PDP-11 is octal. Each octal digit maps to a sequence of three bits:

Octal Binary
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1

While this did not evenly fit into the native word size of the machine (all possible words were in the range 000000-177777), it remains to be the standard for all documentation written for the platform. Ultimately, the VAX-11, another DEC successor to this model, introduced the more familiar hexadecimal notation (along with the 32-bit word size and true paging).

Getting Started

PDP-11 assembly is almost trivial enough that, with some experience, it can be written with a hex (oct?) editor. Unlke the VAX and x86 instructions, all instructions took up exactly one word size, and their operands have a consistent representation that is independent of the instruction for which they are encoded--such is the benefit of the orthogonal instruction set. Wikipedia's article on the topic contains more than enough information for anyone with a pencil, paper, and time to become an adequate PDP-11 (dis)assembler, and, indeed, you may want to become practiced in this if you plan to read through memory dumps.

But, of course, setting up an assembler and compiler is infinitely more pleasant. Most examples of assembly are written for the MACRO-11 assembler, an assembler which can still be found by an author who made a version of it for his Windows-based emulator (the source is still "cross-platform"). However, an artifact of building gcc and binutils is gas, the GNU assembler, which--when targetting the PDP-11--understands all of the MACRO-11 syntax I've thrown at it so far (though it is not its native syntax, you have been warned).

Without further ado, then, let's set up a gcc cross-compiler. This is a fairly fundamental step in compiling for any non-native architecture (I would be very surprised if a host viewing this is on a PDP-11) and tends to be the most imposing, though it's not as hard as it seems--no one seems to document it well.

First things first, gcc depends on a matching binutils somewhere--this is where it derives its assembly and various other features. Get a binutils snapshot (the latest and greatest version still works at the time of writing--I'm using 2.24) and extract it somewhere; I like doing so in /tmp because the source tree is still available in pure form as the downloaded archive if I need it:

   cd /tmp
   mkdir build
   cd build
   tar xvf /path/to/binutils-version-stuff.tar.bz2

When this is done (and it may take a little bit), we can set up the build. A relatively undocumented feature of most GNU compiler-related projects is that they expect an out-of-tree build--and may break if you try to build in tree--ssimho don't use "configure" from where it's situated! Although redundant, I like to put build directories inside the working directory of the repository, and name them with their target (so that I may build multiple targets at once).

Targets for binutils and gcc consist of, at the least, a machine architecture and a binary format for output, separated by a dash. We will be using pdp11-aout for this demonstration, as pdp11-elf does not compile in binutils at present. Besides, a.out format was the native binary format for this machine (when running any of the Unices or derivatives it supported).

At this point, you will also want to choose a prefix. The default is /usr/local, but that requires root privileges (sudo make install). If you do not have these, you can still install into a directory you own (like $HOME), but remember to be sure that the directory you choose is in the relevant PATHs (particularly, <dir>/bin in PATH and <dir>/lib in LD_LIBRARY_PATH).

Let's get to it, then. As with above, feel free to change the --target and --prefix arguments to configure:

   cd binutils-version-stuff
   mkdir build-pdp11-aout
   cd build-pdp11-aout
   ../configure --target=pdp11-aout --prefix=/usr/local
   make

Several minutes later (on fast machines; worse for slower ones), the build should finish without error. when that time comes:

   sudo make install

(or just make install if you don't have sudo--and you own the prefix directory.)

If your build errors out, you may have to choose a different target (especially in binary format). There are many ways the build can go wrong, so I couldn't possibly cover them all here. Just remember that, if you choose a different target, you will need to be consistent about it for the next step.

With binutils made and installed, you should be ready for gcc. The setup is about the same, so forgive me if I elide the details.

   cd /tmp/build
   tar xvf /path/to/gcc-version-stuff.tar.bz2
   cd gcc-version-stuff
   mkdir build-pdp11-aout
   cd build-pdp11-aout
   ../configure --target=pdp11-aout --prefix=/usr/local
   make
   sudo make install

If all goes well, after this procedure, you should be able to type pdp11-aout-gcc --version at a prompt and get back the version of GCC you just compiled.

If all hasn't yet gone well, it turns out that GCC building isn't exactly turn-key; fortunately, GCC developers are hosting an easy-to-read list of GCC dependencies, which includes their multiprecision libraries (MPFR, MPF, GMP). Don't worry, you only need these on the host platform, so you don't need to build them from source--though instructions vary widely between platforms you are building on, I found it sufficient to run the following on Ubuntu/Debian:

   sudo apt-get install libgmp-dev libmpfr-dev libmpc-dev

Other Linuces likely have similar packages available from their package managers, or at least ways of building these libraries if need be.

Similarly, you may have to turn off some of the runtime libraries. For example, if you get errors in the build process that resemble:

   configure: error: Can't find stdio.h.
   You must have a usable C system for the target already installed, at least
   including headers and, preferably, the library, before you can configure
   the Objective C runtime system.  If necessary, install gcc now with
    `LANGUAGES=c', then the target library, then build with `LANGUAGES=objc'.
   make[1]: *** [configure-target-libobjc] Error 1
   make[1]: Leaving directory `/tmp/build/gcc-5.1.0/build-pdp11-aout'
   make: *** [all] Error 2

While there may or may not be instructions at the end, the easier and more consistent thing to do is to simply disable building that portion--in this case, the run-time library for Objective C (not the Objective C compiler itself). The important bit is the make part of the error, configure-target-libobjc; anything past the configure-target- bit is the name of the feature; just go back to your ../configure line and add a disable for it:

   ../configure --target=pdp11-aout --prefix=/usr/local --disable-libobjc
   make

You will probably have to do this more than once. When I was through, I had to disable most of the runtime libraries, and stack smashing protection:

   ../configure --target=pdp11-aout --prefix=/usr/local --disable-libstdc++-v3 --disable-libssp --disable-libgfortran --disable-libobjc

The reason is simply that the runtimes for these languages usually expect "nice" features like <stdio.h> that aren't reliably in existence for the target architecture (libssp, additionally, requires some more convoluted features). One can fairly readily build a small, standalone libc using any of those that are about nowadays (newlibc and glibc come to mind), but I won't cover that here (yet). Keep in mind that the compilers are still already built at this point; however, using them will generate code that can't link at the moment.

Application Binary Interface

The ABI of the PDP-11, as I've observed it being emitted by GCC, is corroborated exactly by the UNIX v5 specifications. In particular, should you be looking to link assembly to C, you'll want to know the calling convention:

Register Usage Saved By
r0 Return registers (up to 32-bit) Caller
r1
r2 Local variables (3x16bit) Callee
r3
r4
r5 Frame pointer (fp, bp--rarely aliased)
r6 Stack pointer (sp--usually aliased)
r7 Program counter (pc--usually aliased)

Interrupts are a slightly different beast. Unlike x86-compatibles, the Interrupt Vector Table is fixed at location 0 in memory, and has a fixed size (sources argue on the size, but 256 words/512 bytes is a safe bet [citation needed]). All interrupts to the processor possess a vector, which is the (necessarily even!) address of two words in the IVT. Most hardware devices can be programmed, either via software or hardware, to alter their vector, whereas certain software instructions generate interrupts to vectors on their own. Some common ones are as follows:

Vector (octal) Cause
0 User device interrupt
4 Bus error, Illegal instruction, Stack overflow (like a General Protection Fault on x86)
10 Reserved instruction (attempt to execute privileged code)
14 BPT and trace trap (for debugging)
20 IOT trap
24 Power fail trap (raised immediately by the processor when the PSU detects loss of line power, but before it loses all power)
30 EMT trap (parameterized by the lower octet of the EMT, more or less)
34 TRAP trap (idem)
40 System software communication (id est, as the system developer, reserved for you :)
44
50
54
60 Teleprinter (TTY out) interrupt
64 Teletype keyboard (TTY in) interrupt
70 Paper tape punch (PT out) interrupt
74 Paper tape reader (PT in) interrupt
80 Device interrupts
374
400 Redline for Stack Overflow (attempts to address deferred through SP when SP is here or lower will raise Stack Overflow)

The documentation of many devices lists their vectors and their configurability.

The overall trap procedure can be emulated in a processor by the following MACRO-11 for some vector VEC:

   PSW = 177776
   mov @#PSW, -(sp)
   mov pc, -(sp)
   mov @#VEC+2, @#PSW
   mov @#VEC, pc

Note that the processor status word (PSW) is memory-mapped at 177776. Also note that the EMT and TRAP instructions are related; they both begin with (octal) 104xxx, where xxx is 0-377 for EMT and 400-777 for TRAP. From within a TRAP or EMT interrupt handler, you can determine the value of the instruction using something similar to:

   mov r0, -(sp)
   mov 2(sp), r0
   ...
   mov (sp)+, r0

It is critical that, from within a service routine, all registers are callee saved. After operation, assuming the stack pointer is back where it was when the handler was entered, one can use the RTI instruction to return from the handler, undoing the trap entry procedure:

   add #4, sp
   mov -2(sp), @#PSW
   mov -4(sp), pc

Do not use RTS pc for returning for an interrupt handler--you will corrupt the stack!

Bits <7:5> of the PSW (zero-indexed) are the Interrupt Priority Level. They specify a threshold above which an interrupt needs to be (in level) to cause service by the processor. Software interrupts occupy the lowest levels (0-3, respectively Stack Overflow, Trace Trap, Trap Instruction (as above), Bus Error), whereas interrupts (4-7) are reserved for hardware devices through four physical lines, named BR4-BR7. Most devices can be somehow configured to use a different level. If the software chooses to honor a bus interrupt, it will raise the matching BG4-BG7 line, which indicates to the device that it should put its vector on the bus. Note that you, the systems programmer, have the ability to control how nested interrupts can be by what the vector's stored PSW's IPL is set to. A good practice is setting the IPL to the expected level of the received interrupt.

Working with SIMH

SIMH is an excellent little historic computer simulator that includes support for the PDP-11, amongst a long list of other contemporaneous systems. There are various ways to get it, including downloading Windows binaries, getting Debian packages, or building it from source tarballs. I won't cover that build process (I got it from a dpkg myself :), but I promise you it should be trivial after building a cross-compiler.

For how well it works, SIMH is one of the most abhorrently-documented projects I've seen. For example, each simulator supports different load formats; the one we're interested in, the PDP-11, states in the distributed PDF document that "load" will receive "standard binary format tapes". No, these aren't .tar files. The only place you can find any documentation whatsoever on the format actually accepted is by cracking open the source:

/* Binary loader.
   Loader format consists of blocks, optionally preceded, separated, and
   followed by zeroes.  Each block consists of:
        001             ---
        xxx              |
        lo_count         |
        hi_count         |
        lo_origin        > count bytes
        hi_origin        |
        data byte        |
        :                |
        data byte       ---
        checksum
   If the byte count is exactly six, the block is the last on the tape, and
   there is no checksum.  If the origin is not 000001, then the origin is
   the PC at which to start the program.
*/

simh/PDP11/pdp11_sys.c, lines 218-237

It's actually not too terrible, if you forgive the fact that there's no documentation for the algorithm computing the checksum. We can ignore the fact that it takes three octal digits to represent a byte for the moment, since we'll be dealing with that anyway.

Lucky for you, I already pounded my head against the wall and banged out this "small" Python script that should do the right thing:

#PDP-11 crappy "terp" (tape) format for loading into simh/pdp11's terp dervs.

import struct
import sys

if len(sys.argv) < 2:
    print '''Usage: python mkterp.py {BLOCK} {BLOCK} {BLOCK}
where each {BLOCK} is:
	[-O <origin>] to set the origin (default 1)
	-d <file> a binary file to read in data (terminates the block).
OR:
	-o <fname> to set the output file name (last instance wins).
OR:
	-p <pc> to set the address at which the program shall start (last instance wins).
'''
    exit()

ofile = 'image.out'
pc = 1 #For some reason, 1 does not set PC to any special value on load.
blocks = [] #[{'org', 'fname'}]
org = 0

i = 1
while i < len(sys.argv):
	if sys.argv[i] == '-o':
		oname = sys.argv[i+1]
		i+=2
	elif sys.argv[i] == '-p':
		pc = eval(sys.argv[i+1])
		i+=2
	elif sys.argv[i] == '-O':
		org = eval(sys.argv[i+1])
		i+=2
        elif sys.argv[i] == '-d':
		fname = sys.argv[i+1]
		i+=2
		blocks.append({'org': org, 'fname': fname})
		org = 0
	else:
		print 'Unrecognized option:', sys.argv[i], 'skipped'
		i+=1

print 'Blocks to be put into %s:'%(ofile,)
for block in blocks:
	print 'File', block['fname'], '@', block['org']

of = open(ofile, 'wb')
for block in blocks:
	inf = open(block['fname'], 'rb')
	data = inf.read()
	inf.close()
	pkt = struct.pack('<HHH', 1, 6+len(data), block['org'])+data
	csum = 0
	for ch in pkt:
		csum = (csum + ord(ch)) % 256
	of.write(pkt + chr(256 - csum))
pkt = struct.pack('<HHH', 1, 6, pc)
csum = 0
for ch in pkt:
	csum = (csum + ord(ch)) % 256
of.write(pkt + chr(256 - csum))
of.close()
print 'Complete.'

The usage of this script should look something like the following:

   python mkterp.py -O ldaddr_A -d A -O ldaddr_B -d B -O ldaddr_C -d C ... [-o outfile] [-p startPC]

where A is to be loaded into memory starting at ldaddr_A, B is to be loaded into memory at ldaddr_B, and so on. The output file is cooked and ready to serve with "load" on the simulator command prompt.