This article shows how Multics mechanisms for segmentation, kernel and user rings of execution, and execution stacks cooperate to implement debug breakpoints in a secure, efficient, and extensible fashion. Another article describes the Multics Execution Environment in more detail.
Specifically, this article analyzes how the debug command uses a Master Mode Entry 2 mme2 instruction to implement a breakpoint in an object segment. Executing this instruction causes a MME2 fault in the running program, which invokes the debug command, permitting in-depth investigation of the suspended program.
The analysis grew from a shorter summary of this topic, assembled to help Charles Anthony in repairing the dps8m simulator's handling of the Restore Control Unit rcu instruction. Mapping of a user program fault onto a condition signaled in the program's ring of execution may be of interest to other readers. It is used for several fault types (overflow, divide check, mme, mme1, mme2, mme3).
NOTE: Several links in this analysis open the Processor Manual (AL39), and jump to a specific page in the PDF file. The Google Chrome browser does the best job of handling each download. It opens to the correct page; and subsequent links to other pages reuse the AL39 document in the browser's cache. As this analysis is published, other browsers have flaws in this mechanism.
|
ContentsA Simple Program Output from this Program Object Code Debugging the Program Setting a Breakpoint Execution at a Breakpoint Ring 0: Breakpoint Instruction Generates a Fault Executing a Breakpoint Instruction Fault Cycle Fault and Interrupt Vectors Fault Cycle for MME2 Ring 0 Fault Handler: fim$signal_entry Ring 0 Exit Routine: signaller Preparing to Call the Outer Ring Signal Mechanism User Ring: Breakpoint as a Condition Support Routine: return_to_ring_0_ - return_to_ring_n_ Label Support Routine: signal_ Support Routine: sct_manager_ Handler for the Breakpoint Support Routine: debug$mme2_fault Restarting Execution at the Breakpoint After Debug Handles the Breakpoint Support Routine: return_to_ring_0_ - Primary Entrypoint Return to Ring 0 Ring 0 Entrance Routine: restart_fault Validating User Ring Control Unit Data Restarting the Verified Machine Conditions User Ring: Program Continues Execution |
COMPILATION LISTING OF SEGMENT hello Compiled by: Multics PL/I Compiler, Release 33e, of October 6, 1992 Compiled at: Arrakis Compiled on: 01/28/17 1126.7 mst Sat Options: table list 1 hello: proc; 2 3 dcl ioa_$nnl entry() options(variable); 4 5 call ioa_$nnl("Hello,"); 6 7 call ioa_$nnl(" world!^/"); 8 9 end hello;
This analysis begins as a thought experiment, with creation of a user program in which a debug breakpoint can be set. To keep things simple, we'll use a "Hello, world!" program. The PL/I compiler listing shows source line numbers for this program.
The program is modified from its simplest form to permit setting a breakpoint between output of "Hello," and " world!". This will give visual confirmation of: where the breakpoint is set; when it is actually reached; and when the program ends.
hello
Hello, world!
r 07:53 0.032 0
Just the line "Hello, world!" is expected as output when no breakpoint is set. Input lines are shown in blue font.
This analysis uses the debug command to set a breakpoint in source line 7 of the program. Code generated by PL/I is shown in this compiler listing fragment. In the PL/I listing, instruction operands are given in decimal. For example, at instruction location 000027, the operand of the ldaq instruction is an offset of -19 words from the current instruction counter ic. We shall see that the debug command displays such operands in octal, rather than decimal.
STATEMENT 1 ON LINE 7 call ioa_$nnl(" world!^/"); 000027 aa 777755 2370 04 ldaq -19,ic 000004 = 040167157162 154144041136 000030 aa 6 00102 7571 00 staq pr6|66 000031 aa 057000 2350 03 lda 24064,du 000032 aa 6 00104 7551 00 sta pr6|68 000033 aa 6 00102 3521 00 epp2 pr6|66 000034 aa 6 00112 2521 00 spri2 pr6|74 000035 aa 777743 3520 04 epp2 -29,ic 000000 = 524000000011 000036 aa 6 00114 2521 00 spri2 pr6|76 000037 aa 6 00110 6211 00 eax1 pr6|72 000040 aa 004000 4310 07 fld 2048,dl 000041 aa 6 00044 3701 20 epp4 pr6|36,* 000042 la 4 00012 3521 20 epp2 pr4|10,* ioa_$nnl 000043 aa 0 00622 7001 00 tsx0 pr0|402 call_ext_out_desc
The analysis continues by using debug to set a breakpoint in the hello program.
The following debug command and debugging requests:
debug /hello/&a7,s call ioa_$nnl(" world!^/"); ,i3 LINE NUMBER 7 27 777755 2370 04 ldaq -23,ic 000004 = 040167157162 154144041136 30 6 00102 7571 00 staq pr6|102 31 057000 2350 03 lda 57000,du +1< Break 1 set at hello|30 (line 7) At instruction: 30 6 00102 7571 00 staq pr6|102
Debug sets a breakpoint in object code by saving the instruction at the breakpoint location in a per-user segment called the breakmap1.
The .bl request lists all the breakpoints saved in the breakmap for the selected object segment.
.bl
Break 1 set at hello|30 (line 7) At instruction:
30 6 00102 7571 00 staq pr6|102
After a breakpoint is set, the program location now contains a Master Mode Entry 2 mme2 instruction, with an operand set to the break number. When executed, this instruction causes an event that results in signalling the mme2 condition.
-1,i3 LINE NUMBER 7 27 777755 2370 04 ldaq -23,ic 000004 = 040167157162 154144041136 30 000001 0040 00 mme2 1 31 057000 2350 03 lda 57000,du .q r 07:53 0.149 0
hello
Hello,Break 1 at line 7 of hello - at 367|30
When a breakpoint is set, running the hello program encounters the breakpoint. The debug on-unit for the mme2 condition accepts debugging requests.
.t
DEPTH NAME CONDITION
0 initialize_process_|704
1 listen_|34522
2 abbrev_processor|14103
3 command_processor_|2101
4 hello|30 mme2
A stack trace request shows the hello command being invoked, with execution stopped at hello|30 with a mme2 condition.
5 return_to_ring_0_|0 6 signal_|10017 7 sct_manager_$call_handler|11461 8 mme2_fault_handler_|3756 9 mme2_fault|6622 10 db_parse|14311
However, the routines that signal the mme2 condition and invoke debug also have frames on the stack. These support routine stack frames are shown in gray here, because they are usually omitted from the debug stack trace. This focuses user attention on the program that reached a breakpoint. Later sections of this analysis reveal how these support routines are invoked by execution of the mme2 instruction.
.c
world!
r 12:11 0.116 1
A continue request tells debug to restart program execution; the original instruction at the breakpoint is executed, followed by remaining code in the hello object, which outputs the second part of hello's output line. When the program ends, Multics displays a standard ready line.
Note that the breakpoint mme2 instruction remains in the object segment until explicitly reset by the user, or until program source is recompiled. Thus, subsequent program invocations repeat the scenario outlined here.
To learn how a breakpoint interrupts execution of its containing program, start with the components introduced above.
When the running program executes the mme2 instruction, it causes a MME2 fault event to occur. The processor suspends execution of the program running in the user ring, and begins a Fault Cycle in ring 0.
The Control Unit is the section of the processor that oversees the flow of instruction executions. Two sorts of events can temporarily suspend program execution and redirect the flow to other instructions: faults and interrupts. When such events occur, the processor begins a Fault Cycle to redirect instruction execution. For a MME2 fault at hello|30, or for other faults and interrupts, the Fault Cycle includes the following:
The Fault Cycle references a fault or interrupt vector entry specific to the particular fault or interrupt that has occurred. Such entries are defined in the Multics fault vector structure1, which holds several arrays containing:
Each interrupt or fault vector entry is a pair of instructions to be invoked by the Execute Double (xed) instruction generated by the Fault Cycle. For Multics vector entries:
When the hello program executes the breakpoint mme2 instruction, that instruction starts a MME2 fault cycle. The MME2 fault has a fault number of 21. Its fault vector entry is at address 1521. The xed instruction executes the pair of instructions in that fault vector.
The fault vector entry begins with an scu instruction, which copies the Safe-Stored Control Unit Data to the location given in the scu data ptr for the MME2 fault: pds$signal_data.scu2. The Process Data Segment pds is a per-process, ring 0 segment created in the process directory. Its pages are wired in memory while its owning process is running on the processor.
The xed execution continues with the second instruction of the MME2 fault vector entry, which transfers to the ring 0 handler for this fault. The fault handler ptr for MME2 points to fim$signal_entry, an entrypoint in the Fault Interceptor Module.
At this point in handling the MME2 fault, only three instructions have been executed:
None of these instructions has changed, nor depends upon, any pointer register or register of any kind. All program-visible registers still contain data they held when the fault occurred.
The ring 0 Fault Interceptor Module: fim.alm1 is a ring 0, kernel segment whose pages are wired in memory. This prevents taking a page fault while processing another type of fault. The fim operates in privileged instruction mode, with interrupts inhibited2 to prevent I/O interrupts and other low-priority faults from interrupting its handling of the fault or interrupt.
The fim defines several fault handler3 entrypoints. Most of these provide similar services, but store data for different groups of faults in distinct locations. The MME2 fault is one of a group of faults that are handled directly in the faulting ring of execution. The fault event is converted to a condition that is signaled in this faulting ring4. The fim$signal_entry handles such faults.
The first operation performed by the handler saves register contents as they were at the time of the fault. These are stored in the pds$signal_data (adjacent to the Control Unit Data saved by scu instruction). The handler begins with a Store Pointer Registers as ITS Pairs spri instruction. It continues with a Store Registers sreg instruction. The handler can now use these registers for its own purposes.5
Next, the handler uses the fault number, stored in the Control Unit Data at mc.scu.fi_num, to access an internal array of fault information (at fim's fault_table: label). This table:
The fim$signal_entry call_signaller code path ends fault handling by: storing the condition name at pds$condition_name; and transferring to signaller$signaller, to complete handling of the MME2 fault.
Notice that fim performs a simple set of operations: reading from static data tables, and storing data in permanent segments. It does nothing that requires temporary storage, and thereby avoids creating (pushing) a stack frame onto any kernel stack segment.
The fim handler captured machine conditions at the time of the MME2 fault, and chose a condition name to be signaled when reporting the fault event in the outer ring. The next step is to actually signal that condition on the stack of the faulting program, which is usually an outer ring stack. The ring 0 signaller.alm segment1 performs this operation.
signaller is a ring 0 segment that operates in privileged instruction mode, with interrupts inhibited to prevent I/O interrupts and other low-priority faults from interrupting its extended handling of the fault.
signaller$signaller performs checks for different user environments when selecting the stack on which to signal the condition.
For a MME2 fault caused by a debug breakpoint, the selected stack is most likely the stack for the user's login ring. That is the ring containing object segments in which breakpoints can be set.
Having decided on which stack to signal the condition, the signaller creates a new stack frame at the end of this stack1.
At this point, signaller is running in ring 0 without a stack frame. There are no active frames on any stack for this process. The diagram shows the frame for hello highlighted in orange, because execution of the program owning that stack frame was suspended by the MME2 fault.
When signaller creates the new stack frame on the selected stack, it sets the Stack Pointer Register pr6 pointing to that frame. It assigns ownership of the frame to the program entrypoint return_to_ring_0_$return_to_ring_0_.2,3,4
Signaller then copies machine conditions for the fault into the new stack frame.
Signaller also copies the name of the condition to be signaled from pds$condition_name into the new stack frame.
Signaller constructs an argument list to pass the saved information to the outer ring signal mechanism. The call is of the form:
call signal_( condition_name, mc_ptr, null, null );
Signaller loads a pointer to this argument list into the Argument List Pointer Register pr0. Signaller loads a pointer to the outer ring signal program from the outer ring stack header stack_header.signal_ptr into pr3.
Finally, signaller constructs an outer-ring pointer to the external label: return_to_ring_0_$return_to_ring_n_. Then it "returns" through this pointer, using a Return Control Double rtcd instruction.
Returning to the original ring of execution, a mme2 condition is signaled to indicate that execution has reached the program breakpoint.
Code in return_to_ring_0_1 is now running in the outer ring. The current Stack Frame Pointer (pr6) points to its stack frame, just created by signaller. Therefore, that stack frame is shown in green.
Interrupts and low priority faults are no longer inhibited. So any pending interrupts or faults (e.g., ring alarm faults) that arose while running in ring 0 code can now occur.
Code at the return_to_ring_n_ label consists of only four instructions:
nop nop nop callsp bb|0
The three nop instructions give the processor an opportunity to process any pending interrupts or low-priority faults delayed while running in ring 0 with interrupts inhibited.
Officially, the callsp instruction is known as a call6 instruction. Its operand bb|0 is an ALM synonym for pr3|0, which contains the outer ring signal_ptr value. call6 references pr6, which correctly points to the return_to_ring_0_ stack frame. And signaller loaded the Argument List Pointer Register (pr0) for this call with the address of the arg_list constructed in this stack frame (shown in the diagram above).
This call invokes the outer ring signal mechanism.
In the user ring, stack_header.signal_ptr points to the signal_1 program. This program supports two kinds of condition handlers. The normal handler is an on-unit attached to the stack frame of a running program. This on-unit is active only while the stack frame appears on the stack.
However, debug must implement breakpoints set in a program when debug is no longer running. For example, breakpoints set in a program should be handled if some program failure triggers creation of a new process. The user reinvokes the program being debugged from a command, without restarting debug.
For such cases, signal_ supports a less-used type of condition handler called a static handler. Before searching for a normal condition on-unit, signal_ calls sct_manager_$call_handler, asking it to invoke any static handler for the condition being signaled. It passes the following information:
call sct_manager_$call_handler( mc_ptr, condition_name, ..., continue);
If sct_manager_$call_handler can handle the condition, it sets continue="0"b to stop further signalling.
The sct_manager_1 looks in the System Condition Table (SCT) for static handlers. The SCT for a ring is located just after the declared stack_header structure in the Stack Header region (first page) of the stack segment. The stack_header.sct_ptr points to the SCT.
For each possible value of mc.fcode (the machine conditions fault code), a handler slot is reserved in the SCT. For MME2, the fault code is 21. sct_handler(21) points to process_overseer_$mme2_fault_handler_. This entry was added to the SCT by initialize_process_ when the process was created.
sct_manager_ calls the static handler, with all of its arguments. It permits the handler to decide how to set the continue argument.2
The debug break handling entrypoint is invoked to accept debugging requests.
process_overseer_$mme2_fault_handler_ is a proxy handler that actually calls the debug$mme2_fault static handler. It passes all incoming arguments to the debug handler.
debug$mme2_fault1 is called with a pointer to the machine conditions captured for the MME2 fault. These are the machine conditions placed in the return_to_ring_0_ stack frame by signaller. debug uses them to locate the breakpoint that executed the mme2 instruction, and to find that breakpoint entry in the breakmap.
It calls db_parse2 to parse incoming debug request lines from the user: requests to trace the stack, set or reset breakpoints, continue execution after a breakpoint is hit, etc. For a .c request, it calls db_break$restart to continue execution from the point at which it was suspended by the current breakpoint.
db_break$restart1 handles restarting of execution when continuing from a debug breakpoint.
When continuing from a regular breakpoint that is still set, db_break$restart wants to continue from the breakpoint by executing:
But the mme2 instruction still remains at that breakpoint location in the object segment, ready to break during a future execution of this hello code path.
To execute the instruction originally at the breakpoint, db_break$restart replaces the mme2 instruction at mc.scu.even_inst with the original instruction from the breakmap entry.2,3,4
db_break$restart then returns to the restart_fault: action in debug$mme2_fault.
Upon return from db_break$restart, the debug$mme2_fault static handler returns to its caller, the proxy static handler.
The proxy handler, process_overseer_$mme2_fault_handler_, sets its continue parameter to "0"b, telling signal_ the mme2 condition has been handled. It then returns to its caller.
The sct_manager_$call_handler that invoked the static handler now returns to its caller.
signal_, seeing continue="0"b, returns to its caller.
In Support Routine: return_to_ring_0_ - return_to_ring_n_ Label (see above), signaller exited ring 0 by doing a rtcd instruction to the return_to_ring_0_$return_to_ring_n_1 label. Code at that external label called the outer ring signal_ program.
But signaller set the owner of the stack frame to the primary entrypoint: return_to_ring_0_$return_to_ring_0_. It also set the stack_frame.return_ptr to this same code location. Thus, when signal_ returns, the return operator transfers to this code.
Primary entrypoint code is:
return_to_ring_0_: callsp restart_fault_ptr,* call into ring zero even restart_fault_ptr: its 75|0
restart_fault_ptr2 points to the restart_fault|0 entrypoint in segment 758.
The kernel must validate the machine conditions to be restarted. If acceptable, it restarts execution based upon the registers and instructions in those machine conditions.
restart_fault1 is a gate into ring 0,2 with ring brackets: 0,0,7. Like all gate segments, the first section of code is a transfer vector with an entry for each of the permitted gate entrypoints. The two entrypoints are:
000000 tra restart_entry 000001 tra cleanup_entry
restart_fault operates with interrupts inhibited to prevent I/O interrupts and other low-priority faults from interrupting its restarting of faulting code.
restart_fault is co-framer (along with signaller) of storage in return_to_ring_0_'s stack frame. It knows the layout of the automatic variables in that stack frame; in particular, it uses the machine conditions (mc) structure located in this stack frame. And while signaller creates the stack frame, restart_fault removes it if the fault is restarted.3
restart_fault's restart_entry code prepares for a possible restart as follows:
The outer ring can make changes to the Control Unit Data to correct the cause of the fault, and request ring 0 to then restart using this corrected instruction data. restart_fault must ensure these changes:
Therefore, restart_fault imposes the following restrictions on the Control Unit Data to be restarted.
After verification, restart_fault continues:
It uses information in the copied machine conditions to restart execution where the fault occurred. This begins with restoring registers, in reverse order of usefulness (because restart_fault is using pointer registers to access the machine conditions).
So index registers, A, Q, etc. are restored first, using an lreg instruction. Then pointer registers are restored, using an lpri instruction.
Execute a Restore Control Unit rcu instruction to load the verified, possibly modified Control Unit Data from mc.scu to the processor Control Unit internal registers. This restarts execution of the faulting program in its original ring of execution.
It is important to note that the rcu instruction, may only be used when running in privileged mode. Attempting to restore damaged or malicious Control Unit Data could:
Therefore, its use is restricted to the ring 0 kernel software.
debug does modify the machine conditions to run the original instruction at the breakpoint, which it copied from the breakmap into mc.scu.even_inst.
When the machine conditions are restarted, the hello program continues execution with the original instruction that was at the breakpoint, followed by the remaining instructions in the program.1
Note that the Control Unit reads instructions from memory 2-words at a time, starting with an even location. Since most instructions fill one word, this means two instructions may be present in the Control Unit Data. The breakpoint examined here was at an even location. So the mme2 instruction and the lda that followed it were in the Control Unit when the fault occurred.
When restarting from the machine conditions altered by debug, debug replaced the mme2 instruction causing the fault with the original staq instruction in the mc.scu.even_inst. It left the lda untouched in mc.scu.odd_inst. Thus, execution continued with: the two instructions loaded from machine conditions into the Control Unit; followed by remaining instructions starting at hello|32.
As the hello program continues execution, note that the breakpoint remains set in the object segment. The next traversal of this code path will also stop at the breakpoint.
The breakpoint remains in the object segment until: it is reset via the debug .br request; or until the source is recompiled.
Original 08 Feb 2017 GCD