MULTICS TECHNICAL BULLETIN MTB-757-01 To: MTB Distribution From: Rick Gray Date: October 20, 1986 Subject: ALM Symbol Table Support ----------------------------------- 1. Abstract This MTB describes features to be added to the ALM assembler that will be of interest to ALM programmers and those compiler writers who are using ALM as an intermediary. The features will include the ability to: - position text entry sequences, - specify more information in text entry sequences, - create a full or partial symbol table for debugging. These features will provide the assembler with symbolic debugging information that will be organized into a format known to probe and debug. The information will be supplied to the assembler using two new pseudo operators. These pseudo operators are described in this MTB and will allow the programmer or compiler writer to provide the assembler with information for the symbolic debugging of ALM programs. Note that this document does not describe any mechanism for the explicit specification of statement map information. Thanks to Ward Anderson who has done most of the work on symbol table support for ALM. ----------------------------------- Comments on this MTB should be sent to the author - via Multics mail to: JRGray.Multics on System M via telephone to: (403) 284-6400 (403) 284-6410 via forum on System-M to: >udd>m>DGHowe>mtgs_dir>c>c_imp (c) ALM Symbol Tables MTB-757-01 2. Preface The two ALM pseudo operators described in this MTB will provide ALM programmers and compiler writers with the opportunity to symbolically debug ALM programs using probe and debug. This MTB will limit itself to describing the use of the two pseudo operators. The first section describes the pseudo operator that will create entry sequences. The second section describes the pseudo operator that will create runtime symbols. The pseudo operator that will be used by compilers to specify statement map information for high level source code will not be included in this MTB. MTB-757-01 ALM Symbol Tables TABLE OF CONTENTS Section Page Subject ======= ==== ======= 1 i Abstract 2 1 Preface 3 1 Introduction 4 2 New Control Arguments 5 3 The 'ext_entry' Pseudo Operator 5.1 3 . . Function 5.2 3 . . Syntax and Arguments 5.3 4 . . Sections Of The Entry_block 5.3.1 4 . . . . Entry_sequence 5.3.2 4 . . . . Call_sequence 5.3.3 5 . . . . Offset_sequence 5.3.4 5 . . . . Transfer_sequence 5.4 6 . . Environment 5.4.1 6 . . . . Position of the entry_block 5.4.2 7 . . . . The Statement Map 5.4.3 7 . . . . Positioning the entry_block 5.4.4 7 . . . . Runtime Symbol Blocks 5.4.5 7 . . . . Scope Rules 6 8 The 'rt_symbol' Pseudo Operator 6.1 8 . . Function 6.2 8 . . Syntax and Arguments 7 10 Changes to the ALM Info Segment 7.1 10 . . New Control Arguments 7.2 10 . . New Pseudo Operators ALM Symbol Tables MTB-757-01 3. Introduction The creation of a C compiler on Multics has prompted the changes to ALM described in this MTB. The Multics C compiler has been written in such a way that it generates ALM source code as an intermediate step in the compilation process. As a result, the ALM assembler will be responsible for the creation of the object segment. Presently, the ALM assembler does not create an object segment containing symbolic debugging information. The C compiler specifications require symbolic debugging support for programs written in C. The responsibility of providing symbolic debugging support is passed to the ALM assembler. As a step towards this design goal, all programs written in ALM will be able to generate symbolic debugging information. This will be accomplished through the use of the appropriate command line arguments and the pseudo operators described in this MTB. An alternate solution was tested but found to be ineffective. It involved joining information created by operators such as vfd,zero,oct,etc to the symbol section. The complexity of creating the necessary links from one block of information to the next became very involved and time consuming so this approach was not implemented. MTB-757-01 ALM Symbol Tables 4. New Control Arguments The addition of two new control argument to ALM will allow for the control and use of the new symbol table features. The new control arguments will be called -table and -brief_table. Note that the 'ext_entry' pseudo-op will produce no debugging information unless -table or -brief_table is specified. The 'rt_symbol' pseudo-op is ignored unless -table is specified. A brief description of the two new control arguments follow: -brief_table, -bftb generate statement map information. This argument will cause ALM to produce the minimal symbol table information necessary for mapping ALM source lines to text locations. -table, -tb generate full symbol table information. This argument will cause ALM to produce symbol table information on statement locations and on runtime values and locations. ALM Symbol Tables MTB-757-01 5. The 'ext_entry' Pseudo Operator 5.1. Function The ext_entry pseudo operator will create: 1. an entry_sequence (as described in AG91-04, G-3). 2. a call_sequence comprised of instructions that will establish a stack frame and call the PL/1 operator ext_entry. 3. an offset_sequence with offsets to debugging information located in the linkage section and symbol block. 4. a transfer_section that will transfer control to the entrypoint. The above sections will be created to establish a C-like environment suitable for symbolic debugging. These sections will be collectively referred to as the entry_block in the rest of this MTB and are described in detail in Section 4.3. Note that the use of the ext_entry pseudo-operation will not leave| pr0 pointing to the argument list. The argument list pointer can | be accessed from its location in the stack frame (location 26 ie. | pr6|26). | 5.2. Syntax and Arguments Syntax: ext_entry elabel/clabel,stack_size,dlabel | . . . . elabel: . . Arguments: elabel (required) is the name of the label that identifies the entrypoint for the entry_block. clabel (optional) | is specified by following the elabel with a slash and then a | name. Give 'clabel' the value of the address of the code | sequence associated with the entrypoint. | stack_size (optional) if specified is a decimal number that specifies the size of the | stack frame. | MTB-757-01 ALM Symbol Tables if not specified is decimal 64 plus the number of words required by the temp,tempd and temp8 pseudo operators. If the stack_size is not an even multiple of 16, it is increased to the nearest multiple of 16. dlabel (optional) is the name of the label that identifies the descriptor information. There is no default value. 5.3. Sections Of The Entry_block 5.3.1. Entry_sequence The entry_sequence will be as described in AG91-04 G-3. The descr_relp_offset and reserved fields will be included in the entry_sequence if the dlabel argument is provided. They will be omitted if the dlabel argument is not provided. The address of the word identified by dlabel will be converted to an 18 bit offset relative to the base of the text section, and stored in the upper half of the word. The def_relp field will be set to the offset, relative to the base of the definition section, of the definition associated with this external entry. The flags fields will be determined as follows: 1. The basic_indicator field will always be set to "0"b. 2. The revision_1 field will always be set to "1"b. 3. If the dlabel argument is provided, the has_descriptors will be set to "1"b; otherwise it will be set to "0"b. 4. If the dlabel argument is provided,the variable field will be set to "0"b; otherwise it will be set to "1"b. 5. The function field will always be set to "0"b. ALM Symbol Tables MTB-757-01 5.3.2. Call_sequence The following instructions will establish a stack frame by calling the PL/1 ext_entry operator. eax7 stack_size " create stack frame epp2 pr7|28,* " set pr2 to base of PL/1 operators tsp2 pr2|549 " transfer to ext_entry and set pr2 5.3.3. Offset_sequence There will be two words in this section. They will be set to zero initially and remain zero unless the -table option is given as a command line argument. The first word will remain | zero but was once used to contain 2 times the number of arguments | expected by the entrypoint. This field is currently ignored by | operators and debugging tools. When an entry_block is created by | the ext_entry pseudo, a runtime_block will also be created, but | in the symbol block. The runtime_block will be present for debugging purposes only. The offset of the runtime_block, relative to the base of the symbol block, will be set by the assembler and stored in the lower half of the second word in the offset_sequence. The upper half of the second word will be set to the offset of the symbol_table link relative to the base of the linkage section. 5.3.4. Transfer_sequence The transfer section will be an unconditional transfer instruction that transfers control to the entrypoint of the external entry. This instruction will allow the entry_block to be separated from the first instruction in the program. This feature will prove useful when the programmer wishes to create a declaration section or include parameter information within the scope of the entry_block. The entrypoint of an external entry will be identified by the label whose name is that of the external entry. For example: ext_entry elabel . . . tra elabel oct 000000000001 " fill oct 000000000030 " fill elabel: lda 10,dl MTB-757-01 ALM Symbol Tables Note that the label used to identify the entrypoint should never identify the entry_block. For example: elabel: ext_entry elabel,100,dlabel lda 10,dl The result(s) of incorrectly specifying the entrypoint, as shown in the previous example, cannot be determined until runtime. The first word (if parameters are not specified) or two words (if parameters are specified) of the entry_sequence will be interpreted as instructions following the execution of the transfer instruction in the transfer_sequence. Assuming the first word(s) in the entry_sequence is(are) valid instructions, the most probable event will be stack overflow, as the call_sequence will be executed many times. The eax7 instruction will increase the value of pr7 to exceed its limit. There are currently no hooks in ALM to identify or prevent this situation from occuring. To avoid this situation, the example shown above should read: ext_entry elabel,100,dlabel elabel: lda 10,dl 5.4. Environment 5.4.1. Position of the entry_block The ALM ext_entry pseudo operator will create entry_blocks whose structure and function will be very similar to those of external entrypoints in a PL/1 program. If used in the same context, they will perform an identical function. An important characteristic of an object segment generated by the PL/1 compiler is the position of the entry_block with respect to the code associated with that external entry. The entry_block always precedes the code, so control flows sequentially from entry_block to the first instruction in the body of the program. This configuration facilitates the sequential order of the statement_map, which is a list of structures in the symbol section, used by symbolic debuggers. ALM Symbol Tables MTB-757-01 5.4.2. The Statement Map A statement_map will be generated for ALM source when the -table or -brief_table command line arguments are specified. The assembler will produce a statement_map based on ALM instructions found in the "alm_probe_table_$optable". The table will be created by omitting from defops.incl.alm those pseudo operators and instructions found in alm_probe_list.incl.pl1. There will be a separate statement_map associated with every runtime_block. The statement_maps will be contiguous, but separated by an invalid statement_map entry. The invalid statement_map entries will be intended as markers only and can be bypassed by advancing the position counter in probe. If the statement_map is not ordered sequentially, it is malformed and the debugging facilities do not function properly. It will be the programmer's responsibility to ensure the entry_block precedes the body of the program in ALM. 5.4.3. Positioning the entry_block ALM allows the programmer to position and relocate regions of the text section using the use and join pseudo operators. This ALM feature will not adversely affect the statement_map. The assembler will compensate for the reordering and relocation caused by the use and join pseudo operators. An ext_entry pseudo operator located after the body of the program in the source file may be positioned before the body in the object segment without compromising the integrity of the statement_map. This will be useful when the stack_size is not known until the end of the program, and code has already been emitted. 5.4.4. Runtime Symbol Blocks The assembler will create runtime symbol blocks when the -table argument is specified. A runtime symbol block will be created for every label symbol and every symbol defined by the temp , tempd and the temp8 pseudo operators. 5.4.5. Scope Rules The C language allows symbols to be known locally or globally. Local symbols are known in the function where the declaration occurs. Global symbols are not declared within a function and are known to all functions sharing the same source file. The ext_entry pseudo has been designed to facilitate such a feature. Symbols in the ALM environment are those established using the temp , tempd , temp8 and rt_symbol pseudo operators as well MTB-757-01 ALM Symbol Tables as labels. The following rules are used to determine the scope of a symbol: 1. All symbols appearing before the first occurence of an ext_entry pseudo operator in the source file will be regarded as global. 2. After the occurence of the first ext_entry pseudo in the source file, symbols will be local to a single runtime_block. Each ext_entry pseudo will have an active and inactive state. An ext_entry will become active when first encountered in the source file. Any previously active ext_entry will becomes inactive, and be replaced by the new occurence. An ext_entry will remain in the active state until the next ext_entry is encountered or the end of the source file is reached. All symbols occuring while there is an active ext_entry will be known locally to the runtime_block associated with that ext_entry. 6. The 'rt_symbol' Pseudo Operator 6.1. Function The rt_symbol pseudo operator will create a runtime symbol block within the symbol table when certain requirements are fulfilled. The requirements are as follows: 1. The program must be assembled with the -table option. 2. The ext_entry pseudo operator must be used to create entrypoints. This pseudo operator will not reserve stack space. The symbols that will be created with this pseudo operator will be windows to previously established memory locations. 6.2. Syntax and Arguments Syntax: rt_symbol location,level,atrtributes,type,name[a1:a2:a3] Arguments: location (required) is an expression that results in an arithmetic value. The result of the expression is the offset of the symbol relative to the base of the stack frame. Locations start at offset 64 (decimal). ALM Symbol Tables MTB-757-01 level (optional) is the structure level of the symbol. Level 0 indicates the symbol is not part of a structure. If the level field is ommitted, the level is assumed to be 0. attributes (optional) are a list of one or more of the following symbol attributes: a. aligned b. unaligned c. external d. internal e. static f. unsigned g. signed h. based type (required) is one of the the following symbol types: Name PL/1 equivalent ============================ a. char char(1) aligned b. integer fixed bin(35) signed aligned c. short fixed bin(17) signed aligned d. long fixed bin(71) signed aligned e. double float bin(63) signed aligned f. float float bin(27) signed aligned g. pointer pointer aligned h. struct --- i. string char(?) name (required) is the name of the runtime symbol. The name may be up to 32 characters in length and may include underscores. [a1:a2:a3] (optional) are array indices for all types except for the 'string' type. When used in conjunction with the 'string' type, the first indice is interpreted as the length of the string. MTB-757-01 ALM Symbol Tables 7. Changes to the ALM Info Segment The following changes should be made to the info segment alm.info: 7.1. New Control Arguments Insert descriptions for three new control arguments -table, -brief_table, and -no_table. -brief_table, -bftb generates partial symbol table giving correspondence between ALM source line numbers and object locations. -table, -tb generates full symbol table. This will generate debugging information used to find the location of ALM source lines and information about runtime symbols and variables. -no_table, -ntb do not generate debugging information. (Default) 7.2. New Pseudo Operators Insert descriptions for the two new pseudo-ops: ext_entry label{/code_label}{,size}{,name} -- make a probe-able | entry sequence for 'label' with stackframe | size of 'size' and with descriptors at label | 'name'. If 'code_label' is specified then it | is assigned the value of the address of the | code associated with the entry sequence. | rt_symbol loc,{,level}{,attributes},type,name{[a1:a2:a3]} -- Define runtime symbol at stack offset 'loc', structure level 'level', type 'type', name 'name', attributes as specified and indexes (a1 etc) as specified.