Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

To:       Distribution

From:     Douglas Howe

Date:     26 April 1985

Subject:  Multics C Implementation Specification

1.  Abstract

This document contains the specifications  required to bring up a
System V Release 2.0 compatible C  on Multics.  The C compiler on
Multics  will  be  as  Multics  compatible  as  possible  without
becoming incompatible with System V Release 2.0 C.(1)

Changes will be marked with change bars.                          |

Comments should be sent to the authors:

     via Multics mail to:

        DGHowe.Multics

     via posted mail to:

        Douglas G.  Howe
        Advanced Computing Technology Centre
        Foothills Professional Building
        1620 29th St., N.W.
        Calgary Alberta Canada   T2N-4L7

     via telephone to:

        (403)-270-5400
        (403)-270-5437 (Howe)

     via forum on System-M to:

        >udd>m>DGHowe>mtgs_dir>c>c_imp (c)

_________________________________________________________________

Multics project  internal documentation; not to  be reproduced or
distributed outside the Multics project.

(1) Unix and System V Release 2.0 are registered trademarks of AT
    & T


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

                        TABLE OF CONTENTS

Section    Page  Subject
=======    ====  =======

1             i  Abstract
2             1  Preface
3             2  Introduction
3.1           2  . . Goal
3.2           3  . . References For This Document
4             4  Execution Environment
4.1           4  . . Stack Disciplines
4.2           4  . . Argument List Creation
5             4  Object Segment Format
5.1           5  . . Symbol Section
5.2           5  . . Statement Map
6             6  Entrypoints
6.1           6  . . Main Entrypoints
7             7  Calling Conventions
7.1           7  . . Calling C to C
7.2           7  . . Calling a Main Program
7.3           7  . . Calls from C to Non-C Procedures
7.4           8  . . Calls from Non-C Procedures to C Functions
8             8  Storage Allocation
9             9  Code Conventions
9.1           9  . . Forbidden Instructions
9.2           9  . . Use of Pointer Registers
9.3          10  . . Identifiers
10           11  General Information
10.1         11  . . Data Type Sizes
10.1.1       12  . . . . Conversion of Data Types


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

2.  Preface

This document defines the format of a C object segment on Multics
and  describes how  C programs  should use  pl1_operators_.  This
MTB, MTB-647 and  the other related MTBs, are  intended to supply
most of the documentation needed  to implement the C compiler for
Multics.

We wish to thank those people  who have made this possible either
by  creating  tools  for  analysis or  through  input  of subject
matter.  These  people are Ron  Barstad, Greg Baryza,  Rick Gray,
Steve Herbst,  Dave Mason, Audrey  Neal, Tom Oke,  Doug Robinson,
Melanie Weaver and Brian Westcott.


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

3.  Introduction

3.1.  Goal

The  goal  of  this  project  is to  create  a  Multics  Native C
compiler.   This  compiler  will  allow the  porting  of existing
software and the use of basic  Multics tools.  The compiler is to
be compatible with System V Release 2.0 while losing as little as
possible  of Multics.   It will be  accompanied by  the C runtime
library with  some routines redesigned to  understand the Multics
environment.   To  accomplish  this  goal  the  compiler  will be
divided into  a two versions.   These versions can  be defined as
follows:

 I.  Demo Compiler
     This  version of  the compiler will  be used to  bring up C.
     This will  be done using an  alm(1) intermediate; C programs
     will be translated to alm  source code, and then compiled on
     Multics.   In the  initial transfer  of the  compiler a Unix
     system  will  be used  to  generate the  alm source.   It is
     intended that  this version will  be usable in  some form to
     allow third party software to be brought to Multics.

 II. Production Compiler
     This will be  the first general release of  the compiler and
     will  be  an  extension of  the  demo compiler.   It  is not
     decided  at  this point  if  the second  version  will still
     generate  alm  source or  if  it will  do  object generation
     directly.    The   second   version   should   include  some
     improvements in efficiency and will  be able to use probe to
     some extent.   The full definition  of this version  will be
     given at a later date.

_________________________________________________________________

(1) Assembler Language Multics


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

3.2.  References For This Document

1) MTB-647 created by Greg Baryza.

2) The C Programming Language
   Kernighan, Brian W.  & Ritchie, Dennis M.
   Prentice-Hall (1978)
   Englewood Cliffs, New Jersey

4) Multics Programmers Reference Manual (10.2 AG91-03A)
   (hereafter referred to as MPRM)

5) MTB 689 titled The C Runtime System on Multics by Doug Howe.

6) MTB 691  titled The C  External Execution Environment  by Doug
   Howe.

8) MTB entitled the  Multics Link Editor by Dean  Elhard and Doug
   Howe.

9) MTB 707 entitled C Required Changes To ALM Specification.


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

4.  Execution Environment

The execution  environment to be used  by the production compiler
on  Multics  will allow  the use  of Multics  user tools  such as
probe, trace and profile.  It will be compatible with the current
PL1 environment.

The Multics  standard execution environment is  documented in the
MPRM Appendix H.

4.1.  Stack Disciplines

Like  most  other languages,  C will  use the  same stack  as the
Multics   command  environment   for  its   local  storage.   All
activities that  affect the size  of the stack,  such as pushing,
popping   and   extending  stack   frames,   will  be   done  via
`pl1_operators_'.

4.2.  Argument List Creation

In  Multics,  all  calls  that  pass  arguments  should  create a
structure defining where  the arguments can be found  and where a
set of  descriptors defining their  data type can  be located.  A
complete description can be found in the MPRM H-20.

C has no runtime requirement for argument descriptors.  Therefore
argument descriptors will not be  included in the demo version of
the compiler.   These argument descriptors  will be added  to the
production version of the compiler as required for the support of
various Multics tools.

If required a  new descriptor structure and a  new method for the |
calling sequence will be designed.                                |

5.  Object Segment Format

The  object  segments  generated by  the  C compiler  will  be in
Multics standard  format by default due  to the use of  alm as an
intermediate  language.   This  format  is  defined  in  the MPRM
Appendix G.  Declarations for  all structured items are included.
There are two  exceptions to the format:  the  Symbol Section and
the Statement Map.


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

5.1.  Symbol Section

Due  to  the  use of  alm  as  the intermediate  language  of the
compiler, C  will be lacking complete  Symbol Section information
in the demo version.  Complete Symbol Section information will be
added as  a function of the  C compiler or as  a series of pseudo |
ops added to ALM (see MTB 707).                                   |

5.2.  Statement Map

Pseudo  ops  in  alm or  direct  object creation  will  achieve a
statement  map in  the production  version of  the compiler.  The
Statement Map will refer to  the original source segment.  Macros
will be seen in their non-expanded form.


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

6.  Entrypoints

All entrypoints (except for static  and main_ -- see below), will
be defined as external entrypoints  refering to the pl1_ops entry
ext_entry to perform the stack set up.

All entries of functions that push their own stack frames must be
preceded  by the  structured information described  for the entry
sequence on page  G-3 of the MPRM.  This will  be generated by an |
ALM pesudo op as defined in MTB 707.                              |

6.1.  Main Entrypoints

Due  to  Multics standard  entry procedure  the C  `main' program
would not  be found by  the standard searching  method.  For this
reason, as  well as allowing a  place for initial set  up to take
place, C programs containing a  `main' program will have an added
entrypoint called `main_' as is currently done with Fortran.  The
definition of the  entrypoint `main' will be that  of an external
entrypoint.

The  entrypoint main_  will have to  perform a  series of precise
functions.  These functions will be  fully defined in another MTB
entitled  The   C  External  Execution   Environment  (MTB  691).
Initially `main_'  will be a separate  program generated and link
edited with the main program.


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

7.  Calling Conventions

Copying of  arguments to be passed  by value will be  done by the
caller.  As  usual in Multics, if  the name of the  routine to be
called  contains  a "$"  it will  be  assumed to  be of  the form
segment_name$entry_name.

There  are four  different situations that  involve calls.  These
are:   calls  from  one  C function  to  another,  calls  to main
programs, calls from  C to non-C procedures and  calls from non-C
procedures to C functions.

7.1.  Calling C to C

A  call  from  C to  C  will be  done  directly with  the  use of
`pl1_operators_'.   The  types  of   the  arguments  will  be  as
described in MTB 647.

7.2.  Calling a Main Program

C  progams will  be callable  in two  ways:  one  through `main_'
expecting  it's   arguments  in  the   standard  Multics  command
processor  format;  and  through  `main' which  will  expect it's
arguments in the standard C  Argc, Argv format.  The normal entry
sequence  for  a  C program  will  be via  the  command processor
linking to `main_'.  Within an  execution unit calls to main will
be  resolved  to  the  standard C  entry  `main'.   Although both
entrypoints are accessable to the  user, it will remain the users
responsibility to  ensure that the  correct values are  passed as
parameters.

7.3.  Calls from C to Non-C Procedures

C  will be  able to  call non-C  functions if  the non-C function
being called understands the data  types being passed to it.  For
this reason  only pointers and  some basic arithmetic  data types
will be compatible with non-C languages.


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

7.4.  Calls from Non-C Procedures to C Functions

Non-C  functions  will  be able  to  call  C if  the  C functions
understand the data types being  passed to them.  For this reason
only  pointers  and  some  basic arithmetic  data  types  will be
compatible with C.

8.  Storage Allocation

C  will follow  the Multics standard  for the  allocation of it's
variables.   The only  exception to this  standard is  due to the
definition  of C  external variables.   C external  variables are
defined   by   the   normal   C    environment   to   be   on   a
per-execution-process basis, while Multics external variables are
on  a per-login-process  basis.  For  this reason  C external and
static variables will be allocated  as a normal external variable
but  the  execution unit  will  be expected  to  be linked  as is
defined in MTB 691.


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

9.  Code Conventions

Multics has a  few conventions that must be  followed.  The major
conventions are listed in the following paragraphs.

9.1.  Forbidden Instructions

C will  not use any of  the alm instruction set  which may become
obsolete in future releases of Multics.

9.2.  Use of Pointer Registers

Pointer  registers  are widely  used  in Multics  because  of the
segmented  address  space.   Everything  outside  of  the current
segment  must be  addressed via  a segment  number.  Some pointer
registers have defined uses:

- PR6 should always point to the current stack frame.

- PR0   is   set   by    the   operators   for   programs   using
  `pl1_operators_'.  It  points to the  `pl1_operators_' transfer
  vector  except during  a call, when  it points  to the argument
  list.  The  following instruction can  be used to  reset PR0 to
  the  transfer vector:   epp0 pr7|28,*  where PR7  points to the
  base of the stack.

- PR4  is  usually  used  when a  pointer  to  the linkage/static
  section is needed.   The entry operators store it  in the stack
  frame,  so  it can  be reloaded  by the  following instruction:
  epp4 pr6|36,*

- PR7 points to  the base of the stack segment  when a program is
  entered.  It may  be reused by the program  and reloaded by the
  instruction epbp7 pr6|0,*

`pl1_operators_'  does  not  save  the  values  of  other pointer
registers across calls.


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

9.3.  Identifiers

In the  demo version of  the compiler variable  names will likely
have maximum length  of 32 characters.  The name  must be made up
of  at least  one character followed  by a  series of characters,
numbers, a $ or the underscore character.  If the identifier name
contains  a  single $  it will  be taken  to represent  a Multics
external  identifier.   The  following  bnf  style  grammar  will
explain the variable name.

 <character>      ::= a|b|c|d|.....|y|z|A|B|C|D|......|Y|Z|_
 <character str>  ::= <character> | <charcter> <character str>
 <number>         ::= 0|1|2|3|.....|8|9 | <number> <number>
 <identifier>     ::= <character>[ <character str>| <number>| "$"]*

External    Identifiers     on    Multics    have     the    form
"segname$entry_name" where:

 <segname> ::= <character>[<character str>| <number>]*
 <entry_name> ::= <identifier>
where []* means zero or more.


Multics Technical Bulletin                                MTB-688
Multics C Impl. Spec.

10.  General Information

10.1.  Data Type Sizes

At the time of this writing, the following sizes are proposed for
the basic data types:

short int       (36/18) bits                      (half/word) aligned
int             36 bits                                  word aligned
long int        72 bits                           double word aligned
unsigned int    36 bits                                  word aligned(1)
unsigned long   72 bits                           double word aligned(2)
char             9 bits / char                           word aligned
float           36 bits (8  bit exponent
                         28 bit mantissa)                word aligned
double          72 bits (8  bit exponent
                         64 bit mantissa)         double word aligned
pointer ITS     72 bits                           double word aligned
pointer packed  36 bits                                  word aligned

In the  demo version of the  compiler short int types  will be 36
bits long and  will be word aligned.  Hopefully,  short ints will
be 18 bits  long and half word aligned  in the production version
of the compiler.

_________________________________________________________________

(1) This is a change from MTB 647

(2) This is a change from MTB 647


MTB-688                                Multics Technical Bulletin
                                            Multics C Impl. Spec.

10.1.1.  Conversion of Data Types

Conversion of C Pointers will be handled as follows.

1. The size of a pointer in C will be 72 bits.
2. Conversion of  a value of zero  in an int will  lead to a null
   pointer or to a pointer value of -1|1.
3. Conversion of  int to pointer  or pointer to int  will be done
   via the pack and unpack pointer instructions.
4. No conversion will  take place on the passing  or receiving of
   pointers as parameters.
5. Conversion of pointers  to long ints or long  ints to pointers
   will be done directly on a bit to bit relationship.
6. Conversion of a null pointer will  lead to an integer value of |
   zero.                                                          |