Multics Technical Bulletin                                MTB-634
DM:  Shutdown

To:       Distribution

From:     Lee A. Newcomb

Date:     10/11/83

Subject:  Data Management:  System Shutdown

1 ABSTRACT

     A data management system should be shutdown when the Multics
system it is running on is shutdown.  This allows data management
to be  made available more  quickly to users in  the next Multics
bootload by  avoiding crash recovery.   It also gives  some extra
insurance that users' protected files are consistent.  The reader
should note  that some hardcore  changes are required  to support
new inter-process signals (IPS) and the required static handlers.

Comments should be sent to the author:

via Multics forum:
   >udd>Multics>Spratt>meetings>DMS_Development

via Multics Mail:
   Newcomb.Multics on System M or
   LNewcomb.Multics on MIT Multics.

via US Mail:
   Lee A. Newcomb
   Honeywell Information Systems, Inc.
   4 Cambridge Center
   Cambridge, Massachusetts 02142

via telephone:
   (HVN) 261-9332, or (617) 492-9332

_________________________________________________________________

Multics  project  internal  working  documentation.   Not  to  be
reproduced or distributed outside the Multics project without the
consent of the author or the author's management.



                            CONTENTS

                                                         Page

                 1 Abstract . . . . . . . . . . . . . .     i
                 2 Introduction . . . . . . . . . . . .     1
                 3 Basic Shutdown Steps . . . . . . . .     1
                 4 Start shutdown and warn users  . . .     2
                    4.1 Mark DMS State  . . . . . . . .     2
                    4.2 Stop New Transactions . . . . .     2
                    4.3 Warn Users  . . . . . . . . . .     3
                    4.4 Set Daemon Timer  . . . . . . .     3
                 5 User Shutdown  . . . . . . . . . . .     3
                    5.1 Mark DMS State  . . . . . . . .     3
                    5.2 Signal User Processes . . . . .     4
                    5.3 Set Daemon Logout Timer . . . .     4
                 6 Final Shutdown . . . . . . . . . . .     4
                 7 Problems . . . . . . . . . . . . . .     5
                    7.1 Invalidating User DMS
                     References . . . . . . . . . . . .     5
                    7.2 IPS' (inter-process signals)
                     Ignored  . . . . . . . . . . . . .     5
                    7.3 Hardcore Changes to support new
                     IPS' . . . . . . . . . . . . . . .     6


Multics Technical Bulletin                                MTB-634
DM:  Shutdown

2 INTRODUCTION

     The  major objective  of Data  Management is  to keep users'
protected  files  consistent.   To  this end,  there  are several
termination  mechanisms  employed  to recover  from  user process
deadlocks,  a Multics  system crash  (with or  without ESD), etc.
The  shutdown of  a running  Data Management  System (DMS)  is an
extra guarantee  that protected file are  correct.  As important,
users  will  see a  DMS  available more  quickly after  a Multics
bootload since crash recovery, which  can be very time consuming,
is avoided.

     DMS  shutdown  will  normally  occur just  prior  to Multics
system  shutdown.   There  are   also  occasions  when  a  system
administrator  may  wish to  shutdown a  DMS without  taking down
Multics.  For example, there may  be some priority jobs that must
run  and  do not  use  any protected  files.   It is  possible to
shutdown the DMS until these jobs  are finished and bring it back
up.  It is  expected these occasions will be  very rare; however,
it  is easy  to add this  capability to Data  Management and also
facilitates development testing.

     It must be  remembered that DMS shutdown is  not an absolute
necessity;  the DMS  crash recovery mechanism  will put protected
files back in order.  If DMS shutdown does complete, the recovery
at the next  DMS bootload will have nothing to  do.  For the same
reason, shutdown does not have  to complete:  recovery will still
rollback any  transactions left over.   The result is  less crash
recovery time and faster availability on the next bootload.

     The reader is assumed to be familiar with the MTB's covering
the initialization and recovery of a DMS.  These are MTB numbers:
508, 592, and 603.

3 BASIC SHUTDOWN STEPS

     Following  are the  basic steps  in the  shutdown of  a Data
Management System.  These will be discussed in detail later.  The
objective  is  to  have  no  transactions  in  progress  when the
caretaker  Daemon  of the  DMS logs  out, implying  all protected
files and before journals are closed (and therefore consistent).

    o  The  DMS  state  is  set  to  "shutdown  warning";  no new
       transactions     are     allowed     to     begin.     The
       "dm_shutdown_warning_" inter-process signal  (IPS) is sent
       to  the  current users  of  the DMS  to  warn them  DMS is
       shutting  down and  there is  a finite  amount of  time to
       finish their work.


MTB-634                                Multics Technical Bulletin
                                                    DM:  Shutdown

    o  When the  time limit for  users to finish  is reached, the
       DMS    state   is    set   to    "user   shutdown".    The
       "dm_user_shutdown_" IPS is sent  to any remaining users of
       the DMS.   The default action  for user processes  in this
       case will be to call transaction_manager_$user_shutdown to
       finish any active transactions,  close all protected files
       and journals, and invalidate their per-process DM data.

    o  When all transactions have been finished, the DMS state is
       set to "normal shutdown" and the Daemon logs out.  At this
       time,  all protected  files in this  system are consistent
       and crash recovery will do nothing on the next bootload of
       DMS.

4 START SHUTDOWN AND WARN USERS

     At  some time,  someone or  something decides  a running DMS
should be shutdown and informs  the DMS' caretaker Daemon.  It is
anticipated  this  will  normally  occur when  it  is  decided to
shutdown the Multics system running  the DMS.  There will also be
an administrative  interface to allow a  privileged user to start
DMS shutdown.

4.1 Mark DMS State

     The current  state of the  DMS (in dm_system_data_)  must be
set to "shutdown warning".  This  is in case the caretaker Daemon
dies  before completing  the shutdown  tasks.  A  new Daemon will
note this state  and pick up the shutdown  work instead of trying
to continue normal DMS operation.

4.2 Stop New Transactions

     No  new  transactions  will  be  started  once  shutdown has
started.        This       is      enforced       by      calling
transaction_manager_$begins_off  to  set  a  global  flag  in the
current  DM  system.   This  does not  prohibit  currently active
transactions from continuing.


Multics Technical Bulletin                                MTB-634
DM:  Shutdown

4.3 Warn Users

     Send  the "dm_shutdown_warning_"  inter-process signal (IPS)
to all users  of the current DMS.  The  default static handler in
the user ring for this IPS reports to the user the amount of time
remaining  to  finish  a  transaction  before  DMS  user shutdown
actually  occurs.   If  the  process  does  not  have  an  active
transaction, the static  handler will act as if  the user's grace
time has expired.  See the "USER SHUTDOWN" section below.

4.4 Set Daemon Timer

     The DMS caretaker Daemon then sets a timer to wake itself up
when the user  grace time is over to force  users out of the DMS.
This  will  be the  DMS shutdown  time for  users, not  the final
shutdown time.

5 USER SHUTDOWN

     When  the  Daemon's timer  for user  shutdown goes  off, all
active transactions must be aborted  or abandoned (this allows us
to  use the  normal rollback  procedures for  shutdown instead of
writing  new  ones).   In addition,  all  users of  the  DMS must
invalidate  their  references to  DMS per-system  and per-process
data.   This is  mainly to  avoid segment  faults in  the DM ring
(which is  generally lower than  a user's login ring)  if the DMS
bootload  directory is  deleted (expected  to be  the most common
case).

5.1 Mark DMS State

     The current state of the  DMS (in dm_system_data_) is set to
"user shutdown".   Again, this is  in case the  current caretaker
Daemon dies and a new one must pick up the shutdown work.


MTB-634                                Multics Technical Bulletin
                                                    DM:  Shutdown

5.2 Signal User Processes

     The Daemon sends the "dm_user_shutdown_" IPS to all users of
the DMS.   The default static  handler in the user  ring for this
signal        will        call       the        new       program
transaction_manager_$user_shutdown.       This      will     call
transaction_manager_$abandon_txn  if  the   user  has  an  active
transaction  so the  Daemon may  rollback the  active transaction
using  the  currently  existing   code  for  this  function.   In
addition, the user_shutdown entry  will invalidate the user's DMS
per-process  data  (e.g.  dm_data_,  lm_data_) and  references to
per-system  tables,  and  terminate   the  Data  Management  ring
transfer vectors.  This termination allows  the user to use a new
DMS if  one is booted  again in this  Multics bootload; otherwise
the user must new_proc.  This type  of shutdown is expected to be
rare and may only be used in development and testing.

5.3 Set Daemon Logout Timer

     The Daemon now sets a timer  for when it is to logout.  This
is  much  like the  user  warning timer:   it defines  the finite
amount of  time the Daemon has  to cleanup transactions abandoned
by users.   This may be  unneccesary if shutdown  is occurring as
part of Multics shutdown.

6 FINAL SHUTDOWN

     When  there  are  no more  users  of the  DMS  bootload, the
caretaker Daemon  will mark the  DMS state as  "normal shutdown".
It      will      then      call      the      new      procedure
dm_dir_$old_bootload_dir_disposition  to either  rename or delete
the  current bootload  directory just  as the  DMS crash recovery
mechanism would when it is finished.

     When  the  above two  steps  are finished,  the  Daemon will
logout.  It may  logout without doing any of  the above if forced
by the Multics operator.


Multics Technical Bulletin                                MTB-634
DM:  Shutdown

7 PROBLEMS

7.1 Invalidating User DMS References

     Users  who  do not  have  active transactions  must  also be
notified  that the  DMS is shutting  down so  they may invalidate
their per-process  data and references to  per-system tables, and
terminate  references  to  the   Data  Management  ring  transfer
vectors.  This is only a concern  when Multics is not going down,
just the DMS,  and it is expected that the  DMS will be re-booted
within  this Multics  bootload.  If  a user  has references  to a
previous DMS (now inactive), the user's process will take segment
faults in  the Data Management  ring if attempts are  made to use
the shut down DMS.

     There are  several options in  this case.  One  method is to
follow the  scheme presented in the  main description of shutdown
above.  This is more work for the Daemon and requires more coding
effort, but  is easier for  users and for  booting multiple DMS's
within the same Multics invocation for development testing.

     The most  convenient solution for development  is to only be
concerned  with  users  having  active  transactions.   Since DMS
shutdown  will usually  coincide with  Multics shutdown,  the DMS
will not be  re-booted to cause segment faults  in the inner ring
for a  user without an  active transaction at  DMS shutdown time.
If the DMS is re-booted, a warning  could be sent to all users to
new_proc or call the user_shutdown entry in transaction_manager_.

     It would  also be possible  to handle segment  faults in the
inner  ring  code,  but  the  faults  would  require considerable
analysis.  There are better ways to use our time.

7.2 IPS' (inter-process signals) Ignored

     It    is    possible   for    the    user   to    mask   the
"dm_shutdown_warning_"  and  "dm_user_shutdown_"  IPS'.   In this
case, the user process may  never recieve the shutdown warning or
call  the user_shutdown  entry.  This  is an  unavoidable problem
with the way  things are done.  The Daemon  certainly cannot wait
forever for  the user process  to respond.  The first  step is to
simply ignore  the fact an  active transaction still  exists when
the time comes for the Daemon to logout.


MTB-634                                Multics Technical Bulletin
                                                    DM:  Shutdown

     An  alternative  solution  is  for the  caretaker  Daemon to
forcibly  take  over  any  transaction not  abandoned  by  a user
process  which ignores  the "dm_user_shutdown_"  IPS.  The Daemon
would process those transactions  given up voluntarily first, and
then  takes  over  any  left over.   In  addition,  users without
transactions, but  with DMS per-system  bootload tables initiated
must   be   "kicked   out".    This   all   amounts   to  running
transaction_manager_$user_shutdown for a user (I wonder if we can
charge extra for this?).  This will require some modifications to
transaction_manager_  to be  able to take  over transactions; and
will  probably  also increase  the Daemon's  working set  to keep
pointers to all users' per-process data in the DM ring.  There is
a minor advantage to this:  force takeover could allow for future
handling of transaction  timeouts by the Daemon on  an active DMS
to ease holding of before journals, deadlocks, etc.

     Another solution is to force  the user to logout, which will
cause  the  Daemon to  be  notified that  the  TDT entry  for the
process needs  cleaning out.  This is  a rather drastic situation
that only matters if the DMS  is being shutdown, but Multics will
stay  up,  and  the  DMS  re-booted  later  in  the  same Multics
invocation.  If the force takeover of user transactions and force
invalidation  of user  DMS data  method is  used, this  method is
unnecessary.  If  required, a temporary  interface interface with
the Initializer to destroy the process could be created.

7.3 Hardcore Changes to support new IPS'

     This  is  not  strictly a  problem,  just a  subtask  of the
strategy  presented above.   Two new IPS  need to  be created and
static handlers written to take care of the signal.  In addition,
some programs that deal with  the character representation of the
IPS  names must  be modified  (e.g.  sys_info, create_ips_mask_);
and  the  default static  handlers  must be  setup in  all stacks
greater than the  DM ring (except when the DM  ring IS the user's
login ring, mods.  required to make_stack_).

     It is also being proposed that the four character limitation
on the name  of an IPS be expanded to  32 characters.  This gives
us  the  ability  to name  the  IPS' according  to  function more
clearly  and make  them self-documenting.  Instead  of the "dmw_"
and "dms_"  signals with the  four character restriction,  we get
"dm_shutdown_warning_" and "dm_user_shutdown_".

     The system programs dealing with IPS are limited enough that
the above changes should not be hard to do.