Multics Technical Bulletin MTB-487 DM: Medium Range Goals To: Distribution From: André Bensoussan Date: 03/26/81 Subject: Data Management: Medium Range Goals 1 PURPOSE The long range goals for Multics Data Management have been documented in MTB 481. A subset of these goals has been selected to be the medium range goals for Multics Data Management, using the following guidelines: o The resulting system must be attractive to users with small and medium size data bases. It should be extendable to support very large data bases without changing most of its existing parts. o The selected features must be implementable within a 2 year period, to be included in one of the MR10 releases around the end of 1982. This memo represents a common agreement between the Phoenix group and the Cambridge group as to what our medium range goals should be. _________________________________________________________________ Multics project internal working documentation. Not to be reproduced or distributed outside the Multics project without the consent of the author or the author's management. Page i. MTB-487 Multics Technical Bulletin DM: Medium Range Goals Comments should be sent to the author: via Multics Mail: Bensoussan.Multics on System M. via US Mail: André Bensoussan Honeywell Information Systems, inc. 4 Cambridge Center Cambridge, Massachusetts 02142 via telephone: (HVN) 261-9334, or (617) 492-9334 Page ii. CONTENTS Page 1 Purpose . . . . . . . . . . . . . . i 2 Atomicity . . . . . . . . . . . . . 1 3 Reliability, Recovery . . . . . . . 1 4 Performance . . . . . . . . . . . . 1 5 File Implementation . . . . . . . . 2 6 Capacity . . . . . . . . . . . . . . 2 7 Concurrency Control . . . . . . . . 3 8 Work Involved . . . . . . . . . . . 3 iii Multics Technical Bulletin MTB-487 DM: Medium Range Goals 2 ATOMICITY o Start, commit and abort transaction will be implemented. o Checkpoint/Restart will also be provided. A checkpoint in the middle of a transaction will allow for a subsequent roll back of the transaction to stop at the checkpoint mark, and for the transaction to resume its execution at the checkpoint mark rather at the beginning. It is different from a commit in the sense that the transaction is not finished, the locks are not released and the work done is not committed. 3 RELIABILITY, RECOVERY An optional recovery mechanism will be provided. A before image journal will be kept and will be used to roll back aborted transactions in order to preserve the integrity of the data base in the following cases: - Quit not followed by start - Transaction abort - Process abort - System abort with successful ESD. An after image journal will be kept and will be used to reconstruct the data base in the following cases: - ESD failure - Media failure The reconstruction process will start with a saved copy of the data base to which it will apply all modifications recorded in the after image journal, up to the last committed transaction. The recovery facility should be optional. A user who does not want it, should be able to turn it off. 4 PERFORMANCE Our goal for performance is stated in terms of virtual CPU time as well as number of I/O's. In terms of virtual CPU time, MRDS, without journalization, must perform 5 times as fast as in MR6.5. In terms of I/O's, one should be able to perform a simple update with less than 10 input-output operations on data base pages and control structures, in a balanced configuration. Page 1. MTB-487 Multics Technical Bulletin DM: Medium Range Goals Reaching this goal will require improving MRDS itself, as well as the lower layers supporting it. The new interface that will be presented to MRDS, as a replacement for vfile, will provide an array of Access Methods, and will be better adapted to MRDS needs. It should contribute to improve the performance as a whole. 5 FILE IMPLEMENTATION The medium range goals do not include the large file implementation. Files will still be MSF's but they will not be directly accessible to programs that manage their content. All accesses will be made through a get and put interface, which will completely hide the way the file is implemented. Above this interface, the fact that a file is implemented as a single segment, an MSF, or any other construct, or even the fact that the file reside in a different computer, is immaterial. The get and put interface will assign segment numbers to MSF components, address them using segment number/offset, and will move the requested data to (or from) the user area. Since segment numbers assigned to components are never passed over the interface, the get and put module can reassign the same segment number to different components, as long as it keeps track of it. Since roll back will never be performed after an ESD failure, (at least in the medium range plan), it is acceptable to let page control write modified pages to disk, at any time, without any danger of compromising the integrity of the data base. 6 CAPACITY Some of the current MRDS limitations, such as the number of tuples per relation, come from the current vfile implementation and will be relaxed by the introduction of the new access methods. Some other limitations, such as the number of relations per data base and the number of attributes per relation, come from MRDS itself and will be relaxed by MRDS modifications that will be documented in a separate MTB. The capacity limitations due to MSF's will still be present; that is, the maximum size for a single file will remain 1 billion bytes. However, the maximum number of segments known to a process will no longer be a limitation associated with MSF implementation: the get and put interface will be capable of multiplexing segments numbers among components since it never passes segment numbers back to its callers. Page 2. Multics Technical Bulletin MTB-487 DM: Medium Range Goals 7 CONCURRENCY CONTROL The Multics supervisor will provide optional lock and unlock primitives, with deadlock detection. After evaluation of physical vs logical locking, logical locking has been chosen. This means that MRDS and the new Access Methods will be responsible for defining the objects to be locked, their lock identifiers, the granularity and the hierarchy of locks, and will call the supervisor primitives to perform the locking. The concurrency control facility should be optional. A user who does not want it should be able to turn it off. 8 WORK INVOLVED o Get and put interface to the storage system. o Access Methods: vfile will be replaced by a new interface to a set of Access Methods providing file, record, index and hash table management and whatever function is needed to the data base manager(s). This new Access Method interface will use the get and put interface to access the data base. It will not be called "vfile" and will not attempt to produce files which are bit by bit identical to those produced by the current vfile. A facility will be provided to convert old files into new files. The current vfile will still be present in the system but will no longer be used by MRDS. o Transaction manager o1 Checkpoint manager Lock manager o Recovery manager - Saved data base - After image journal - Before image journal o MRDS o Must be modified to use the new Access Methods interface. This is expected to be a major project; but the work involved cannot properly be evaluated at this point in time because no description of the new interface is available yet. The Access Methods Page 3. MTB-487 Multics Technical Bulletin DM: Medium Range Goals interface must be well specified prior to MRDS starting the coding changes to use it. o The new concurrency control may require MRDS to provide a new user interface. o Some of the current size limitations for the size of relations being supported will be relaxed. However, the support of very large data bases will probably require a complete rework of the methods used to do some of the relational operations such as joins. o A facility will be provided to convert old data bases into new ones. o Documentation Page 4.