MULTICS TECHNICAL BULLETIN MTB-666 From: W. Olin Sibert Date: July 4, 1984 Subject: New Logging Facilities To: MTB Distribution ABSTRACT This MTB describes a new facility for collecting and displaying information in log files. Because logged information is critical to system integrity and security, the mechanism must be as robust as possible. It must also operate in hostile environments, such as the ring zero supervisor and system initialization. Consequently, it is implemented using very simple techniques for managing log contents, and with minimal dependence on other system facilities. This MTB covers the following topics: * Overview * Organization of new log segments * New interfaces for perusing logs * Log message format * Replacing the existing syserr mechanism * Subroutine interfaces for log messages * Appendix A: MR11 SRB Notice * Appendix B: Info files * Appendix C: Summary of changes to existing programs * Appendix D: Differences from prototype implementation The initial result will be replacement of the existing syserr log mechanism. The new primitives are designed, however, so that any other application (such as the Internet) requiring logged information can easily be converted to use them. Comments should be sent to the System-M forum meeting: >udd>Multics>Sibert>logging>logging or via Multics mail to Sibert -at System-M. _________________________________________________________________ Multics Project internal working documentation. Not to be reproduced or distributed outside the project without consent of the author or Director, Multics Development Center. Introduction MTB-666 INTRODUCTION: Why do we need a new set of log mechanisms? Well, for starters, we already have two mechanisms, incompatible with each other. There is a "general-purpose" log mechanism, used principally by the Answering Service and Message Coordinator, and another used solely for recording syserr messages (messages produced by the supervisor and other privileged code). The new mechanism ultimately will replace both of the existing log mechanisms, though for MR11.0, only the syserr log is due to be replaced. The new mechanism has many advantages when compare to either of the current ones. Problems with the current mechanism are detailed in the next section; succeeding sections describe various aspects of the new mechanism. The important advantages of the new mechanism are: 1) Greatly improved robustness. It is intended that it will be impossible to damage new format logs in such a way as to cause the log perusal tools or the log subroutines to malfunction catastrophically. Of course, in the case of damage, information will be lost, but the system will continue to operate. 2) Reduced storage usage. The new logs require less storage for overhead information than either of the current formats. 3) Improved efficiency. The new mechanism can write messages faster than the existing syserr mechanism, which is important for handling the higher volume of messages from increased B-2 security auditing. 4) Improved log reading tools. The new log printing command, print_sys_log, allows more selection options than either of the current ones. It is integrated with the new log monitoring command, monitor_sys_log, so that now the syserr log can be monitored. The performance of syserr log printing is also greatly improved, by elimination of a message copying step. 5) Simple application interface. The new log subroutine interfaces are easily used in arbitrary application environments. Introduction MTB-666 6) Better support of binary data in messages. The ability to specify an interpretation procedure for binary messages means that non-system applications will be able to create their own binary data message types and interpret them using standard software. Binary data may now be up to 16K words long. 7) Lockless operation. New format logs are manipulable entirely without locks, making them usable in any environment without worrying about going blocked, etc. Current Mechanisms MTB-666 General Purpose Logs: The general-purpose log mechanism is very limited in its capabilities: log messages are fixed in size, limited to ASCII strings, and the only attributes they have are a time stamp and a severity. This format of log is also very easily confused by file system damage (pages of zeros), although because the log messages are fixed format, a page of zeros can only do a limited amount of damage. A general-purpose log consists of a family of segments, each of which records the name of the previous member of the family. Like the messages themselves, the segments are fixed in size and format: each one holds precisely 2048 messages, and, when it overflows, a new segment is automatically created. By convention, the segments for any particular log are found in two directories: one directory where the "live" log segment is, as well as any that have been created since the last time they were copied, and another directory to which filled log segments (except for the live one) are moved, once per day, by the crank. If the crank is not used to copy filled log segments, they continue to accumulate indefinitely in the first directory. General-purpose logs are used solely by user-ring programs. They are simply segments in the directory hierarchy, and vulnerable to all the types of damage that the directory hierarchy can sustain. The Syserr Log: The syserr log is handled by a completely separate mechanism. The ring zero supervisor, and certain other privileged programs, needs the ability to write messages that may be of interest to system administrators and maintenance personnel. Rather than simply writing all such messages to the operator console, they are written to a per-system log called the syserr log (possibly in addition to being written to the console). Because the supervisor runs in many differently restricted program environments, it is important that it be able to create and log syserr messages in any of those environments. Additionally, because the supervisor must be able to run without the presence of an operational file system and directory hierarchy, the syserr log must exist outside the hierarchy so it can record messages during system initialization. These goals are achieved by having each syserr message travel through three different places before finally coming to rest. When the supervisor calls syserr, the message is placed into a wired buffer segment in ring zero (syserr_data), and optionally written to the console. Current Mechanisms MTB-666 Because the wired buffer is small, and also because it does not correspond to any permanent location on disk, as soon as a message is written to the wired buffer, a wakeup is sent to a hardcore process (one that runs only in ring zero) called the SyserrLogger daemon. Upon receiving the wakeup, the daemon copies the message from the wired buffer into a ring zero paged segment (syserr_log). The syserr_log segment is specially treated during system initialization. Unlike most other paged ring zero segments, it has a permanent home on disk, and persists across bootloads. Rather than creating it during initialization, the supervisor merely constructs a page table describing this permanent region of disk (the LOG partition) and makes it accessable in ring zero. This is done early in Collection 2 initialization; prior to this point, no syserr messages are logged, and those that would have been logged are written to the console instead. Once in the syserr_log segment, the message stays until it is copied out into the permanent syserr log, a keyed vfile called >sc1>perm_syserr_log. This copying is performed by the Initializer: at system startup, at system shutdown, and whenever the syserr_log segment gets to be more than a settable percentage full. Although messages are copied as quickly as possible from the wired buffer, they may stay in the syserr_log for a long time-- this is done to avoid awakening the Initializer for every single message. The syserr log perusal tools must therefore be able to read messages from both the ring zero segment and from the keyed vfile. Since these have two very different formats, a utility program, syserr_log_util_, exists to search for and read messages as if they were in only one place. It is important that the syserr log be as available and robust as possible, because it is used to record important messages about system damage, and for security auditing. Unfortunately, despite this requirement, it is actually a very fragile thing. The ring zero syserr_log is easily damaged by a non-ESD crash, and although a program is automatically run to inspect for such damage, the damage is not fixed, and the program will not necessarily even find some types of damage. Similarly, the perm_syserr_log vfile is very vulnerable: any file system damage can make the vfile index wholly unusable, and there are no salvaging tools available. Messages in the syserr log are more complex than those in general-purpose logs. Syserr messages can be arbitrarily long, and additionally can contain an arbitrary amount of binary data, which can be interpreted by the log printing commands. The Current Mechanisms MTB-666 binary data is used to avoid the necessity of translating information (such as pathnames) into a printable representation at the time the message is written, since that may not even be possible in the environment where the message is generated. Syserr messages also have a sequence number, in addition to a timestamp. The sequence number is supposed to be unique for the life of the system, and monotonically increasing; in practice, however, this laudable goal is compromised by the inability to detect and repair damage to the sequence numbers kept in the ring zero syserr_log segment. All knowledge of how to interpret binary syserr messages is kept in a single program (print_syserr_msg_), which makes it awkward to add new types of binary messages; consequently, this part of the facility is little used outside the file system part of the supervisor. New Log Structure MTB-666 STRUCTURE OF NEW LOGS: The new logging mechanism is designed so that it can be used by syserr, as well other applications requiring logs of events. It provides all the features of the present syserr mechanism, and has some additional features for other applications. The new mechanism makes great improvements in robustness, and should solve the reliability problems caused by syserr today. There are two main levels of structure in the new mechanism: log segments, the lower level, and families of log segments, the higher level. The two levels are distinguished primarily by the environments in which they can run: the low level interface can run in any environment, any ring, and does not require a file system. The higher level requires a file system, and can only run outside the supervisor. Log Segments: The basic unit of a new log is the log segment. Each log segment is a self-contained collection of log messages, and a header, identifying the log and its contents. The subroutines dealing with contents of individual log segments(1) can all be run in any environment. A single log segment can be updated concurrently by multiple processes. In order to eliminate the need for explicit locking, concurrent updating is handled by a mechanism that uses STACQ to simultaneously reserve space in the segment and assign a message sequence number. Once the space has been reserved, the caller can fill it in as desired. The lockless updating strategy guarantees that any message in the log will have a sequence number greater than a message preceding it in the segment.(2) This monotonic increase of sequence numbers and storage allocation is the only thing guaranteed by the primitives. In particular, and messages may not appear in the log in correct time sequence, because the caller supplies the _________________________________________________________________ (1) log_segment_, log_search_, log_initialize_, and some entrypoints in log_salvage_ and log_wakeup_ (2) Unless one of log_segment_$create_message_number or log_write_$general was used to write the message; however, these entrypoints are intended for use only when a single process knows that it is the only one updating the log segment. New Log Structure MTB-666 time after the message storage is reserved. Also, the time and sequence limits in the header may be slightly inaccurate. Any errors are assumed to be small, however, and the log_search_ routine is coded to assume sufficient slop. A log segment contains a flag indicating whether it is currently "in service" or not. This flag is used to accomodate the higher level message writing interfaces; when a log segment is found to be full, it is taken out of service, and a new segment created. Whenever a log segment is initialized, it must explicitly be placed "in service" before the low level primitives will reserve message space within it. When log_segment_ is called to reserve space for a message, it returns an error code if the log segment is damaged, out of service, or if there is no room to allocate the message. It is then up to the caller to handle this condition, and this is what the log segment family interfaces do. To be used with the low-level subroutines, a log "segment" need not be a whole segment, or begin at offset zero in a segment, although the all the higher level subroutines do enforce that restriction. An application (such as the syserr replacement) can take advantage of this by using small "logs" as a temporary home for messages before copying them into a log occupying a full segment. An individual log segment may be salvaged, if damaged, and this will be performed automatically where appropriate. In general, if there is any valid information available in the log segment, it will be found and recovered. Sentinels are used in log messages to allow the salvager to locate messages as reliably as possible. Each log segment is a distinct entity in terms of salvaging; no information is required from outside the segment in order to perform the best possible reconstruction. New Log Structure MTB-666 Contents of a Log Segment: Each log segment contains a header, followed by a block of unstructured space containing sequentially allocated messages. The header of the log segment contains the following items: * Offset of next free word within the log segment, and next sequence number to be used. * A flag indicating whether this log is currently in use. * Name of the log segment family to which this segment belongs; this is not necessarily ever the entryname of the segment in the file sysetm (see below). * Sequence numbers of the first and last messages in this log segment; if the log is being updaetd by multiple processes, this information may be slightly inaccurate. * Time stamps for the first and last messages in this log segment; as with sequence numbers, this may be slightly inaccurate. * List of up to 25 processes "listening" for new messages to be placed in the log, and a minimum time interval between wakeups to avoid excess wakeups. * Pathname of previous log segment in this family; this is set when a log segment fills or is migrated. This is only meaningful when the log segments occupy whole segments. Header information is copied from the old to the new log segment whenever a log segment fills up and is replaced with a new one. New Log Structure MTB-666 Families of Log Segments: When a log segment fills, another segment must be created to contain new messages. In all but the syserr environment, this is done with file system operations: the current log segment is renamed, a new one is created, the header and control information is copied, and the new one is placed in use. In the syserr environment, this operation is somewhat more complex, but effectively does the same thing. Like old style logs, each log segment contains (in the header) its own name, and the name of the previous segment in the family. When a log segment fills, it is renamed and and a new log segment is created; the headers of both logs are updated to record the names appropriately. Older (non-live) log segments have a date/time suffix of the form YYYYMMDD.HHMMSS, which gives the time that the log segment was taken out of service and replaced by an empty one. The time is calculated in GMT to avoid time zone problems. Because of this suffix (which is 16 characters long), the original name of a log must not be more than 16 characters long. A family of log segments consists of one or more "live" segments at the beginning (usually only one, except for syserr, where there may be two), and zero or more "history" segments. The "history" segments all must have a name consisting of the log family name followed by a date/time suffix. The "live" segments are identified by pointer, and their names in the file system are not used; however, the "family name" in the header of the live segments must be the name of the log segment family. The names for the "history" segments are the ultimate arbiter of their positions in history, and those names must not be changed. When reading messages from a log family, all log segments with appropriately constructed names are sorted into chronological order and searched in that order. This makes the history mechanism robust against damage that may have destroyed some segment in the middle of the history; while the contents of that particular segment will be lost, previous messages will still be easily accessable. Because the names of the segments themselves are used when searching the history, the only use for the previous log pathname in the log segment header is to identify directories where earlier segments may be found. The previous log pathname is also used when locating newer logs when reading from a log family after the initial opening. New Log Structure MTB-666 Subroutine Interfaces for Log Families: The primary interfaces for families of log segments are log_read_ and log_write_. These are user-ring programs that read and write messages using the appropriate primitives, and switch between segments as necessary. The log_write_ subroutine is responsible for renaming old log segments and creating new ones as they fill. The log_read_ subroutine is responsible for searching through families of log segments and switching between them as messages are read. When a log segment fills and is replaced by a new one, a pointer to the old segment remains valid; however, log_read_ will not automatically update its knowledge of the segments that make up a log family to find the new segment. As long as the first segment in the family is still current, log_read_ will be able to find the newest message in that segment. There is also a log_migrate_ interface which is used by the administrative tools to copy whole log segments from one directory to another, updating the previous log pathname as it does so. Reading New Format Logs MTB-666 NEW INTERFACES FOR LOG READING: There is a new command interface for reading logs, and a new subroutine interface as well. The new command interface is called "print_sys_log", and is based on the existing print_syserr_log and print_log commands. The subroutine interafce is called log_read_, and is similar to the old syserr_log_util_ interface. Both the print_syserr_log command and the syserr_log_util_ subroutine are eliminated by this installation. The print_sys_log command is described in Appendix B, which contains its info file. The chief differences are some renamed control arguments (the old ones are still accepted for compatibility), a -reverse control argument to print messages in reverse chronological order, and some additional control over the expansion of messages. NOTE: As of this writing (84-06-07), no mechanism has been designed for specifying explicitly the directories where members of a log family may be found, although it is clear that some mechanism to do this is essential in order to deal with log segments copied from other systems, or simply from elsewhere in the hierarchy. A mechanism will be specified in the next revision of the MTB; it will consist of appropriate control arguments for print_sys_log and another entrypoint for log_read_. The log_read_ interface is described in the subroutines section, below. It requires that the family of log segments be "opened" and "closed", and provides entrypoints to search for individual messages by time or sequence number, as well as for stepping sequentially through the messages. In order to avoid copying messages, it returns pointers directly into the log segments; the data pointed to should never be modified. Callers of the log_read_ interface must be able to interpret the message structure stored in the log. Messages may be formatted for printing by the format_log_message_ subroutine; this includes performing the standard formatting for any binary data in the messages. Messages containing user-defined binary data can be interpreted by providing a format_XXXX_log_msg_ subroutine, which will automatically be called by format_log_message_. The XXXX in the subroutine name is the "data_class" value from the sys_log_message structure; it may be up to ten characters long. Log Message Structure MTB-666 FORMAT AND CONTENTS OF LOG MESSAGES: A message in a log segment, or copied out by a log-reading program, has the following declaration (declared in sys_log_message.incl.pl1). The contents of a message should never be altered except between calls to the $create_message and $finish_message entrypoints in log_segment_; fields marked with "(*)" should not be altered then, either. dcl 1 sys_log_message_header aligned based, 2 sentinel bit (36) aligned, (*) 2 sequence fixed bin (35), (*) 2 severity fixed bin (8) unal, 2 data_class_lth fixed bin (9) unsigned unal, (*) 2 time fixed bin (53) unal, 2 text_lth fixed bin (17) unal, (*) 2 data_lth fixed bin (17) unal, (*) 2 process_id bit (36) aligned; dcl 1 sys_log_message aligned based (sys_log_message_ptr), 2 header aligned like sys_log_message_header, 2 text unal char (sys_log_message_text_lth refer (sys_log_message.text_lth)), 2 data_class unal char (sys_log_message_data_class_lth refer (sys_log_message.data_class_lth)), 2 data aligned dim (sys_log_message_data_lth refer (sys_log_message.data_lth)) bit (36); Elements of sys_log_message structure: header Every log message is word-aligned, and begins with this eight-word header. sentinel This is a flag used only by the log software itself to mark the beginning of a valid message. It is set when the message is created and should never be altered. Along with the time stamp, it is used for consistency checks. sequence This is the sequence number assigned to the message when it was allocated in the log. It is set when the message is created, and should never be altered. severity This is a severity value for the message; its meaning is up to the subsystem writing the messages. It can be used in combination with the -severity argument of print_sys_log to select specific groups of messages. Log Message Structure MTB-666 time This is the clock reading at the time the message originated. This must be set by the caller of log_segment_$create_message, because the message may be added to a log significantly later than when it was initially formatted. text_lth This is the length, in characters, of the text portion of the message. It must not be zero; every message should have a text portion. data_lth This is the length, in words, of the binary portion of the message. If there is no binary portion, this must be zero; in that case, the data_class should be blank and the data_type must be zero also. process_id This is the process_id of the process generating the message. It can be used to identify processes at a later time. text This is the text (printable) portion of the log message. data_class This is a ten-character (or shorter) field used to identify the message formatting procedure for the binary data in this message. When the print_sys_log command is used to display the message, it will look for a procedure called format_XXXX_log_msg_ to format binary messages. Valid values for this field depend on the application; see print_sys_log.info for a list. data This is the binary (non-printable) portion of the log message. It must be interpreted by a special format procedure (see data_class, above) before it can be printed. Conversion of Syserr MTB-666 REPLACEMENT OF SYSERR: As stated in the introduction, the immediate goal of this project is replacement of the existing syserr log mechanism. The existing syserr log mechanism is unreliable, and has been a common source of system problems (including, in its worst forms, total inability to boot for no apparent reason). New Syserr Log Structure: The new syserr log will consist of two log segments and one data segment, all kept in the LOG partition, and a new directory, >sc1>syserr_log, where older log segments will be kept. The three new segments will replace the existing >sl1>syserr_log segment, and the >sc1>syserr_log directory will replace the >sc1>perm_syserr_log MSF. The two new log segments, >sl1>syserr_log_laurel and >sl1>syserr_log_hardy, will be filled and emptied alternately. When either one fills, it will be copied wholesale into >sc1>syserr_log, and log messages will start being placed into the other one. If both become full before either can be copied and emptied, the oldest will be re-used, and a log overflow will occur, just as it does today. The data segment, >sl1>syserr_log_data, will describe the current state of the syserr log segments. Additionally, the logs will be swapped at system initialzation and shutdown, and, if desired, whenever a specified interval elapses, in order to ensure that the copy in the hierarchy is as up-to-date as possible. There will not be an analogue for the log copy threshold of today, however: log copying will simply take place whenever one of the two log segments is full; that is, when the log partition itself is half-full. Syserr Interface Changes: This conversion will involve some incompatible changes, in the interests of overall better interfaces. Because syserr is a deeply buried part of the system, the effects on system code are relatively minor and easily located. The same should be true of any user code that references syserr messages as well; an informal poll asking about site-written syserr log scanning tools, taken in March 1984, received only one response (from AFDSC). The hardcore changes are even easier to isolate. Conversion of Syserr MTB-666 No changes will be made to the standard syserr interface. The syserr$binary interface will be changed to include a data_class and data_type in its calling sequence, and the eleven programs(1) that reference it will be changed to use the new calling sequence. NOTE: As of this writing, the new syserr$binary interface has not yet been specified. Subroutine documentation for it will appear in the next revision of this MTB (which, I imagine, will be the first time syserr has ever been documented. *sigh*). Syserr initialization and the syserr daemon will be changed to accomodate the new format of syserr messages. The wired buffer will be changed to have a two-part copying scheme just like the one used for the paged buffer: it will be divided into two (tiny) log segments, and one will be filled while the other is being copied out by the daemon. The daemon will still be invoked for every message, however, and will not bother to wait until one of the wired log parts is full. The Answering Service log copier will be changed to support the new log formats; mostly, this means making the whole thing simpler. Because the format of syserr messages changes (it becomes the sys_log_message format), the dozen or so programs that reference syserr_message.incl.pl1(2) will be changed to use the new sys_log_message data structure. This is probably the most significantly incompatible change, because there may be site-specific programs that reference this structure as well. To make this more apparent in the field, the names of the syserr_log_util_ entrypoints will be changed so that existing unconverted programs will get linkage errors. _________________________________________________________________ (1) activate, disk_control, hardware_fault, ioi_log_status, mos_memory_check, ocdcm_, page_error, salvage_pv, scavenge_volume, scavenger, verify_lock (2) azm_syserr_, daily_syserr_process, display_cpu_error, fnp_data_summary, heals_collect_data_, heals_cpu_reports_, io_error_summary, mos_edac_summary, mpc_data_summary, print_syserr_log, print_syserr_msg_, syserr_log_util_ Conversion of Syserr MTB-666 Converting the Existing Syserr Log: In order to avoid losing valuable information, a tool will be provided which can be run in admin mode immediately after installation of the new syserr software which will convert the existing perm_syserr_log vfile into a family of segments in the >sc1>syserr_log directory. If this is not done, that directory will be created automatically, and the contents of the perm_syserr_log will be lost. The ring zero syserr log will be converted automatically during bootload, and, so long as the previous shutdown was clean and successful, no messages will be lost, because the perm_syserr_log vfile is updated at shutdown. The syserr log initialization for the new mechanism will treat the presence of an old-format log partition precisely as it would have treated an empty partition: the partition will be completely reinitialized. Performance of New Syserr Log: The new log mechanism will have significantly higher bandwidth than the old one, because of its simpler copying mechanism. The copying mechanism will also be runnable by any process with the necessary access, so this function can be given to the Utility SysDaemon, and, more importantly, can easily be restarted if problems occur. The new syserr log mechanism is more storage-efficient than the old one. Each syserr message in the new logs has a six-word header. In the current system, the messages in the LOG partition have a seven-word header. The keyed vfile_ in the current system has a six-word header for each message, as well as the overhead for the keys themselves, an additional seven words per message (plus 4K words of fixed vfile_ overhead for the whole vfile_). This means that the new log segments should require about 20 to 30 percent less storage than the existing vfile_.(1) _________________________________________________________________ (1) Thanks to Gary Dixon for providing these calculations Subroutine Outlines MTB-666 LOW LEVEL SUBROUTINES The following subroutines and include files make up the low level interface for log segments. They are the only subroutines that perform any manipulations of the log segment headers. Except as noted, they are all usable from ring zero, wired environments. sys_log_message.incl.pl1 This include file describes the format of an individual log message. It is used by all programs reading and writing logs, not just the low level subroutines. sys_log_info.incl.pl1 This include file describes the format of a log segment. All header information is described here, with the exception of the allocation information, which is declared only in log_segment_. Most high-level programs have no need to include this file, but instead should rely on low level programs to extract the necessary information. Error codes: the following new error_table_ codes are required to support the new log primitives. Details of when they may occur will be supplied later, but a brief description is supplied for each one. error_table_$log_segment_full "The log segment is full"-- when attempting to add a message and there is no room to do so. error_table_$log_segment_damaged "The log segment is damaged"-- any operation may return this when it discovers a problem it can't deal with. Some operations will do an automatic salvage, however. error_table_$log_out_of_service "The log segment is not currently in service"-- when attempting to write to a log segment that hasn't had its "in-service" bit turned on. Used to synchronize creations of new log segments. error_table_$no_log_message "The specified log message does not exist"-- when attempting to position to a log message that isn't there. Subroutine Outlines MTB-666 log_segment_ This is the subroutine that creates messages, and performs miscellaneous manipulations of the log segment header. This is the ONLY subroutine that can interpret the space and sequence number allocation information in the header of a log segment. A message is created in a log segment by calling either the create_message or create_message_number (used to assign a specific sequence number) entrypoint to reserve space for the message, filling in the message contents, and then calling the finish_message entrypoint to mark the message as completed. log_initialize_ This subroutine initializes the headers of empty log segments. It can either initialize the header from scratch, or copy the relevant header information from another log segment. It calls log_segment_ to set up the allocation information, but does all other initializations itself. log_search_ This subroutine searches within a single log segment for a message at or near the specified sequence number or time. It is used by the log reading primitives to do positioning. log_wakeup_ This subroutine manages the list of 25 "registered" wakeup recipients for the log. It also contains the entrypoints called to deliver the wakeups, one for use from ring zero, and one for use outside. log_salvage_ This subroutine performs consistency checks and salvages on a single log segment. Because it embodies all the knowledge of log segment salvaging, it has separate entrypoints specifying how the repairs are to be announced (if at all), some of which can be used only outside ring zero. Subroutine Outlines MTB-666 HIGH LEVEL SUBROUTINES The following subroutines and include files constitute the high level interface to log families. They can run only outside of ring zero, because they reference the file system and make calls to timer_manager_$sleep. log_read_write_data.incl.pl1 This include file declares the structures used to record "openings" of log families for reading and writing. It is included ONLY by log_read_, log_write_, and their wholly-owned subsidiaries. log_read_ This is the application interface for log reading. It contains entrypoints to open and close a log family, search for specific messages, and step sequentially through the messages in a log family. To "read" a message, log_read_ returns a pointer to the message text in the log segment. This is the interface responsible for keeping track of the entire history of a log family. log_write_ This is the application interface for log writing. It contains entrypoints to open and close a log family, and to write text-only and binary messages into the log. This is the interface responsible for writing messages into log segments and creating new ones when the old ones fill up. log_create_ This subroutine creates log segments. It is used only by log_write_ and the syserr applications. It can either create a new log segment from scratch, or create one with attributes identical to an existing segment. log_initiate_ This subroutine initiates log segments. It is used only by log_read_ and log_write_. It is responsible for waiting (briefly, for a caller-specified delay) until a newly-created log segment (created by another process, that is) has been initialized and placed in service. log_name_ This subroutine constructs names for the history segments, of the form <family>.YYYYMMDD.HHMMSS. log_list_history_ This subroutine, used only by log_read_, is used to list the complete set of historical logs in a log family, and create a log_read_data describing that family. Subroutine Outlines MTB-666 LOG PERUSAL TOOLS The following commands and subroutines are used to peruse information in logs. The commands are documented in info files of their own; by and large, the subroutines are not intended for use except by the standard log commands. NOTE: As of this writing (84-06-07), several of these commands and subroutines have not yet been designed, and consequently no further documentation appears elsewhere in the MTB. Detailed documentation will be provided where appropriate in the next revision of the MTB. print_sys_log Prints selected messages from a log family, with various selection and message expansion options. monitor_sys_log Prints new messages as they appear in a log ("monitors" the log). migrate_sys_log Moves log segments from one directory to another, updating the pathnames in the log headers as it does so. display_log_segment Displays header and message information from a selected log segment. This is a debugging tool. format_log_message_ This subroutine can be used to format or print a specific log message according to various options. It is essentially equivalent to today's print_syserr_msg_. format_XXXXXXXXXXX_log_msg_ Application writers can construct subroutines of this name to interpret specific types of binary data in their messages. The appropriate format subroutine, whose name is derived from the "data_class" field in the log message, will be called by format_log_message_ when it is expanding log messages with binary data. log_monitor_ This subroutine contains the important mechanisms for the monitor_sys_log command, and can be used by applications to perform the same sort of monitoring. Subroutine Outlines MTB-666 log_match_ This subroutine, intended for use only by monitor_sys_log and print_sys_log, matches text strings against a series of -match and -exclude strings. log_test This "command" is really a collection of miscellaneous entrypoints for testing various aspects of the logging software. It is not intended for use except as an implementation aid. Appendix A: SRB Notice MTB-666 SRB NOTICE: In MR11, the syserr logging mechanism has been replaced. In addition to being used for the syserr log, the new log mechanism is available for general use by any application needing to maintain logs of messages. This new mechanism is not currently used for any system logs other than the syserr log. Important changes: * "print_syserr_log" becomes "print_sys_log -syserr" * audit_gate_ eliminated, replaced by log segment ACLs * syserr message declaration changes * "trim_syserr_log" replaced by "date_deleter" (in crank) * >sc1>perm_syserr_log becomes >sc1>syserr_log The print_syserr_log command has been eliminated. In its place is the print_sys_log command; the interfaces are substantially similar, except that print_sys_log must be given the "-syserr" control argument to direct its attention to the syserr log. See the new print_sys_log.info distributed with this release. The syserr log gate, audit_gate_, has been eliminated. Instead, the syserr log appears as three segments in >sl1: syserr_log_data, syserr_log_laurel, and syserr_log_hardy. The ACLs on these segments should include "rw Initializer.SysDaemon", and "r" access for any process which formerly had access to audit_gate_. The default ACL provides "r" access for the SysDaemon, SysMaint, and SysAdmin projects only. The ACL on the syserr log segments is copied whenever a new segment is created in >sc1>syserr_log. All other commands for perusing the syserr log (such as daily_syserr_process, display_cpu_error, etc.) have been converted to use the new log_read_ interfaces. If your site has its own syserr log perusal tools, these must be modified to use log_read_, as the syserr_log_util_ interface has been eliminated. The format of syserr log messages has been changed; see the include file sys_log_message.incl.pl1 for details. The calling sequence for syserr$binary has also changed; see the source for syserr_real.pl1. The trim_syserr_log command has been deleted. In its place, you must use the date_deleter command on >sc1>syserr_log to delete old log segments. The crank has been modified to do this. A migrate_log command is included which can be used to move log segments into a history directory elsewhere in the hierarchy, if desired; see the info file for details. This is not used by the syserr log; all old syserr log segments remain in >sc1>syserr_log until trimmed. Appendix A: SRB Notice MTB-666 The mechanism available for user applications is the log_read_ and log_write_ subroutines. See the info files for details. The existing syserr log must be converted to the new format during installation of the release. This is done as follows: *** [This belongs in the installation instructions] After installing the MR11 libraries, but before starting the Answering Service (that is, at ring four "standard" command level in admin mode), run the "convert_syserr_log" program. This will create the directory >sc1>syserr_log, and copy the contents of the >sc1>perm_syserr_log vfile into a family of log segments in that directory. The existing >sc1>perm_syserr_log vfile is converted into the new format by running the "convert_syserr_log" program before starting the MR11 Answering Service. If this step is omitted, the >sc1>syserr_log directory will be created automatically during Answering Service initialization, but it will contain only messages generated after the installation of MR11; no messages from the perm_syserr_log will be preserved. The LOG partition will be converted automatically to the new format when MR11 is first booted, and its previous contents will be lost. This is not a problem, however, since the previous shutdown will have copied all messages from the partition into the perm_syserr_log vfile, where they can be collected using "convert_syserr_log". Appendix B, Info File: print_sys_log MTB-666 84-06-06 print_sys_log, psl Syntax: psl -syserr {-control_args} psl PATHNAME {-control_args} Function: prints selected portions of system logs, including the syserr log. Various control arguments are used to determine which portions of the log are printed, and the format of the output. Arguments: PATHNAME is the pathname of the current segment in a family of logs. Information in this segment will be used to locate earlier segments in the log family, if required. -syserr specifies that the syserr log is to be examined; the syserr log segments in >sl1 are examined to locate the members of the syserr log family. Control arguments: -reverse, -rv specifies that the log is to be examined starting with the most recent message selected by other control arguments, and proceed backwards. -forward, -fwd specifies that the log is to be examined starting with the oldest message selected by other control arguments, and proceed forwards. (Default) -from TIME, -fm TIME, -from NUMBER, -fm NUMBER specifies that the first message examined is the first message at or after the specified time or sequence number; if -reverse is specified, the first message is the one at or before the specified value. If no -from value is specified, the default is the first message in the log, or the last if -reverse is specified. This is incompatible with -last. -to TIME, -to NUMBER specifies the last message to be examined, either by message time or sequence number. If not specified, the default is all the remaining messages in the log. This is incompatible with -for. Appendix B, Info File: print_sys_log MTB-666 -for TIME, -for NUMBER specifies a number of messages to print, or a time interval relative to the starting time (specified by -from) in which the messages must be contained. The number of messages is the actual number of messages printed, not the number of messages examined in the log. This is incompatible with -to and -last. -last NUMBER, -lt NUMBER, -last TIME, -lt TIME specifies that only the last NUMBER messages, or the messages since TIME, are to be printed. If a NUMBER is specified, it specifies the actual number of messages to be printed, not the number of messages examined in the log. This is incompatible with -to and -last. -severity S1 ... Sn, -sv S1 ... Sn only messages with the severity specified by an Si are printed. The severities, Si, may either be decimal integers, or ranges, consisting of a pair decimal integers separated by a colon ("20:29"). If multiple severities are specified, all messages with any of those severities are printed. A severity value must be between -250 and 250. -all_severities, -asv messages of all severities are printed. (Default) -exclude STR1 ... STRn, -ex STR1 ... STRn any message whose text contains one of the specified strings STRi is not printed. A string is interpreted either as a text string, or as a regular expression if it is surrounded by slashes. See Notes on String Matching, below, for details. -match STR1 ... STRn all messages text contains one of the specified strings STRi are printed. Strings are interpreted as for -exclude. -expand {T1 ... Tn} in addition to printing the text portion of messages, prints the expanded representation of any binary data contained in the message. If any Ti type values are specified, only messages of the specified types are printed in expanded form. See List of Message Types, below. -expand_octal {T1 ... Tn}, -eo {T1 ... Tn} in addition to printing the text portion of messages, prints the octal representation of any binary data contained in the message. The type argument(s) are interpreted as for -expand, above. -no_expand {T1 ... Tn}, -nex {T1 ... Tn} does not expand mesages of any of the specified types; cancels the effect of a previous -expand or -expand_octal. Appendix B, Info File: print_sys_log MTB-666 -exclude_data STR1 ... STRn, -ed STR1 ... STRn any message whose expanded binary data contains one of the specified strings STRi is not printed. These tests are applied after the tests for matching and exclusion on the text of the message. Strings are interpreted as for -exclude. -match_data STR1 ... STRn, -md STR1 ... STRn any message whose expanded binary data contains one of the specified strings STRi is printed. These tests are applied after the tests for matching and exclusion on the text of the message. Strings are interpreted as for -exclude. -duplicates, -dup inhibits the printing of "=" messages for messages whose text is the same as the previous message printed. All messages are printed exactly as they appear in the log. -no_duplicates, -ndup prints "=" for messages whose text is the same as the previous message printed. (Default) -header, -he prints a header giving the times and sequence numbers of the first and last messages that will be examined. This is the default. -no_header, -nhe suppresses printing of the header. -limits reads only the first and last messages in the log and prints their times and sequence numbers. No other action is performed, regardless of what other control arguments are used. -single, -sg specifies that only messages from the single log segment whose pathname was given in the command line are to be examined. This is incompatible with -syserr. -family, -fm specifies that messages from the entire log family whose first segment was specified given in the command line are to be examined. This is incompatible with -syserr. (Default) -absolute_pathname, -absp prints the absolute pathname of all log segments examined while printing log messages. -no_absolute_pathname, -nabsp does not print the pathname of log segments. (Default) Appendix B, Info File: print_sys_log MTB-666 List of Syserr Message Types: A message type is a short string, up to ten characters, specifying a particular expansion routine to be called to display the binary data in printable format. The message type is specified when the message is written, and is used to distinguish between various types of binary messages. The following types appear in the syserr log: pc Page control detected error; binary data gives the pathname and location of the affected segment. config Config deck information, logged when the configuration changes. mc Any message containing a set of machine conditions for a fault, such as a hardware error, a fault audit message, or a crawlout from ring zero. ioi Messages logged to save I/O error status from peripheral devices. Access required: For logs other than the syserr log, read permission is required on all segments in the family, and search permission on all the directories containing log segments. For the syserr log, read permission is required on the segments in >sc1>syserr_log, along with status permission on the directory, and read permission is required on the following three segments in >sl1: syserr_log_data, syserr_log_laurel, and syserr_log_hardy. Notes on message selection: Messages are selected for printing in a series of steps, each of which filters out certain messages according to the control arguments specified. The set of messages at each step is any that were left after the previous step. If a control argument was not specified, then its corresponding step eliminates no messages. Note that the -expand control arguments do NOT select messages, but only affect how their contents are displayed Appendix B, Info File: print_sys_log MTB-666 1) -to (stop looking after specified message) 2) -from (stop looking before specified number) 3) -for TIME (stop looking after specified time) 4) -last TIME (stop looking before specified time) 5) -severity 6) -exclude (eliminate matching messages) 7) -match (eliminate non-matching messages) 8) -exclude_data (eliminate matching messages) 9) -match_data (eliminate non-matching messages) 10) -for NUMBER (stop after NUMBER are printed) 11) -last NUMBER (stop after NUMBER are printed) Compatibility features: The following control arguments are accepted for compatibility with the old print_syserr_log and print_log commands: -action => -severity -next => -for -octal, -oc => -expand_octal -debug, -db => -duplicates The effect of print_syserr_log's -class argument can be achieved by supplying a range to the -severity argument: "-class 2" is replaced by "-severity 20:29". Appendix B, Info File: monitor_sys_log MTB-666 06-06-04 monitor_sys_log, msl Syntax: msl LOG_IDENTIFIER {-control_args} Function: prints selected portions of system logs, including the syserr log. Various control arguments are used to determine which portions of the log are printed, and the format of the output. Arguments: LOG_IDENTIFIER is either the pathname of the first segment in the log family to be monitored, or one of the following control arguments. If one of these control arguments is specified, no log pathname may be specified. If pathname specifies a log not currently being monitored, the log specified is added to the list; otherwise, its monitoring status is altered. Control arguments (log selection): -syserr specifies that the syserr log is to be monitored. -all, -a specifies that all logs currently being monitored are affected by the other control arguments; normally, only the specified log is affected. (Default) -number N, -nb N specifies the number (from a monitor_log -status listing) of one of the logs being monitored. Control arguments (action): -add Adds the specified log to the list being monitored. This may only be given with a log pathname. (Default) -remove Removes the specified log(s) from the list being monitored. -off Turns off monitoring of the specified log(s), without removing them from the list. -on Turns monitoring back on for the specified log(s). Appendix B, Info File: monitor_sys_log MTB-666 -call STR specifies that when new entries appear in the specified log(s), their text is passed as arguments to the specified command line STR instead of being printed. If STR is a null string (""), command line processing is turned off, and new entries are printed instead. The arguments passed to the command line are: 1) name of log family 2) sequence number of message 3) severity of message 4) text of message 5) expanded text of message (if -expand specified) -status, -st displays the monitoring status of the specified log(s). -remove_exclude, -rmex clears the set of exclude strings for the specified log(s). This is processed before any of the -exclude strings are added. -remove_match, -rmm clears the set of match strings for the specified log(s). This is processed before any of the -match strings are added. -remove_exclude_data, -rmed clears the set of data exclude strings for the specified log(s). This is processed before any of the -exclude_data strings are added. -remove_match_data, -rmmd clears the set of data match strings for the specified log(s). This is processed before any of the -match_data strings are added. Appendix B, Info File: monitor_sys_log MTB-666 -time N, -tm N specifies the monitoring interval, in seconds; the specified log(s) will be sampled once every monitoring interval. If the specified interval is zero, periodic monitoring is turned off. -register, -rg causes the users process to be registered as a recipient of wakeups whenever a message is added to the specified log(s). This is more efficient than periodic monitoring. Registered monitoring can be used in combination with periodic monitoring, but this is usually not a useful thing to do. -deregister, -drg removes the users process from the list of registered monitors of the specified log(s). Control arguments (message selection): -severity S1 ... Sn, -sv S1 ... Sn only messages with the severity specified by an Si are processed. The severities, Si, may either be decimal integers, or ranges, consisting of a pair decimal integers separated by a colon ("20:29"). If multiple severities are specified, all messages with any of those severities are processed. A severity value must be between -250 and 250. -all_severities, -asv messages of all severities are printed. (Default) -exclude STR1 ... STRn, -ex STR1 ... STRn adds the specified strings to the set of exclude strings for the specified log(s). Any message whose text contains one of the set of exclude strings for this log is not processed. A string is interpreted either as a text string, or as a regular expression if it is surrounded by slashes. See Notes on String Matching, below, for details. -match STR1 ... STRn adds the specified strings to the set of match strings for the specified log(s). All messages whose text contains one of the set of match strings for this log are processed. Appendix B, Info File: monitor_sys_log MTB-666 -expand {T1 ... Tn} in addition to printing the text portion of messages, prints the expanded representation of any binary data contained in the message. If any Ti type values are specified, only messages of the specified types are printed in expanded form. See List of Message Types, below. -expand_octal {T1 ... Tn}, -eo {T1 ... Tn} in addition to printing the text portion of messages, prints the octal representation of any binary data contained in the message. The type argument(s) are interpreted as for -expand, above. -no_expand {T1 ... Tn}, -nex {T1 ... Tn} does not expand mesages of any of the specified types; cancels the effect of a previous -expand or -expand_octal. -exclude_data STR1 ... STRn, -ed STR1 ... STRn adds the specified strings to the set of data exclude strings for the specified log(s). any message whose expanded binary data contains one of the set of data exclude strings is not printed. These tests are applied after the tests for matching and exclusion on the text of the message. Strings are interpreted as for -exclude. -match_data STR1 ... STRn, -md STR1 ... STRn adds the specified strings to the set of data match strings for the specified log(s). any message whose expanded binary data contains one of the set of data exclude strings is printed. These tests are applied after the tests for matching and exclusion on the text of the message. Strings are interpreted as for -exclude. Access required: For logs other than the syserr log, read permission is required on all segments in the family, and search permission on all the directories containing log segments. For the syserr log, read permission is required on the segments in >sc1>syserr_log, along with status permission on the directory, and read permission is required on the following three segments in >system_library_1: syserr_log_data, syserr_log_laurel, and syserr_log_hardy. Write access to the segments is required if -register or -deregister is specified. Appendix B, Info File: monitor_sys_log MTB-666 Notes on message selection: Messages are selected for printing in a series of steps, each of which filters out certain messages according to the control arguments specified. The set of messages at each step is any that were left after the previous step. If a control argument was not specified, then its corresponding step eliminates no messages. Note that the -expand control arguments do NOT select messages, but only affect how their contents are displayed 1) -class and -severity 2) -exclude (eliminate matching messages) 3) -match (eliminate non-matching messages) 4) -exclude_data (eliminate matching messages) 5) -match_data (eliminate non-matching messages) Appendix C: Changes to Existing Programs MTB-666 EFFECTS OF INCOMPATIBLE CHANGES: The incompatible changes to data structures and syserr message formats will require at least minor changes in the 43 programs listed below. This list comes from the MR10.2 libraries (more or less-- System-M in April 1984). Some of these programs will be completely replaced or rewritten; others, however, will just require small amounts of attention. They are broken down into group, by include file, later on; those that will be completely revamped have been removed from the detailed listings. In this list, the prefixes have the following meanings: F: Message format or binary type change (minor) M: Medium scale modification required for different logic R: Completely reimplemented D: Deleted from system AFFECTED PROGRAMS: F: activate F: poll_fnp M: azm_display_fdump_events F: poll_mpc R: azm_syserr_ R: print_syserr_log F: daily_syserr_process R: print_syserr_msg_ F: disk_control M: process_dump_segments F: display_cpu_error M: real_initializer R: display_syserr_ F: salvage_pv D: display_syserr_log_part F: scavenge_volume F: fnp_data_summary F: scavenger F: hardware_fault M: structure_library_5_ R: heals_collect_data_ R: syserr_copy_paged R: heals_cpu_reports_ R: syserr_data R: io_error_summary D: syserr_log_copy R: ioi_masked R: syserr_log_init R: mc_con_rec_ R: syserr_log_man_ F: mdc_repair_ R: syserr_log_util_ F: mos_edac_summary R: syserr_logger F: mos_memory_check M: syserr_real F: mpc_data_summary R: syserrlog_segdamage_scan_ F: ocdcm_ F: system_startup_ F: page_error M: verify_lock Appendix C: Changes to Existing Programs MTB-666 Changes to syserr_log.incl.pl1 and syserr_data.incl.pl1 These programs will be changed substantially, since the whole syserr log format has changed. Some programs, those not directly concerned with maintaining the log segments, will only require replacement of code that manually examines the log with simpler code that uses the log_search_ entrypoints. The display and analysis tools will be reworked. Changes to syserr_message.incl.pl1 This include file is to be deleted. At the least, programs referencing syserr_message will be changed to reference sys_log_message instead, and will also have some of the structure element names changes. Programs that process binary messages will also be changed to check the data_class and data_type, instead of the old binary message type number. The message formatting and display tools will be reimplemented. Changes to syserr_binary_def.incl.pl1 This include file will be deleted, and replaced with a new one defining some of the standard data_class and data_type values. The ring zero programs that reference this need only be changed for the new calling sequence to syserr$binary. The outer ring programs, primarily log scanning programs, will be changed to look for the appropriate data_class and data_type values; most of these are already covered under the changes for syserr_message. Appendix D: Changes from Prototype MTB-666 CHANGES FROM PROTOTYPE IMPLEMENTATION: An earlier version of this new logging mechanism, written by Benson Margulies, served as the base for this implementation. The two are similar in overall structure, but are quite different in actual implementation. The changes, and their justifications, are listed below: All log information in single segments The original design used families of log segments, but also included a "control segment" for each family that listed the listeners for new messages in the logs. Because it is, in general, difficult to ensure consistency between two separate segments, this information was moved into the header of the log segments themselves, and a fixed maximum limit of 25 listening processes per log was imposed. Recording full pathnames in the log segment header The original design recorded only the entryname of a previous log segment in the header of a newly created log, and used a search list to specify directories where older logs were to be found. This was changed to include the full pathname in order to eliminate the need for a search list, and also to make handling the syserr log case easier, since that requires a special case to find the first one or two segments in the log. Elimination of I/O module interface Use of an I/O module for reading logs was not particularly convenient, and also introduced considerable inefficiencies because of the required data copying. The log_read_ subroutine interface is sufficient to handle all existing applications, and, if an I/O module interface is required in the future, would be required as the base for it anyway. Introduction of binary data classes The original design carried over from syserr the notion of a single, system-wide set of binary message types. This makes it difficult to use user-defined binary messages, because there is no standard mechanism for locating a procedure to interpret the messages.