29 Oct 2022

expandfile History

This note describes the history of expandfile, a simple Unix command line program for expanding templates.

History

I wrote expandfile in 2002 to replace a collection of ad hoc Perl programs that I had been using to maintain the multicians.org website since 1995. Since 2002, I've added features to expandfile occasionally.

The idea of a simple macro expander dates back to Christopher Strachey's GPM macro expander, which I had experimented with in the mid 60s on CTSS at MIT Project MAC. Later, I used several computer languages that used pre-processors to add features to their languages, such as ALM, PL/I, and C: they provided features such as including other input files.

Because multicians.org had mirror sites where I could not modify the web server configuration or execute code on the server, I could not dynamically generate web pages. Furthermore, at the time, people implementing web sites by generating pages dynamically from databases were encountering web server performance and security problems. I chose to create static HTML pages and have the web server serve them without run-time calculation. (I had read a little about PHP, but at that time it was considered primarily a browser enhancement for FORM processing.)

My ISP's web servers supported a "server side include" feature, which let users insert the contents of auxiliary files when serving a web page. When I tried to separate my pages into content and boilerplate, I found that I wanted to have the boilerplate be slightly different for each page, so I sought a method where

RUNOFF is an archetype text-transformation language system, and I had used it as well in CTSS days. RUNOFF's input is either text to be copied to the output, or commands that change the state of execution and affect how later processing works. Later implementations of RUNOFF-like languages in Multics and elsewhere included the ideas of macro execution and multi-pass processing.

I decided to write my own source text expansion tool that did not parse the underlying input language, similar to GPM: it would just transform text strings into other text strings, with a minimal way of defining macros. This made the program more general and freed it from dependency on the syntax of the underlying language; I didn't have to write or maintain an HTML parser, and changes in the HTML spec would rarely require the tool to change. As in GPM and RUNOFF, I could set and evaluate string variables, and expand macros that accepted string variables as arguments.

I looked at the Unix m4 tool, written by friends of mine from Multics days. It wasn't available for the computer I had then; Perl was working OK.

I studied the errors I made often when maintaining multicians.org, and tried to build tools that would prevent them. Having each fact in one place was good only if I made sure to regenerate all object pages that used that fact: this led me to use the make program. Another common error was forgetting to update server files when I modified a file on my computer: this suggested the use of rsync.

Using Expandfile to Generate Web Sites

I first used expandfile to translate "HTML with extensions" input (which I called HTMX) into HTML, mostly to include common boilerplate, such as page banners and CSS layout instructions used on all my web pages. I added features to allow variables in the page header and footer data, like "title" and "date updated." Adding builtin functions that could transform variables' values came next, then the ability to capture the output of external shell commands, and then integration with SQL. I used these steps to simplify my work flow for maintaining websites I created, and to eliminate special-purpose Perl programs in favor of logic in HTMX web page templates.

The big advance for me was introducing the *block builtin, and the pattern of writing HTMX files that

This pattern separates site boilerplate from page content, provides an independent source file for each HTML page, and makes it easy to regenerate a single page.

Using expandfile was valuable when I made global changes to every page on a site to change each page's appearance, to conform to changing HTML specifications, or to use new browser features.

Connecting expandfile to my local MySQL database and supporting *sqlloop was the next big step. This provided consistent formatting for lists of people, publications, glossary entries, and website page indexes, and defined a lightweight way for any HTMX file to refer to data from these lists. These changes reduced the chance that an editing mistake would screw up a whole page.

The third big step I took was using traditional Unix tools to automate site building and publishing. Using make (created for Unix by Multician Stu Feldman) to invoke expandfile only when an HTMX file was newer than its corresponding HTML files meant that I could make a one-line change to a file and then just type make install to recompile the minimum number of files and automatically rsync them to the deployment site.

Other Applications of Expandfile

As expandfile developed, I found other uses for it, including reformatting database files and preparing input for other programs such as input to procmail, RSS feed declarations in XML, shell scripts, input to the dot graphical compiler, and XML sitemap files for the Google crawler.

For some applications, I expand a template which generates HTMX files which are in turn expanded; this lets me do "two pass" expansions so I can add up counters and then display them above the detailed information.

In 2004, I wrote a web statistics application that uses expandfile to format complex daily reports of web server usage data loaded into SQL.

For a document formatting application, I extracted data from data files in a proprietary format, translated it to SQL and loaded it into a local MySQL database, used expandfile's *sqlloop builtin to generate HTML, and a browser to generate visually formatted output, printed the browser output to PostScript, and used page impression tools to generate a booklet.

I have also built template files that use the *shell builtin to fire off curl commands that fetch XML data from Web APIs, and then parse the result with *xmlloop to generate HTML reports.

Language Issues

I originally wrote expandfile in Perl 5, and used it on Unix, macOS and Windows, through years of evolution of my program, the Perl language, and the features provided on different platforms. I showed expandfile to friends, but they were put off by the difficulty of installing and configuring the Perl implementation:

Later Improvements

Early versions of expandfile had some features that I later decided were mistakes. Fortunately, nobody but me was using the program, and I knew where all the HTMX source files were. So I backed up the program and sources, created and tested a new program version, modified every source file that had to change, recompiled everything, compared old output to new, and accounted for changes before switching over to the new version.

Some of the changes were bug fixes or new features, such as *xmlloop and *format. A few changes were made when Perl syntax changed and the program had to be updated.

I made the following changes in early 2021: