|
Perl is highly admired for its ability to create Web pages "on the fly". Sometimes, however, you just need to create a few static pages, with minimal effort. In fact, a CGI script may be exactly what you wish to avoid. In my efforts to create indexing pages for a MacPerl CD-ROM recently, I ran into just such a situation. I did not want to force the user to set up a server and CGI support just to use my Web pages. On the other hand, I wanted to give the user the "feel" of dynamic pages. So, I cheated. In brief, I created a "little language" for describing related sets of HTML files. Using a simple encoding scheme, I specified which lines should appear in which HTML file. With a bit of fancy footwork (described below), this allowed me to provide assorted buttons that would look familiar to my intended (Mac OS) audience. The language I created falls into the general class of "macro languages", so let's begin there. Macro LanguagesMacro languages allow specially-encoded input files to be "expanded" in assorted ways. A macro preprocessor can include or omit lines of information, incorporate lines from other files, calculate values from pre-set variables, and more. Although the output file may just be "data", often it is program code. Unix is rife with macro processing languages. m4 is a very nice general-purpose macro language. The C pre-processor (cpp) is a useful, albeit rather tame facility. The Unix shells can be used to provide a form of macro processing on any embedded scripts (for example, awk, perl, or sed). That is, shell variables can be expanded within the body of embedded awk code. Going up another level, make provides similar preprocessing for bits of embedded shell code. Unfortunately, my development was taking place on a Macintosh, which is not so replete with macro facilities. Having MacPerl, however, I was able to continue in the face of such minor difficulties. In fact, lacking a general-purpose macro language, I was able to create a specialized macro processor that met my needs quite nicely. s4waysMy macro processor, named "s4ways", splits an encoded input file into four HTML files. Each of the output files contains a slightly different set of information, along with navigation links that let the user select a desired "view". One of my HTML files contains sets of useful links (i.e. links to directories on the disc or to relevant Web sites). These links could get in the way of a simple scan, however, so I wanted to make them optional. By setting up some "buttons" on the top of the page, I could allow the user to turn this information on or off. Two switches in two states sounds like four combinations, hence, s4ways. The s4ways "language" is very simple. A string of four characters at the beginning of each input line tells the processor how the line should be handled. Because the most common case - "include this line in all output files" - should be easy to encode, s4ways uses a string of four blanks to specify it:
<HTML>
<HEAD>
...
Looks just like indented HTML; how convenient! Now, let's find ways to express some other output selections: X disc links X Web links __ no links XX both links X_ disc links only _X Web links only The first two of these, along with the common (blank) form, make up most of my input file. If I want to include a line with a CD (WWW) link, I preface it with "X " (" X"). The other codes are used for the navigation section, allowing each of my four output pages to have a unique "navigation bar". After several false starts, I settled on the following four navigation lines: Show: [ ] CD [ ] WWW [ ] All Show: [X] CD [ ] WWW Show: [ ] CD [X] WWW Show: [X] CD [X] WWW [ ] None In practice, I used a pair of GIF files instead of the "[ ]" and "[X]" text shown above. This presented the user with colored squares that darkened to indicate selection: __ Show: <A HREF="p_.html"><IMG SRC="b0.gif"></A> PTF <A __ HREF="_w.html"><IMG SRC="b0.gif"></A> WWW <A __ HREF="pw.html"><IMG SRC="b0.gif"></A> All X_ Show: <A HREF="__.html"><IMG SRC="b1.gif"></A> PTF <A X_ HREF="pw.html"><IMG SRC="b0.gif"></A> WWW _X Show: <A HREF="pw.html"><IMG SRC="b0.gif"></A> PTF <A _X HREF="__.html"><IMG SRC="b1.gif"></A> WWW XX Show: <A HREF="_w.html"><IMG SRC="b1.gif"></A> PTF <A XX HREF="p_.html"><IMG SRC="b1.gif"></A> WWW <A XX HREF="__.html"><IMG SRC="b0.gif"></A> None To control the character spacing in the HTML pages, I used the <PRE> (preformatted text) tag. The rather peculiar positioning of the line breaks and angle bracket allows me to "hide" my input spacing from view. This lets my input file be (reasonably) presentable, while getting the desired output appearance. I also found, by the way, that setting BORDER=0 improved the appearance of the GIF files. (See the sampler page for a better idea of the GIF's appearance.) Radio ButtonsBy changing the navigation code section slightly, I could have presented the user with a rather different "look and feel", but exactly the same information and choices: Show: [ ] CD [ ] WWW [ ] All [X] None Show: [X] CD [ ] WWW [ ] All [ ] None Show: [ ] CD [X] WWW [ ] All [ ] None Show: [ ] CD [ ] WWW [X] All [ ] None Here is the code for this "radio button" format. The code for each navigation line is the same, save for the position of the darkened square: __ Show: <A HREF="p_.html"><IMG SRC="b0.gif"></A> CD <A __ HREF="_w.html"><IMG SRC="b0.gif"></A> WWW <A __ HREF="pw.html"><IMG SRC="b0.gif"></A> All <A __ HREF="__.html"><IMG SRC="b1.gif"></A> None ... Turning TrianglesOn another page, I had two major sections that I wished to make optional. Using right- and down-pointing triangles in the left margin, I was able to achieve quite a nice simulation of Apple's "turning triangle" format, which Macintosh users would find familiar: Common text ... > Option header v Option header optional text ... Here is the navigation code for the "turning triangle" format: __ <A HREF="1_.html"><IMG SRC="t0.gif"></A> Header 1 X_ <A HREF="__.html"><IMG SRC="t1.gif"></A> Header 1 _X <A HREF="12.html"><IMG SRC="t0.gif"></A> Header 1 XX <A HREF="_2.html"><IMG SRC="t1.gif"></A> Header 1 X Optional text 1 ... __ <A HREF="_2.html"><IMG SRC="t0.gif"></A> Header 2 X_ <A HREF="12.html"><IMG SRC="t0.gif"></A> Header 2 _X <A HREF="__.html"><IMG SRC="t1.gif"></A> Header 2 XX <A HREF="1_.html"><IMG SRC="t1.gif"></A> Header 2 X Optional text 2 ... To understand this code, walk through the states and transitions. In the HTML file "__.html", both triangles are pointed to the right ("t0.gif") and no optional code appears. Selecting the first triangle jumps us to "1_.html", where the top triangle points down ("t1.gif") and the top set of optional code appears. Because the macro processing is quite simple, the implementation code is short. (Feel free to skip over this section if you aren't a Perl aficionado!) My production script has a bit more bulletproofing and such, but it basically looks like:
#! perl
$base = 'Perl:PTF';
@keys = ('__', '_w', 'p_', 'pw');
splitter("$base:x1.mf", "$base:", '.html', @keys);
@keys = ('__', '_2', '1_', '12');
splitter("$base:x2.mf", "$base:", '.html', @keys);
sub splitter { # $in, $pre, $suf, @keys
#
# Read $in, write _$pre{@keys}$suf
my ($in, $pre, $suf, @keys) = @_;
my ($s0, $s1, $s2, $s3) = @keys;
my ($flag, $line, $text);
# 0123
$m{' '} = 'XXXX'; # print on ??
$m{' X'} = '.X X'; # print on ?X
$m{'X '} = '..XX'; # print on X?
$m{'XX'} = '...X'; # print on XX only
$m{'_X'} = '.X..'; # print on _X only
$m{'X_'} = '..X '; # print on X_ only
$m{'__'} = 'X...'; # print on __ only
open(MF, "<$in") or die;
open(S0, ">$pre$s0$suf") or die;
open(S1, ">$pre$s1$suf") or die;
open(S2, ">$pre$s2$suf") or die;
open(S3, ">$pre$s3$suf") or die;
while (defined($line = <MF>)) {
chomp($line);
$line .= " " if (length($line) < 4);
$flag = substr($line, 0, 2);
$text = substr($line, 4);
print(S0 "$text\n") if ($m{$flag} =~ /X.../);
print(S1 "$text\n") if ($m{$flag} =~ /.X../);
print(S2 "$text\n") if ($m{$flag} =~ /..X./);
print(S3 "$text\n") if ($m{$flag} =~ /...X/);
}
close(MF); close(S0); close(S1); close(S2); close(S3);
}
The pre-loaded hash %m provides a level of indirection, allowing my syntax to be a bit more abstract than it might otherwise be. This paid off in easier editing of (and fewer errors in!) the input files. Post-MortemBecause I wrote s4ways for my own purposes, I was able to be quite lazy and self-centered about its design. The language has exactly the syntax I wanted; no more, no less. The implementation is very simple, if not very efficient. Designers of general-purpose macro languages such as m4 are required to be quite a bit more "professional" in their approach than I was. But then, that's their problem (:-). Splitting a file into four versions is trivial. Splitting into 8 or 16 would be somewhat trickier, but no real challenge. After that, the sheer number of generated files might start to get oppressive. On the other hand, I cheerfully generated a set of 200 HTML files for another part of the disc, so moderation appears to be a matter of taste. In this other case, by the way, the files were split up on a data-driven basis. The moral here, I guess, is that you shouldn't be afraid to generate new "languages" if the old ones start to run out of steam. For more information on little languages (as well as some educational and pleasant reading), take a look at Jon Bentley's delightful "Programming Pearls" and "More Programming Pearls" (Addison-Wesley, ISBNs 0-201-10331-1 and -11889-0) and/or follow his column in UNIX Review. |