Wednesday, May 03, 2006

Macro Madness

The macros -- and madness -- I'm referring to today have little to do with Lisp, except as a cautionary tale. No, I'm talking about TeX and LaTeX, the other major language famous for its macros. I format my English résumé and French C.V. using a motley collection of LaTeX packages that I've found over the years. The English version uses resume.sty, originally by Stephen Gildea, and dates from 1988! The French one uses Didier Varna's CurVepackage which, I've found, does a good job of outputting a C.V. in a French / European style. His C.V. is a lot sexier than mine, but one has to start somewhere.

The style I adapted (i.e., copied) from Didier puts the years of an activity in the margin of an entry, bold faces the job title and company, puts the city in normal type in parentheses, and starts right in with the description. A consulting company asked me for my C.V. in a different format: months and years of dates (not in the margin), company and city in bold, job title on the next line, followed by the job description in bullet points. For this long-time TeX amateur it seems vastly preferable to conditionalize this somehow rather than keeping two different version of a document around. In terms of CurVe, I need the prologue of each entry to expand differently in each version and do something about establishing a list environment for the consulting version. Given this input from my wrapper macro:

\Entry{août 1999}{août 2001}{Software Design Engineer}{}{Seattle
WA, États-Unis}{blah blah}

I want:

\entry[1999 -- 2001]{\textbf{Software Design Engineer,}
(Seattle WA, États-Unis)}{blah blah}

in the regular version and:

\entry{\textbf{août 1999 -- août 2001, Seattle WA, États-Unis}\par%
\textbf{Software Design Engineer:} blah blah}

in the consulting version. This doesn't seem too bad, but there are two big complications:

  • If the years of the date are the same, I only want one year in the margin of the regular version;
  • I want to be able to write \now instead of a date and get "présent".

Furthermore, I wanted to save some typing and write dates like "août 2001" and not as parameters to a macro like \date{août}{2001} or, worse, as separate parameters to the \Entry macro. This is one of those little decisions that, in retrospect, are incredibly stupid and lead to disaster or, at best, a lot more work, but otherwise I wouldn't have anything to write about today.

Like good Lisp programmers we'll attack this in a top-down and bottom-up manner at the same time. Assuming an \ifcvstyle conditional that chooses between the two styles, the LaTeX code I wrote above suggests these definitions:

\entry[\@years{#1}{#2}]{\textbf{#3, #4} (#5) #6}}%
\def\@@cvdate#1 #2\relax{#2}
\newcommand{\Entry}[6]{\entry{\textbf{#1 -- #2 #4, #5}\par\textbf{#3:} #6}}
\def\@@cvdate#1 #2\relax{#1 #2}
\cvdate{#1} -- \cvdate{#2}}}

\ifthenelse is LaTeX's all-purpose conditional macro; here we use it to compare two strings because plain TeX can't do that itself. OK, this isn't too bad; we can choose the basic form of the parameters to \entry and pick apart the parts of the date using TeX's powerful if bizarre macro parsing capabilities. But what about \@date, which we haven't defined yet? That submacro checks if the date argument is equal to \now and either returns that or proceeds with the date parsing.

My first try at \@date was something like:


The intent is to look at the first "token" (basically, character or control sequence beginning with backslash); if it is \now, just use that, otherwise put the token back and let \@@cvdate go to town. But this exploded in some cryptic way. We have to enter the shady world of TeX "expandability" to understand why.

It's somewhat obvious that TeX macro definitions are simply templates into which parameters are substituted, although of course these templates can contain arbitrarily complicated code. In the course of this template substitution, called "expansion" (duh), some evaluation can take place; for example, primitive conditionals such as our \ifcvstyle can be evaluated and expanded. But many, many things in TeX can not be, such as definitions made with \def, and assignments to registers, things that complex macros invariably do. A macro is said to be expandable if it produces its intended result only through expansion. Many macros require their parameters to be expandable because they pick apart the results through various clever tricks. So, back to \ifthenelse: it is an extremely complicated macro that is not expandable, but the arguments passed to the predicate in its test part must be! Our \@cvsdate is used in \@years in the test of an \ifthenelse and therefore needs to be expandable. Oh well, can't use \ifthenelse.

TeX's primitive conditionals are expandable, so let's try again:


\ifx compares characters or tokens and expands them, so it will work fine. This seems like it should work, but it doesn't. Aargh! After using TeX's \tracingmacros feature, which, compared to Lisp's macroexpand, sucks, it becomes clear that the \fi token that should end the conditional is sucked into \@@cvdate as part of the parameters. WTF? It turns out that when TeX expands a conditional it doesn't just replace it with the tokens in the appropriate branch and proceed; instead it skips the unsuccessful branch and starts expanding the taken branch, eventually noticing the ending \else or \fi token and discarding it. But it can't notice this token at the time it is gathering tokens for the argument of a macro, because it ignores the meaning of most tokens at that time... are we totally screwed?

No, there's a TeX idiom for handling this situation, using the somewhat bizarre \expandafter primitive. \expandafter reads one token without expanding it, then reads the next and expands it (which could consume other tokens if it takes arguments), then puts the first token in front of that expansion and carries on. This sounds promising: we want TeX to find the end if the conditional without beginning the expansion of \@@cvdate. Here's the final version of \@cvdate:


Note the double use of \expandafter. We have two tokens to save up and put after the end of the \ifx; you can chain together uses of \expandafter to save up multiple tokens like this.

To Lisp hackers this is absolute brain damage. We could just gloat about the superiority of our macro system, but that would miss the point. The TeX macro language is designed to be usable by authors, not necessarily technical, with a syntax that seems "natural." That's why TeX supports complex parsing of arguments, as in our \@@cvdate macro. Since most of the world now uses LaTeX that has a much more regular macro syntax by convention this power is now wasted on end users, but at least the thought is there. Furthermore, there is a balance between writing simple substitution abbreviations and macros, which is easy in TeX, and more complicated packages that are likely to be written by motivated hackers that will invest the time to learn all the obscure incantations. And some novice macro writers will inevitably be sucked into the second category... Finally, it must be remembered that TeX was developed 25 years ago on machines that were incredibly puny by today's standards, so perhaps the strict separation between expansion and evaluation made sense. And, the objective is to put glyphs on paper, not write cool programs: yet another design constraint.

I don't know if ultimately I saved any time with all this macrology, but I did learn something about TeX and macros in general. If you were going to write a Lispy typesetting program with an expansion mechanism that didn't suck, what would you do?


Faré said...

If I needed to rewrite my CV, I would do it with CL-Typesetting; and if I wanted a nicer front-end, I would do it with Exscribe... and for an even nicer one, I'd take advantage of programmable filters in the Scribble syntax to write some kind of Markdown filter .

The "hardest" part is to write the CL-Typesetting backend for Exscribe. Shouldn't be that hard, either. I might even do it someday...

One downside of Exscribe is its very poor error handling mechanism in case of syntax error. That could be alleviated if more of your code uses Markdown syntax.

Anonymous said...

There is an intresting project Scribe Scheme.