[The `forthcoming' TUGboat article cited below appeared as `Self-replicating macros' by Victor Eijkhout and Ron Sommeling, TUGboat 13 (1992) no 1, p. 84] Date: Tue 7 Jan 92 16:43:29-EST From: Michael Downes Subject: 'Around the bend' #2, Exercise 7, solutions To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List "*** Exercise 7 (hard): " "In the September 1991 issue of Dr. Dobb's Journal, in an article "`Little Languages, Big Questions' (pp. 16--25), Ray Vald\'es "described a `little language' as a part of a more complex "application that is " " partitioned into two (or more) nested components: a core module " that provides a primitive set of services for an application area " (the ``engine''), and a surrounding module that provides " programmatic access to these services. The surrounding module is " typically a language interpreter for a simple, easily parsed " computer language--a ``little language''. " "Since TeX seems to fall into this category, I wonder if any Dr. Dobb's "readers who know TeX tried their hand at the challenge given in a "sidebar (`How Strong Is Your Little Language')? " " [An] informal benchmark of a language's computational power is the " programming exercise that Ken Thompson (coauthor of Unix) used to " pass the time in college. ... The goal is to write the shortest " self-reproducing program: ``More precisely stated ... to write a " source program that, when compiled and executed, will produce as " output an exact copy of its source.'' " "When I tried it it turned out to be a real challenge for me. In the "Unix world, for conventional compiled languages, the problem as "originally stated can assume output on the `standard output' stream; "but TeX already clutters up standard output with some of its built-in "messages. This leaves three alternatives in refining the statement of "the problem to be meaningful for TeX: " "1. Write a TeX program that includes the built-in messages in its "source in such a way that it exactly fulfills the the original problem "statement with standard output as the output stream. " "2. Pretend the built-in messages don't exist and write a TeX program "that reproduces an exact copy of itself (with no extra garbage) "in the middle of the built-in messages. " "3. Write on a different output stream. " "Take your pick, any or all of the above, and see what you can come up "with. I have solutions for 2 and 3 but have not gotten around to really "thinking about 1 yet. I believe it will require at least a different "algorithm than the other 2, if it is not impossible. Plenty of good answers for this one. >>Solution 1 (mine) % This solution is type 2 (print the copy in the middle of TeX's % built-in messages). It assumes plain.tex or similar has been % loaded to set the catcodes of the left and right curly braces. % % The idea is to assign the text to the token register \errhelp % (used merely because it is a convenient pre-existing token % register), and then print out \the\errhelp twice. There is a bit % of shuffling to ensure that \errhelp will swallow the last half of % the file and that the last half of the file is equal to the first % half, which contains all the preparations necessary to prepare % \errhelp for that swallowing and the subsequent message-sending. % % A space is left after every control word, because this is easier % than trying to prevent TeX from printing spaces after control % words when the message is eventually printed on screen. % % The lines are carefully arranged to break at column 79 % (including spaces) since this is the normal value for max_print_line, % a constant compiled into TeX which controls the length of screen % output lines. It would be easy to make the lines work out nicely % no matter what the working code required, by varying the length % of the macro name \selfcopy and using, say, \everyhbox or % \everyjob instead of \errhelp. % % The total number of tokens in this solution is 54. % {\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end } \aftergroup \errhelp \afterassignment \selfcopy } {\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end } \aftergroup \errhelp \afterassignment \selfcopy } >>EndSolution >>Solution 2 (mine) This variation is Type 3, writing the copy to a disk file instead of to the screen. The total number of tokens in this solution is 126. \immediate \openout 0=\jobname .cpy {\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp }\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}} \newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12 \immediate \openout 0=\jobname .cpy {\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp }\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}} \newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12 >>EndSolution I learned from Victor Eijkhout that he had submitted a short article to TUGboat discussing this very problem, well before I asked it here in 'Around the bend'. He kindly sent me a copy of the article, which contains a good discussion of the underlying ideas, and a couple of different solutions. To summarize briefly, he gave a Type 2 solution similar in length to mine, and also a solution that involved printing out the source file on PAPER! A 'Type 4' solution, in other words. I'm a little embarrassed that I didn't think of this, given that the whole idea of TeX is to print things on paper. >>Solution 2 (Victor Eijkhout) Forthcoming in TUGboat. >>EndSolution Although I'm giving them all together, as "Solution 3", Peter Schmitt actually sent in six different variations, including a Type 4 solution. His first solution, log-pl.tex is Type 2 like my first solution but comes in at 38 tokens, significantly shorter. His third solution is comparable to my second solution but once again significantly shorter (87 tokens). >>Solution 3 (Peter Schmitt) The principal structure of the solution is the following: \def \run { \write { \def \run { } \run } } \run The following TeX-File out-ini.tex when processed by INITeX produces a file out-ini.out that is identical to out-ini.tex (case (3) below): (The file consist of a single line, it is broken up to make comments possible - each occurrence ot the comment sign % has to be removed together with the rest of the line to produce identical output.) \catcode `\{1 \catcode `\}2 \catcode `\#6 % these \catcodes are required \def \run {% a macro to called at the end of the file \immediate \openout 1=out-ini.out% % opens output \def \select ##1:->##2{##2}% an auxiliary macro to extract the replacement text \immediate \write 1{% write the output file \catcode `\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 % % writes the first `line' of the output \noexpand \def \noexpand \run % writes \def \run {\expandafter \select \meaning \run }% writes the replacement text of \run \noexpand \run }% writes the last `line' of the program \immediate \closeout 1% close output file \end }% close input \run % start the macro Comments: (i) \immediate prevents that a dvi-file is produced. (ii) the tex-file can be shortened (less characters) by using shorter names, maybe also by using a controlsymbol for \noexpand, both possibilities do not reduce the number of tokens. Maybe some \space tokens can be removed but most of them are necessary because they are produced by \meaning. - \immediate may be omitted (produces dvi-file) - at least with my implementation closing the output file is not necessary (iii) The TeX-file can be modified to solve variations of the exercise: - If the file is to be processed by plain TeX \catcodes need not be set (see (1) below). - if the output file is replaced by standard output or the log file \message instead of \write can be used (see (1) and (2) below). Note that in this case macro names and spaces have to be adjusted so that the line breaks produced do not prevent processing the file (In the log file line breaks may occur even in control sequence names!) I have not (not yet?) been able to solve the exercise using more pleasant (predetermined) linebreaks. - It is possible to produce a log file that is identical to the input file. But since the log file contains the time of processing this will be the case only at a specific date and time (see (4) below). (The time is output before the input file is read. Therefore it is impossible to change this part of output by the input.) - Of course, the above variation can be modified to produce a screen output identical to the input file. - It is possible to pass a verbatim copy of the input to TeX and set it in \tt %%%%%%%%%%%%%%%%%%%%%%% Some of the variations: %%%%%%%%%%%%%%%%%%%%%%% (1) plain TeX --> section of log file or standard output terminal %%% log-pl.tex: \def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run {\expandafter \select \meaning \run } \noexpand \run } \end } \run %%% log-pl.log This is TeX, Version 3.1(c)sb34 (preloaded format=plain3sm 91.4.28) 24 NOV 1991 02:15 ** &plain log-pl (log-pl.tex \def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run {\expandafter \select \meaning \run } \noexpand \run } \end } \run ) No pages of output. (2) INITeX --> section of log file or standard output terminal %%% log-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run %%% log-ini.log This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:16 ** log-ini.tex (log-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run ) No pages of output. (3) INITeX --> output file %%% out-ini.tex (Note: A single line broken at the %'s!) \catcode `\{1 \catcode `\}2 \catcode `\#6 \def \run {\immediate \openout % 1=out-ini.out\def \select ##1:->##2{##2}\immediate \write 1{\catcode % `\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 \noexpand \def % \noexpand \run {\expandafter \select \meaning \run }\noexpand \run }% \immediate \closeout 1\end }\run (4) INITeX --> log file %%% flog-ini.tex This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:17 ** flog-ini.tex (flog-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run [0] ) Output written on flog-ini.dvi (1 page, 512 bytes). %%% flog-ini.log This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:18 ** flog-ini.tex (flog-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run [0] ) Output written on flog-ini.dvi (1 page, 512 bytes). (5) INI-TeX --> log-file (formatted) %%% fmt-log.tex This is TeX, Version 3.1(c)sb34 (INITEX) 30 NOV 1991 13:13 ** fmt-log (fmt-log.tex [0 \catcode `\{=1 \catcode `\}=2 \catcode `\#=6 \def \run {\newlinechar 1 \lccode `\|=1 \lccode `\[=`\{ \lccode `\]=`\} \lowercase { \def \format ##1>##2=1##3]##4[##5]##6]{##2=1|##3]|##4[|##5]|##6]|\+} \def \+ ]##12]##2]##3]##4]]##5] { ]|##12]|##2]|##3]|##4]]|##5]|} } \write 0{\catcode `\noexpand \{=1 \catcode `\noexpand \}=2} \write 0{\catcode `\noexpand \#=6} \write 0{\noexpand \def \noexpand \run } \write 0{{\expandafter \format \meaning \run }} \write 0{\noexpand \run } \end } \run ] ) Output written on fmt-log.dvi (1 page, 512 bytes). (6) INITeX --> dvi-file %%% dvi-ini.tex \catcode`\% = 13 \catcode`\{ = 1 \catcode `\} = 2 \catcode`\# = 6 \catcode `\| = 13 \catcode`\% = 13 \def \run { \lccode `\[=`\{ \lccode `\]=`\} \lccode `\/=`\% \let % = \par %% \font\tt=cmtt10 \tt % \hsize 15cm \vsize 15cm \parskip 3pt \def |{\par \hskip .5em} % \lowercase { % \def \fmt ##1>##2//##3/##4/##5/##6/##7/{|##2//|##3/|##4/|##5/|##6/|##7/|\+} % \def \+ ##1/##2/##3/##4//##5/##6/##7/{##1/|##2/|##3/|##4//|##5/|##6/|##7/|} % } % \string \catcode `\string \{ = 1 \string \catcode `\string \} = 2 % \string \catcode `\string \# = 6 \string \catcode `\string \| = 13 % \string \catcode `\string \% = 13 %% \string \def \string \run \lowercase { [} % \expandafter \fmt \meaning \run \lowercase {]} % \string \run % \end } \run >>EndSolution