% This is tree_doc.tex, the documentation for the treetex macro package % as it will appear in the conference proceedings of the third European % TeX meeting in Exeter, England, 1988. \documentstyle[12pt,a4]{article} \advance\voffset by -2cm \clubpenalty=10000 \widowpenalty=10000 \def\addcontentsline#1#2#3{\relax}% Some captions are too long for some % TeX installations (buffer size too small) \newenvironment{lemma}{\begingroup\samepage\begin{lemmma}\ }{\end{lemmma}% \endgroup} \newtheorem{lemmma}{Lemma}[section] \newenvironment{proof}{\begin{prooof}\rm\ \nopagebreak}{\end{prooof}} \newcommand{\proofend}{\qquad\ifmmode\Box\else$\Box$\fi} \newtheorem{prooof}{Proof} \renewcommand{\theprooof}{} % makes shure that prooof doesn't get numbers \newenvironment{Figure}{\begin{figure}\vspace{1\baselineskip}}% {\vspace{1\baselineskip}\end{figure}} \newlength{\figspace} % space between figures in a single \setlength{\figspace}{30pt} % Figure environment \newcommand{\var}[1]{{\it #1\/}} % use it for names of variables \renewcommand{\emph}[1]{{\em #1\/}} % use it for emphazided text % (This notion sticks to the % applicative style of markup.) \renewcommand{\O}{{\rm O}} % O-notation, also for math mode \newcommand{\T}{{\cal T}} % the set T in math mode \newcommand{\TreeTeX}{Tree\TeX} \newcommand{\fig}[1]{Figure~\ref{#1}} \let\p\par \input treetex \Treestyle{\vdist{20pt}\minsep{16pt}} \dummyhalfcenterdim@n=2pt \def\Node(#1,#2){\put(#1,#2){\circle*{4}}} \def\Edge(#1,#2,#3,#4,#5){\put(#1,#2){\line(#3,#4){#5}}} \def\enode{\node{\external\type{dot}}} \def\inode{\node{\type{dot}}} \def\e{\node{\external\type{dot}}} \def\i{\node{\type{dot}}} \def\il{\node{\type{dot}\leftonly}} \def\ir{\node{\type{dot}\rightonly}} \newcommand{\stack}[3]{% \vtop{\settowidth{\hsize}{#1}% \setlength{\leftskip}{0pt plus 1fill}% \setlength{\baselineskip}{#2}#3}} \let\multic\multicolumn \newlength{\hd} % hidden digit \setbox0\hbox{1} \settowidth{\hd}{\usebox{0}} \newcommand{\ds}{\hspace{\hd}} % digit space \newcommand{\ccol}[1]{\multicolumn{1}{c}{#1}} \hyphenation{post-or-der sym-bol Karls-ruhe bool-ean} \begin{document} \bibliographystyle{plain} \title{Drawing Trees Nicely with \TeX\thanks{This work was supported by a Natural Sciences and Engineering Research Council of Canada Grant~A-5692 and a Deutsche Forschungsgemeinschaft Grant~Sto167/1-1. It was started during the first author's stay with the Data Structuring Group in Waterloo.}} \author{Anne Br\"uggemann-Klein\thanks{Institut f\"ur Informatik, Universit\"at Freiburg, Rheinstr.~10--12, 7800~Freiburg, West~Germany}\ \and Derick Wood\thanks{Data Structuring Group, Department of Computer Science, University of Waterloo, Waterloo, Ontario, N2L~3G1, Canada}} \maketitle \begin{abstract} Various algorithms have been proposed for the difficult problem of producing aesthetically pleasing drawings of trees, see~% \cite{TidierTrees,TidyTrees} but implementations only exist as ``special purpose software'', designed for special environments. Therefore, many users resort to the drawing facilities available on most personal computers, but the figures obtained in this way still look ``hand-drawn''; their quality is inferior to the quality of the surrounding text that can be realized by today's high quality text processing systems. In this paper we present an entirely new solution that integrates a tree drawing algorithm into one of the best text processing systems available. More precisely, we present a \TeX{} macro package \TreeTeX{} that produces a drawing of a tree from a purely logical description. Our approach has three advantages. First, labels for nodes can be handled in a reasonable way. On the one hand, the tree drawing algorithm can compute the widths of the labels and take them into account for the positioning of the nodes; on the other hand, all the textual parts of the document can be treated uniformly. Second, \TreeTeX{} can be trivially ported to any site running \TeX{}. Finally, modularity in the description of a tree and \TeX{}'s macro capabilities allow for libraries of subtrees and tree classes. In addition, we have implemented an option that produces drawings which make the \emph{structure} of the trees more obvious to the human eye, even though they may not be as aesthetically pleasing. \end{abstract} \section{Aesthetical criteria for drawing trees} One of the most commonly used data structures in computer science is the tree. As many people are using trees in their research or just as illustration tools, they are usually struggling with the problem of \emph{drawing} trees. We are concerned primarily with ordered trees in the sense of~\cite{ACP}, especially binary and unary-binary trees. A binary tree is a finite set of nodes which either is empty, or consists of a root and two disjoint binary trees called the left and right subtrees of the root. A unary-binary tree is a finite set of nodes which either is empty, or consists of a root and two disjoint unary-binary trees, or consists of a root and one nonempty unary-binary tree. An extended binary tree is a binary tree in which each node has either two nonempty subtrees or two empty subtrees. For these trees there are some basic agreements on how they should be drawn, reflecting the top-down and left-right ordering of nodes in a tree; see \cite{TidierTrees} and \cite{TidyTrees}. \begin{enumerate} \item[1.] Trees impose a distance on the nodes; no node should be closer to the root than any of its ancestors. \item[2.] Nodes of a tree at the same height should lie on a straight line, and the straight lines defining the levels should be parallel. \item[3.] The relative order of nodes on any level should be the same as in the level order traversal of the tree. \end{enumerate} These axioms guarantee that trees are drawn as planar graphs: edges do not intersect except at nodes. Two further axioms improve the aesthetical appearance of trees: \begin{enumerate} \item[4.] In a unary-binary tree, each left child should be positioned to the left of its parent, each right child to the right of its parent, and each unary child should be positioned below its parent. \item[5.] A parent should be centered over its children. \end{enumerate} An additional axiom deals with the problem of tree drawings becoming too wide and therefore exceeding the physical limit of the output medium: \begin{enumerate} \item[6.] Tree drawings should occupy as little width as possible without violating the other axioms. \end{enumerate} In \cite{TidyTrees}, Wetherell and Shannon introduce two algorithms for tree drawings, the first of which fulfills axioms~1--5, and the second 1--6. However, as Reingold and Tilford in \cite{TidierTrees} point out, there is a lack of symmetry in the algorithms of Wetherell and Shannon which may lead to unpleasant results. Therefore, Reingold and Tilford introduce a new structured axiom: \begin{enumerate} \item[7.] A subtree of a given tree should be drawn the same way regardless of where it occurs in the given tree. \end{enumerate} Axiom~7 allows the same tree to be drawn differently when it occurs as a subtree in different trees. Reingold and Tilford give an algorithm which fulfills axioms~1--5 and~7. Although this algorithm doesn't fulfill axiom~6, the aesthetical improvements are well worth the additional space. \fig{algorithms} illustrates the benefits of axiom~7, and \fig{narrowtrees} shows that the algorithm of Reingold and Tilford violates axiom~6. \begin{Figure} \centering \leavevmode\noindent \begin{Tree} \enode \enode\enode\inode\enode\enode\inode\inode\inode \node{\external\type{dot}\rght{\unskip\hskip2\mins@p\hskip2\dotw@dth}} \enode\enode\inode\enode\enode\inode\inode\inode \inode \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \enode \enode\enode\inode\enode\enode\inode\inode\inode \enode \enode\enode\inode\enode\enode\inode\inode\inode \inode \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\ \caption{The left tree is drawn by the algorithm of Wetherell and Shannon, and the tidier right one is drawn by the algorithm of Reingold and Tilford.} \label{algorithms} \vspace{\figspace} \centering \leavevmode\noindent \begin{Tree} \enode\enode\enode\enode\enode\enode\enode\enode\enode \enode\inode\inode\inode \enode\inode\inode\inode \enode\inode\inode\inode \enode\inode\inode\inode \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \enode\enode\enode\enode\enode\enode\enode\enode \node{\external\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}} \enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}} \enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}} \enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}} \enode\inode\inode\inode \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\ \caption{The left tree is drawn by the algorithm of Reingold and Tildford, but the right tree shows that narrower drawings fulfilling all aesthetic axioms are possible.} \label{narrowtrees} \end{Figure} \section{The algorithm of Reingold and Tilford} The algorithm of Reingold and Tilford (hereafter called ``the RT~algorithm'') takes a modular approach to the positioning of nodes: The relative positions of the nodes in a subtree are calculated independently from the rest of the tree. After the relative positions of two subtrees have been calculated, they can be joined as siblings in a larger tree by placing them as close together as possible and centering the parent node above them. Incidentally, the modularity principle is the reason that the algorithm fails to fulfill axiom~6; see~\cite{Complexity}. Two sibling subtrees are placed as close together as possible, during a postorder traversal, as follows. At each node \var{T}, imagine that its two subtrees have been drawn and cut out of paper along their contours. Then, starting with the two subtrees superimposed at their roots, move them apart until a minimal agreed upon distance between the trees is obtained at each level. This can be done gradually: Initially, their roots are separated by some agreed upon minimum distance. Then, at the next lower level, they are pushed apart until the minimum separation is established there. This process is continued at successively lower levels until the bottom of the shorter subtree is reached. At some levels no movement may be necessary; but at no level are the two subtrees moved closer together. When the process is complete, the position of the subtrees is fixed relative to their parent, which is centered over them. Assured that the subtrees will never be placed closer together, the postorder traversal is continued. A nontrivial implementation of this algorithm has been obtained by Reingold and Tilford that runs in time $\O(N)$, where $N$ is the number of nodes of the tree to be drawn. Their crucial idea is to keep track of the contour of the subtrees by special pointers, called threads, such that whenever two subtrees are joined, only the top part of the trees down to the lowest level of the smaller tree need to be taken into account. The RT algorithm is given in \cite{TidierTrees}. The nodes are positioned on a fixed grid and are considered to have zero width. No labelling is provided. The algorithm only draws binary trees, but is easily extendable to multiway trees. \section{Improving human perception of trees} It is common understanding in book design that aesthetics and readability don't necessarily coincide, and---as Lamport (\cite{LaTeX}) puts it---% books are meant to be read, not to be hung on walls. Therefore, readability is more important than aesthetics. When it comes to tree drawings, readability means that the structure of a tree must be easily recognizable. This criterion is not always met by the RT~algorithm. As an example, there are trees whose structure is very different, the only common thing being the fact that they have the same number of nodes at each level. The RT~algorithm might assign identical positions to these nodes making it very hard to perceive the different structures. Hence, we have modified the RT~algorithm such that additional white space is inserted between subtrees of \emph{significant} nodes. Here a binary node is called significant if the minimum distance between its two subtrees is taken \emph{below} their root level. Setting the amount of additional white space to zero retains the original RT~% placement. The effect of having nonzero additional white space between the subtrees of significant nodes is illustrated in \fig{addspace} . Another feature we have added to the RT~algorithms is the possibility to draw an unextended binary tree with the same placement of nodes as its associated extended version. We define the \emph{associated extended version} of a binary tree to be the binary tree obtained by replacing each empty subtree having a nonempty sibling with a subtree consisting of one node. This feature also makes the structure of a tree more prominent; see \fig{extended}. \begin{Figure} \centering \leavevmode\noindent \begin{Tree} \e\il\e\e\i\i\il % the left subtree \e\ir\il % the right subtree \i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \e\il\il\il % the left subtree \e\e\i\e\i\il % the right subtree \i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \adds@p10pt \begin{Tree} \e\il\e\e\i\node{\type{dot}\lft{$\longrightarrow$}}\il % the left subtree \e\ir\il % the right subtree \node{\type{dot}\lft{$\longrightarrow$}} \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \e\il\il\il % the left subtree \e\e\i\e\i\il % the right subtree \node{\type{dot}\lft{$\longrightarrow$}} \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\ \adds@p0pt \caption{The first two trees get the same placement of their nodes by the RT~algorithm, although the structure of the two trees is very different. The alternative drawings highlight the structure of the trees by adding additional white space between the subtrees of ($\longrightarrow$) significant nodes.} \label{addspace} \end{Figure} \begin{Figure} \centering \leavevmode\noindent \begin{Tree} \e\e\i\il\e\e\i\i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \e\e\i\e\i\e\ir\i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \extended \begin{Tree} \e\e\i\il\e\e\i\i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\qquad \begin{Tree} \e\e\i\e\i\e\ir\i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\\ \noextended \begin{Tree} \e\e\i\e\i\e\e\i\i \end{Tree} \hskip\leftdist\box\TeXTree\hskip\rightdist\ \caption{In the first two drawings, the RT~algorithm assigns the same placement to the nodes of two trees although their structure is very different. The modified RT~algorithms highlights the structure of the trees by optionally drawing them like their extended counterpart, which is given in the second row.} \label{extended} \end{Figure} \section{Trees in a document preparation environment} Drawings of trees usually don't come alone, but are included in some text which is itself typeset by a text processing system. Therefore, a typical scenario is a pipe of three stages. First comes the tree drawing program which calculates the positioning of the nodes of the tree to be drawn and outputs a description of the tree drawing in some graphics language; next comes a graphics system which transforms this description into an intermediate language which can be interpreted by the output device; and finally comes the text processing system which integrates the output of the graphics system into the text. This scenario loses its linear structure once nodes have to be labelled, since the labelling influences the positioning of the nodes. Labels usually occur inside, to the left of, to the right of, or beneath nodes (the latter only for external nodes), and their extensions certainly should be taken into account by the tree drawing algorithm. But the labels have to be typeset first in order to determine their extensions, preferably by the typesetting program that is used for the regular text, because this method makes for the uniformity in the textual parts of the document and provides the author with the full power of the text processing system for composing the labels. Hence, a more complex communication scheme than a simple pipe is required. Although a system of two processes running simultaneously might be the most elegant solution, we wanted a system that is easily portable to a large range of hardware at our sites including personal computers with single process operating systems. Therefore, we thought of using a text processing system having programming facilities powerful enough to program a tree drawing algorithm and graphics facilities powerful enough to draw a tree. One text processing system rendering outstanding typographic quality and good enough programming facilities is \TeX, developed by Knuth at Stanford University; see~\cite{TeXbook}. The \TeX{} system includes the following programming facilities: \begin{enumerate} \item[1.] datatypes:\\ integers~(256), dimensions\footnote{The term \emph{dimension} is used in \TeX\ to describe physical measurements of typographical objects, like the length of a word.}~(512), boxes~(256), tokenlists~(256), boolean variables~(unrestricted) \item[2.] elementary statements:\\ $a:=\rm const$, $a:=b$ (all types);\\ $a:=a+b$, $a:=a*b$, $a:=a/b$ (integers and dimensions);\\ horizontal and vertical nesting of boxes \item[3.] control constructs:\\ if-then-else statements testing relations between integers, dimensions, boxes, or boolean variables \item[4.] modularization constructs:\\ macros with up to 9~parameters (can be viewed as procedures without the concept of local variables). \end{enumerate} Although the programming facilities of \TeX{} hardly exceed the abilities of a Turing machine, they are sufficient to handle relatively small programs. How about the graphics facilities? Although \TeX{} has no built-in graphics facilities, it allows the placement of characters in arbitrary positions on the page. Therefore, complex pictures can be synthesized from elementary picture elements treated as characters. Lamport has included such a picture drawing environment in his macro package \LaTeX, using quarter circles of different sizes and line segments (with and without arrow heads) of different slopes as basic elements; see~\cite{LaTeX}. These elements are sufficient for drawing trees. This survey of \TeX's capabilities implies that \TeX{} may be a suitable text processing system to implement a tree drawing algorithm directly. We are basing our algorithm on the RT~algorithm, because this algorithm gives the aesthetically most pleasing results. In the first version presented here, we restrict ourselves to unary-binary trees, although our method is applicable to arbitrary multiway trees. But in order to take advantage of the text processing environment, we expand the algorithm to allow labelled nodes. In contrast to previous tree drawing programs, we feel no necessity to position the nodes of a tree on a fixed grid. While this may be reasonable for a plotter with a coarse resolution, it is certainly not necessary for \TeX, a system that is capable of handling arbitrary dimensions and produces device \emph{independent} output. \section{A representation method for \TeX{}trees} The first problem to be solved in implementing our tree drawing algorithm is how to choose a good internal representation for trees. A straightforward adaptation of the implementation by Reingold and Tilford requires, for each node, at least the following fields: \begin{enumerate} \item two pointers to the children of the node \item two dimensions for the offset to the left and the right child (these may be different once there are labels of different widths to the left and right of the nodes) \item two dimensions for the $x$- and $y$-coordinates of the final position of the nodes \item three or four labels \item one token to store the geometric shape (circle, square, framed text etc.) of the node. \end{enumerate} Because these data are used very frequently in calculations, they should be stored in registers (that's what variables are called in \TeX), rather than being recomputed, in order to obtain reasonably fast performance. This gives a total of $10N$ registers for a tree with $N$ nodes, which would exceed \TeX's limited supply of registers. Therefore, we present a modified algorithm hand-tailored to the abilities of \TeX{}. We start with the following observation. Suppose a unary-binary tree is constructed bottom-up, in a postorder traversal. This is done by iterating the following three steps in an order determined by the tree to be constructed. \begin{enumerate} \item Create a new subtree consisting of one external node. \item Create a new subtree by appending the two subtrees created last to a new binary node; see \fig{Construct}. \item Create a new subtree by appending the subtree created last as a left, right, or unary subtree of a new node; see \fig{Construct}. \end{enumerate} (A pointer to) each subtree that has been created in steps 1--3 is pushed onto a stack, and steps 2 and 3 remove two trees or one, respectively, from the stack before the push operation is carried out. Finally, the tree to be constructed will be the remaining tree on the stack. \begin{Figure} \centering \begin{Tree} \treesymbol{\lvls{2}}% \hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff} $+$ \treesymbol{\lvls{2}}% \hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff}\quad $\Longrightarrow$\quad \treesymbol{\lvls{2}}% \treesymbol{\lvls{2}}% \node{\type{dot}}% \hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}% \end{Tree} \vskip\baselineskip \begin{Tree} \treesymbol{\lvls{2}}% \hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff}\quad $\Longrightarrow$\quad \treesymbol{\lvls{2}}% \node{\leftonly\type{dot}}% \hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}% \quad or\quad \treesymbol{\lvls{2}}% \node{\unary\type{dot}}% \hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}% \quad or\quad \treesymbol{\lvls{2}}% \node{\rightonly\type{dot}}% \hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}% \end{Tree} \caption{Construction steps 2 and 3} \label{Construct} \end{Figure} This tree traversal is performed twice in the RT~algorithm. During the first pass, at each execution of step 2 or step 3, the relative positions of the subtree(s) and of the new node are computed. A closer examination of the RT~algorithm reveals that information about the subtree's coordinates is not needed during this pass; the contour information alone would be sufficient. Complete information is only needed in the second traversal, when the tree is actually drawn. Here a special feature of \TeX{} comes in that allows us to save registers. Unlike Pascal, \TeX{} provides the capability of storing a drawing in a single box register that can be positioned freely in later drawings. This means that in our implementation the two passes of the original RT~algorithm can be intertwined into a single pass, storing for each subtree on the stack its contour and its drawing. Although the latter is a complex object, it takes only one of \TeX's precious registers. \section{The internal representation} Given a tree, the corresponding \TeX{}tree is a box containing the ``drawing'' of the tree, together with some additional information about the contour of the tree. The reference point of a \TeX{}tree-box is always in the root of the tree. The height, depth, and width of the box of a \TeX{}tree are of no importance in this context. The additional information about the contour of the tree is stored in some registers for numbers and dimensions and is needed in order to put subtrees together to form a larger tree. \var{loff} is an array of dimensions which contains for each level of the tree the horizontal offset between the left end of the leftmost node at the current level and the left end of the leftmost node at the next level. \var{lmoff} holds the horizontal offset between the root and the leftmost node of the whole tree. \var{lboff} holds the horizontal offset between the root and the leftmost node at the bottom level of the tree. Finally, \var{ltop} holds the distance between the reference point of the tree and the leftmost end of the root. The same is true for \var{roff}, \var{rmoff}, \var{rboff}, and \var{rtop}; just replace ``left'' by ``right''. Finally, \var{height} holds the height of the tree, and \var{type} holds the geometric shape of the root of the tree. \fig{TeXtree} shows an example \TeX{}tree, i.e. a tree drawing and the corresponding additional information. \begin{Figure} \centering \begin{Tree} \e\ir\ir\e \node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}% \i % A \end{Tree} \leavevmode \stack{-10pt}{\vd@st}{% -10pt\\10pt\\10pt\\\var{loff}}% \hspace{1em}% \hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}% \hspace{1em}% \stack{-10pt}{\vd@st}{% 15pt\\5pt\\-10pt\\\var{roff}}% \vskip\baselineskip\raggedright height:~3, type:~dot, ltop:~2pt, rtop:~2pt, lmoff:~-10pt, rmoff:~20pt, lboff:~10pt, rboff:~10pt. \caption{A \TeX{}tree consists of the drawing of the tree and the additional information. The width of the dots is 4pt, the minimal separation between adjacent nodes is 16pt, making for a distance of 20pt center to center. The length of the small rule labelling one of the nodes is 5pt. The column left (right) of the tree drawing is the array \var{loff} (\var{roff}), describing the left (right) contour of the tree. At each level, the dimension given is the horizontal offset between the border at the current and at the next level. The offset between the left border of the root node and the leftmost node at level~1 is -10pt, the offset between the right border of the root node and the rightmost node at level~1 is 15pt, etc.} \label{TeXtree} \end{Figure} Given two \TeX{}trees \var{A} and \var{B}, how can a new \TeX{}tree \var{C} be built that consists of a new root and has \var{A} and \var{B} as subtrees? An example is given in \fig{AddInfo}. \begin{Figure} \centering \begin{Tree} \e\ir\ir\e \node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}% \i % A \end{Tree} \leavevmode A: \stack{-10pt}{\vd@st}{% -10pt\\10pt\\10pt\\\ \\\var{loff}(\var{A})}% \hspace{1em}% \hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}% \hspace{1em}% \stack{-10pt}{\vd@st}{% 15pt\\5pt\\-10pt\\\ \\\var{roff}(\var{A})}% \qquad \begin{Tree} \e\il\e\i\il\il\ir % B \end{Tree} \leavevmode B: \stack{-10pt}{\vd@st}{% 10pt\\-10pt\\-10pt\\-10pt\\-10pt\\\ \\\var{loff}(\var{B})}% \hspace{1em}% \hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}% \hspace{1em}% \stack{-10pt}{\vd@st}{% 10pt\\-10pt\\-10pt\\10pt\\-30pt\\\ \\\var{roff}(\var{B})}% \\[\figspace] \begin{Tree} \e\ir\ir\e \node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}% \i % A \e\il\e\i\il\il\ir % B \i % C \end{Tree} \leavevmode C: \stack{-10pt}{\vd@st}{% -20\\-10pt\\% \makebox[0pt][r]{\var{loff}(\var{A})$\smash{\left\{\vrule height\vd@st depth\vd@st width0pt\right.}$ }% 10pt\\10pt\\% \makebox[0pt][r]{$\longrightarrow$ }% 10pt\\% \makebox[0pt][r]{\raisebox{-.5\vd@st}{\var{loff}(\var{B})$\smash {\left\{\vrule height.5\vd@st depth.5\vd@st width0pt\right.}$ }}% \makebox[0pt][r]{-}10pt\\\ \\\var{loff}(\var{C})}% \hspace{1em}% \hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}% \hspace{1em}% \stack{-10pt}{\vd@st}{% 20pt\\10pt\\-10pt\\-10pt% \makebox[0pt][l]{\raisebox{-.5\vd@st}{ $\smash{\left\}\vrule height2.5\vd@st depth2.5\vd@st width0pt\right.}$\var{roff}(\var{B})}}% \\10pt\\-30pt\\\ \\\var{roff}(\var{C})}% \vspace{\figspace} \centering \begin{tabular}{|l|r|r|r|} \hline &\multic{1}{c|}{\var{A}}&\multic{1}{c|}{\var{B}}&\multic{1}{c|}{\var{C}}\\ \hline height&\multic{1}{c|}{3}& \multic{1}{c|}{5}& \multic{1}{c|}{6}\\ type& \multic{1}{c|}{dot}&\multic{1}{c|}{dot}&\multic{1}{c|}{dot}\\ ltop& 2pt& 2pt& 2pt\\ rtop& 2pt& 2pt& 2pt\\ lmoff& -10pt& -30pt& -30pt\\ rmoff& 20pt& 10pt& 30pt\\ lboff& 10pt& -30pt& -10pt\\ rboff& 10pt& -30pt& -10pt\\ \hline \end{tabular}\qquad \begin{tabular}{|c|r|r|} \hline \multic{1}{|c|}{level}&\multic{1}{c|}{\var{totsep}}& \multic{1}{c|}{\var{currsep}}\\ \hline 0&20pt&0/16pt\\ 1&25pt&11/16\\ 2&40pt&1/16pt\\ 3&40pt&16pt\\ \hline \end{tabular} \caption{The \TeX{}trees \var{A} and~\var{B} are combined to form the larger \TeX{}\-tree~\var{C}. The small table gives the history of computation for \var{totsep} and \var{currsep}.} \label{AddInfo} \end{Figure} First we determine which tree is higher; this is \var{B} in the example. Then we have to compute the minimal distance between the roots of \var{A} and \var{B}, such that at all levels of the trees there is free space of at least \var{minsep} between the trees when they are drawn side by side. For this purpose we keep track of two values, \var{totsep} and \var{currsep}. The variables \var{totsep} and \var{currsep} hold the total distance between the roots and the distance between the rightmost node of \var{A} and the leftmost node of \var{B} at the current level. In order to calculate \var{totsep} and \var{currsep}, we start at level 0 and visit each level of the trees until we reach the bottom level of the smaller tree; this is \var{A} in our example. At level 0, the distance between the roots of \var{A} and \var{B} should be at least \var{minsep}. Therefore, we set $\var{totsep}:=\var{minsep} + \var{rtop}(\var{A}) + \var{ltop}(\var{B})$ and $\var{currsep}:=\var{minsep}$. Using $\var{roff}(\var{A})$ and $\var{loff}(\var{B})$, we can proceed to calculate \var{currsep} for the next level. If $\var{currsep} < \var{minsep}$, we have to increase \var{totsep} by the difference and update \var{currsep}. This process is iterated until we reach the lowest level of \var{A}. Then \var{totsep} holds the final distance between the nodes of \var{A} and \var{B}, as calculated by the RT~algorithm. If the root of \var{C} is a significant node, then the additional space , which is 0pt by default, is added to \var{totsep}. However, the approach of synthesizing drawings from simple graphics characters allows only a finite number of orientations for the tree edges; therefore, \var{totsep} must be increased slightly to fit the next orientation available. Now we are ready to construct the box of \TeX{}tree~\var{C}. Simply put \var{A} and~\var{B} side by side, with the reference points \var{totsep}~units apart, insert a new node above them, and connect the parent and children by edges. Next, we update the additional information for \var{C}. This can be done by using the additional information for \var{A} and~\var{B}. Note that most components of $\var{roff}(\var{C})$ and $\var{lroff}(\var{C})$ are the same as in the higher tree, which is \var{B} in our case. So, if we can avoid moving this information around, we only have to access $\var{height}(\var{A}) + \var{const}$ many counters in order to update the additional information for \var{C}. This implies that we can apply the same argument as in~\cite{TidierTrees}, which gives us a running time of $\O(N)$ for drawing a tree with N nodes. Therefore, we must carefully design the storage allocation for the additional information of \TeX{}trees in order to fulfill the following requirements: If a new tree is built from two subtrees, the additional information of the new tree should share storage with its larger subtree. Organizational overhead, that is, pointers which keep track of the locations of different parts of additional information, must be avoided. This means that all the additional information for one \TeX{}tree should be stored in a row of consecutive dimension registers such that only one pointer granting access to the first element in this row is needed. On the other hand, each parent tree is higher and therefore needs more storage than its subtrees. So we must ensure that there is always enough space in the row for more information. The obvious way to fulfill these requirements is to use a stack and to allow only the topmost \TeX{}trees of this stack to be combined into a larger tree at any time. This leads to the following register allocation: A subsequent number of box registers contains the treeboxes of the subtrees in the stack. A subsequent number of token registers contains the type information for the nodes of the subtrees in the stack. For each subtree in the stack, a subsequent number of dimension registers contains the contour information of the subtree. The ordering of these groups of dimension registers reflects the ordering of the subtrees in the stack. Finally, a subsequent number of counter registers contains the height and the address of the first dimension register for each subtree in the stack. Four address counters store the addresses of the last treebox, type information, height, and address of contour information. A sketch of the register organization for a stack of \TeX{}trees is provided in \fig{Registers}. \begin{Figure} Dimension registers\\ \var{lmoff}(1) \var{rmoff}(1) \var{lboff}(1) \var{rboff}(1) \var{ltop}(1) \var{rtop}(1)\\ \var{loff}($h_1$) \var{roff}($h_1$) \dots\ \var{loff}(1) \var{roff}(1)\\ \dots\\ \var{lmoff}($n$) \var{rmoff}($n$) \var{lboff}($n$) \var{rboff}($n$) \var{ltop}($n$) \var{rtop}($n$)\\ \var{loff}($h_n$) \var{roff}($h_n$) \dots\ \var{loff}(1) \var{roff}(1)\\ \ \\ Counter registers\\ \var{lasttreebox} \var{lasttreeheight} \var{lasttreeinfo} \var{lasttreetype}\\ \var{treeheight}(1) \var{diminfo}(1) \dots\ \var{treeheight}($n$) \var{diminfo}($n$)\\ \ \\ Box registers\\ \var{treebox}(1) \dots\ \var{treebox}($n$)\\ \ \\ Token registers\\ \var{type}(1) \dots\ \var{type}($n$) \caption{\var{lasttreebox}, \var{lasttreeheight}, \var{lasttreeinfo}, \var{lasttreetype} contain pointers to \var{treebox}($n$) \var{treeheight}($n$), \var{lmoff}($n$), \var{type}($n$), \var{diminfo}($i$) contains a pointer to \var{lmoff}($i$). Unused dimension registers are allowed between the dimension registers of subsequent trees. The counter registers \var{lasttreebox},\ldots,\var{diminfo}($n$) serve as a directory mechanism to access the \TeX{}trees on the stack.} \label{Registers} \end{Figure} When a new node is pushed onto the stack, the treebox, type information, height, address of contour information, and contour information are stored in the next free registers of the appropriate type, and the four address counters are updated accordingly. When a new tree is formed from the topmost subtrees on the stack, the treebox, type information, height, and address of contour information of the new tree are sorted in the registers formerly used by the bottommost subtree that has occured in the construction step, and the four address registers are updated accordingly. This means that these informations for the subtrees are no longer accessible. The contour information of the new subtree is stored in the same registers as the contour information of the larger subtree used in the construction, apart from the left and right offset of the root to the left and right child, which are stored in the following dimension registers. That means that gaps can occur between the contour information of subsequent subtrees in the stack, namely when the right subtree, which is on a higher position on the stack, is higher than the left one. In order to avoid these gaps, the user can specify an option \verb.\lefttop. when entering a binary node, which makes the topmost tree in the stack the left subtree of the node. This stack concept also has consequences for the design of the user interface that is discussed in Section~\ref{Interface}. \section{Space cost analysis} Suppose we want to draw a unary-binary tree $T$ of height $h$ having $N$ nodes\footnote{The height $h$ and the number of nodes $N$ refer to the drawing of the tree. $N$ is the number of circles, squares etc.~actually drawn, and $h$ is the number of levels in the drawing minus 1.}. According to our internal representation, for each subtree in the stack we need \begin{enumerate} \item one box register to store the box of the \TeX{}tree. \item one token register to store the type of the root of the subtree. \item $2h^\prime+6$ dimension registers to store the additional information, where $h^\prime$ is the height of the subtree. \item three counter registers to store the register numbers of the box register, the token register, and the first dimension register above. \end{enumerate} The following lemma relates to $h$ and $N$ the number of subtrees of $T$ which are on the stack simultaneously and their heights. \begin{lemma} \begin{enumerate} \item At any time, there are at most $h+1$ subtrees of $T$ on the stack. \item For each set $\T$ of subtrees of $T$ which are on the stack simultaneously we have $$\sum_{T^\prime\in \T}({\rm ht}(T^\prime)+1) \le\min(N,{(h+1)(h+2)\over2}).$$ \end{enumerate} \end{lemma} \begin{proof} \begin{enumerate} \item By induction on $h$.\label{stackdepth} \item The trees in $\T$ are pairwise disjoint, and each tree of height $h^\prime$ has at least $h^\prime+1$ nodes. This implies $$\sum_{T^\prime\in \T}({\rm ht}(T^\prime)+1) \le N.$$ The second part is shown by induction on $h$. The basis $h=0$ is clear. Assume the assumption holds for all trees of height less than $h$. If $\T$ contains only subtrees of either the left or the right subtree of $T$, we have $$\sum_{T^\prime\in \T}({\rm ht}(T^\prime)+1)\le {h(h+1)\over2}\le{(h+1)(h+2)\over2}.$$ Otherwise, $\T$ contains the left or the right subtree $T_s$ of $T$. Then all elements of $\T-\{T_s\}$ belong to the other subtree. This implies \begin{eqnarray*} \sum_{T^\prime\in \T}({\rm ht}(T^\prime)+1)&\le& {\rm ht}(T_s)+1 +\sum_{T^\prime\in \T-\{T_s\}}({\rm ht}(T^\prime)+1)\\ &\le& h+{h(h+1)\over2}\le{(h+1)(h+2)\over2}.\proofend \end{eqnarray*} \end{enumerate} \end{proof} Therefore, our implementation uses at most $9h+2\min(N,(h+1)(h+2)/2)$ registers. In order to compare this with the $10N$ registers used in the straightforward implementation, an estimation of the average height of a tree with $N$ nodes is needed. Several results, depending on the type of trees and of the randomization model, are cited in \fig{Stat}, which compares the number of registers used in a straightforward implementation with the average number of registers used in our implementation. This table shows clearly the advantage of our implementation. \begin{Figure} \centering \begin{tabular}{|c|c|c|c|c|} \hline ®isters&\multicolumn{3}{c|}{average registers}\\ \cline{3-5} nodes&(straight-&extended&unary-binary&binary\\ &forward)&binary trees&trees& search trees\\ &&($\sqrt{\pi n}$) \cite{AverageHeight}& ($\sqrt{3\pi n}$) ~\cite{BinaryTrees}& ($4.311\log n$) \cite{BinarySearchTrees}\\ \hline \ds8& \ds80& \ds61.12& \ds94.15& \ds51.04\\ \ds9& \ds90& \ds65.86& 100.89& \ds55.02\\ 10& 100& \ds70.44& 107.37& \ds58.80\\ 11& 110& \ds74.91& 113.64& \ds62.41\\ 12& 120& \ds79.26& 119.71& \ds65.87\\ 20& 200& 111.34& 163.56& \ds90.48\\ 30& 300& 147.37& 211.33& 117.31\\ 40& 400& 180.89& 254.75& 132.58\\ 50& 500& 212.80& 295.37& 143.54\\ \hline \end{tabular} \caption{The numbers of registers used by a straightforward implementation (second column) and by our modified implementation (third to fifth column) of the RT~algorithm are given for different types of trees and randomization models. The formula in parentheses indicates the average height of the respective class of trees, as depending on the number of nodes.} \label{Stat} \end{Figure} \section{The user interface}\label{Interface} \subsection{General design considerations} The user interface of \TreeTeX{} has been designed in the spirit of the thorough separation of the logical description of document components and their layout; see~\cite{DocumentFormatting,GML}. This concept ensures both uniformity and flexibility of document layout and frees authors from layout problems which have nothing to do with the substance of their work. For some powerful implementations and projects see \cite{Tables,Karlsruhe,LaTeX,Grif,Scribe}. In this context, the description of a tree is given in a purely logical form, and layout variations are defined by a separate style command which is valid for all trees of a document. A second design principle is to provide defaults for all specifications, thereby allowing the user to omit many definitions if the defaults match what he or she wants. The node descriptions of a tree must be entered in postorder. This fits the internal representation of \TeX{}trees best. Although this is a natural method of describing a tree, a user might prefer more flexible description methods. However, note that instances of well defined tree classes can be described easily by \TeX{} macros. In section~\ref{ExampleClasses}. we give examples of macros for complete binary trees and Fibonacci trees. \TreeTeX{} uses the picture making macros of \LaTeX. If \TreeTeX{} is used with any other macro package or format, the picture macros of \LaTeX{} are included automatically. \subsection{The description of a tree} The description of a tree is started by the command \verb.\beginTree. and closed by \verb.\endTree. (or \verb.\begin{Tree}. and \verb.\end{Tree}. in \LaTeX). The description can be started in any mode; it defines a box and two dimensions. The box is stored in the box register \verb.\TeXTree. and contains the drawing of the tree. The box has zero height and width, and its depth is the height of the drawing. The reference point is in the center of the node of the tree. The dimensions are stored in the registers \verb.\leftdist. and \verb.\rightdist. and describe the distance between the reference point and the left and right margin of the drawing. These data can be used to position the drawing of the tree. Note that the \TreeTeX{} macros don't contribute anything to the current page but only store their results in the registers \verb.\TeXTree., \verb.\leftdist., and \verb.\rightdist.. It is the user's job to put the drawing onto the page, using the commands \verb.\copy. or \verb.\box. (or \verb.\usebox. in \LaTeX). Each matching pair of \verb.\beginTree. and \verb.\endTree. must contain the description for only \emph{one} tree. Descriptions of trees cannot be nested and new registers cannot be allocated inside a matching pair of \verb.\beginTree. and \verb.\endTree.. As already stated, each tree description defines the nodes of the tree in postorder, that is, a tree description is a particular sequence of node descriptions. A node description, in turn, consists of the macro \verb.\node., followed by a list of node options, included in braces. The list of node options may be empty. The node options describe the labels, the geometric shape (type), and the outdegree of the node. Default values are provided for all options which are not explicitly specified. The following node options are available: \begin{enumerate} \item[1.] \verb.\lft{