A troff to HTML Conversion Program
troff_to_html - a TROFF to HTML Convertor
Paul E. Dunne
Department of Computer Science
University of Liverpool
Liverpool L69 3BX
Great Britain
ped@uk.ac.liv.csc
http://www.csc.liv.ac.uk/users/ped
Description
The following describes a very basic text conversion facility
for translating a troff source file (using the -ms macro set) to
a HTML file suitable for vieweing using mosaic or netscape.
The translator is invoked (locally) via the command
~ped/bin/troff_to_html < input_file > output_file
A (uuencoded compressed) binary (HP7000 Series) is here
Source program (written in Ada) is here
where input_file is a troff source file (that may include directives
for the pre-processors pic, and eqn), built using the -ms macro set.
output_file will be a HTML source file.
A C++ version (compiling under c89) has been produced by Dave McGaw and
may be found here. Please note that this
requires a dictionary file for special characters located here
and an include (.h) file that is here.
-ms macros Recognised and Translated
-
.NH Numbered headings (parameters to .NH are ignored, e.g. .NH 2 processes as .NH)
-
.SH Un-numbered headings
-
.LP New paragraph and serves as break for .IP (see below)
-
.PP New paragraph and serves as break for .IP (see below)
-
.IP Indented paragraphs (paragraph labels are ignored); these are processed
as un-ordered lists, the list being terminated when .LP or .PP directive encountered.
-
.DS/.DE The text between these is left unaltered, hence .DS/.DE translate to HTML <pre> and </pre>.
-
.EQ/.EN An (extremely crude) attempt is made to process the eqn text between these.
-
.PS./.PE Translated to an anchor to a given file `picture<n>.gif' (where <n>) starts from 1.
This file should be created (by the user) to hold the relevant picture.
-
.TS/.TE Simple table handling.
Extensions August 1996: Two artificial directives have been added, to allow printing
of $ symbols: These are:
- .NE (No Eqn) treat $ in text as character not as opening of eqn expression. (default)
- .EO (Eqn On) Treat $ in text as start of eqn expression.
Ada Version
No limit on number of rows/column. New version (July 1996) uses HTML <table>
feature.
C++ Version
Also incorporates HTML <table> (errors in previous version corrected, July 1996).
Multiple row formats allowed (max 70 rows/20 columns).
NoteNo font changes or eqn inside table entries. Multi-column spans generally
don't look very good. It is assumed that the table specification is valid (otherwise
conversion program will crash).
troff Directives Recognised and Translated.
-
.ce Printed as a 3rd level heading. Parameters to .ce are ignored.
-
.ft Change font. Roman (R), Italic (I), Bold (B), Courier (C) are all recognised.
-
.ul Print next line of text in italic font.
-
.so Will create an anchor point to the file indicated by the .so command. Link in
text will be indicated as `anchor<n>'.
-
.br/.sp Causes a line break. Parameters to .sp are ignored.
-
Text in the form $...$ is regarded as introducing an eqn expression. If the first
character after $ is the character ^ then the text is left in its literal form.
In addition diacritical marks (umlaut, grave, and acute accents) are dealt with.
Note: All lines opening with a full-stop are ignored unless the line
is a directive recognised by the translator.