Converting LaTeX to HTML
Tags: software
Since I write almost everything in LaTeX these days, be it personal stuff (letters, essays, documentation) or academcial things (papers, reports, my thesis), I was interested how to publish them in other formats. While PDF and Postscript files are great for storage and printing, nothing beats HTML in its simplicity and ubiquity.
So far I tried two different programs, latex2html
and tth
. They
follow the same premise (converting a LaTeX document into a series of
HTML files) but differ in their approach.
latex2html
latex2html
is a rather old tool.
Apparently, it is not updated anymore. Consequently, its output is
rather peculiar and it supports HTML 4.0 only. No XHTML and definitely
no strict
variant. Here is a loose list of my experiences while trying
to convert several LaTeX documents:
- I ran into some problems concerning German umlauts (which I specified as
"a
, for example).latex2html
expects these to be specified as\"{a}
. - If
latex2html
encounters an unknown environment, it falls back to the LaTeX interpreter and generates a picture instead. This is also done when mathematical formulas are involved. - There is no possibility (to my knowledge) for generating the body of the document only. This makes inclusion of LaTeX content for an existing website harder.
- Theming the generated files is possible, albeit cumbersome.
All in all, latex2html
was not enough for my purposes. So the search
continued and I eventually arrived at tth
.
tth
tth
has a very novel approach: Instead of generating
images for mathematical formulas, tth
tries to generate HTML code that mostly resembles a
formula.
All in all, this works quite well. Even with complex documents. Here are my notes:
- By default, a single HTML page is generated. This is done quite fast, even for a larger document.
tth
is very resilient concerning unknown commands. It tries to parse the whole document and simply ignores erroneous sections.- The layout is represented very well: Tables, sections, it is all there.
- Short formulas are easily readable. For longer formulas, I find the output of
tth
tedious to read. tth
is very tunable: There is even an option for generating the body of the document only.
Conclusion
All in all, tth
proved to be sufficient for my purposes. Yet, there seems to be a lack of
publishing software for LaTeX sources. This is a pity, as publishing documents in several formats at
once without many adjustments would be interesting. Another possibility would be to allow
rendering of PDF files inside the browser (the currently available plugins are rather disappointing,
in my opinion), although I do not like this option as it makes a browser even more bloated.
Furthermore, in comparison to PDF, HTML offers still some advantages in readability (especially for
disabled readers).