Reasonable automatic HTML to PDF conversion (in UNIX/Linux environment)

Posted by Alex Balashov on Server Fault See other posts from Server Fault or by Alex Balashov
Published on 2009-08-01T18:40:15Z Indexed on 2010/04/01 22:13 UTC
Read the original article Hit count: 343

Filed under:
|
|
|
|

Is there a way to generate PDF documents from HTML files automatically in Linux where the PDF offers some kind of reasonable level of resemblance to the input file?

A command-line tool - as opposed to an interactive GUI of some kind - is key.

I have tried htmldoc and some related cousins, of course. But these tools are hopelessly stone-age; htmldoc doesn't support CSS at all. You won't find a lot of HTML documents these days that don't have at least some CSS styling. I don't really care about stupid effects or minor embellishments, but the issue is that CSS is at the core of most layouts these days; not many folks are using 6 layers of nested tables anymore. So, if the conversion tool has no grasp of CSS whatsoever, it's not just a matter of "the document doesn't look quite right"; it is likely to not meet the minimum standard of usability at all.

It has been suggested to me by some folks to try to use the Gecko rendering engine to generate images that can be converted to PDFs, but I have no idea how one would go about doing this, let alone easily.

I have no trouble believing that there are good commercial tools that do this, but I'm really looking for an open-source package if possible, as the endeavour itself is an open-source one and doesn't pay.

Thanks in advance!

© Server Fault or respective owner

Related posts about pdf

Related posts about html