HTML to 'pretty' text conversion for printing on text only printer (dot matrix)

Posted by Gala101 on Stack Overflow See other posts from Stack Overflow or by Gala101
Published on 2010-05-14T15:40:19Z Indexed on 2010/05/14 15:44 UTC
Read the original article Hit count: 316

Hi,

I have a web-site that generates some simple tabular data as html tables, many of my users print the web-page on a laser/inkjet printer; however some like to print on legacy Dot Matrix printers (text only) and there-in lies the problem.

When giving Print from web-browser onto dot-matrix printer, the printer actually perceives data as 'graphic'/image and proceeds to print it dot-by-dot.

i.e If printing a character 'C', printer slices it horizontally and prints in 3-4 passes.

Same printer prints a text from an ASCII file (say from notepad) as complete characters in single pass, thereby being 5 times faster and much quieter than when printing a web-page.

(Even tried 'generic text-only driver' but Mozilla Firefox has a know bug that it does not print anything over this particular driver since 2.0+)

So is there some clean way of formatting an already generated HTML (say method takes the entire html table as string) and generates a corresponding text file with properly aligned columns?

I have tried stripping the html tags, but the major issue there is performing good 'wrapping' of a cell's data and maintaining integrity of other cells' data (from same row).

eg: ( '|' and '_' not really required)

Col1    |  Col2      |  Colum_Name3  |  Col4   |
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
1       |  this cell | this column   | smaller |
        | is in three| spans 2 rows  |         |
        | rows       |               |         |
- - - - - - - - - - - - - - - - - - - - - - - - 
2       | smaller now| this also     | but this|
        |            |               | cell's  |
        |            |               | data is |
        |            |               | now     |
        |            |               | bigger  |
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Could you please suggest preferred approach?

I've thought of using xslt and somehow outputting text (instead of more prevalent pdf), but Apache FOP's text renderer is really broken and perhaps forgotten in development path. Commercial one's are way too costly.

© Stack Overflow or respective owner

Related posts about html

Related posts about formatting