basic operations for modifying a source document with XSLT

Posted by SpliFF on Stack Overflow See other posts from Stack Overflow or by SpliFF
Published on 2010-03-09T03:14:34Z Indexed on 2010/03/09 3:21 UTC
Read the original article Hit count: 214

Filed under:

All the tutorials and examples I've found of XSLT processing seem to assume your destination will be a significantly different format/structure to your source and that you know the structure of the source in advance. I'm struggling with finding out how to perform simple "in-place" modifications to a HTML document without knowing anything else about its existing structure.

Could somebody show me a clear example that, given an arbitrary unknown HTML source will:

1.) delete the classname 'foo' from all divs
2.) delete a node if its empty (ie <p></p>)
3.) delete a <p> node if its first child is <br>
4.) add newattr="newvalue" to all H1
5.) replace 'heading' in text nodes with 'title'
6.) wrap all <u> tags in <b> tags (ie, <u>foo</u> -> <b><u>foo</u></b>)
7.) output the transformed document without changing anything else

The above examples are the primary types of transform I wish to accomplish. Understanding how to do the above will go a long way towards helping me build more complex transforms.

To help clarify/test the examples here is a sample source and output, however I must reiterate that I want to work with arbitrary samples without rewriting the XSLT for each source:

<!doctype html>
<html>
<body>
  <h1>heading</h1>
  <p></p>
  <p><br>line</p>
  <div class="foo bar"><u>baz</u></div>
  <p>untouched</p>
</body>
</html>

output:

<!doctype html>
<html>
<body>
  <h1 newattr="newvalue">title</h1>
  <div class="bar"><b><u>baz</u></b></div>
  <p>untouched</p>
</body>
</html>

© Stack Overflow or respective owner

Related posts about xslt