replace html tags within xml content with wordML formatting tags

Posted by Josh on Stack Overflow See other posts from Stack Overflow or by Josh
Published on 2010-06-06T21:25:30Z Indexed on 2010/06/06 21:32 UTC
Read the original article Hit count: 310

Filed under:
|
|
|
|

I am taking an XML document and creating a word document using XSLT and OpenXML. The problem is that when I create the word document, all of the HTML that is within the CDATA tags are not escaped and look like this:

GET /recipe/recipe/cat.php/>"><script>alert(document.domain)</script>

I have tried defining "cdata-section-elements" in my xsl:output; however I receive an error stating that p tag doesn't match the w:t tag.(the p tag is apart of the CDATA HTML).

Here is what one of my xsl templates looks like:

 <xsl:template match="SECTION">
    <w:p w:rsidR="00272D24" w:rsidRPr="00272D24" w:rsidRDefault="00272D24">
      <w:pPr>
        <w:rPr>
          <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
        </w:rPr>
      </w:pPr>
  </xsl:template>
  <w:r w:rsidRPr="00272D24">
    <w:rPr>
      <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
    </w:rPr>
    <w:t>
      <xsl:value-of select="INFORMATION"/>
    </w:t>
  </w:r>
</w:p>

Here is what the xml looks like:

        <INFORMATION>
          <![CDATA[
              <P> line 1 of information
              <P> line 2 of information.......]]>
        </INFORMATION>

Here is what the word output looks like: (white space and poor formatting)

DIAGNOSIS:
<P>          line 1 of information.            <P>           line 2 of information 

I need to be able to somehow render the HTML or strip out the HTML. If I strip out the HTML then I would have to search for every possible HTML element, which is madness! Any help at all would be appreciated...

Thanks.

© Stack Overflow or respective owner

Related posts about c#

Related posts about Xml