Ignoring "Content is not allowed in trailing section" SAXException

Posted by Paul J. Lucas on Stack Overflow See other posts from Stack Overflow or by Paul J. Lucas
Published on 2010-05-11T23:17:52Z Indexed on 2010/05/11 23:34 UTC
Read the original article Hit count: 168

Filed under:
|
|
|

I'm using Java's DocumentBuilder.parse(InputStream) to parse an XML document. Occasionally, I get malformed XML documents in that there is extra junk after the final > that causes a SAXException: Content is not allowed in trailing section. (In the cases I've seen, the junk is simply one or more null bytes.)

I don't care what's after the final >. Is there an easy way to parse an entire XML document in Java and have it ignore any trailing junk?

Note that by "ignore" I don't simply mean to catch and ignore the exception: I mean to ignore the trailing junk, throw no exception, and to return the Document object since the XML up to an including the final > is valid.

© Stack Overflow or respective owner

Related posts about java

Related posts about sax