How do you remove invalid hexadecimal characters from an XML-based data source prior to constructing

Posted by Oppositional on Stack Overflow See other posts from Stack Overflow or by Oppositional
Published on 2008-08-21T18:47:49Z Indexed on 2010/03/31 9:53 UTC
Read the original article Hit count: 477

Filed under:

Is there any easy/general way to clean an XML based data source prior to using it in an XmlReader so that I can gracefully consume XML data that is non-conformant to the hexadecimal character restrictions placed on XML?

Note:

The solution needs to handle XML data sources that use character encodings other than UTF-8, e.g. by specifying the character encoding at the XML document declaration. Not mangling the character encoding of the source while stripping invalid hexadecimal characters has been a major sticking point.
The removal of invalid hexadecimal characters should only remove hexadecimal encoded values, as you can often find href values in data that happens to contains a string that would be a string match for a hexadecimal character.

Background:

I need to consume an XML-based data source that conforms to a specific format (think Atom or RSS feeds), but want to be able to consume data sources that have been published which contain invalid hexadecimal characters per the XML specification.

In .NET if you have a Stream that represents the XML data source, and then attempt to parse it using an XmlReader and/or XPathDocument, an exception is raised due to the inclusion of invalid hexadecimal characters in the XML data. My current attempt to resolve this issue is to parse the Stream as a string and use a regular expression to remove and/or replace the invalid hexadecimal characters, but I am looking for a more performant solution.

Developer IT

How do you remove invalid hexadecimal characters from an XML-based data source prior to constructing - Developer IT

How do you remove invalid hexadecimal characters from an XML-based data source prior to constructing

Xml

validation

encoding

Related posts about Xml

gwt+xml- can i read through incomplete XML using the GWT XML Parser

Store XML,update record in XML,retrive a specific record in XML stored on BB device

perl xml parser get xml content within xml

Reading php generated XML in flash?

Announcing RSS feeds of Microsoft All-In-One Code Framework code samples

Related posts about validation

How can I use Windows Workflow for validation of a Silverlight application?

HTML5 Form Validation

HTML5 Form Validation

Introducing Data Annotations Extensions

jQuery validation per multiple fieldsets, how to use different event to trigger validation per secti

Categories cloud