How do I match complete XML objects in a string?

Posted by cyclotis04 on Stack Overflow See other posts from Stack Overflow or by cyclotis04
Published on 2010-05-27T17:55:35Z Indexed on 2010/05/27 18:01 UTC
Read the original article Hit count: 362

Filed under:
|
|
|

I'm attempting to find complete XML objects in a string. They have been placed in the string by an XmlSerializer, but may or may not be complete. I've toyed with the idea of using a regular expression, because it seems like the kind of thing they were built for, except for the fact that I'm trying to parse XML.

I'm trying to find complete objects in the form:

<?xml version="1.0"?>
<type>
    <field>value</field>
    ...
</type>

My thought was a regex to find <?xml version="1.0"?><type> and </type>, but if a field has the same name as type, it obviously won't work.

There's plenty of documentation on XML parsers, but they seem to all need a complete, fully-formed document to parse. My XML objects can be in a string surrounded by pretty much anything else (including other complete objects).

hw<e>reR@lot$0fr@ndm&nchrs%<?xml version="1.0"?><type><field>...</field>...</type>@ndH#r$omOre!!>nuT6erjc?y!<?xml version="1.0"?><type><field>...</field>...</type>ty!=]

A regex would be able to match a string while excluding the random characters, but not find a complete XML object. I'd like some way to extract an object, parse it with a serializer, then repeat until the string contains no more valid objects.

© Stack Overflow or respective owner

Related posts about c#

Related posts about Xml