Regex to delete HTML within <table> tags

Posted by johnv on Stack Overflow See other posts from Stack Overflow or by johnv
Published on 2010-12-21T17:51:24Z Indexed on 2010/12/21 17:54 UTC
Read the original article Hit count: 155

Filed under:
|

I have an HTML document in .txt format containing multiple tables and other texts and I am trying to delete any HTML (anything within "<>") if it's inside a table (between and ). For example:

===================

other text

<other HTML>

<table>

<b><u><i>bold underlined italic text</b></u></i>

</table>

other text

<other HTML>

==============

The final output would be as the following. Note that only HTML within and are removed.

==============

other text

<other HTML>

<table>

bold underlined italic text        

</table>

other text

<other HTML>

=============

Any help is greatly appreciated!

© Stack Overflow or respective owner

Related posts about regex

Related posts about vb.net