Extract parts of html using regex

Posted by Fred Yang on Stack Overflow See other posts from Stack Overflow or by Fred Yang
Published on 2010-04-22T19:07:09Z Indexed on 2010/04/23 0:03 UTC
Read the original article Hit count: 278

Filed under:
|

I have a simple requirement to extract text in html. Suppose the html is

<h1>hello</h1> ... <img moduleType="calendar" /> ...<h2>bye</h2> 

I want to convert it into three parts

<h1>hello</h1> 
<img moduleType="calendar" />
<h2>bye</h2> 

The aim is to extract text in two categories, simple html and special tags with <img moduleType="Calender".

© Stack Overflow or respective owner

Related posts about regex

Related posts about html