Regular expression to match HTML table row ( <tr> ) NOT containing a specific value

Posted by user1821136 on Stack Overflow See other posts from Stack Overflow or by user1821136
Published on 2012-11-13T15:59:02Z Indexed on 2012/11/13 17:01 UTC
Read the original article Hit count: 225

Filed under:
|
|
|
|

I'm using Notepad++ to clean up a long and messy HTML table. I'm trying to use regular expressions even if I'm a total noob. :)

I need to remove all the table rows that doesn't contain a specific value (may I call that substring?).

After having all the file contents unwrapped, I've been able to use the following regular expression to select, one by one, every table row with all its contents:

<tr>.+?</tr>

How can I improve the regular expression in order to select and replace only table rows containing, somewhere inside a part of them, that defined substring?

I don't know if this does matter but the structure of every table row is the following (I've put there every HTML tag, the dots stand for standard content/values)

<tr>
    <td> ... </td>
    <td> ... </td>
    <td> <a sfref="..." href="...">!! SUBSTRING I HAVE TO MATCH HERE !!</a> </td>
    <td> <img /> </td>
    <td> ... </td>
    <td> ... </td>
    <td> ... </td>
    <td> ... </td>
</tr>

Thanks in advance for your help!

© Stack Overflow or respective owner

Related posts about html

Related posts about regex