Regex: Strip HTML attributes except SRC

Posted by Ian Silber on Stack Overflow See other posts from Stack Overflow or by Ian Silber
Published on 2010-06-08T02:33:04Z Indexed on 2010/06/08 4:22 UTC
Read the original article Hit count: 300

Filed under:
|
|
|

Hi,

I'm trying to write a regular expression that will strip all tag attributes except for the SRC attribute. For example:

<p id="paragraph" class="green">This is a paragraph with an image <img src="/path/to/image.jpg" width="50" height="75"/></p>

Would be returned as:

<p>This is a paragraph with an image <img src="/path/to/image.jpg" /></p>

I have a regular expression to strip all attributes, but I'm trying to tweak it to leave in src. Here's what I have so far:

<?php preg_replace('/<([A-Z][A-Z0-9]*)(\b[^>]*)>/i', '<$1>', '<html><goes><here>');

Using PHP's preg_replace() for this.

Thanks! Ian

© Stack Overflow or respective owner

Related posts about php

Related posts about html