JavaScript - Efficiently find all elements containing one of a large set of strings

Posted by noah on Stack Overflow See other posts from Stack Overflow or by noah
Published on 2010-04-23T16:08:47Z Indexed on 2010/04/23 19:43 UTC
Read the original article Hit count: 213

Filed under:
|
|

I have a set of strings and I need to find all all of the occurrences in an HTML document. Where the string occurs is important because I need to handle each case differently:

  • String is all or part of an attribute. e.g., the string is foo: <input value="foo"> -> Add class ATTR to the element.

  • String is the full text of an element. e.g., <button>foo</button> -> Add class TEXT to the element.

  • String is inline in the text of an element. e.g., <p>I love foo</p> -> Wrap the text in a span tag with class TEXT.

Also, I need to match the longest string first. e.g., if I have foo and foobar, then <p>I love foobar</p> should become <p>I love <span class="TEXT">foobar</span></p>, not <p>I love <span class="TEXT">foo</span>bar</p>.

The inline text is easy enough: Sort the strings descending by length and find and replace each in document.body.innerHTML with <span class="TEXT">$1</span>, although I'm not sure if that is the most efficient way to go.

For the attributes, I can do something like this:

sortedStrings.each(function(it) {
     document.body.innerHTML.replace(new RegExp('(\S+?)="[^"]*'+escapeRegExChars(it)+'[^"]*"','g'),function(s,attr) {
        $('[+attr+'*='+it+']').addClass('ATTR');
     });
});

Again, that seems inefficient.

Lastly, for the full text elements, a depth first search of the document that compares the innerHTML to each string will work, but for a large number of strings, it seems very inefficient.

Any answer that offers performance improvements gets an upvote :)

© Stack Overflow or respective owner

Related posts about JavaScript

Related posts about jQuery