Efficient mass string search problem.
        Posted  
        
            by Monomer
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Monomer
        
        
        
        Published on 2010-03-28T08:10:36Z
        Indexed on 
            2010/03/28
            8:13 UTC
        
        
        Read the original article
        Hit count: 363
        
The Problem: A large static list of strings is provided. A pattern string comprised of data and wildcard elements (* and ?). The idea is to return all the strings that match the pattern - simple enough.
Current Solution: I'm currently using a linear approach of scanning the large list and globbing each entry against the pattern.
My Question: Are there any suitable data structures that I can store the large list into such that the search's complexity is less than O(n)?
Perhaps something akin to a suffix-trie? I've also considered using bi- and tri-grams in a hashtable, but the logic required in evaluating a match based on a merge of the list of words returned and the pattern is a nightmare, and I'm not convinced its the correct approach.
© Stack Overflow or respective owner