Search Results

Search found 51 results on 3 pages for 'trie'.

Page 1/3 | 1 2 3  | Next Page >

  • Implementing a Patricia Trie for use as a dictionary

    - by Regis Frey
    I'm attempting to implement a Patricia Trie with the methods addWord(), isWord(), and isPrefix() as a means to store a large dictionary of words for quick retrieval (including prefix search). I've read up on the concepts but they just aren't clarifying into an implementation. I want to know (in Java or Python code) how to implement the Trie, particularly the nodes (or should I implement it recursively). I saw one person who implemented it with an array of 26 child nodes set to null/None. Is there a better strategy (such as treating the letters as bits) and how would you implement it?

    Read the article

  • All words in a trie data-structure

    - by John Smith
    I'm trying to put all words in a trie in a string, a word is detonated by the eow field being true for a certain character in the trie data structure, hence a trie can could have letters than lead up to no word, for ex "abc" is in the trie but "c"'s eow field is false so "abc" is not a word Here is my Data structure struct Trie { bool eow; //when a Trie field isWord = true, hence there is a word char letter; Trie *letters[27]; }; and here is my attemped print-all function, basically trying to return all words in one string seperated by spaces for words string printAll( string word, Trie& data) { if (data.eow == 1) return word + " "; for (int i = 0; i < 26; i++) { if (data.letters[i] != NULL) printAll( word + data.letters[i]->letter, *(data.letters[i])); } return ""; } Its not outputting what i want, any suggestions?

    Read the article

  • Persisting a trie to a file - C

    - by Appu
    I have a trie which I am using to do some string processing. I have a simple compiler which generates trie from some data. Once generated, my trie won't change at run time. I am looking for an approach where I can persist the trie in a file and load it effectively. I have looked at sqllite to understand how they are persisting b-treebut their file format looks bit advanced and I may not need all of those. It'd be helpful if someone can provide some ideas to persist and read the trie. I am programming using C.

    Read the article

  • Adapting pseudocode to java implementation for finding the longest word in a trie

    - by user1766888
    Referring to this question I asked: How to find the longest word in a trie? I'm having trouble implementing the pseudocode given in the answer. findLongest(trie): //first do a BFS and find the "last node" queue <- [] queue.add(trie.root) last <- nil map <- empty map while (not queue.empty()): curr <- queue.pop() for each son of curr: queue.add(son) map.put(son,curr) //marking curr as the parent of son last <- curr //in here, last indicate the leaf of the longest word //Now, go up the trie and find the actual path/string curr <- last str = "" while (curr != nil): str = curr + str //we go from end to start curr = map.get(curr) return str This is what I have for my method public static String longestWord (DTN d) { Queue<DTN> holding = new ArrayQueue<DTN>(); holding.add(d); DTN last = null; Map<DTN,DTN> test = new ArrayMap<DTN,DTN>(); DTN curr; while (!holding.isEmpty()) { curr = holding.remove(); for (Map.Entry<String, DTN> e : curr.children.entries()) { holding.add(curr.children.get(e)); test.put(curr.children.get(e), curr); } last = curr; } curr = last; String str = ""; while (curr != null) { str = curr + str; curr = test.get(curr); } return str; } I'm getting a NullPointerException at: for (Map.Entry<String, DTN> e : curr.children.entries()) How can I find and fix the cause of the NullPointerException of the method so that it returns the longest word in a trie?

    Read the article

  • Trie Backtracking in Recursion

    - by Darksky
    I am building a tree for a spell checker with suggestions. Each node contains a key (a letter) and a value (array of letters down that path). So assume the following sub-trie in my big trie: W / \ a e | | k k | | is word--> e e | ... This is just a subpath of a sub-trie. W is a node and a and e are two nodes in its value array etc... At each node, I check if the next letter in the word is a value of the node. I am trying to support mistyped vowels for now. So 'weke' will yield 'wake' as a suggestion. Here's my searchWord function in my trie: def searchWord(self, word, path=""): if len(word) > 0: key = word[0] word = word[1:] if self.values.has_key(key): path = path + key nextNode = self.values[key] return nextNode.searchWord(word, path) else: # check here if key is a vowel. If it is, check for other vowel substitutes else: if self.isWord: return path # this is the word found else: return None Given 'weke', at the end when word is of length zero and path is 'weke', my code will hit the second big else block. weke is not marked as a word and so it will return with None. This will return out of searchWord with None. To avoid this, at each stack unwind or recursion backtrack, I need to check if a letter is a vowel and if it is, do the checking again. I changed the if self.values.has_key(key) loop to the following: if self.values.has_key(key): path = path + key nextNode = self.values[key] ret = nextNode.searchWord(word, path) if ret == None: # check if key == vowel and replace path # return nextNode.searchWord(... return ret What am I doing wrong here? What can I do when backtracking to achieve what I'm trying to do?

    Read the article

  • Efficient Trie implementation for unicode strings

    - by U Mad
    I have been looking for an efficient String trie implementation. Mostly I have found code like this: Referential implementation in Java (per wikipedia) I dislike these implementations for mostly two reasons: They support only 256 ASCII characters. I need to cover things like cyrillic. They are extremely memory inefficient. Each node contains an array of 256 references, which is 4096 bytes on a 64 bit machine in Java. Each of these nodes can have up to 256 subnodes with 4096 bytes of references each. So a full Trie for every ASCII 2 character string would require a bit over 1MB. Three character strings? 256MB just for arrays in nodes. And so on. Of course I don't intend to have all of 16 million three character strings in my Trie, so a lot of space is just wasted. Most of these arrays are just null references as their capacity far exceeds the actual number of inserted keys. And if I add unicode, the arrays get even larger (char has 64k values instead of 256 in Java). Is there any hope of making an efficient trie for strings? I have considered a couple of improvements over these types of implementations: Instead of using array of references, I could use an array of primitive integer type, which indexes into an array of references to nodes whose size is close to the number of actual nodes. I could break strings into 4 bit parts which would allow for node arrays of size 16 at the cost of a deeper tree.

    Read the article

  • Trie VS Suffix Tree VS Suffix Array

    - by ukrania
    Hello everyone, Which one is the structure that provides best performance results? Trie, Suffix Tree or Suffix Array? There are other equivalent structures? What are good Java implementations of these structures? Thanks for your answers. Best Regards, ukrania

    Read the article

  • java Getting a list of words from a Trie

    - by adam08
    I'm looking to use the following code to not check whether there is a word matching in the Trie but to return a list all words beginning with the prefix inputted by the user. Can someone point me in the right direction? I can't get it working at all..... public boolean search(String s) { Node current = root; System.out.println("\nSearching for string: "+s); while(current != null) { for(int i=0;i<s.length();i++) { if(current.child[(int)(s.charAt(i)-'a')] == null) { System.out.println("Cannot find string: "+s); return false; } else { current = current.child[(int)(s.charAt(i)-'a')]; System.out.println("Found character: "+ current.content); } } // If we are here, the string exists. // But to ensure unwanted substrings are not found: if (current.marker == true) { System.out.println("Found string: "+s); return true; } else { System.out.println("Cannot find string: "+s +"(only present as a substring)"); return false; } } return false; } }

    Read the article

  • trie reg exp parse step over char and continue

    - by forest.peterson
    Setup: 1) a string trie database formed from linked nodes and a vector array linking to the next node terminating in a leaf, 2) a recursive regular expression function that if A) char '*' continues down all paths until string length limit is reached, then continues down remaining string paths if valid, and B) char '?' continues down all paths for 1 char and then continues down remaining string paths if valid. 3) after reg expression the candidate strings are measured for edit distance against the 'try' string. Problem: the reg expression works fine for adding chars or swapping ? for a char but if the remaining string has an error then there is not a valid path to a terminating leaf; making the matching function redundant. I tried adding a 'step-over' ? char if the end of the node vector was reached and then followed every path of that node - allowing this step-over only once; resulted in a memory exception; I cannot find logically why it is accessing the vector out of range - bactracking? Questions: 1) how can the regular expression step over an invalid char and continue with the path? 2) why is swapping the 'sticking' char for '?' resulting in an overflow? Function: void Ontology::matchRegExpHelper(nodeT *w, string inWild, Set<string> &matchSet, string out, int level, int pos, int stepover) { if (inWild=="") { matchSet.add(out); } else { if (w->alpha.size() == pos) { int testLength = out.length() + inWild.length(); if (stepover == 0 && matchSet.size() == 0 && out.length() > 8 && testLength == tokenLength) {//candidate generator inWild[0] = '?'; matchRegExpHelper(w, inWild, matchSet, out, level, 0, stepover+1); } else return; //giveup on this path } if (inWild[0] == '?' || (inWild[0] == '*' && (out.length() + inWild.length() ) == level ) ) { //wild matchRegExpHelper(w->alpha[pos].next, inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover);//follow path -> if ontology is full, treat '*' like a '?' } else if (inWild[0] == '*') matchRegExpHelper(w->alpha[pos].next, '*'+inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover); //keep adding chars if (inWild[0] == w->alpha[pos].letter) //follow self matchRegExpHelper(w->alpha[pos].next, inWild.substr(1), matchSet, out+w->alpha[pos].letter, level, 0, stepover); //follow char matchRegExpHelper(w, inWild, matchSet, out, level, pos+1, stepover);//check next path } } Error Message: +str "Attempt to access index 1 in a vector of size 1." std::basic_string<char,std::char_traits<char>,std::allocator<char> > +err {msg="Attempt to access index 1 in a vector of size 1." } ErrorException Note: this function works fine for hundreds of test strings with '*' wilds if the extra stepover gate is not used Semi-Solved: I place a pos < w->alpha.size() condition on each path that calls w->alpha[pos]... - this prevented the backtrack calls from attempting to access the vector with an out of bounds index value. Still have other issues to work out - it loops infinitely adding the ? and backtracking to remove it, then repeat. But, moving forward now. Revised question: why during backtracking is the position index accumulating and/or not deincrementing - so at somepoint it calls w->alpha[pos]... with an invalid position that is either remaining from the next node or somehow incremented pos+1 when passing upward?

    Read the article

  • How can I store data in a table as a trie? (SQL Server)

    - by Matt
    Hi, To make things easier, the table contains all the words in the English dictionary. What I would like to do is be able to store the data as a trie. This way I can traverse the different branches of the trie and return the most relevant result. First, how do I store the data in the table as a trie? Second, how do I traverse the tree? If it helps at all, the suggestion in this previous question is where this question was sparked from. Please make sure it's SQL we're talking about. I understood the Mike Dunlavey's C implementation because of pointers but can't see how this part (The trie itself) works in SQL. Thanks, Matt

    Read the article

  • Iterating through nested dictionaries

    - by Framester
    I want to write an iterator for my 'toy' Trie implementation. Adding already works like this: class Trie: def __init__(self): self.root = dict() pass def add(self, string, value): global nops current_dict = self.root for letter in s: nops += 1 current_dict = current_dict.setdefault(letter, {}) current_dict = current_dict.setdefault('value', value) pass The output of the adding looks like that: trie = Trie() trie.add("hello",1) trie.add("world",2) trie.add("worlds",12) print trie.root {'h': {'e': {'l': {'l': {'o': {'value': 1}}}}}, 'w': {'o': {'r': {'l': {'d': {'s': {'value': 2}, 'value': 2}}}}}} I know, that I need a __iter__ and next method. def __iter__(self): self.root.__iter__() pass def next(self): print self.root.next() But AttributeError: 'dict' object has no attribute 'next'. How should I do it? [Update] In the perfect world I would like the output to be one dict with all the words/entries with their corresponding values.

    Read the article

  • Auto-Suggest via &lsquo;Trie&rsquo; (Pre-fix Tree)

    - by Strenium
    Auto-Suggest (Auto-Complete) “thing” has been around for a few years. Here’s my little snippet on the subject. For one of my projects, I had to deal with a non-trivial set of items to be pulled via auto-suggest used by multiple concurrent users. Simple, dumb iteration through a list in local cache or back-end access didn’t quite cut it. Enter a nifty little structure, perfectly suited for storing and matching verbal data: “Trie” (http://tinyurl.com/db56g) also known as a Pre-fix Tree: “Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are normally not associated with every node, only with leaves and some inner nodes that correspond to keys of interest.” This is a very scalable, performing structure. Though, as usual, something ‘fast’ comes at a cost of ‘size’; fortunately RAM is more plentiful today so I can live with that. I won’t bore you with the detailed algorithmic performance here - Google can do a better job of such. So, here’s C# implementation of all this. Let’s start with individual node: Trie Node /// <summary> /// Contains datum of a single trie node. /// </summary> public class AutoSuggestTrieNode {     public char Value { get; set; }       /// <summary>     /// Gets a value indicating whether this instance is leaf node.     /// </summary>     /// <value>     ///     <c>true</c> if this instance is leaf node; otherwise, a prefix node <c>false</c>.     /// </value>     public bool IsLeafNode { get; private set; }       public List<AutoSuggestTrieNode> DescendantNodes { get; private set; }         /// <summary>     /// Initializes a new instance of the <see cref="AutoSuggestTrieNode"/> class.     /// </summary>     /// <param name="value">The phonetic value.</param>     /// <param name="isLeafNode">if set to <c>true</c> [is leaf node].</param>     public AutoSuggestTrieNode(char value = ' ', bool isLeafNode = false)     {         Value = value;         IsLeafNode = isLeafNode;           DescendantNodes = new List<AutoSuggestTrieNode>();     }       /// <summary>     /// Gets the descendants of the pre-fix node, if any.     /// </summary>     /// <param name="descendantValue">The descendant value.</param>     /// <returns></returns>     public AutoSuggestTrieNode GetDescendant(char descendantValue)     {         return DescendantNodes.FirstOrDefault(descendant => descendant.Value == descendantValue);     } }   Quite self-explanatory, imho. A node is either a “Pre-fix” or a “Leaf” node. “Leaf” contains the full “word”, while the “Pre-fix” nodes act as indices used for matching the results.   Ok, now the Trie: Trie Structure /// <summary> /// Contains structure and functionality of an AutoSuggest Trie (Pre-fix Tree) /// </summary> public class AutoSuggestTrie {     private readonly AutoSuggestTrieNode _root = new AutoSuggestTrieNode();       /// <summary>     /// Adds the word to the trie by breaking it up to pre-fix nodes + leaf node.     /// </summary>     /// <param name="word">Phonetic value.</param>     public void AddWord(string word)     {         var currentNode = _root;         word = word.Trim().ToLower();           for (int i = 0; i < word.Length; i++)         {             var child = currentNode.GetDescendant(word[i]);               if (child == null) /* this character hasn't yet been indexed in the trie */             {                 var newNode = new AutoSuggestTrieNode(word[i], word.Count() - 1 == i);                   currentNode.DescendantNodes.Add(newNode);                 currentNode = newNode;             }             else                 currentNode = child; /* this character is already indexed, move down the trie */         }     }         /// <summary>     /// Gets the suggested matches.     /// </summary>     /// <param name="word">The phonetic search value.</param>     /// <returns></returns>     public List<string> GetSuggestedMatches(string word)     {         var currentNode = _root;         word = word.Trim().ToLower();           var indexedNodesValues = new StringBuilder();         var resultBag = new ConcurrentBag<string>();           for (int i = 0; i < word.Trim().Length; i++)  /* traverse the trie collecting closest indexed parent (parent can't be leaf, obviously) */         {             var child = currentNode.GetDescendant(word[i]);               if (child == null || word.Count() - 1 == i)                 break; /* done looking, the rest of the characters aren't indexed in the trie */               indexedNodesValues.Append(word[i]);             currentNode = child;         }           Action<AutoSuggestTrieNode, string> collectAllMatches = null;         collectAllMatches = (node, aggregatedValue) => /* traverse the trie collecting matching leafNodes (i.e. "full words") */             {                 if (node.IsLeafNode) /* full word */                     resultBag.Add(aggregatedValue); /* thread-safe write */                   Parallel.ForEach(node.DescendantNodes, descendandNode => /* asynchronous recursive traversal */                 {                     collectAllMatches(descendandNode, String.Format("{0}{1}", aggregatedValue, descendandNode.Value));                 });             };           collectAllMatches(currentNode, indexedNodesValues.ToString());           return resultBag.OrderBy(o => o).ToList();     }         /// <summary>     /// Gets the total words (leafs) in the trie. Recursive traversal.     /// </summary>     public int TotalWords     {         get         {             int runningCount = 0;               Action<AutoSuggestTrieNode> traverseAllDecendants = null;             traverseAllDecendants = n => { runningCount += n.DescendantNodes.Count(o => o.IsLeafNode); n.DescendantNodes.ForEach(traverseAllDecendants); };             traverseAllDecendants(this._root);               return runningCount;         }     } }   Matching operations and Inserts involve traversing the nodes before the right “spot” is found. Inserts need be synchronous since ordering of data matters here. However, matching can be done in parallel traversal using recursion (line 64). Here’s sample usage:   [TestMethod] public void AutoSuggestTest() {     var autoSuggestCache = new AutoSuggestTrie();       var testInput = @"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer nec odio. Praesent libero.                 Sed cursus ante dapibus diam. Sed nisi. Nulla quis sem at nibh elementum imperdiet. Duis sagittis ipsum. Praesent mauris.                 Fusce nec tellus sed augue semper porta. Mauris massa. Vestibulum lacinia arcu eget nulla. Class aptent taciti sociosqu ad                 litora torquent per conubia nostra, per inceptos himenaeos. Curabitur sodales ligula in libero. Sed dignissim lacinia nunc.                 Curabitur tortor. Pellentesque nibh. Aenean quam. In scelerisque sem at dolor. Maecenas mattis. Sed convallis tristique sem.                 Proin ut ligula vel nunc egestas porttitor. Morbi lectus risus, iaculis vel, suscipit quis, luctus non, massa. Fusce ac                 turpis quis ligula lacinia aliquet. Mauris ipsum. Nulla metus metus, ullamcorper vel, tincidunt sed, euismod in, nibh. Quisque                 volutpat condimentum velit. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nam                 nec ante. Sed lacinia, urna non tincidunt mattis, tortor neque adipiscing diam, a cursus ipsum ante quis turpis. Nulla                 facilisi. Ut fringilla. Suspendisse potenti. Nunc feugiat mi a tellus consequat imperdiet. Vestibulum sapien. Proin quam. Etiam                 ultrices. Suspendisse in justo eu magna luctus suscipit. Sed lectus. Integer euismod lacus luctus magna. Quisque cursus, metus                 vitae pharetra auctor, sem massa mattis sem, at interdum magna augue eget diam. Vestibulum ante ipsum primis in faucibus orci                 luctus et ultrices posuere cubilia Curae; Morbi lacinia molestie dui. Praesent blandit dolor. Sed non quam. In vel mi sit amet                 augue congue elementum. Morbi in ipsum sit amet pede facilisis laoreet. Donec lacus nunc, viverra nec.";       testInput.Split(' ').ToList().ForEach(word => autoSuggestCache.AddWord(word));       var testMatches = autoSuggestCache.GetSuggestedMatches("le"); }   ..and the result: That’s it!

    Read the article

  • Const-correctness semantics in C++

    - by thirtythreeforty
    For fun and profit™, I'm writing a trie class in C++ (using the C++11 standard.) My trie<T> has an iterator, trie<T>::iterator. (They're all actually functionally const_iterators, because you cannot modify a trie's value_type.) The iterator's class declaration looks partially like this: template<typename T> class trie<T>::iterator : public std::iterator<std::bidirectional_iterator_tag, T> { friend class trie<T>; struct state { state(const trie<T>* const node, const typename std::vector<std::pair<typename T::value_type, std::unique_ptr<trie<T>>>>::const_iterator& node_map_it ) : node{node}, node_map_it{node_map_it} {} // This pointer is to const data: const trie<T>* node; typename std::vector<std::pair<typename T::value_type, std::unique_ptr<trie<T>>>>::const_iterator node_map_it; }; public: typedef const T value_type; iterator() =default; iterator(const trie<T>* node) { parents.emplace(node, node->children.cbegin()); // ... } // ... private: std::stack<state> parents; // ... }; Notice that the node pointer is declared const. This is because (in my mind) the iterator should not be modifying the node that it points to; it is just an iterator. Now, elsewhere in my main trie<T> class, I have an erase function that has a common STL signature--it takes an iterator to data to erase (and returns an iterator to the next object). template<typename T> typename trie<T>::iterator trie<T>::erase(const_iterator it) { // ... // Cannot modify a const object! it.parents.top().node->is_leaf = false; // ... } The compiler complains because the node pointer is read-only! The erase function definitely should modify the trie that the iterator points to, even though the iterator shouldn't. So, I have two questions: Should iterator's constructors be public? trie<T> has the necessary begin() and end() members, and of course trie<T>::iterator and trie<T> are mutual friends, but I don't know what the convention is. Making them private would solve a lot of the angst I'm having about removing the const "promise" from the iterator's constructor. What are the correct const semantics/conventions regarding the iterator and its node pointer here? Nobody has ever explained this to me, and I can't find any tutorials or articles on the Web. This is probably the more important question, but it does require a good deal of planning and proper implementation. I suppose it could be circumvented by just implementing 1, but it's the principle of the thing!

    Read the article

  • Trie vs B+ tree

    - by Fakrudeen
    How does Trie and B+ tree compare for indexing lexicographically sorted strings [on the order some billions]? It should support range queries as well. From perf. as well as implementation complexity point of view.

    Read the article

  • Complexity in using Binary search and Trie

    - by user121196
    given a large list of alphabetically sorted words in a file,I need to write a program that, given a word x, determines if x is in the list. Preprocessing is ok since I will be calling this function many times over different inputs. priorties: 1. speed. 2. memory I already know I can use (n is number of words, m is average length of the words) 1. a trie, time is O(log(n)), space(best case) is O(log(n*m)), space(worst case) is O(n*m). 2. load the complete list into memory, then binary search, time is O(log(n)), space is O(n*m) I'm not sure about the complexity on tri, please correct me if they are wrong. Also are there other good approaches?

    Read the article

  • Need dictionary database

    - by Bragaadeesh
    Hi, I am planning to work in TRIE data structure for which I need a dictionary database or a text or word file containing the entire list of english words. It doesnt matter if the size is huge. Larger the better.

    Read the article

  • Are there any radix/patricia/critbit trees for Python?

    - by Andrew Dalke
    I have about 10,000 words used as a set of inverted indices to about 500,000 documents. Both are normalized so the index is a mapping of integers (word id) to a set of integers (ids of documents which contain the word). My prototype uses Python's set as the obvious data type. When I do a search for a document I find the list of N search words and their corresponding N sets. I want to return the set of documents in the intersection of those N sets. Python's "intersect" method is implemented as a pairwise reduction. I think I can do better with a parallel search of sorted sets, so long as the library offers a fast way to get the next entry after i. I've been looking for something like that for some time. Years ago I wrote PyJudy but I no longer maintain it and I know how much work it would take to get it to a stage where I'm comfortable with it again. I would rather use someone else's well-tested code, and I would like one which supports fast serialization/deserialization. I can't find any, or at least not any with Python bindings. There is avltree which does what I want, but since even the pair-wise set merge take longer than I want, I suspect I want to have all my operations done in C/C++. Do you know of any radix/patricia/critbit tree libraries written as C/C++ extensions for Python? Failing that, what is the most appropriate library which I should wrap? The Judy Array site hasn't been updated in 6 years, with 1.0.5 released in May 2007. (Although it does build cleanly so perhaps It Just Works.)

    Read the article

  • Identifier is undefined

    - by hawk
    I wrote the following code in C++ using VS2012 Express. void ac_search( uint num_patterns, uint pattern_length, const char *patterns, uint num_records, uint record_length, const char *records, int *matches, Node* trie) { // Irrelevant code omitted. } vector<int> ac_benchmark_search( uint num_patterns, uint pattern_length, const char *patterns, uint num_records, uint record_length, const char *records, double &time) { // Prepare the container for the results vector<int> matches(num_records * num_patterns); Trie T; Node* trie = T.addWord(records, num_records, record_length); // error line ac_search(num_patterns, pattern_length, patterns, num_records, record_length, records, matches.data(), trie); // Irrelevant code omitted. return matches; } I get the error identifier "ac_search" is undefined at the function invoking line. I am a bit confused here. because the function ac_search is declared as a global (not inside any container). Why can't I call it at this place? Am I missing something? Update I tried ignore irrelevant code and then included it gradually and found that everything is fine until I include the outer loop of ac_search I get the aforementioned error. here is updated code of the function ac_search: void ac_cpu_string_search(uint num_patterns, uint pattern_length, const char *patterns, uint num_records, uint record_length, const char *records, int *matches, Node* trie) { // Loop over all records //for (uint record_number = 0; record_number < num_records; ++record_number) //{ // // Loop over all patterns for (uint pattern_number = 0; pattern_number < num_patterns; ++pattern_number) { // Execute string search const char *ptr_record = &records[record_number * record_length]; const char *ptr_match = std::strstr(ptr_record, &patterns[pattern_number * pattern_length]); // If pattern was found, then calculate offset, otherwise result is -1 if (ptr_match) { matches[record_number * num_patterns + pattern_number] = static_cast<int>(std::distance(ptr_record, ptr_match)); } else { matches[record_number * num_patterns + pattern_number] = -1; } // } //} } Update 2 I think the error has something to do with the function addWord which belongs to the class Trie. When I commented out this function, I did not get the error anymore. Node* Trie::addWord(const char *records, uint num_records, uint record_length) { // Loop over all records for (uint record_number = 0; record_number < num_records; ++record_number) { const char *ptr_record = &records[record_number * record_length]; string s = ptr_record; Node* current = root; if ( s.length() == 0 ) { current->setWordMarker(); // an empty word return; } for ( int i = 0; i < s.length(); i++ ) { Node* child = current->findChild(s[i]); if ( child != NULL ) { current = child; } else { Node* tmp = new Node(); tmp->setContent(s[i]); current->appendChild(tmp); current = tmp; } if ( i == s.length() - 1 ) current->setWordMarker(); } return current; } void ac_search( uint num_patterns, uint pattern_length, const char *patterns, uint num_records, uint record_length, const char *records, int *matches, Node* trie) { // Irrelevant code omitted. } vector<int> ac_benchmark_search( uint num_patterns, uint pattern_length, const char *patterns, uint num_records, uint record_length, const char *records, double &time) { // Prepare the container for the results vector<int> matches(num_records * num_patterns); Trie T; Node* trie = T.addWord(records, num_records, record_length); // error line ac_search(num_patterns, pattern_length, patterns, num_records, record_length, records, matches.data(), trie); // Irrelevant code omitted. return matches; }

    Read the article

  • Population count of rightmost n integers

    - by Jason Baker
    I'm implementing Bagwell's Ideal Hash Trie in Haskell. To find an element in a sub-trie, he says to do the following: Finding the arc for a symbol s, requires ?nding its corresponding bit in the bit map and then counting the one bits below it in the map to compute an index into the ordered sub-trie. What is the best way to do this? It sounds like the most straightforward way of doing this is to select the bits below that bit and do a population count on the resulting number. Is there a faster or better way to do this?

    Read the article

  • Algorithm for autocomplete?

    - by StackUnderflow
    I am referring to the algorithm that is used to give query suggestions when a user type a search term in google. I am mainly interested in how google algorithm is able to show: 1. Most important results (most likely queries rather than anything that matches) 2. Match substrings 3. Fuzzy matches I know you could use Trie or generalized trie to find matches but it wouldn't meet the above requirements... Similar questions asked earlier here Thanks

    Read the article

  • finding long repeated substrings in a massive string

    - by Will
    I naively imagined that I could build a suffix trie where I keep a visit-count for each node, and then the deepest nodes with counts greater than one are the result set I'm looking for. I have a really really long string (hundreds of megabytes). I have about 1 GB of RAM. This is why building a suffix trie with counting data is too inefficient space-wise to work for me. To quote Wikipedia's Suffix tree: storing a string's suffix tree typically requires significantly more space than storing the string itself. The large amount of information in each edge and node makes the suffix tree very expensive, consuming about ten to twenty times the memory size of the source text in good implementations. The suffix array reduces this requirement to a factor of four, and researchers have continued to find smaller indexing structures. And that was wikipedia's comments on the tree, not trie. How can I find long repeated sequences in such a large amount of data, and in a reasonable amount of time (e.g. less than an hour on a modern desktop machine)? (Some wikipedia links to avoid people posting them as the 'answer': Algorithms on strings and especially Longest repeated substring problem ;-) )

    Read the article

  • Replace occurance of character with all letters in the alphabet

    - by McAvoy
    I have created a scrabble game with a computer opponent. If a blank tile is found in the computer's rack during the word generation if needs to be swapped out for every letter in the alphabet. I have my current solution to solve this problem below, but was wondering if there is a better more efficient way to accomplish this task. if (str.Contains("*")) { char c = 'A'; String made = ""; while(c < 'Z') { made = str.ReplaceFirst("*", c.ToString()); if (!made.Contains("*")) { wordsMade.Add(made); if (theGame.theTrie.Search(made) == Trie.SearchResults.Found) { validWords.Add(made); } } else { char ch = 'A'; String made2 = ""; while (ch < 'Z') { made2 = made.ReplaceFirst("*", c.ToString()); wordsMade.Add(made2); if (theGame.theTrie.Search(made2) == Trie.SearchResults.Found) { validWords.Add(made2); } ch++; } } c++; }

    Read the article

1 2 3  | Next Page >