Search Results

Search found 1 results on 1 pages for 'c14ppy'.

Page 1/1 | 1 

  • Are there any working implementations of the rolling hash function used in the Rabin-Karp string sea

    - by c14ppy
    I'm looking to use a rolling hash function so I can take hashes of n-grams of a very large string. For example: "stackoverflow", broken up into 5 grams would be: "stack", "tacko", "ackov", "ckove", "kover", "overf", "verfl", "erflo", "rflow" This is ideal for a rolling hash function because after I calculate the first n-gram hash, the following ones are relatively cheap to calculate because I simply have to drop the first letter of the first hash and add the new last letter of the second hash. I know that in general this hash function is generated as: H = c1ak - 1 + c2ak - 2 + c3ak - 3 + ... + cka0 where a is a constant and c1,...,ck are the input characters. If you follow this link on the Rabin-Karp string search algorithm , it states that "a" is usually some large prime. I want my hashes to be stored in 32 bit integers, so how large of a prime should "a" be, such that I don't overflow my integer? Does there exist an existing implementation of this hash function somewhere that I could already use? Here is an implementation I created: public class hash2 { public int prime = 101; public int hash(String text) { int hash = 0; for(int i = 0; i < text.length(); i++) { char c = text.charAt(i); hash += c * (int) (Math.pow(prime, text.length() - 1 - i)); } return hash; } public int rollHash(int previousHash, String previousText, String currentText) { char firstChar = previousText.charAt(0); char lastChar = currentText.charAt(currentText.length() - 1); int firstCharHash = firstChar * (int) (Math.pow(prime, previousText.length() - 1)); int hash = (previousHash - firstCharHash) * prime + lastChar; return hash; } public static void main(String[] args) { hash2 hashify = new hash2(); int firstHash = hashify.hash("mydog"); System.out.println(firstHash); System.out.println(hashify.hash("ydogr")); System.out.println(hashify.rollHash(firstHash, "mydog", "ydogr")); } } I'm using 101 as my prime. Does it matter if my hashes will overflow? I think this is desirable but I'm not sure. Does this seem like the right way to go about this?

    Read the article

1