On counting pairs of words that differ by one letter

Posted by Quintofron on Stack Overflow See other posts from Stack Overflow or by Quintofron
Published on 2012-11-07T01:42:35Z Indexed on 2012/11/07 5:00 UTC
Read the original article Hit count: 133

Filed under:
|

Let us consider n words, each of length k. Those words consist of letters over an alphabet (whose cardinality is n) with defined order. The task is to derive an O(nk) algorithm to count the number of pairs of words that differ by one position (no matter which one exactly, as long as it's only a single position).

For instance, in the following set of words (n = 5, k = 4):

abcd, abdd, adcb, adcd, aecd

there are 5 such pairs: (abcd, abdd), (abcd, adcd), (abcd, aecd), (adcb, adcd), (adcd, aecd).

So far I've managed to find an algorithm that solves a slightly easier problem: counting the number of pairs of words that differ by one GIVEN position (i-th). In order to do this I swap the letter at the ith position with the last letter within each word, perform a Radix sort (ignoring the last position in each word - formerly the ith position), linearly detect words whose letters at the first 1 to k-1 positions are the same, eventually count the number of occurrences of each letter at the last (originally ith) position within each set of duplicates and calculate the desired pairs (the last part is simple).

However, the algorithm above doesn't seem to be applicable to the main problem (under the O(nk) constraint) - at least not without some modifications. Any idea how to solve this?

© Stack Overflow or respective owner

Related posts about algorithm

Related posts about sorting