Calculating probability that a string has been randomized? - Python

Posted by RadiantHex on Stack Overflow See other posts from Stack Overflow or by RadiantHex
Published on 2010-05-14T10:51:04Z Indexed on 2010/05/14 10:54 UTC
Read the original article Hit count: 253

Hi folks,

this is correlated to a question I asked earlier (question)

I have a list of manually created strings such as:

lucy87

gordan_king

fancy_unicorn77

joplucky_kanga90

base_belong_to_narwhals

and a list of randomized strings:

johnkdf

pancake90kgjd

fancy_jagookfk

manhattanljg


What gives away that the last set of strings are randomized is that sequences such as 'kjg', 'jgf', 'lkd', ... .

Any clever way I could separate strings that contain these apparently randomized strings from the crowd?

I guess that this plays a lot on the fact that certain characters are more likely to be placed next to others (e.g. 'co', 'ka', 'ja', ...).


Any ideas on this one? Kylotan mentioned Reverend, but I am not sure if it can be used fr such purpose.

Help would be much appreciated!

© Stack Overflow or respective owner

Related posts about python

Related posts about string