Calculating probability that a string has been randomized? - Python
- by RadiantHex
Hi folks,
this is correlated to a question I asked earlier (question)
I have a list of manually created strings such as:
  lucy87
  
  gordan_king
  
  fancy_unicorn77
  
  joplucky_kanga90
  
  base_belong_to_narwhals
and a list of randomized strings:
  johnkdf
  
  pancake90kgjd
  
  fancy_jagookfk
  
  manhattanljg
What gives away that the last set of strings are randomized is that sequences such as 'kjg', 'jgf', 'lkd', ... .
Any clever way I could separate strings that contain these apparently randomized strings from the crowd?
I guess that this plays a lot on the fact that certain characters are more likely to be placed next to others (e.g. 'co', 'ka', 'ja', ...).
Any ideas on this one? Kylotan mentioned Reverend, but I am not sure if it can be used fr such purpose.
Help would be much appreciated!