Searching a large list of words in another large list
- by Christian
I have a list of 1,000,000 strings with a maximum length of 256 with protein names. Every string has an associated ID.
I have another list of 4,000,000,000 strings with a maximum length of 256 with words out of articles and every word has an ID.
I want to find all matches between the list of protein names and the list of words of the articles.
Which algorithm should I use? Should I use some prebuild API?
It would be good if the algorithm runs on a normal PC without special hardware.