mysql - filtering a list against keywords, both list and keywords > 20 million records

Posted by threecheeseopera on Stack Overflow See other posts from Stack Overflow or by threecheeseopera
Published on 2010-04-30T03:24:42Z Indexed on 2010/04/30 3:27 UTC
Read the original article Hit count: 253

Filed under:

I have two tables, both having more than 20 million records; table1 is a list of terms, and table2 is a list of keywords that may or may not appear in those terms. I need to identify the terms that contain a keyword. My current strategy is:

SELECT table1.term, table2.keyword FROM table1 INNER JOIN table2 ON table1.term 
LIKE CONCAT('%', table2.keyword, '%');

This is not working, it takes f o r e v e r.
It's not the server (see notes).

How might I rewrite this so that it runs in under a day?

Notes: As for server optimization: both tables are myisam and have unique indexes on the matching fields; the myisam key buffer is greater than the sum of both index file sizes, and it is not even being fully taxed (key_blocks_unused is ... large); the server is a dual-xeon 2U beast with fast sas drives and 8G of ram, fine-tuned for the mysql workload.

© Stack Overflow or respective owner

Related posts about mysql