mysql - filtering a list against keywords, both list and keywords > 20 million records
Posted
by threecheeseopera
on Stack Overflow
See other posts from Stack Overflow
or by threecheeseopera
Published on 2010-04-30T03:24:42Z
Indexed on
2010/04/30
3:27 UTC
Read the original article
Hit count: 253
mysql
I have two tables, both having more than 20 million records; table1 is a list of terms, and table2 is a list of keywords that may or may not appear in those terms. I need to identify the terms that contain a keyword. My current strategy is:
SELECT table1.term, table2.keyword FROM table1 INNER JOIN table2 ON table1.term
LIKE CONCAT('%', table2.keyword, '%');
This is not working, it takes f o r e v e r.
It's not the server (see notes).
How might I rewrite this so that it runs in under a day?
Notes:
As for server optimization: both tables are myisam and have unique indexes on the matching fields; the myisam key buffer is greater than the sum of both index file sizes, and it is not even being fully taxed (key_blocks_unused
is ... large); the server is a dual-xeon 2U beast with fast sas drives and 8G of ram, fine-tuned for the mysql workload.
© Stack Overflow or respective owner