Optimize MySQL query (ngrams, COUNT(), GROUP BY, ORDER BY)

Posted by Gerardo on Stack Overflow See other posts from Stack Overflow or by Gerardo
Published on 2010-06-02T04:50:19Z Indexed on 2010/06/02 4:53 UTC
Read the original article Hit count: 255

Filed under:
|

I have a database with thousands of companies and their locations. I have implemented n-grams to optimize search. I am making one query to retrieve all the companies that match with the search query and another one to get a list with their locations and the number of companies in each location.

The query I am trying to optimize is the latter. Maybe the problem is this: Every company ('anunciante') has a field ('estado') to make logical deletes. So, if 'estado' equals 1, the company should be retrieved. When I run the EXPLAIN command, it shows that it goes through almost 40k rows, when the actual result (the reality matching companies) are 80.

How can I optimize this?

This is my query (XXX represent the n-grams for the search query):

SELECT provincias.provincia AS provincia, provincias.id, COUNT(*) AS cantidad
FROM anunciantes 
JOIN anunciante_invertido AS a_i0 ON anunciantes.id = a_i0.id_anunciante 
JOIN indice_invertido AS indice0 ON a_i0.id_invertido = indice0.id 
LEFT OUTER JOIN domicilios ON anunciantes.id = domicilios.id_anunciante 
LEFT OUTER JOIN localidades ON domicilios.id_localidad = localidades.id 
LEFT OUTER JOIN provincias ON provincias.id = localidades.id_provincia 
WHERE anunciantes.estado = 1 
AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') 
AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') 
AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') 
AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') 
AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') 
GROUP BY provincias.id 
ORDER BY cantidad DESC

And this is the query explained (hope it can be read in this format):

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   PRIMARY     anunciantes     ref     PRIMARY,estado  estado  1   const   36669   Using index; Using temporary; Using filesort
1   PRIMARY     domicilios  ref     id_anunciante   id_anunciante   4   db84771_viaempresas.anunciantes.id  1    
1   PRIMARY     localidades     eq_ref  PRIMARY     PRIMARY     4   db84771_viaempresas.domicilios.id_localidad     1    
1   PRIMARY     provincias  eq_ref  PRIMARY     PRIMARY     4   db84771_viaempresas.localidades.id_provincia    1    
1   PRIMARY     a_i0    ref     PRIMARY,id_anunciante,id_invertido  PRIMARY     4   db84771_viaempresas.anunciantes.id  1   Using where; Using index
1   PRIMARY     indice0     eq_ref  PRIMARY     PRIMARY     4   db84771_viaempresas.a_i0.id_invertido   1   Using index
6   DEPENDENT SUBQUERY  ngrama  const   PRIMARY,ngrama  ngrama  5   const   1   Using index
6   DEPENDENT SUBQUERY  invertido_ngrama    eq_ref  PRIMARY,id_palabra,id_ngrama    PRIMARY     8   func,const  1   Using index
5   DEPENDENT SUBQUERY  ngrama  const   PRIMARY,ngrama  ngrama  5   const   1   Using index
5   DEPENDENT SUBQUERY  invertido_ngrama    eq_ref  PRIMARY,id_palabra,id_ngrama    PRIMARY     8   func,const  1   Using index
4   DEPENDENT SUBQUERY  ngrama  const   PRIMARY,ngrama  ngrama  5   const   1   Using index
4   DEPENDENT SUBQUERY  invertido_ngrama    eq_ref  PRIMARY,id_palabra,id_ngrama    PRIMARY     8   func,const  1   Using index
3   DEPENDENT SUBQUERY  ngrama  const   PRIMARY,ngrama  ngrama  5   const   1   Using index
3   DEPENDENT SUBQUERY  invertido_ngrama    eq_ref  PRIMARY,id_palabra,id_ngrama    PRIMARY     8   func,const  1   Using index
2   DEPENDENT SUBQUERY  ngrama  const   PRIMARY,ngrama  ngrama  5   const   1   Using index
2   DEPENDENT SUBQUERY  invertido_ngrama    eq_ref  PRIMARY,id_palabra,id_ngrama    PRIMARY     8   func,const  1   Using index

© Stack Overflow or respective owner

Related posts about mysql

Related posts about optimization