Eliminate full table scan due to BETWEEN (and GROUP BY)

Posted by Dave Jarvis on Stack Overflow See other posts from Stack Overflow or by Dave Jarvis
Published on 2010-05-12T01:47:41Z Indexed on 2010/05/12 1:54 UTC
Read the original article Hit count: 201

Filed under:
|
|

Description

According to the explain command, there is a range that is causing a query to perform a full table scan (160k rows). How do I keep the range condition and reduce the scanning? I expect the culprit to be:

Y.YEAR BETWEEN 1900 AND 2009 AND

Code

Here is the code that has the range condition (the STATION_DISTRICT is likely superfluous).

SELECT                                                                  
  COUNT(1) as MEASUREMENTS,                                             
  AVG(D.AMOUNT) as AMOUNT,                                              
  Y.YEAR as YEAR,                                                       
  MAKEDATE(Y.YEAR,1) as AMOUNT_DATE                                     
FROM                                                                    
  CITY C,                                                               
  STATION S,                                                            
  STATION_DISTRICT SD,                                                  
  YEAR_REF Y FORCE INDEX(YEAR_IDX),                                     
  MONTH_REF M,                                                          
  DAILY D                                                               
WHERE                                                                   
  -- For a specific city ...                                            
  --                                                                    
  C.ID = 10663 AND                                                      

  -- Find all the stations within a specific unit radius ...
  --                                                        
  6371.009 *                                                
  SQRT(                                                     
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
     POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND

  -- Get the station district identification for the matching station.
  --                                                                  
  S.STATION_DISTRICT_ID = SD.ID AND                                   

  -- Gather all known years for that station ...
  --                                            
  Y.STATION_DISTRICT_ID = SD.ID AND             

  -- The data before 1900 is shaky; insufficient after 2009.
  --                                                        
  Y.YEAR BETWEEN 1900 AND 2009 AND                          

  -- Filtered by all known months ...
  --                                 
  M.YEAR_REF_ID = Y.ID AND           

  -- Whittled down by category ...
  --
  M.CATEGORY_ID = '003' AND

  -- Into the valid daily climate data.
  --
  M.ID = D.MONTH_REF_ID AND
  D.DAILY_FLAG_ID <> 'M'
GROUP BY
  Y.YEAR

Update

The SQL is performing a full table scan, which results in MySQL performing a "copy to tmp table", as shown here:

+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
| id | select_type | table | type   | possible_keys                     | key          | key_len | ref                           | rows   | Extra       |
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
|  1 | SIMPLE      | C     | const  | PRIMARY                           | PRIMARY      | 4       | const                         |      1 |             |
|  1 | SIMPLE      | Y     | range  | YEAR_IDX                          | YEAR_IDX     | 4       | NULL                          | 160422 | Using where |
|  1 | SIMPLE      | SD    | eq_ref | PRIMARY                           | PRIMARY      | 4       | climate.Y.STATION_DISTRICT_ID |      1 | Using index |
|  1 | SIMPLE      | S     | eq_ref | PRIMARY                           | PRIMARY      | 4       | climate.SD.ID                 |      1 | Using where |
|  1 | SIMPLE      | M     | ref    | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8       | climate.Y.ID                  |     54 | Using where |
|  1 | SIMPLE      | D     | ref    | INDEX                             | INDEX        | 8       | climate.M.ID                  |     11 | Using where |
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+

Related

Thank you!

© Stack Overflow or respective owner

Related posts about mysql

Related posts about query-optimization