Slow queries in Rails- not sure if my indexes are being used.
- by Max Williams
I'm doing a quite complicated find with lots of includes, which rails is splitting into a sequence of discrete queries rather than do a single big join.  The queries are really slow - my dataset isn't massive, with none of the tables having more than a few thousand records. 
I have indexed all of the fields which are examined in the queries but i'm worried that the indexes aren't helping for some reason: i installed a plugin called "query_reviewer" which looks at the queries used to build a page, and lists problems with them.  This states that indexes AREN'T being used, and it features the results of calling 'explain' on the query, which lists various problems.  Here's an example find call:
Question.paginate(:all, {:page=>1, :include=>[:answers, :quizzes, :subject, {:taggings=>:tag}, {:gradings=>[:age_group, :difficulty]}], :conditions=>["((questions.subject_id = ?) or (questions.subject_id = ? and tags.name = ?))", "1", 19, "English"], :order=>"subjects.name, (gradings.difficulty_id is null), gradings.age_group_id, gradings.difficulty_id", :per_page=>30})
And here are the generated sql queries:
SELECT DISTINCT `questions`.id 
  FROM `questions` 
  LEFT OUTER JOIN `taggings` ON `taggings`.taggable_id = `questions`.id 
    AND `taggings`.taggable_type = 'Question' 
  LEFT OUTER JOIN `tags` ON `tags`.id = `taggings`.tag_id 
  LEFT OUTER JOIN `subjects` ON `subjects`.id = `questions`.subject_id 
  LEFT OUTER JOIN `gradings` ON gradings.question_id = questions.id 
  WHERE (((questions.subject_id = '1') or (questions.subject_id = 19 and tags.name = 'English'))) 
  ORDER BY subjects.name, (gradings.difficulty_id is null), gradings.age_group_id, gradings.difficulty_id 
  LIMIT 0, 30
SELECT `questions`.`id` AS t0_r0 <..etc...> 
  FROM `questions` 
  LEFT OUTER JOIN `answers` ON answers.question_id = questions.id 
  LEFT OUTER JOIN `quiz_questions` ON (`questions`.`id` = `quiz_questions`.`question_id`) 
  LEFT OUTER JOIN `quizzes` ON (`quizzes`.`id` = `quiz_questions`.`quiz_id`) 
  LEFT OUTER JOIN `subjects` ON `subjects`.id = `questions`.subject_id 
  LEFT OUTER JOIN `taggings` ON `taggings`.taggable_id = `questions`.id 
    AND `taggings`.taggable_type = 'Question' 
  LEFT OUTER JOIN `tags` ON `tags`.id = `taggings`.tag_id 
  LEFT OUTER JOIN `gradings` ON gradings.question_id = questions.id 
  LEFT OUTER JOIN `age_groups` ON `age_groups`.id = `gradings`.age_group_id 
  LEFT OUTER JOIN `difficulties` ON `difficulties`.id = `gradings`.difficulty_id 
  WHERE (((questions.subject_id = '1') or (questions.subject_id = 19 and tags.name = 'English'))) 
    AND `questions`.id IN (602, 634, 666, 698, 730, 762, 613, 645, 677, 709, 741, 592, 624, 656, 688, 720, 752, 603, 635, 667, 699, 731, 763, 614, 646, 678, 710, 742, 593, 625) 
  ORDER BY subjects.name, (gradings.difficulty_id is null), gradings.age_group_id, gradings.difficulty_id
SELECT count(DISTINCT `questions`.id) AS count_all FROM `questions` 
  LEFT OUTER JOIN `answers` ON answers.question_id = questions.id 
  LEFT OUTER JOIN `quiz_questions` ON (`questions`.`id` = `quiz_questions`.`question_id`) 
  LEFT OUTER JOIN `quizzes` ON (`quizzes`.`id` = `quiz_questions`.`quiz_id`) 
  LEFT OUTER JOIN `subjects` ON `subjects`.id = `questions`.subject_id 
  LEFT OUTER JOIN `taggings` ON `taggings`.taggable_id = `questions`.id 
    AND `taggings`.taggable_type = 'Question' 
  LEFT OUTER JOIN `tags` ON `tags`.id = `taggings`.tag_id 
  LEFT OUTER JOIN `gradings` ON gradings.question_id = questions.id 
  LEFT OUTER JOIN `age_groups` ON `age_groups`.id = `gradings`.age_group_id 
  LEFT OUTER JOIN `difficulties` ON `difficulties`.id = `gradings`.difficulty_id 
  WHERE (((questions.subject_id = '1') or (questions.subject_id = 19 and tags.name = 'English')))
Actually, looking at these all nicely formatted here, there's a crazy amount of joining going on here.  This can't be optimal surely.  Anyway, it looks like i have two questions.
1) I have an index on each of the ids and foreign key fields referred to here.   The second of the above queries is the slowest, and calling explain on it (doing it directly in mysql) gives me the following:
+----+-------------+----------------+--------+---------------------------------------------------------------------------------+-------------------------------------------------+---------+------------------------------------------------+------+----------------------------------------------+
| id | select_type | table          | type   | possible_keys                                                                   | key                                             | key_len | ref                                            | rows | Extra                                        |
+----+-------------+----------------+--------+---------------------------------------------------------------------------------+-------------------------------------------------+---------+------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | questions      | range  | PRIMARY,index_questions_on_subject_id                                           | PRIMARY                                         | 4       | NULL                                           |   30 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | answers        | ref    | index_answers_on_question_id                                                    | index_answers_on_question_id                    | 5       | millionaire_development.questions.id           |    2 |                                              | 
|  1 | SIMPLE      | quiz_questions | ref    | index_quiz_questions_on_question_id                                             | index_quiz_questions_on_question_id             | 5       | millionaire_development.questions.id           |    1 |                                              | 
|  1 | SIMPLE      | quizzes        | eq_ref | PRIMARY                                                                         | PRIMARY                                         | 4       | millionaire_development.quiz_questions.quiz_id |    1 |                                              | 
|  1 | SIMPLE      | subjects       | eq_ref | PRIMARY                                                                         | PRIMARY                                         | 4       | millionaire_development.questions.subject_id   |    1 |                                              | 
|  1 | SIMPLE      | taggings       | ref    | index_taggings_on_taggable_id_and_taggable_type,index_taggings_on_taggable_type | index_taggings_on_taggable_id_and_taggable_type | 263     | millionaire_development.questions.id,const     |    1 |                                              | 
|  1 | SIMPLE      | tags           | eq_ref | PRIMARY                                                                         | PRIMARY                                         | 4       | millionaire_development.taggings.tag_id        |    1 | Using where                                  | 
|  1 | SIMPLE      | gradings       | ref    | index_gradings_on_question_id                                                   | index_gradings_on_question_id                   | 5       | millionaire_development.questions.id           |    2 |                                              | 
|  1 | SIMPLE      | age_groups     | eq_ref | PRIMARY                                                                         | PRIMARY                                         | 4       | millionaire_development.gradings.age_group_id  |    1 |                                              | 
|  1 | SIMPLE      | difficulties   | eq_ref | PRIMARY                                                                         | PRIMARY                                         | 4       | millionaire_development.gradings.difficulty_id |    1 |                                              | 
+----+-------------+----------------+--------+---------------------------------------------------------------------------------+-------------------------------------------------+---------+------------------------------------------------+------+----------------------------------------------+
The query_reviewer plugin has this to say about it - it lists several problems:
Table questions:  Using temporary table, Long key length (263), Using filesort 
MySQL must do an extra pass to find out how to retrieve the rows in sorted order.
To resolve the query, MySQL needs to create a temporary table to hold the result.
The key used for the index was rather long, potentially affecting indices in memory
2) It looks like rails isn't splitting this find up in a very optimal way.  Is it, do you think?  Am i better off doing several find queries manually rather than one big combined one?
Grateful for any advice, max