Why is SQLite3 using covering indices instead of the indices I created?

Posted by Geoff on Stack Overflow See other posts from Stack Overflow or by Geoff
Published on 2012-10-05T03:34:40Z Indexed on 2012/10/05 3:37 UTC
Read the original article Hit count: 227

Filed under:

I have an extremely large database (contacts has ~3 billion entries, people has ~280 million entries, and the other tables have a negligible number of entries). Most other queries I've run are really fast. However, I've encountered a more complicated query that's really slow. I'm wondering if there's any way to speed this up.

First of all, here is my schema:

CREATE TABLE activities (id INTEGER PRIMARY KEY, name TEXT NOT NULL);
CREATE TABLE contacts (
        id INTEGER PRIMARY KEY,
        person1_id INTEGER NOT NULL,
        person2_id INTEGER NOT NULL,
        duration REAL NOT NULL, -- hours
        activity_id INTEGER NOT NULL
    --  FOREIGN_KEY(person1_id) REFERENCES people(id),
    --  FOREIGN_KEY(person2_id) REFERENCES people(id)
    );
CREATE TABLE people (
        id INTEGER PRIMARY KEY,
        state_id INTEGER NOT NULL,
        county_id INTEGER NOT NULL,
        age INTEGER NOT NULL,
        gender TEXT NOT NULL, -- M or F
        income INTEGER NOT NULL
    --  FOREIGN_KEY(state_id) REFERENCES states(id)
    );
CREATE TABLE states (
        id INTEGER PRIMARY KEY,
        name TEXT NOT NULL,
        abbreviation TEXT NOT NULL
    );
CREATE INDEX activities_name_index on activities(name);
CREATE INDEX contacts_activity_id_index on contacts(activity_id);
CREATE INDEX contacts_duration_index on contacts(duration);
CREATE INDEX contacts_person1_id_index on contacts(person1_id);
CREATE INDEX contacts_person2_id_index on contacts(person2_id);
CREATE INDEX people_age_index on people(age);
CREATE INDEX people_county_id_index on people(county_id);
CREATE INDEX people_gender_index on people(gender);
CREATE INDEX people_income_index on people(income);
CREATE INDEX people_state_id_index on people(state_id);
CREATE INDEX states_abbreviation_index on states(abbreviation);
CREATE INDEX states_name_index on states(name);

Note that I've created an index on every column in the database. I don't care about the size of the database; speed is all I care about.

Here's an example of a query that, as expected, runs almost instantly:

SELECT count(*) FROM people, states WHERE people.state_id=states.id and states.abbreviation='IA';

Here's the troublesome query:

SELECT * FROM contacts WHERE rowid IN
    (SELECT contacts.rowid FROM contacts, people, states
        WHERE contacts.person1_id=people.id AND people.state_id=states.id AND states.name='Kansas'
            INTERSECT
    SELECT contacts.rowid FROM contacts, people, states
        WHERE contacts.person2_id=people.id AND people.state_id=states.id AND states.name='Missouri');

Now, what I think would happen is that each subquery would use each relevant index I've created to speed this up. However, when I show the query plan, I see this:

sqlite> EXPLAIN QUERY PLAN SELECT * FROM contacts WHERE rowid IN (SELECT contacts.rowid FROM contacts, people, states WHERE contacts.person1_id=people.id AND people.state_id=states.id AND states.name='Kansas' INTERSECT SELECT contacts.rowid FROM contacts, people, states WHERE contacts.person2_id=people.id AND people.state_id=states.id AND states.name='Missouri');
0|0|0|SEARCH TABLE contacts USING INTEGER PRIMARY KEY (rowid=?) (~25 rows)
0|0|0|EXECUTE LIST SUBQUERY 1
2|0|2|SEARCH TABLE states USING COVERING INDEX states_name_index (name=?) (~1 rows)
2|1|1|SEARCH TABLE people USING COVERING INDEX people_state_id_index (state_id=?) (~5569556 rows)
2|2|0|SEARCH TABLE contacts USING COVERING INDEX contacts_person1_id_index (person1_id=?) (~12 rows)
3|0|2|SEARCH TABLE states USING COVERING INDEX states_name_index (name=?) (~1 rows)
3|1|1|SEARCH TABLE people USING COVERING INDEX people_state_id_index (state_id=?) (~5569556 rows)
3|2|0|SEARCH TABLE contacts USING COVERING INDEX contacts_person2_id_index (person2_id=?) (~12 rows)
1|0|0|COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE (INTERSECT)

In fact, if I show the query plan for the first query I posted, I get this:

sqlite> EXPLAIN QUERY PLAN SELECT count(*) FROM people, states WHERE people.state_id=states.id and states.abbreviation='IA';
0|0|1|SEARCH TABLE states USING COVERING INDEX states_abbreviation_index (abbreviation=?) (~1 rows)
0|1|0|SEARCH TABLE people USING COVERING INDEX people_state_id_index (state_id=?) (~5569556 rows)

Why is SQLite using covering indices instead of the indices I created? Shouldn't the search in the people table be able to happen in log(n) time given state_id which in turn is found in log(n) time?

Developer IT

Why is SQLite3 using covering indices instead of the indices I created? - Developer IT

Why is SQLite3 using covering indices instead of the indices I created?

sql

sqlite

sqlite3

Related posts about sql

SQL SERVER – Concat Strings in SQL Server using T-SQL – SQL in Sixty Seconds #035 – Video

SQL SERVER – Concat Function in SQL Server – SQL Concatenation

Error with SQL Server Setup 2012 on Windows 2012

How can I detect which version of SQL (eg SQL 2008 or SQL Azure)

Nested SQL Select statement fails on SQL Server 2000, ok on SQL Server 2005

Related posts about sqlite

How to add SQLite (SQLite.NET) to my C# project

dotConnect or "sqlite.phxsoftware.com" for using Entity Framework with SQLite

SQLite table does not exist exception for existing SQLite database (and table)

Android Jelly bean database is locked (code 5)

Android:accessing assets folder sqlite databse file with .sqlite extension

Categories cloud