Optimize INSERT / UPDATE / DELETE operation
- by clime
I wonder if the following script can be optimized somehow. It does write a lot to disk because it deletes possibly up-to-date rows and reinserts them. I was thinking about applying something like "insert ... on duplicate key update" and found some possibilities for single-row updates but I don't know how to apply it in the context of INSERT INTO ... SELECT query.
CREATE OR REPLACE FUNCTION update_member_search_index() RETURNS VOID AS $$
        DECLARE
                member_content_type_id INTEGER;
        BEGIN
                member_content_type_id := (SELECT id FROM django_content_type WHERE app_label='web' AND model='member');
                DELETE FROM watson_searchentry WHERE content_type_id = member_content_type_id;
                INSERT INTO watson_searchentry (engine_slug, content_type_id, object_id, object_id_int, title, description, content, url, meta_encoded)
                SELECT 'default',
                        member_content_type_id,
                        web_member.id,
                        web_member.id,
                        web_member.name,
                        '',
                        web_user.email||' '||web_member.normalized_name||' '||web_country.name,
                        '',
                        '{}'
                FROM web_member INNER JOIN web_user ON (web_member.user_id = web_user.id) INNER JOIN web_country ON (web_member.country_id = web_country.id)
                WHERE web_user.is_active=TRUE;
        END;
$$ LANGUAGE plpgsql;
EDIT: Schemas of web_member, watson_searchentry, web_user, web_country: http://pastebin.com/3tRVPPVi.
(content_type_id, object_id_int) in watson_searchentry is unique pair in the table but atm the index is not present (there is no use for it).
This script should be run at most once a day for full rebuilds of search index.