PostgreSQL - Why are some queries on large datasets so incredibly slow

Posted by Brad Mathews on Stack Overflow See other posts from Stack Overflow or by Brad Mathews
Published on 2010-05-19T02:32:19Z Indexed on 2010/05/19 2:40 UTC
Read the original article Hit count: 387

Hello,

I have two types of queries I run often on two large datasets. They run much slower than I would expect them to.

The first type is a sequential scan updating all records:

Update rcra_sites Set street = regexp_replace(street,'/','','i')

rcra_sites has 700,000 records. It takes 22 minutes from pgAdmin! I wrote a vb.net function that loops through each record and sends an update query for each record (yes, 700,000 update queries!) and it runs in less than half the time. Hmmm....

The second type is a simple update with a relation and then a sequential scan:

Update rcra_sites as sites 
Set violations='No' 
From narcra_monitoring as v 
Where sites.agencyid=v.agencyid and v.found_violation_flag='N'

narcra_monitoring has 1,700,000 records. This takes 8 minutes. The query planner refuses to use my indexes. The query runs much faster if I start with a set enable_seqscan = false;. I would prefer if the query planner would do its job.

I have appropriate indexes, I have vacuumed and analyzed. I optimized my shared_buffers and effective_cache_size best I know to use more memory since I have 4GB. My hardware is pretty darn good. I am running v8.4 on Windows 7.

Is PostgreSQL just this slow? Or am I still missing something?

Thanks! Brad

© Stack Overflow or respective owner

Related posts about postgresql

Related posts about query-optimization