Need help optimizing this Django aggregate query

Posted by Chris Lawlor on Stack Overflow See other posts from Stack Overflow or by Chris Lawlor
Published on 2010-06-06T22:14:17Z Indexed on 2010/06/06 22:22 UTC
Read the original article Hit count: 158

Filed under:
|
|

I have the following model

class Plugin(models.Model):
    name = models.CharField(max_length=50)
    # more fields

which represents a plugin that can be downloaded from my site. To track downloads, I have

class Download(models.Model):
    plugin = models.ForiegnKey(Plugin)
    timestamp = models.DateTimeField(auto_now=True)

So to build a view showing plugins sorted by downloads, I have the following query:

# pbd is plugins by download - commented here to prevent scrolling
pbd = Plugin.objects.annotate(dl_total=Count('download')).order_by('-dl_total')

Which works, but is very slow. With only 1,000 plugins, the avg. response is 3.6 - 3.9 seconds (devserver with local PostgreSQL db), where a similar view with a much simpler query (sorting by plugin release date) takes 160 ms or so.

I'm looking for suggestions on how to optimize this query. I'd really prefer that the query return Plugin objects (as opposed to using values) since I'm sharing the same template for the other views (Plugins by rating, Plugins by release date, etc.), so the template is expecting Plugin objects - plus I'm not sure how I would get things like the absolute_url without a reference to the plugin object.

Or, is my whole approach doomed to failure? Is there a better way to track downloads? I ultimately want to provide users some nice download statistics for the plugins they've uploaded - like downloads per day/week/month. Will I have to calculate and cache Downloads at some point?

EDIT: In my test dataset, there are somewhere between 10-20 Download instances per Plugin - in production I expect this number would be much higher for many of the plugins.

© Stack Overflow or respective owner

Related posts about django

Related posts about query