Complex SQL query help on aggregating values for nested subquery

Posted by François Beausoleil on Stack Overflow See other posts from Stack Overflow or by François Beausoleil
Published on 2010-03-12T04:59:42Z Indexed on 2010/03/12 5:07 UTC
Read the original article Hit count: 889

Filed under:
|

Hi!

I have people, companies, employees, events and event kinds. I'm making a report/followup sheet where people, companies and employees are the rows, and the columns are event kinds.

Event kinds are simple values describing: "Promised Donation", "Received Donation", "Phoned", "Followed up" and such. Event kinds are ordered:

CREATE TABLE event_kinds (
  id,
  name,
  position);

Events hold the actual reference to the event:

CREATE TABLE events (
  id,
  person_id,
  company_id,
  referrer_id,
  event_kind_id,
  created_at);

referrer_id is another reference to people. It is the person which sent the information/tip along, and is an optional field, although I sometimes want to filter on an event_kind that has a specific referrer, while I don't for other event kinds.

Notice I don't have an employee ID reference. The reference exists, but is implied. I have application code to validate that person_id and company_id really reference an employee record. The other tables are pretty basic:

CREATE TABLE people (
  id, name);

CREATE TABLE companies (
  id, name);

CREATE TABLE employees (
  id, person_id, company_id);

I'm trying to achieve the following report:

                         Referrer       Phoned     Promised   Donated
    Francois                            Feb 16th   Feb 20th   Mar 1st
    Apple (Steve Jobs)   Steve Ballmer                        Mar 3rd
    IBM                  Bill Gates     Mar 7th

The first row is a people record, the 2nd is an employee, and the 3rd is a company. If I asked for referrer Bill Gates for Phoned event kinds, I'd only see the 3rd row, while asking for Steve and Phoned would return no rows.

Right now, I do 3 queries, one for companies, one for people and a last one for employees. I want the event kind columns to be ordered, but I do that in application code and show it properly there. Here's where I'm at so far:

SELECT companies.id,
       companies.name,
       (SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
       (SELECT events.id FROM events WHERE events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
       (SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "companies"

SELECT people.id,
       people.name,
       (SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
       (SELECT events.id FROM events WHERE events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
       (SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "people"

SELECT employees.id,
       employees.company_id,
       employees.person_id,
       (SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
       (SELECT events.id FROM events WHERE events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
       (SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "employees"

I rather suspect I'm doing this wrong. There should be an "easier" way to do it.

One other filter criteria would be to filter on people/company names: WHERE LOWER(companies.name) LIKE '%apple%'.

Note that I'm ordering by the dates of event_kind_9 here, and a secondary sort is by person/company name.

To summarize: I want to paginate the result set, find the latest event for each cell, order the result set by the date of the latest event, and by company/person name, filter by referrer in some event kinds, but not others.

For reference, I'm using PostgreSQL, from Ruby, ActiveRecord/Rails. The solution is pure SQL though.

© Stack Overflow or respective owner

Related posts about sql

Related posts about postgresql