merging two tables, while applying aggregates on the duplicates (max,min and sum)

Posted by cloudraven on Stack Overflow See other posts from Stack Overflow or by cloudraven
Published on 2014-06-05T02:10:48Z Indexed on 2014/06/05 3:26 UTC
Read the original article Hit count: 161

Filed under:
|
|

I have a table (let's call it log) with a few millions of records. Among the fields I have Id, Count, FirstHit, LastHit.

  • Id - The record id
  • Count - number of times this Id has been reported
  • FirstHit - earliest timestamp with which this Id was reported
  • LastHit - latest timestamp with which this Id was reported

This table only has one record for any given Id

Everyday I get into another table (let's call it feed) with around half a million records with these fields among many others:

  • Id
  • Timestamp - Entry date and time.

This table can have many records for the same id

What I want to do is to update log in the following way. Count - log count value, plus the count() of records for that id found in feed FirstHit - the earliest of the current value in log or the minimum value in feed for that id LastHit - the latest of the current value in log or the maximum value in feed for that id.

It should be noticed that many of the ids in feed are already in log.

The simple thing that worked is to create a temporary table and insert into it the union of both as in

Select Id, Min(Timestamp) As FirstHit, MAX(Timestamp) as LastHit, Count(*) as Count FROM feed GROUP BY Id
UNION ALL
Select Id, FirstHit,LastHit,Count FROM log;

From that temporary table I do a select that aggregates Min(firsthit), max(lasthit) and sum(Count)

Select Id, Min(FirstHit),Max(LastHit),Sum(Count) FROM @temp GROUP BY Id;

and that gives me the end result. I could then delete everything from log and replace it with everything with temp, or craft an update for the common records and insert the new ones. However, I think both are highly inefficient.

Is there a more efficient way of doing this. Perhaps doing the update in place in the log table?

© Stack Overflow or respective owner

Related posts about sql

Related posts about sql-server