Optimization of SQL query regarding pair comparisons
        Posted  
        
            by 
                InfiniteSquirrel
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by InfiniteSquirrel
        
        
        
        Published on 2011-01-05T16:19:23Z
        Indexed on 
            2011/01/05
            16:54 UTC
        
        
        Read the original article
        Hit count: 276
        
Hi, I'm working on a pair comparison site where a user loads a list of films and grades from another site. My site then picks two random movies and matches them against each other, the user selects the better of the two and a new pair is loaded. This gives a complete list of movies ordered by whichever is best.
The database contains three tables;
fm_film_data - this contains all imported movies
fm_film_data(id int(11), 
             imdb_id varchar(10), 
             tmdb_id varchar(10), 
             title varchar(255),     
             original_title varchar(255),    
             year year(4),
             director text,
             description text,
             poster_url varchar(255))
fm_films - this contains all information related to a user, what movies the user has seen, what grades the user has given, as well as information about each film's wins/losses for that user.
fm_films(id int(11),
         user_id int(11),
         film_id int(11),
         grade int(11),  
         wins int(11),   
         losses int(11))
fm_log - this contains records of every duel that has occurred.
fm_log(id int(11),
       user_id int(11),
       winner int(11),
       loser int(11))
To pick a pair to show the user, I've created a mySQL query that checks the log and picks a pair at random.
SELECT pair.id1, pair.id2 
FROM
    (SELECT part1.id AS id1, part2.id AS id2 
    FROM fm_films AS part1, fm_films AS part2 
    WHERE part1.id <> part2.id 
        AND part1.user_id = [!!USERID!!] 
        AND part2.user_id = [!!USERID!!]) 
AS pair
LEFT JOIN
    (SELECT winner AS id1, loser AS id2 
    FROM fm_log
    WHERE fm_log.user_id = [!!USERID!!]
    UNION
    SELECT loser AS id1, winner AS id2 
    FROM fm_log
    WHERE fm_log.user_id = [!!USERID!!])
AS log
ON pair.id1 = log.id1 AND pair.id2 = log.id2
WHERE log.id1 IS NULL
ORDER BY RAND()
LIMIT 1
This query takes some time to load, about 6 seconds in our tests with two users with about 800 grades each.
I'm looking for a way to optimize this but still limit all duels to appear only once.
The server runs MySQL version 5.0.90-community.
© Stack Overflow or respective owner