Search Results

Search found 2 results on 1 pages for 'sakura90'.

Page 1/1 | 1 

  • how to prune data set?

    - by sakura90
    The MovieLens data set provides a table with columns: userid | movieid | tag | timestamp I have trouble reproducing the way they pruned the MovieLens data set used in: http://www.cse.ust.hk/~yzhen/papers/tagicofi-recsys09-zhen.pdf In 4.1 Data Set of the above paper, it writes "For the tagging information, we only keep those tags which are added on at least 3 distinct movies. As for the users, we only keep those users who used at least 3 distinct tags in their tagging history. For movies, we only keep those movies that are annotated by at least 3 distinct tags." I tried to query the database: select TMP.userid, count(*) as tagnum from (select distinct T.userid as userid, T.tag as tag from tags T) AS TMP group by TMP.userid having tagnum = 3; I got a list of 1760 users who labeled 3 distinct tags. However, some of the tags are not added on at least 3 distinct movies. Any help is appreciated.

    Read the article

  • How to prune data set by frequency to conform to paper's description

    - by sakura90
    The MovieLens data set provides a table with columns: userid | movieid | tag | timestamp I have trouble reproducing the way they pruned the MovieLens data set used in: Tag Informed Collaborative Filtering, by Zhen, Li and Young In 4.1 Data Set of the above paper, it writes "For the tagging information, we only keep those tags which are added on at least 3 distinct movies. As for the users, we only keep those users who used at least 3 distinct tags in their tagging history. For movies, we only keep those movies that are annotated by at least 3 distinct tags." I tried to query the database: select TMP.userid, count(*) as tagnum from (select distinct T.userid as userid, T.tag as tag from tags T) AS TMP group by TMP.userid having tagnum >= 3; I got a list of 1760 users who labeled 3 distinct tags. However, some of the tags are not added on at least 3 distinct movies. Any help is appreciated.

    Read the article

1