Hbase schema design -- to make sorting easy?

Posted by chen on Stack Overflow See other posts from Stack Overflow or by chen
Published on 2010-03-25T15:28:20Z Indexed on 2010/06/13 20:32 UTC
Read the original article Hit count: 240

Filed under:
|

I have 1M words in my dictionary. Whenever a user issue a query on my website, I will see if the query contains the words in my dictionary and increment the counter corresponding to them individually. Here is the example, say if a user type in "Obama is a president" and "Obama" and "president" are in my dictionary, then I should increment the counter by 1 for "Obama" and "president".

And from time to time, I want to see the top 100 words (most queried words). If I use Hbase to store the counter, what schema should I use? -- I have not come up an efficient one yet.

If I use word in my dictionary as row key, and "counter" as column key, then updating counter(increment) is very efficient. But it's very hard to sort and return the top 100.

Anyone can give a good advice? Thanks.

© Stack Overflow or respective owner

Related posts about olap

  • MDX lekérdezések az Oracle OLAP-hoz

    as seen on Oracle Blogs - Search for 'Oracle Blogs'
    Az Oracle OpenWord-ön, 2009. október 12-én jelentette be az Oracle, hogy elkészült a Simba Technologies MDX eszköze az Oracle OLAP eléréséhez: Oracle and Simba Technologies Introduce MDX Provider for Oracle® OLAP. Az MDX Provider for Oracle® OLAP eszközzel közvetlenül az Excel felületrol lehet elérni… >>> More

  • An OLAP client!

    as seen on SQL Blog - Search for 'SQL Blog'
    While surfing CodePlex I’ve come across a very interesting tool for all BI Developers who misses a decent OLAP client where to write, run & test MDX queries http://ranetuilibraryolap.codeplex.com/ I’ve not tested it yet, but I’ll surely do this week and I’ll post my impressions ASAP. The first… >>> More

  • An OLAP client!

    as seen on SQL Blog - Search for 'SQL Blog'
    While surfing CodePlex I’ve come across a very interesting tool for all BI Developers who misses a decent OLAP client where to write, run & test MDX queries http://ranetuilibraryolap.codeplex.com/ I’ve not tested it yet, but I’ll surely do this week and I’ll post my impressions ASAP. The first… >>> More

  • OWB 11gR2 – OLAP and Simba

    as seen on Oracle Blogs - Search for 'Oracle Blogs'
    Oracle Warehouse Builder was the first ETL product to provide a single integrated and complete environment for managing enterprise data warehouse solutions that also incorporate multi-dimensional schemas. The OWB 11gR2 release provides Oracle OLAP 11g deployment for multi-dimensional models (in addition… >>> More

  • Building dynamic OLAP data marts on-the-fly

    as seen on SQL Blogcasts - Search for 'SQL Blogcasts'
    At the forthcoming SQLBits conference, I will be presenting a session on how to dynamically build an OLAP data mart on-the-fly. This blog entry is intended to clarify exactly what I mean by an OLAP data mart, why you may need to build them on-the-fly and finally outline the steps needed to build… >>> More

Related posts about hbase