select all values from a dimension for which there are facts in all other dimensions

Posted by ideasculptor on Stack Overflow See other posts from Stack Overflow or by ideasculptor
Published on 2010-04-06T17:43:12Z Indexed on 2010/04/06 18:43 UTC
Read the original article Hit count: 218

I've tried to simplify for the purposes of asking this question. Hopefully, this will be comprehensible.

Basically, I have a fact table with a time dimension, another dimension, and a hierarchical dimension. For the purposes of the question, let's assume the hierarchical dimension is zip code and state. The other dimension is just descriptive. Let's call it 'customer' Let's assume there are 50 customers.

I need to find the set of states for which there is at least one zip code in which EVERY customer has at least one fact row for each day in the time dimension. If a zip code has only 49 customers, I don't care about it. If even one of the 50 customers doesn't have a value for even 1 day in a zip code, I don't care about it. Finally, I also need to know which zip codes qualified the state for selection. Note, there is no requirement that every zip code have a full data set - only that at least one zip code does.

I don't mind making multiple queries and doing some processing on the client side. This is a dataset that only needs to be generated once per day and can be cached. I don't even see a particularly clean way to do it with multiple queries short of simply brute-force iteration, and there are a heck of a lot of 'zip codes' in the data set (not actually zip codes, but the there are approximately 100,000 entries in the lower level of the hierarchy and several hundred in the top level, so zipcode->state is a reasonable analogy)

© Stack Overflow or respective owner

Related posts about datawarehousing

Related posts about sql