Here is my problem: The business requires a measure to be created that involves identifying the distinct count of a field for various columns in specific dimensions. The field is a store ID (Retailer_ID). I need to capture the distinct number of store id's across the Distributor Dimension..which contains the following fields:
- Business Unit
- GM
- MDM
Also this will be at the monthly grain.
So here is the SQL that I have:
SELECT bu_cd, bu_dsc, gm_cd, gm_dsc, mdm_cd, mdm_dsc, count(distinct case when date_id between 20130101 and 20131231 then f.retail_id else NULL END) acct_sold_mth_ty, count(distinct case when date_id between 20120101 and 20121231 then f.retail_id else NULL END) acct_sold_mth_ly, count(distinct case when date_id between 20110101 and 20111231 then f.retail_id else NULL END) acct_sold_mth_lly FROM datamart03.dbo.tbl_asa_fact f join datamart03.dbo.tbl_asa_dist d on d.dist_id = f.dist_id GROUP BY bu_cd, bu_dsc, gm_cd, gm_dsc, mdm_cd, mdm_dsc, ORDER BY 1 ASC, 3 ASC
I understand that the distinct count is a killer when trying to aggregate against large amounts of data, especially like how I have it.
The various permutations that are required in order to get the accurate measure, is where I am severely challenged....in fact what I have shown above is a small sample of what needs to be done. There is in fact another dimension with 3 columns that would also need to be part of the group by!
I am hoping that possibly there is someone out there with a more elegant way of handling this??
Thanks in advance!!