Distinct Count of Customers day by day...

  • Hi to all,

    I have to query Analysis services for obtain the distinct count of customer that bought a

    product. Simple! I know 2 way to do this, (see

    http://msdn.microsoft.com/library/en-us/dnolap/html/distinct2.asp?frame=true&_r=1 for

    references)

    The first way is to create an MDX expression that count the member of a set build by a

    crossjoin between Sales Misures and Customer Dimension, and the second way is to create a cube for each distinct count measures, that contains the same dimensions of "primary cube".

    The second method is unapplicable because, unfortunately I need the Distinct Count of customers as a "Running Total" Measures, So if I select the March 2 2003 in Time Dimension, I want to see the distinct count of Measures from January 1 2003 to March 2 2003.

    The first way hit unacceptable performance.

    Someone know other way to do this ?

    Thank's in advance to All,

    Marco.


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • I would use a sub select.

    Like:

    select Customer, count(*) from CustomerTable where CustomerId in (select distinct(CustomerId) from Sales where CustomerHasSaleFl = 1)

    group by Customer with rollup

    with rollup gives you the grand total


    Darren

  • Thanks dgermundson,

    But How I can put your T-SQL query in Analysis server ?


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • I was assuming you would fill in your table names where I guessed on the names. By query Analysis do you mean Query Analyzer or perhaps viewing the Execution Plan? If you give me the table layouts maybe I can be more specific.

    Darren


    Darren

  • Oh no, probably my question is written poor, I want to do a distinct count of customer, day by day, on Analysis Serivices, not Query Analyzer! I 'm talking about an Anlysis services cube, not a SQL database.

    Thank you for your quick response, and excuse me if my question is posted wrong.

    Marco.


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • Marco,

    I haven't had a chance to dial-in and look yet, but from memory, for one of our clients we have gone the separate cube per distinct coutn measure way. We also create (say) YTD, MTD, WTD calculated members int he time dimension(s). And (again from memory) I thought that filtering to these filters the count(s) appropriately.

    So in your case, I would be assuming a Year-to-Date (YTD) would work for you, allowing you to filter you distinct customer count in the way you mentioned.

    If you need it, I can dial-in and get the MDX for the YTD member and send thru, let me know.

    Cheers,

    Steve.

    Steve.

  • Hi Steve,

    first of all thank you very much for your reply.

    At this moment I have a separate cube for each distinct measures, and it work well. Although, my customers want just a calculated measures YTD, and I made it both using the filter function and using the cross join function. This measures work well, but in conjunction with use of PivotTable Services it raise two problem: 1. The measures is very, very slow. 2. The client PC during calculate of this measure, work hard.

    I believe there are not simple solution to do this, because by using the YTD measures, I force the runtime calculation of measures and this is an hard operation.

    Are you agree ?

    Marco.


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • Hi Marco,

    Yeah, perfromance with distinct counts can be an issue. I'm neck deep in conversations with Msft currently about the client Vs server processing of queries.

    What I can offer you from what we know so far, is if you are using Excel, then make sure you have SP3 on the server and also SP3 of PTS (either PTS full or lite) on the client. Ensure you have your Execution Location and Default Isolation Mode settings within the connection string set to appropriate values.

    Skip question re: using built in DC, i re-read original post.

    Steve.

    Edited by - stevefromOZ on 04/15/2003 07:20:59 AM

    Steve.

  • Marco,

    There's no way you can convince the client to use a calc member for YTD in the time dimension? It wouldmean they could use YTD for any given measure in the cube including the DC measure(s).

    Steve

    Steve.

  • Uhm... may be I'don't understand your last answer, what means a calc member in time dimension ?

    It would means that I can add a calc member and set the parent dimension to "MyTime Dimension", right ?

    If yes, when my customers click on this member it give the YTD value of any measures in schema ??? Uhm... very interesting topic!

    Can you give me some elucidation about this ?

    Thanks once again,

    Marco.


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • Hi Marco,

    Yes, I do mean adding the calculated member to the time dimension. Creating this member can be hard or easy, it really depends on the format of the ley for your time dimension. Personally, I like to try for an integer, usually of the format YYYYMMDD, because it lets me query, sort and filter easily in SQL, plus let's me do the YTD easily in MDX 🙂

    Assuming you do have that as a format for your key, then you;ll want to create a calc member along the lines of

    Sum(PeriodsToDate([Time].[Year], StrToMember("[Time].[" + Format(now(), "yyyymmdd") + "]")), [Measures].CurrentMember)

    Explanation of above is as follows:

    1. I have a time dimension called "Time"

    2. My time dimension has several levels, one of which (the lowest) is named "Day"

    3. I have as my key for my time dimension, an integer that is human readable via the format YYYYMMDD (eg 19730522 = 22nd May 1973)

    4. I have specified my time dimension *to be* a time dimension in the advanced proeprties sheet of the cube/dimension editor. From memory, this allows direct use of 'PeriodsToDate' function, and probably the YTD function also, which I havent used as I may load ahead dates (usually when loading budget info, ie it goes intot he future).

    Ok, so basically, what I am wanting the function to do is get the current date (as per server date), get the periods to date, which will start at the start of the year and work to the date I ask for, in this case the current day (this is specified by using the [Time].[Year] level in the PTD function). And then sum the currently displayed measures using this periods to date.

    Hope this helps, or at least makes sense :-S

    Steve.

    Steve.

  • Once again I have to give my thanks. I perfectly understood, what to do.

    I think this is a very very useful techniques to get the YTD measures, and surely I have to use it instead to have more calculated member like "YtdQty", "YtdValue", "YtdAvg" etc...

    Really Very useful hint, thanks a lot.

    I have already tried it but I find that this members don't run well for distinct count measures, because it make the sum, not the distinct count.

    Although I think can use the Aggregate function instead the sum function and may be I have the right values for DC measures.

    I will inform you!

    Thank's a lot,

    Marco.


    "...e il mio maestro mi insegnò com'è difficile trovare l'alba dentro l'imbrunire" F.Battiato - Prospettiva Nevsky

  • quote:


    Hi Steve,

    first of all thank you very much for your reply.

    At this moment I have a separate cube for each distinct measures, and it work well. Although, my customers want just a calculated measures YTD, and I made it both using the filter function and using the cross join function. This measures work well, but in conjunction with use of PivotTable Services it raise two problem: 1. The measures is very, very slow. 2. The client PC during calculate of this measure, work hard.

    I believe there are not simple solution to do this, because by using the YTD measures, I force the runtime calculation of measures and this is an hard operation.

    Are you agree ?

    Marco.


  • Hi Steve,

    first of all thank you very much for your reply.

    At this moment I have a separate cube for each distinct measures, and it work well. Although, my customers want just a calculated measures YTD, and I made it both using the filter function and using the cross join function. This measures work well, but in conjunction with use of PivotTable Services it raise two problem: 1. The measures is very, very slow. 2. The client PC during calculate of this measure, work hard.

    I believe there are not simple solution to do this, because by using the YTD measures, I force the runtime calculation of measures and this is an hard operation.

    Are you agree ?

    Marco.

    SteveFromOz... With regards to your solution for YTD, MTD, WTD etc. I am utilizing the same technique but an looking for a way to dynamically change the current time that is used for the year to date function.

    One method that I have played with is to create a seperate time dimension with months and link that dimension to the time dimension in the cube editor through a dummy link field(value of one). What I am hoping to do is to allow the user to change the month that the currently reported YTD is based on. I am having trouble with getting the YTD function of the time dimension to be based on the the selected month from another time dimension. CAN A CALCULATED MEMBER REFERENCE MEMBERS OR PROPERTIES FROM ANOTHER DIMENSION? Have you ever tried anything like this? Do you have any other suggestions on how to let the user change which month is current when using calculated members of the time dimension for year to date functionality.

    I currently can only change the current date by changing a value in a SQL Server table.

    Thanks

  • Hi Chris,

    You *should* be able to have the calculated member reference another dimension. One gotcha I can think of here is that your calculated member formula must handle the case when the second dimension is not filtered to a particular month. You should be able to get around this by using an 'if' statement prior to obtaining the referenced dimensions value (and basically setting to a default).

    For ease of use I would make the second/reference time dimension basically exactly the same as the primary, this means you won't have to coerce the month/date values being passed to the calc member.

    The other way to solve this is to provide users with a client tool or mod your own front end, so that the users can create the calculated members themselves (ie session/query calc members).

    Both ways have their downsides - using the 2 time dimension method (where one references the other) I personally think is flawed as an inexperienced user can get confused with multiple time dims. Have often seen users wondering why there is no revenue figures for January when second time dim has (inadvertantly) been left in June (or any month other than Jan). The second method (let them do it themselves) causes me problems when looking for consistency in reporting across an organisation. If personA typos the calc member in some way, their figures will be wrong, or at least different, when compared with everyone elses.

    A third way, maybe use the calc member in your measures dimension (where we started ), and have it reference the one (and only one) time dimension for it's "current month". This would probably solve the two issues raised above. But I personally am not too fond of doing YTD's etc this way as it means that if I want YTD, LYTD, QTD, LQTD etc etc, I will end up with [CurrentMeasureCount] * [RequiredCalcMembers] as the count of my measures.

    Who said OLAP/analysis was easy?!?! Maybe we should go back to SQL

    Cheers,

    Steve.

    Steve.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply