March 12, 2006 at 4:40 pm
Hi
I'm doing some research for a project that I am the lead BA for and hoped I might get some advice here
The project I am involved in we have a large Database of client records, each with a unique PersonID and a number of attributes based on location, e.g credit score, credit demand score, geodemographic band, State, Postcode, Disposable Income Band e.t.c...
We are building a web portal that will allow registered web users to interactively specify pre-defined criteria to build a list of clients they are interested in, using some of the variables I have mentioned above.
The aim is to build a cube that will aggregate the counts for all the different combinations of variable values so that the users can build their criteria, submit them and have the counts returned interactively in their web session within 15 sec max. Once they are happy with their final criteria they submit the job for processing and the relevant client records who meet the criteria are extracted and output into a file they can then download.
There are two issues I am grappling with and would appreciate advice on
1/ The users will also be able to upload what we call a suppression file, that is a list of client records they need excluded from the count, they will upload this file prior to building their list filters. What we need to do is match their client records to our DB and flag them for suppression so that when the filters are submitted to extract counts these records are excluded from the counts. What's the best way of doing this but maintaining the objectives of performance and speed when the counts are submitted. Does it mean we will need to dynamically rebuild the cube (i.e available universe less suppressions) for each client that wants to apply a suppression file? What is the likely time to do this in a DB of this size or is there another approach
2/ The other issue is a client could create two filters, for example
a/ Gender = Males and DisposableIncomeIndicatorBand = 10
b/ Gender = Males and CreditRisk > 3
These are seprated by an OR. Issue is you could have the same client record in each rule so we need to dedupe based on the unique PersonID so the count returned excludes dupes.
How do we best do this whilst maintaining the best cube structure with optimal speed and performance?
Thank you for your time and professional assistance
March 15, 2006 at 8:00 am
This was removed by the editor as SPAM
March 16, 2006 at 4:41 am
Filtering list of accounts:-
1) create mdx to bring back sample interested in and drillthrough to underlying cube data, filter this by exclusion list in second recordset or file stream and return as results of query?
2) You can do a distinct count in the cube on the personid to get unique people
My mdx is a bit rusty but think you (assuming both measures) can just check for the case by defining new measure and return null. Then exclude nulls (nonempty) values from resultset or filter the axis' by measure value.
ie
with member [Measures].[New] as 'iif(measures.x=10 or measures.y<3 ,null,x)'
..I guess sizing/performance of AS 2005 is going to be trial and error kind of thing!
Phil Nicholas
March 16, 2006 at 2:46 pm
Thanks Phil, this is a good start...
Most appreciated
Troy.G
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply