July 10, 2012 at 2:21 pm
Hi,
I have a task to complete that involves running a series of tests (mostly text comparisons) against a table, then summarising the results at two grouping levels. The set of tests to be run is dynamic. I want to be able to easily select rows where a combination of tests is failed. Because of this, I'm using a bitwise comparison.
Very simplified setup is like this:
-- create test tables
create table #base_data (
PrimaryId int
, FirstGroupingId int
, SecondGroupingId int
, TextData varchar(100)
, BitwiseValue int
)
create table #check_data (
CheckId int
, CheckValue varchar(10)
, CheckBitwiseValue int
, FailLevel int
)
create table #group_2_summary (
FirstGroupingId int
, SecondGroupingId int
, CheckId int
, FailCount int
)
create table #group_1_summary (
FirstGroupingId int
, IsFailure bit
)
-- add test data
insert #base_data
select 1, 1, 1, 'Bob', 0 union all
select 2, 1, 2, 'Dave', 0 union all
select 3, 1, 2, 'Alan', 0 union all
select 4, 2, 1, 'Mary', 0 union all
select 5, 2, 1, 'Sarah', 0 union all
select 6, 2, 1, 'Charlie', 0 union all
select 7, 2, 2, 'Caroline', 0
insert #check_data
select 1, 'A', 1, 3 union all
select 2, 'B', 2, 3 union all
select 3, 'C', 4, 3 union all
select 4, 'D', 8, 3
Initially, I just populate the bitwise column in the base data using a case statement:
-- populate the individual bitwise values
-- edit: note that in real life this query is built dynamically and the case statements can be looking at any number of different fields within #base_data
-- but this is omitted for simplicity
; with cte as (
select PrimaryId
, BitwiseValue
, case when charindex('A',TextData) > 0 then 1 else 0 end as Check1
, case when charindex('B',TextData) > 0 then 2 else 0 end as Check2
, case when charindex('C',TextData) > 0 then 4 else 0 end as Check3
, case when charindex('D',TextData) > 0 then 8 else 0 end as Check4
from #base_data
)
update cte
set BitwiseValue = Check1 + Check2 + Check3 + Check4
select * from #base_data
go
So far so good. Now I need to create a summary of fail counts for each test at SecondGroupingId level. I'm struggling to come up with a set based solution for this, so have deployed a cursor. I would really like to avoid the use of the cursor if at all possible:
-- populate the second grouped summary first
-- this needs to have an individual row for each combination of
-- FirstGroupingId, SecondGroupingId and CheckId
-- hence use of cursor/loop
declare @fetch int
, @CheckId int
, @CheckBitwiseValue int
declare curChecks cursor for
select CheckId
, CheckBitwiseValue
from #check_data
open curChecks
fetch next from curChecks
into @CheckId, @CheckBitwiseValue
set @fetch = @@fetch_status
while @fetch = 0
begin
-- use cte to get round outer reference problem
; with cte as (
select FirstGroupingId
, SecondGroupingId
, count(*) as FailCount
from #base_data
where @CheckBitwiseValue & BitwiseValue = @CheckBitwiseValue
group by FirstGroupingId
, SecondGroupingId
)
insert #group_2_summary
select FirstGroupingId
, SecondGroupingId
, @CheckId
, FailCount
from cte
fetch next from curChecks
into @CheckId, @CheckBitwiseValue
set @fetch = @@fetch_status
end
close curChecks
deallocate curChecks
go
There are no crieria for failure at SecondGroupingId level individually. But, if the sum of failures for any check at FirstGroupingId level goes over the FailLevel for the check as defined in #check_data, then the FirstGroupingId set of data is deemed to have failed. This can be calculated relatively simply from the #group_2_summary table:
-- use the second group summary to populate the first group summary
; with cte as (
select FirstGroupingId
, cd.CheckId
, sum(FailCount) as TotalFailCount
, FailLevel
from #group_2_summary g2s
join #check_data cd
on g2s.CheckId = cd.CheckId
group by FirstGroupingId
, cd.CheckId
, FailLevel
)
, cte2 as (
select FirstGroupingId
, CheckId
, case when TotalFailCount >= FailLevel then 1 else 0 end as IsFailure
from cte
insert #group_1_summary
select FirstGroupingId, max(IsFailure)
from cte2
group by FirstGroupingId
So for this test data, I expect the results to be as follows:
select * from #group_1_summary
/*
FirstGroupingId IsFailure
--------------- ---------
1 0
2 1
*/
select * from #group_2_summary
order by FirstGroupingId, SecondGroupingId
/*
FirstGroupingId SecondGroupingId CheckId FailCount
--------------- ---------------- ----------- -----------
1 1 2 1
1 2 1 2
1 2 4 1
2 1 1 3
2 1 3 1
2 2 3 1
2 2 1 1
*/
Cleanup:
drop table #base_data
drop table #check_data
drop table #group_1_summary
drop table #group_2_summary
It all just feels a bit lumpy and inelegant. Perhaps someone can suggest a better way?
Thanks, Iain
Edit: clarification
July 10, 2012 at 2:36 pm
Incidentally, I'm eliminating any rows where BitwiseValue = 0.
I've also thought that I might be able to use a combination of case statements in the cte with an unpivot in the insert to do the SecondGroupingId summary. Maybe this is a better way?
I've been looking at this for far too long, my head hurts... :unsure:
July 10, 2012 at 3:31 pm
I'm getting lost in your requirements, but you definitely don't need the cursor.
; with basecte as (
select FirstGroupingId,CheckID,faillevel
, sum(case when charindex(CheckValue,TextData) > 0 then 1 end) status
from #base_data cross join #check_data
group by FirstGroupingId,CheckID,faillevel
)
select FirstGroupingId, max(case when status>=faillevel then 1 else 0 end) from baseCTE group by FirstGroupingId
; with basecte as (
select FirstGroupingId,SecondGroupingId,checkID
,sum( case when charindex(CheckValue,TextData) > 0 then 1 end) status
from #base_data cross join #check_data group by
FirstGroupingId,SecondGroupingId,checkID
)
select * from basecte where status is not null
Note: I am bypassing the bitwise operation, because it's not helping you. You've already got what you need to check a variable number of tests.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
July 10, 2012 at 11:53 pm
Hi Matt,
Thanks for the reply. My example is heavily simplified. The procedure is to be used with an application that allows a user to select and execute a set of tests of their choice against a sample of data. These checks can be run against any one of a number of different columns. The tests are not limited to simple string comparisons. Perhaps I should have made this clearer in my original post, I was trying simplify what is a bit of a complex problem!
I have used the bitwise field because I have a requirement to allow a user to select rows from the master data (i.e. #base_data) that have failed any combination of tests, e.g. return all rows that failed tests 2,3 & 7. So it's either add lots of bit flags or a bitwise column and I prefer the bitwise approach.
The bit I really want to get rid of is the loop in the middle section. I'm going to have a look at the case/unpivot approach and see if I can make something work.
I'm open to suggestions as to other approaches.
Thanks Iain
July 11, 2012 at 1:08 am
As I thought. The case with unpivot option does the job much better:
-- unpivot
-- again, note that this will be built dynamically, but not shown here for clarity
select FirstGroupingId
, SecondGroupingId
, CheckId
, FailCount
from (
select FirstGroupingId
, SecondGroupingId
, sum(case when 1 & BitwiseValue = 1 then 1 else 0 end) as [1]
, sum(case when 2 & BitwiseValue = 2 then 1 else 0 end) as [2]
, sum(case when 4 & BitwiseValue = 4 then 1 else 0 end) as [4]
, sum(case when 8 & BitwiseValue = 8 then 1 else 0 end) as [8]
from #base_data
group by FirstGroupingId
, SecondGroupingId
) src
unpivot (
FailCount for CheckId in (
[1],[2],[4],[8]
)
) unpvt
where FailCount > 0
order by FirstGroupingId, SecondGroupingId, CheckId
A clear demonstration of how it's often much better to step away from something than to attempt to plough through it. Well, that and the power of thinking about your problem in such a way as to be able to explain it to someone else.
I'm still open to suggestions as to approach though. If anyone would like to comment or improve on this feel free.
Thanks, Iain
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply