June 7, 2010 at 12:58 pm
I am working on a project where I have a table with several fields (involvedIIP1, engagedIPOA1, exploredS1, laughed1, and communicatedV1).
These fields are all tiny integers with values from 0 through 3. I need to be able to aggregate and calculate based on the following rules:
If all five fields are non null then add all five together and divide by 5. If any one of the values is null then add all the non null fields together and divide by four. If more than 1 field is null, then there are two many missing data elements to provide a reliable answer.
I also need to to the same thing with a group of six items but allow for two null items before throwing out the results.
This is not homework, it is actually a system which collects observed behaviors of very young children to help identify issues that affect the child's behavior, outside influences that cause changes to those behaviors, and compares the results over a period of time (up to five years of daily observed and recorded behavior data).
June 7, 2010 at 2:58 pm
with out seeing some sample data it would be hard to provide much of an answer but I would group the data by your primary key or possibly multiple fields depending on whats needed. then count the number of non null values. Only pull those records that are over the threshold. stick all that in a temp table with a count and sum of what ever field you are adding together. Then devide the total by the count.
Dan
If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.
June 7, 2010 at 3:14 pm
-- See how this starts off with a table and data in it?
-- If you had provided us the data in this format,
-- it would have made things easier for all of the
-- volunteers on this site to help you out.
DECLARE @test-2 TABLE (Col1 tinyint, Col2 tinyint, Col3 tinyint, Col4 tinyint, Col5 tinyint, Col6 tinyint)
INSERT INTO @test-2
SELECT 0,1,2,3,NULL,NULL UNION ALL
SELECT 1,2,3,2,1,NULL UNION ALL
SELECT NULL,2,3,2,NULL,NULL UNION ALL
SELECT 0,1,2,3,2,1
;WITH CTE AS
(
SELECT * ,
-- get a count of the number of fields that are null
NullCount = CASE WHEN Col1 IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col2 IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col3 IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col4 IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col5 IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col6 IS NULL THEN 1 ELSE 0 END,
-- sum up all of the fields
Total = ISNULL(Col1,0) +
ISNULL(Col2,0) +
ISNULL(Col3,0) +
ISNULL(Col4,0) +
ISNULL(Col5,0) +
ISNULL(Col6,0)
FROM @test-2
)
SELECT Col1,
Col2,
Col3,
Col4,
Col5,
Col6,
[Avg] = Total / (6.0-NullCount)
FROM CTE
WHERE NullCount < 3
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 7, 2010 at 3:41 pm
Wayne already provided an excellent solution in the post above. Here is the alternative utilizing the filtering you need based on the having predicate. I will have to start with mocking up some data because none was provided in the question.
-- begin sample data
create table #t
(
record_id int not null identity(1, 1) primary key clustered,
involvedIIP1 tinyint,
engagedIPOA1 tinyint,
exploredS1 tinyint,
laughed1 tinyint,
communicatedV1 tinyint
);
insert into #t(involvedIIP1, engagedIPOA1, exploredS1, laughed1, communicatedV1)
values(2, 1, 0, 3, null);
insert into #t(involvedIIP1, engagedIPOA1, exploredS1, laughed1, communicatedV1)
values(1, 2, 2, 1, 3);
insert into #t(involvedIIP1, engagedIPOA1, exploredS1, laughed1, communicatedV1)
values(2, null, 0, 1, null);
insert into #t(involvedIIP1, engagedIPOA1, exploredS1, laughed1, communicatedV1)
values(2, null, 0, null, null);
insert into #t(involvedIIP1, engagedIPOA1, exploredS1, laughed1, communicatedV1)
values(2, null, null, null, 1);
-- end sample data
-- Here is the query:
select
record_id, min(involvedIIP1) involvedIIP1, min(engagedIPOA1) engagedIPOA1,
min(exploredS1) exploredS1, min(laughed1) laughed1, min(communicatedV1) communicatedV1,
1.0 * (isnull(min(involvedIIP1), 0) + isnull(min(engagedIPOA1), 0) +
isnull(min(exploredS1), 0) + isnull(min(laughed1), 0) +
isnull(min(communicatedV1), 0)) /
(count(involvedIIP1) + count(engagedIPOA1) + count(exploredS1) +
count(laughed1) + count(communicatedV1)) score
from #t
group by record_id
having count(involvedIIP1) + count(engagedIPOA1) +
count(exploredS1) + count(laughed1) + count(communicatedV1) > 3;
This will have results:
record_id involvedIIP1 engagedIPOA1 exploredS1 laughed1 communicatedV1 score
----------- ------------ ------------ ---------- -------- -------------- --------------
1 2 1 0 3 NULL 1.500000000000
2 1 2 2 1 3 1.800000000000
Oleg
June 7, 2010 at 4:05 pm
Thank you for the answers
June 8, 2010 at 4:42 pm
I urge you to make this a computed column on the main table if possible. That way the definition is coded only once and thus can easily be changed in just one place.
For example:
ALTER TABLE yourTable
ADD ColAvg AS --<<-- naturally chg this [column] name to what you want
CASE WHEN
CASE WHEN involvedIIP1 IS NULL THEN 1 ELSE 0 END +
CASE WHEN engagedIPOA1 IS NULL THEN 1 ELSE 0 END +
CASE WHEN exploredS1 IS NULL THEN 1 ELSE 0 END +
CASE WHEN laughed1 IS NULL THEN 1 ELSE 0 END +
CASE WHEN communicatedV1 IS NULL THEN 1 ELSE 0 END
>= 2 THEN NULL ELSE
CAST((ISNULL(involvedIIP1, 0) +
ISNULL(engagedIPOA1, 0) +
ISNULL(exploredS1, 0) +
ISNULL(laughed1, 0) +
ISNULL(communicatedV1, 0)) * 1.0 / CASE WHEN
involvedIIP1 IS NULL OR
engagedIPOA1 IS NULL OR
exploredS1 IS NULL OR
laughed1 IS NULL OR
communicatedV1 IS NULL THEN 4 ELSE 5 END AS DECIMAL(5, 2))
END
Scott Pletcher, SQL Server MVP 2008-2010
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply