Dataware Query .....

  • Hi pals,

    I need some help from u.

    This is datawarehousing related stuff.

    I am having a source table as "test" and target table as "trg".

    I need to extract the data in required format as per below loading instructions and then load the data into "trg" table.

    Below sample data is only given one zipcode.There can be several codes.

    drop table test

    create table test

    (

    currentyear int,

    district varchar(10),

    school varchar(10),

    rollno int,

    zipcode varchar(10),

    flag1_handicapped char(1),

    flag2_disadvantaged char(1),

    status varchar(10),

    relation varchar(10)

    )

    /* inserted 11 rows */

    insert into test values(2005,'D1','S1',101,'530024','Y','Y','E','R')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','R')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','N','N','E','NR')

    insert into test values(2005,'D1','S1',101,'530024','Y','Y','E','NR')

    select * from test

    --- Structure of the target table

    create table trg

    (

    cyear int,

    district varchar(10),

    school varchar(10),

    RollNo int,

    zipcode varchar(10),

    type varchar(20), /* This is an extra column with hard coded values which we need to assume as Total,flag1_handicapped,flag2_Disadvantaged.For Every unique zipcode i need to GROUP BY these 3 values.

    These values never come from the source table i.e "test".But we can make use of the 2 source columns "flag1_handicapped" & "flag2_Disadvantaged"*/

    actaul_cnt int,

    empl_related int,

    empl_not_related int,

    modified_date datetime

    )

    -- The below table shows what values should get loaded into trg table

    ----------------------------------------------------------------------

    trg table columnvalue to be loaded Description

    -----------------------------------------------------------------------

    cyear test.currentyear

    districttest.district

    schooltest.school

    rollnotest.rollno

    zipcodetest.zipcode

    type /* here we to load 3 rows with 3 values

    This table contains some calculated columns. such as "actual_cnt","empl_related","empl_not_related" and so on...

    Every calculation should be grouped by this "type" column.For reference the you can see the bottom output rows how they should look like.

    The 3 valid values for this type column is "Total","flag1_handicapped","Disadavantaged".

    "Total" means = All the records which satisfies the calculation.

    "flag1_handicapped" means = All the records which statisfies the calculation and have test.flag1_handicapped = 'Y'

    "flag2_Disadvantaged" means = All the records which satisfies the calculation and have test.disadvanatged = 'Y'*/

    actaul_cnt This is a calculated column. The calc is as follows:

    count of records grouped by currentyear,district,school,zipcode,type(flag1_handicapped,flag2_Disadvantaged,total)

    empl_relatedThis is again calculated column. The calc is as follows.

    count of records where status='E' and relation = 'R' grouped by currentyear,district,school,zipcode,type(flag1_handicapped,flag2_Disadvantaged,total)

    empl_not_related This is again calculated column. The calc is as follows.

    count of records where status='E' and relation = 'NR' grouped by currentyear,district,school,zipcode,type(flag1_handicapped,flag2_Disadvantaged,total)

    modified_date getdate()

    -----------------------------------------------------------------------------------------------

    Here is the sample template which i felt like using to load the data. We need to modify this query littlt bit accordingly as per above rules.

    select

    currentyear as "CYear",

    district as "District",

    school as "School",

    rollno as "RollNo",

    zipcode as "zipcode",

    count(*) "actaul_count",

    sum(case when (status='E' and relation='R') then 1 else 0 end) "Emp_Related",

    sum(case when (status='E' and relation='NR') then 1 else 0 end) "Emp_Not_Related",

    getdate() "Date"

    from test

    group by currentyear,

    district,

    school,

    rollno,

    zipcode

    /* Using the above query we need to load 3 rows into below target table whose structure is defined as follows */

    ------------------------------------------------------------------------------------------------------------------------

    Expected Output Rows using above sample data

    ----------------------------------------------------

    CYEAR|DISTRICT|SCHOOL|ROLLNO|ZIPCDE| TYPE |ACTUALCOUNT| EMPL_RELATED |EMPL_NOT_RELATED |MODIFIED_DT

    -------------------------------------------------------------------------------------------------------------------------

    2005 | D1 | S1 | 101 | 530024 | Total | 11 | 2 | 9 | 2002-01-26

    2005 | D1 | S1 | 101 | 530024 | flag1_handicapped | 2 | 1 | 1 | 2002-01-26

    2005 | D1 | S1 | 101 | 530024 | flag2_Disadvantaged | 2 | 1 | 1 | 2002-01-26

    ------------------------------------------------------------------------------------------------------------------------

    But using above SELECT,i am able to get only row as output that to i am not able to show the "type" column in the output

    2005 | D1 | S1 | 101 | 530024 | 11 | 2 | 1 | 2002-01-26 12:57:53.420 |

    ------------------------------------------------------------------------------------------------------------------------

    Basically i am not getting how to build the Group by clause and displaying the type code using above rules.

    Can anyone help me out in solving the problem.

    Do we need to perform any UNION ALL ON test.flag1_handicapped and test.flag2_Disadvantaged columns.?

    This is totally seems out of box for me.

    Any help would be greatly appreciated.

    Thanks in Advance.

  • Assuming your data is correct, I believe this could be close:

    selectcurrentyear as CYear

    ,District

    ,School

    ,RollNo

    ,zipcode

    ,[type]

    ,count(*) as actual_count

    ,Emp_Related

    = sum(case when ([status] = 'E' and relation = 'R') then 1 else 0 end)

    ,Emp_Not_Related

    = sum(case when ([status] =' E' and relation = 'NR') then 1 else 0 end)

    ,getdate() as [Date]

    from(

    selectcurrentyear as currentyear

    ,district as District

    ,school as School

    ,rollno as RollNo

    ,zipcode as zipcode

    ,[type]

    = casewhen(flag1_handicapped = 'Y')

    and (flag2_disadvantaged = 'Y')

    then'flag1 and flag2'

    when(flag1_handicapped = 'Y')

    then'flag1_handicapped'

    when(flag2_disadvantaged = 'Y')

    then'flag2_Disadvantaged'

    else'no flags'

    end

    ,[status] as [status]

    ,relation as relation

    fromtest

    ) Normalized

    group bycurrentyear

    ,district

    ,school

    ,rollno

    ,zipcode

    ,[type]

    Notice that in two rows the value of both flags is 'Y' which you haven't mentioned in your specs.

    ML

    ---
    Matija Lah, SQL Server MVP
    http://milambda.blogspot.com

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply