November 12, 2015 at 4:24 am
I have a table with school details whose columns are like below
Create table Test(schoolid int, ColumnID int, Value varchar(100))
if columnid=1 then it is Start Date
if columnid=2 then it is enddate
if columnid=3 then it is dollarspent
Test table has about 10 million records
Insert into test Values(1000,1,'2001-01-01')
Insert into test Values(1000,1,'2002-01-01')
Insert into test Values(1000,1,'2003-01-01')
Insert into test Values(1000,2,'2001-12-31')
Insert into test Values(1000,2,'2002-09-01')
Insert into test Values(1000,2,'2003-10-01')
Insert into test Values(1000,3,'2001.01')
Insert into test Values(1000,3,'10000.99')
Insert into test Values(2000,1,'2001-01-01')
Insert into test Values(2000,1,'2002-01-01')
Insert into test Values(2000,1,'2003-01-01')
Insert into test Values(2000,2,'2001-12-31')
Insert into test Values(2000,2,'2002-09-01')
Insert into test Values(2000,2,'2003-10-01')
Insert into test Values(2000,3,'2001.01')
Insert into test Values(2000,3,'10000.99')
Insert into test Values(2000,3,'553.22')
I have a reference table that keeps track of kind of school in terms of funding (private, public, semi-public,semi-private)
and kind of school in terms of areas of study (Eng-engineering,Arts,Music)
Create table RefInfo(schoolid int, ColumnID int, Value varchar(100))
if columnid=1 then it is funding
if columnid=2 then it is typeofschool
Table has about 25000 records
Insert into RefInfo Values(1000,1,'PRI')
Insert into RefInfo Values(2000,1,'PUB')
Insert into RefInfo Values(3000,1,'SEMI-PUB')
Insert into RefInfo Values(4000,1,'SEMI-PRI')
Insert into RefInfo Values(1000,2,'ENG')
Insert into RefInfo Values(2000,2,'MUSIC')
Insert into RefInfo Values(3000,2,'ARTS')
Insert into RefInfo Values(4000,2,'ENG')
I want to select schoolid,startdate,enddate,dollarspent,TypeOfSchool into a work table. I am not sure what's the best way to do this. I first thought of using Pivot for each column but not sure of implications of time taken to execute those queries.
Are there any better ways of doing this?
Thanks
Rash
November 12, 2015 at 7:52 am
Here are a couple of examples:
DECLARE @RefInfo TABLE (schoolid int, ColumnID int, Value varchar(100))
INSERT INTO @RefInfo VALUES (1000,1,'PRI'),(2000,1,'PUB'),(3000,1,'SEMI-PUB'),(4000,1,'SEMI-PRI'),(1000,2,'ENG'),(2000,2,'MUSIC'),(3000,2,'ARTS'),(4000,2,'ENG')
--USING CROSS TABS
SELECT
schoolid,
MAX(CASE WHEN columnID = 1 THEN value END) AS funding,
MAX(CASE WHEN columnID = 2 THEN value END) AS typeOfSchool
FROM
@RefInfo
GROUP BY
schoolid
--USING PIVOT
SELECT
schoolid,
[1] AS funding,
[2] AS typeOfSchool
FROM
@RefInfo
PIVOT (MAX(value) FOR ColumnID IN ([1], [2])) pvt
If you can be more specific about what you need I can come up with a complete solution. That being said if you are not familiar with these techniques it's worth giving them a try for practice.
EDIT: fixed typo, had columnid = 1 twice...copy and paste will kill you every time 😉
November 12, 2015 at 10:30 am
I completely messed up the table.. I am so sorry..
Create table Test(schoolid int, ColumnID int, Value varchar(100), insertedDate Date)
Inserteddate - date when it was collected
Insert into test values(1000,1,'2001-01-01','2005-12-31 00:00:00')
Insert into test values(1000,1,'2002-01-01','2006-12-31 00:00:00')
Insert into test values(1000,1,'2003-01-01','2007-12-31 00:00:00')
Insert into test values(1000,2,'2001-12-31','2005-12-31 00:00:00')
Insert into test values(1000,2,'2002-09-01','2006-12-31 00:00:00')
Insert into test values(1000,2,'2003-10-01','2007-12-31 00:00:00')
Insert into test values(1000,3,'2001.01','2005-12-31 00:00:00')
Insert into test values(1000,3,'10000.99','2006-12-31 00:00:00')
Insert into test values(2000,1,'2001-01-01','2006-12-31 00:00:00')
Insert into test values(2000,1,'2002-01-01','2007-12-31 00:00:00')
Insert into test values(2000,1,'2003-01-01','2008-12-31 00:00:00')
Insert into test values(2000,2,'2001-12-31','2006-12-31 00:00:00')
Insert into test values(2000,2,'2002-09-01','2007-12-31 00:00:00')
Insert into test values(2000,2,'2003-10-01','2008-12-31 00:00:00')
Insert into test values(2000,3,'2001.01','2006-12-31 00:00:00')
Insert into test values(2000,3,'10000.99','2007-12-31 00:00:00')
Insert into test values(2000,3,'553.22','2008-12-31 00:00:00')
Here's what my output should be
select Companyid,startdate,enddate,inserteddate,dollarspent,typeofschool,funding
--to be printed for each school
1000,'2001-01-01','2001-12-31','2001.01','2005-12-31 00:00:00','PRI','ENG'
1000,'2002-01-01','2002-09-01','10000.99','2006-12-31 00:00:00','PRI','ENG'
1000,'2003-01-01','2003-10-01',NULL,'2007-12-31 00:00:00','PRI','ENG'
2000,'2001-01-01','2001-12-31','2001.01','2006-12-31 00:00:00','PUB','MUSIC'
November 12, 2015 at 10:41 am
Please, check your sample data. It won't run.
November 12, 2015 at 10:49 am
oops..I updated my scripts
November 12, 2015 at 11:08 am
Ok, try something like this:
DECLARE @RefInfo TABLE (schoolid int, ColumnID int, Value varchar(100))
DECLARE @test-2TABLE (schoolid int, ColumnID int, Value varchar(100), insertedDate Date)
INSERT INTO @RefInfo VALUES (1000,1,'PRI'),(2000,1,'PUB'),(3000,1,'SEMI-PUB'),(4000,1,'SEMI-PRI'),(1000,2,'ENG'),(2000,2,'MUSIC'),(3000,2,'ARTS'),(4000,2,'ENG')
INSERT INTO @test-2 VALUES
(1000,1,'2001-01-01','2005-12-31 00:00:00'),
(1000,1,'2002-01-01','2006-12-31 00:00:00'),
(1000,1,'2003-01-01','2007-12-31 00:00:00'),
(1000,2,'2001-12-31','2005-12-31 00:00:00'),
(1000,2,'2002-09-01','2006-12-31 00:00:00'),
(1000,2,'2003-10-01','2007-12-31 00:00:00'),
(1000,3,'2001.01','2005-12-31 00:00:00'),
(1000,3,'10000.99','2006-12-31 00:00:00'),
(2000,1,'2001-01-01','2006-12-31 00:00:00'),
(2000,1,'2002-01-01','2007-12-31 00:00:00'),
(2000,1,'2003-01-01','2008-12-31 00:00:00'),
(2000,2,'2001-12-31','2006-12-31 00:00:00'),
(2000,2,'2002-09-01','2007-12-31 00:00:00'),
(2000,2,'2003-10-01','2008-12-31 00:00:00'),
(2000,3,'2001.01','2006-12-31 00:00:00'),
(2000,3,'10000.99','2007-12-31 00:00:00'),
(2000,3,'553.22','2008-12-31 00:00:00')
SELECT
t.schoolid,
t.Num,
MAX(CASE WHEN t.columnID = 1 THEN t.value END) AS StartDate,
MAX(CASE WHEN t.columnID = 2 THEN t.value END) AS EndDate,
MAX(CASE WHEN t.columnID = 3 THEN t.value END) AS DollarSpent,
MAX(CASE WHEN r.columnID = 2 THEN r.value END) AS typeOfSchool,
MAX(CASE WHEN r.columnID = 1 THEN r.value END) AS funding
FROM
(SELECT schoolid, ColumnID, Value, insertedDate, ROW_NUMBER() OVER (PARTITION BY schoolid, columnid ORDER BY value ASC) AS Num
FROM @test-2
) t
JOIN @RefInfo r ON t.schoolid = r.schoolid
GROUP BY
t.schoolid,
t.Num
FYI - That is a horrible way of storing your data.
November 14, 2015 at 4:46 am
wow! never thought of that approach. It took me a day :-(..and the only solution I thought was creating separate temp tables for each ColumnID with primary key as SchoolID and then Join them.
So I have to now figure which has better performance, temp tables or using CASE statements. This table has about 10 million records i.e. about 60,000 SchoolIDs with different year data. Any thoughts on that??
Thanks a lot
P.S.
This is how I get data from vendor. I need to create reports using this..
November 14, 2015 at 11:15 am
rash3554 (11/14/2015)
wow! never thought of that approach. It took me a day :-(..and the only solution I thought was creating separate temp tables for each ColumnID with primary key as SchoolID and then Join them.So I have to now figure which has better performance, temp tables or using CASE statements. This table has about 10 million records i.e. about 60,000 SchoolIDs with different year data. Any thoughts on that??
Thanks a lot
P.S.
This is how I get data from vendor. I need to create reports using this..
Yeah... since you're going to hit the whole table, it's going to be "slow" no matter what. Some thoughtful indexing may help with sorting but you're still going to hit the whole table because there's virtually no way to index by date because of the bad design that allows dates and dollar amounts to be mixed in the same column.
There's also no way to guarantee when the dollars were spent because the type 3 rows have no date on them and you cannot rely on storage order of rows in a table for this type of thing. Except for a grand total over the entire period for a given school, the data is pretty much useless for anything else. I'd introduce the vendor to the concept of providing reasonable data.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply