May 31, 2012 at 8:36 am
Ok, here goes.
I have a table, which sits within my CRM and stores "questionnaire" type information, in a very cumbersome way.
The table has 4 pertinent columns. CaseID, FormName, Name, Value.
CaseID is the PK which links back to the case. Formname is the name of the questionnaire, name provides the 'question' and value provides the answer.
Lets say a form has 10 questions, that means there are going to be 10 rows of information, all with the same caseid, and with the name and value combinations being different for each row. For example...
CaseID, Formname, Name, Value
1, Highways, roadname, darlington road
1, Highways, faulttype, Drain
1, Highways, addinfo, Cover stuck up cannot be moved
2, Police Report, roadname, Middleham Road
2, Polive Report, location, Nr Barton
I want to change the format of the information so that the names, or questions, are columns and the answers are the values. For example to look like this
CaseId, Formname, Roadname, faulttype, addinfo, location
1, Highways, Darlington Road, Drain, Coverstuck up cannot be moved, null, null
2, Police Report, Middleham Road, null, null, Nr Barton
I cannot seem to do this with pivot, as it want's to aggregate and the responses are text fields. Any clues?
May 31, 2012 at 10:23 am
I was bored but this should work:
IF OBJECT_ID('tempdb..#basetable') IS NOT NULL
DROP TABLE #basetable
CREATE TABLE #basetable
(
caseid INT,
formname VARCHAR(255),
name VARCHAR(255),
value VARCHAR(255)
)
DECLARE @sql VARCHAR(MAX)
DECLARE @cols VARCHAR(MAX)
DECLARE @joins VARCHAR(MAX)
INSERT #basetable(caseid,formname,name,value)
VALUES
(1, 'Highways', 'roadname', 'darlington road'),
(1, 'Highways', 'faulttype', 'Drain'),
(1, 'Highways', 'addinfo', 'Cover stuck up cannot be moved'),
(2, 'Police Report', 'roadname', 'Middleham Road'),
(2, 'Police Report', 'location', 'Nr Barton')
SET @cols =
(SELECT ','+ '[' + name + '] = t' + CAST(ROW_NUMBER() OVER (ORDER BY name) AS VARCHAR(3)) +'.value'
FROM
(SELECT DISTINCT name FROM #basetable) x
FOR XML PATH(''))
SET @joins =
(SELECT 'LEFT JOIN #basetable t' + CAST(ROW_NUMBER() OVER (ORDER BY name) AS VARCHAR(3)) + ' ON t.caseid = '
+ 't' + CAST(ROW_NUMBER() OVER (ORDER BY name) AS VARCHAR(3)) + '.caseid AND t.formname = ' +
't' + CAST(ROW_NUMBER() OVER (ORDER BY name) AS VARCHAR(3)) + '.formname AND t' + CAST(ROW_NUMBER() OVER (ORDER BY name) AS VARCHAR(3))
+ '.name = ''' + name + ''''+ CHAR(10)
FROM
(SELECT DISTINCT name FROM #basetable) x
FOR XML PATH(''))
SET @sql = 'SELECT DISTINCT t.caseid,t.formname'+@cols+' '+ CHAR(10) + 'FROM #basetable t' + CHAR(10) + @joins
EXEC(@sql)
May 31, 2012 at 11:17 am
That looks like it is just what was ordered. If it were me I would much rather have it the way it is to start with. Can you imagine what that would look like with hundreds of reports, each having different column names?
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
June 1, 2012 at 12:17 am
I do deplore dynamic SQL unless you really need it. Try this:
DECLARE @basetable TABLE
(caseid INT, formname VARCHAR(255), name VARCHAR(255), value VARCHAR(255))
INSERT @basetable(caseid,formname,name,value)
VALUES
(1, 'Highways', 'roadname', 'darlington road'),
(1, 'Highways', 'faulttype', 'Drain'),
(1, 'Highways', 'addinfo', 'Cover stuck up cannot be moved'),
(2, 'Police Report', 'roadname', 'Middleham Road'),
(2, 'Police Report', 'location', 'Nr Barton')
SELECT caseid, formname=MAX(formname)
,RoadName=MAX(CASE WHEN name = 'roadname' THEN value ELSE NULL END)
,faulttype=MAX(CASE WHEN name = 'faulttype' THEN value ELSE NULL END)
,addinfo=MAX(CASE WHEN name = 'addinfo' THEN value ELSE NULL END)
,location=MAX(CASE WHEN name = 'location' THEN value ELSE NULL END)
FROM @basetable
GROUP BY CaseID
You just need to add a CASE for each name attribute that you're interested in reporting on in a column (others will be ignored).
Check for an article by Jeff Moden on dynamic cross tabs.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
June 3, 2012 at 12:55 am
Almost. But the written request does not match the requested output (this is exactly why we always ask for example output) : the group by should be on caseid AND formname.
Using the DDL and example code that ProKelly forgot but luckily SQL Padawan stepped in:
print '---- cross tab solution -----'
set statistics io on;
select forms.CaseId,
forms.Formname,
max(case forms.name when 'roadname' then forms.value end) as Roadname,
max(case forms.name when 'faulttype' then forms.value end) as faulttype,
max(case forms.name when 'addinfo' then forms.value end) as addinfo,
max(case forms.name when 'location' then forms.value end) as location
from #basetable forms
group by forms.caseid,
forms.formname
set statistics io off;
print '---- left join solution -----'
set statistics io on;
SELECT DISTINCT t.caseid,t.formname,[addinfo] = t1.value,[faulttype] = t2.value,[location] = t3.value,[roadname] = t4.value FROM #basetable t LEFT JOIN #basetable t1 ON t.caseid = t1.caseid AND t.formname = t1.formname AND t1.name = 'addinfo' LEFT JOIN #basetable t2 ON t.caseid = t2.caseid AND t.formname = t2.formname AND t2.name = 'faulttype' LEFT JOIN #basetable t3 ON t.caseid = t3.caseid AND t.formname = t3.formname AND t3.name = 'location' LEFT JOIN #basetable t4 ON t.caseid = t4.caseid AND t.formname = t4.formname AND t4.name = 'roadname'
set statistics io off;
As you can see I've compared the both methods presented so far on their IO behavior: the left join method needs 5 scans at the cost of 21 logical reads, whereas the cross tab method only needs 1 scan and 1 logical read. Imagine what happens if the table contains not 5 rows, but several hundreds of thousands...
edit: order of the columns was not the same as requested.
June 3, 2012 at 10:59 am
ProKelly (5/31/2012)
I cannot seem to do this with pivot, as it want's to aggregate and the responses are text fields. Any clues?
What's wrong with using MAX as the aggregate to pivot the Text?
The real key here is do you want this code to dynamically handle any "Name" value for any given unique values of CaseID, Formname, and Name? To wit, is there some sort of guarantee that the combination of CaseID, Formname, and Name will be unique in this table?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 3, 2012 at 11:31 am
Jeff Moden (6/3/2012)
ProKelly (5/31/2012)
I cannot seem to do this with pivot, as it want's to aggregate and the responses are text fields. Any clues?What's wrong with using MAX as the aggregate to pivot the Text?
The real key here is do you want this code to dynamically handle any "Name" value for any given unique values of CaseID, Formname, and Name? To wit, is there some sort of guarantee that the combination of CaseID, Formname, and Name will be unique in this table?
Aha! I haven't tried aggregating text, with the presumption you can't - but I could try that?
CaseID+Formname themselves are unique. The reason for this transformation is that so I can dimensionalise this ugly table within a reporting cube in analysis services. To do this, I need the full list of names (questions) by CaseID and Formname, which will link back to the fact table.
June 3, 2012 at 4:59 pm
ProKelly (6/3/2012)
Jeff Moden (6/3/2012)
ProKelly (5/31/2012)
I cannot seem to do this with pivot, as it want's to aggregate and the responses are text fields. Any clues?What's wrong with using MAX as the aggregate to pivot the Text?
The real key here is do you want this code to dynamically handle any "Name" value for any given unique values of CaseID, Formname, and Name? To wit, is there some sort of guarantee that the combination of CaseID, Formname, and Name will be unique in this table?
Aha! I haven't tried aggregating text, with the presumption you can't - but I could try that?
CaseID+Formname themselves are unique. The reason for this transformation is that so I can dimensionalise this ugly table within a reporting cube in analysis services. To do this, I need the full list of names (questions) by CaseID and Formname, which will link back to the fact table.
If he has all of the column names you want, R.P.Rozema's corrected version of what Dwaine wrote should do nicely for you, then. Notice that it does use "MAX" for the aggregate. That's why I needed to know if the CaseID, FormName, and Name columns formed something unique. If the Name within a give instance of CaseID and FormName isn't unique, you miss some data. Please confirm that Name is unique within any give combination of CaseID and FormName.
R.P.Rozema is also correct about using multiple joins to do this. That would be slower and more resource intensive. The idea of using Dynamic SQL, however, is not a bad idea if you want this code to be "self healing" if someone adds a new "Name" to the mix. I was intending to write some dynamic SQL to do that in an expeditious fashion but I need to know what I asked about uniqueness, first.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 4, 2012 at 8:13 am
Great point guys. Here is a better solution with dynamic sql.
IF OBJECT_ID('tempdb..#basetable') IS NOT NULL
DROP TABLE #basetable
CREATE TABLE #basetable
(
caseid INT,
formname VARCHAR(255),
name VARCHAR(255),
value VARCHAR(255)
)
DECLARE @sql VARCHAR(MAX)
DECLARE @cols VARCHAR(MAX)
DECLARE @joins VARCHAR(MAX)
INSERT #basetable(caseid,formname,name,value)
VALUES
(1, 'Highways', 'roadname', 'darlington road'),
(1, 'Highways', 'faulttype', 'Drain'),
(1, 'Highways', 'addinfo', 'Cover stuck up cannot be moved'),
(2, 'Police Report', 'roadname', 'Middleham Road'),
(2, 'Police Report', 'location', 'Nr Barton')
SET @cols =
STUFF(
(
SELECT ',' + QUOTENAME(name) + '= MAX(CASE WHEN name = '''+name+''' THEN name ELSE NULL END)' + CHAR(10)
FROM (SELECT DISTINCT id = RANK() OVER(ORDER BY name),name FROM #basetable) x
ORDER BY id FOR XML PATH('')
)
,1,1,'')
SET @sql =
'
SELECT
caseid,
formname,' + CHAR(10) + @cols + 'FROM #basetable GROUP BY caseid,formname'
EXEC (@sql)
June 4, 2012 at 9:57 am
CaseID + Formname + Name is indeed always unique. Thanks so much for the help here people, even though (after reading Jeff's ettiquette article) my question was poorly presented. Will try harder next time.
Thanks again, and thanks SQL Padawan for the corrected SQL
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply