August 8, 2019 at 10:40 am
Hi,
I have the next case with a challenge on grouping records on 120 pivotted columns. Objectnames are in Dutch but hopefully that is no problem. The records in our source table have questionaire outputs for patients in our hospital. Every record holds the answer of one question in a questionaire. The questionaire that we would like to pivot consists of 120 questions so we have 120 records per patient. These records are already isolated from other questionaires so our source table only hols the appropriate questionaire. We would like to dynamically pivot the source table having the patientnumber on the rows and the individual questions as columns. This pivotting is no problem, but we get 120 records per patient. What we would like to achieve is to group the records per patient so we end up with one row per patient with the corresponding answer on a question in the corresponding column. What we like to avoid is to hard code all the 120 columns individually in the group by statement. What we like to now is if there is a way to dynamically group by all 120 columns.
All the steps are below. We list them all to give a complete overview what is being executed, but the problem lies within the last steps.
Sorry that the code in not indented like it is in our real queries to make it more readable ...
if object_id('tempdb..##vragen_antwoorden_dialyse_pivot') is not null
drop table ##vragen_antwoorden_dialyse_pivot
if object_id('tempdb..#vragen_antwoorden_dialyse_distinct') is not null
drop table #vragen_antwoorden_dialyse_distinct
if object_id('tempdb..#vragen_antwoorden_dialyse') is not null
drop table #vragen_antwoorden_dialyse
2. Variables are declared
declare @sqlquery nvarchar(max)
declare @pivotkolommen nvarchar(max)
3. We take the unique questions from our source table and sort these questions based on the question number. The question holds the question number as the first x characters of that question.
select z.vraagcode,
z.subvraagcode,
z.vraagstelling
into #vragen_antwoorden_dialyse_distinct
from (
select top 1000 y.*
from (
select distinct x.vraagstelling,
substring(vraagcode, patindex('%[^0-9]%', vraagcode), 1) as subvraagcode,
case when patindex('%[^0-9]%', vraagcode) = 0 then vraagcode
else substring(vraagcode, 1, patindex('%[^0-9]%', vraagcode) - 1) end as vraagcode
from (
select substring(vraagstelling, 1, charindex('.', vraagstelling, 1) - 1) as vraagcode,
vraagstelling
from [ElkerliekBA].kwt.vragen_antwoorden_dialyse
) x
) y
order by cast(y.vraagcode as float)
) z
4. Source query for pivot
select patientnr,
substring(vraagstelling, 1, charindex('.', vraagstelling, 1) - 1) as vraagcode,
vraagstelling,
Antwoord
into #vragen_antwoorden_dialyse
from [ElkerliekBA].kwt.vragen_antwoorden_dialyse
5. Fill variable @PivotKolommen with all distinct questions
select @PivotKolommen = coalesce(@PivotKolommen + ',','') + quotename(vraagstelling)
from #vragen_antwoorden_dialyse_distinct
6. Making the dynamic query
First we used this code ...
N'select [patientnr],' + @PivotKolommen + '
into ##vragen_antwoorden_dialyse_pivot
from #vragen_antwoorden_dialyse
pivot(max([Antwoord])
for [vraagstelling] in (' + @PivotKolommen + ')) as Qry'
... but tried to change it in this one to have the grouping per patient (one row per patient) a step earlier than grouping the finale result (see step 8 below)
N'
with pivot as
(
select [patientnr],' + @PivotKolommen + '
from #vragen_antwoorden_dialyse
pivot (
max([Antwoord])
for [vraagstelling] in (' + @PivotKolommen + ')
) as Qry
)
select patientnr,
max(' + @PivotKolommen + ')
from pivot
group by patientnr'
... but to no avail ... the max function obviously can not handle the dynamic colunms in the variable PivotKolommen.
7. Executing the query
execute(@SQLQuery)
8. Selecting the final result ... where we are stuck because per patient 120 records are shown (with the first query in step 6)
select * from ##vragen_antwoorden_dialyse_pivot
order by patientnr
... So, is there a way to accomplish a dynamic group by in order to avoid writing out all the 120 columns individually in the group by statement.
August 8, 2019 at 12:01 pm
I'll be honest, I'm not even going to try to propose a solution to the query. Sorry.
Instead, let me ask a question. Have you looked at using a columnstore index? It's effectively a pivot on the data, from rows to columns. Further, aggregations are inherent, so you won't need to do this kind of group by processing. Just an idea. Hopefully someone with actual skill, and a ton of time, comes by shortly to propose a better answer. Good luck.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 8, 2019 at 1:03 pm
Without wading into the specifics of your situation you could use the stuff() function in combination with for xml path.
In this code the cte is called num_expand and it creates a list of numbers from 1 to 10. The select statement consolidates the list of numbers into a single comma delimited string. If you replace the cte definition with the unique list of question Id's, the resulting comma string could then be appended into your dynamic sql. I don't think you need pivot just group by.
with
num_expand(num) as (
select 1 num
union all
select
ne.num+1
from
num_expand ne
where
num<10)
select
stuff((select ',' + cast(num as varchar(3)) from num_expand for xml path ('')), 1, 1, '') string;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
August 8, 2019 at 1:51 pm
There's actually a very easy, almost trivial method to solve this problem. Here's the article on how you could pull this off. And, please, tell us why they want to denormalize the data like this, especially for so many columns. It's not like it'll make a useful report and it'll make doing a computational analysis a whole lot more difficult.
Here's the article. You probably won't need "Pre-Aggregation" to speed it up because you don't have multiple rows to aggregate to use in a single "Cell" of the output.
https://www.sqlservercentral.com/articles/cross-tabs-and-pivots-part-2-dynamic-cross-tabs
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply