January 18, 2012 at 1:19 pm
So simple question on which is better: storing CSVs in a column vs. table with records.
**Doing some fine tuning of an old system, an have never ran across a db where CSVs were stored in a column**
**The CSVs are from questionnaires with like something like, "How much pain do you have? 1-10, 1 being 'none' and 10 'hurts bad'"**
**Each value in the CSV is a different question
having a table like so:
UserId CSValues
23 1,2,3,4,5
24 2,3,4,5,6
25 8,7,6,5,4
or using a one to many relationship (creating necessary tables):
User_Id Question_Id Answer
1 1 6
1 2 5
1 3 8
2 1 3
2 2 3
From my past experiences, I have always learned to break such a situation up into one-to-many and many-to-many relationships and creating the necessary tables. Just trying to figure out why a past dev would have done this, so that I better understand, and maybe learn something new 🙂
January 18, 2012 at 1:22 pm
The one-to-many is a more normalized view of the data and would be more 'correct'. But , you could argue that the CSV route could be right for your system , if you are treating it as an atomic unit of data within the database.
January 18, 2012 at 1:35 pm
Dave Ballantyne (1/18/2012)
The one-to-many is a more normalized view of the data and would be more 'correct'. But , you could argue that the CSV route could be right for your system , if you are treating it as an atomic unit of data within the database.
Currently the system does not do anything really with these questionnares (which are completed over and over again) results, i.e. no trending analysis, etc.
But, I would think that when that day comes, it would be more efficient (code-wise and maybe processing time) to have it in a more normalized solution (above). What do you think?
January 18, 2012 at 2:08 pm
Yes, the one-to-many is what you should be aiming for. Its always better to over-engineer than under-engineer.
January 19, 2012 at 7:08 am
stephen99999 (1/18/2012)
Just trying to figure out why a past dev would have done this, so that I better understand, and maybe learn something new 🙂
There are a couple of possibilities.
1) he was lazy or didn't know any better.
2) he was overwhelmed with emergencies and didn't have time to code it properly. Then he forgot to fix it.
3) he let the Business Users bully him into doing it their way.
Being polite, I'd go with the "didn't know any better" option.
January 20, 2012 at 9:22 am
I have seen way too much of this delimited data in on varchar() column crap. It is typically implemented by a programmer who knows about RBAR processing with VB or ASP instead of set based relational processing.
The one exception would be if the specification clearly stated that the data would always be an atomic entity and never used otherwise. Unfortunately the inevitable happens and the requirement to query this data (or even worse, update it) comes up. Then you have hideous performance & fragmentation issues and/or ugly code to deal with it.
The probability of survival is inversely proportional to the angle of arrival.
January 23, 2012 at 4:43 am
CELKO (1/20/2012)
Anyone who is this bad will also have left you with really bad code in other places; you need to clean out the whole damn thing. This is my experiences over the last three decades of doing SQL.
I would have to agree with this sentiment. This can't be the only horrible legacy you have from the previous coder, so keep your eyes peeled for other monsters lurking in the code shadows.
January 23, 2012 at 6:27 am
Yet, ironically, people let other people store XML in their database. Go figure. 😉
If you change the structure of the table now, be advised that you'll also have to change everything that touches it. Although I agree that the original design is terrible and leaving it as it is constitutes an "SQL sin", until you know what the cost of changing everything else is, leave it be. Although it's certainly not the most effecient thing in the world, it can be temporarily normalized as a working table using the high speed T-SQL splitter at the following URL or the CLR splitter found at the same URL.
http://www.sqlservercentral.com/articles/Tally+Table/72993/
And, no... I'm not advocating CSV columns at all. I'm simply saying that you don't know what the cost of changing it to be correct is and, considering the workaround, wait to change it until you know all of the ramifications.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply