January 19, 2012 at 2:24 pm
I need some help with a SQL Query if anyone has a moment.
I have two tables; Attributes & Products. They are connected via INNER JOIN. Each product can have multiple attributes (color, size, position, shape etc...). I have been trying to construct a query which will filter this properly via WHERE clause. It works properly when choosing only one type of attribute (ex... WHERE (Value = 'Blue' OR Value = 'Red') etc...), but it fails to filter properly when more than one attribute type is presented (think Color and Size).
Currently, if someone selects 'Red' as the Color, and 'Large' as the Size, it will show all products which are Red or Large, not products which are Red AND Large, as desired.
I have tried every combination of query I could think of, including...
...WHERE (AttributeValue = 'Red' OR AttributeValue = 'Large')
...WHERE (AttributeValue = 'Red') AND (AttributeValue = 'Large')
...WHERE (AttributeName = 'Color' AND AttributeValue = 'Red') AND (AttributeName = 'Size' AND AttributeValue = 'Large')
...WHERE (Attributes.Name = 'Color' AND Att.Value = 'Red' OR Att.Value = 'Blue) OR (Att.Name = 'Position' AND Att.Value = 'Top')
Each of the above either outputs nothing, or outputs data filtered improperly. The problem seems to be that each attribute is stored on a seperate row, not just in seperate columns.
======================================
Sample Query...
SELECT dbo.Products.ProductID, dbo.Products.Title, dbo.Products.Price, dbo.Products.Image
FROM dbo.Products INNER JOIN Attributes ON Products.ProductID = Attributes.ProductID
WHERE (??? NOT SURE)
ORDER BY dbo.Products.Title ASC
======================================
Attributes Table Structure...
AttributeID | ProductID | AttributeName | AttributeValue
1 | 625 | Color | Red
2 | 625 | Color | Blue
3 | 625 | Position | Top
4 | 625 | Position | Bottom
5 | 625 | Base | Round
6 | 1258 | Color | Green
7 | 1258 | Color | Orange
8 | 5674 | Position | Top
9 | 5674 | Position | Center
10 | 5674 | Base | Round
11 | 5674 | Base | Square
======================================
Any ideas?
January 19, 2012 at 4:00 pm
This is one of the challenges of the way your tables are designed. You are using a table as a catch all. In other words the table holds multiple types of information. Notice how I created tables and sample data to make it easier for others to help. This is something you should do on future questions.
I didn't really know what you were looking for so i decided to find Products that have "Large" and "Red".
create table #Products
(
ProductID int,
Name varchar(10)
)
create table #Attributes
(
AttributeID int identity,
ProductID int,
AttributeName varchar(10),
AttributeValue varchar(10)
)
insert #Products
select 625, 'Prod 1'
union all
select 1258, 'Prod 2'
insert #Attributes
select 625, 'Color', 'Red'
union all
select 625, 'Color', 'Blue'
union all
select 625, 'Size', 'Large'
union all
select 625, 'Size', 'Medium'
--Here is with multiple joins (one for each Attribute type)
select *
from #Products p
join #Attributes a on a.ProductID = p.ProductID and a.AttributeName = 'Color'
join #Attributes b on b.ProductID = p.ProductID and b.AttributeName = 'Size'
where a.AttributeValue = 'Red'
and b.AttributeValue = 'Large'
--Here is with cross apply
select *
from #Products p
cross apply(select * from #Attributes at where at.AttributeName = 'Color' and at.AttributeValue = 'Red' and at.ProductID = p.ProductID) a
cross apply(select * from #Attributes at where at.AttributeName = 'Size' and at.AttributeValue='Medium' and at.ProductID = p.ProductID) b
drop table #Products
drop table #Attributes
That get you close to an answer?
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 19, 2012 at 9:28 pm
Thanks for the reply! I have noted the way to provide table / data structure. That will be most helpful in the future! 🙂
Your suggestion helps greatly, but it is still problematic to my scenario since the names and values of the variables are not 'known'. It is user supplied input and changes per category / product.
With that said, I have a lot of flexibility in redesigning the table. The table is currently not assigned to a live production and its associations are easily modified.
I am thinking of doing the items in a delimited value. So each product that has an attribute (not all do) will have an entry in the Attributes table with a foreign key to the Products table for the relationship. There will only be 1 entry in the Atts. table per product.
I will store the data in 2 fields in the Atts. table; AttName & AttValue. AttName would contain data similar to: |Color||Style||Size|. AttValue would store the associated values like: |Red||Retro|Large|.
With this, I could call simple SELECT .. WHERE (AttName LIKE '%|Color|%' AND AttValue LIKE '%|Red|%') AND (.......)
Does this sound like a logical solution, or more like a hack? If the latter, what would be the best hierarchy for the situation, assuming we are starting with a blank canvas?
Thanks again!
January 19, 2012 at 11:58 pm
TimeToShine (1/19/2012)
Does this sound like a logical solution, or more like a hack? If the latter, what would be the best hierarchy for the situation, assuming we are starting with a blank canvas?
Hi ,
This design is known as EAV ( Entity , Attribute, Value) as is renowned as poorly performing and awkward to work with.
So , assuming a blank canvas , you are painting the wrong picture 🙂
Dave
January 20, 2012 at 2:03 am
TimeToShine (1/19/2012)
I will store the data in 2 fields in the Atts. table; AttName & AttValue. AttName would contain data similar to: |Color||Style||Size|. AttValue would store the associated values like: |Red||Retro|Large|.
Please, please, please read up on nomalisation. What you've got there is a nightmare waiting to happen. Think about how you'd go about adding or removing an attribute from that pipe-delimited set (I'll wait...)
With this, I could call simple SELECT .. WHERE (AttName LIKE '%|Color|%' AND AttValue LIKE '%|Red|%') AND (.......)
Does this sound like a logical solution, or more like a hack? If the latter, what would be the best hierarchy for the situation, assuming we are starting with a blank canvas?
It sounds like something that a consultant would bill a huge amount of money to fix in a few months time. That query, as written would scan the table every time. Couple hundred rows, you might get away with it. Few thousand, not so much.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 20, 2012 at 9:09 am
Please, please, please read up on nomalisation. What you've got there is a nightmare waiting to happen. Think about how you'd go about adding or removing an attribute from that pipe-delimited set (I'll wait...)
(...Hope you weren't holding your breath) I am familiar with normalization. Hence the reason I did not put the attributes directly in the Products table. There's tons of ways to format the data in the pipe delimited format. That's the least of my concerns at this point.
Per the normalization that was suggested. Can you provide an example if you have a moment? I do believe my original approach was fairly normalized. Each attribute was assigned to a ProductID. Each attribute had its own row. The only way I could split that data anymore would be to create separate tables for each AttValue & AttName. That, IMO, would seem awfully redundant and unnecessary.
This design is known as EAV ( Entity , Attribute, Value) as is renowned as poorly performing and awkward to work with.
So , assuming a blank canvas , you are painting the wrong picture 🙂
Congrads., you finally got to be funny on the inernet. Do you possibly have any ideas on how to paint the right picture? Dont just come 'point and laugh', give some advice if there is a better way! 😀
January 20, 2012 at 9:32 am
TimeToShine (1/19/2012)
I am thinking of doing the items in a delimited value. So each product that has an attribute (not all do) will have an entry in the Attributes table with a foreign key to the Products table for the relationship. There will only be 1 entry in the Atts. table per product.
I will store the data in 2 fields in the Atts. table; AttName & AttValue. AttName would contain data similar to: |Color||Style||Size|. AttValue would store the associated values like: |Red||Retro|Large|.
With this, I could call simple SELECT .. WHERE (AttName LIKE '%|Color|%' AND AttValue LIKE '%|Red|%') AND (.......)
I would advise strongly against this approach. You will face some real challenges with updating records and you will suffer unbelievable performance issues. You queries as you presented using like means that any indexing on those columns are lost. The best you could do at that point would be to split your values and suddenly you now using a query to get back to where you are now. Additionally the logic is pretty tough. What happens if you two color attributes? Let's say you had Color and BorderColor. Your above query would return records that it shouldn't because the Color is Blue and the BorderColor is Red. See what I am saying?
It is somewhat difficult to offer much advice on proper structure because I am not familiar with your project as intimately as you are. Depending on how many different attributes you could make a table for each one. Color, Size, Position, Location, etc. Each table could have almost identical columns (ColorID, ProductID, Value). This way you could build your query with a left join to each attribute table and build your where clause based on user input to filter the results. If you have 100 distinct attributes this might be as good an approach because the table count would be rather ridiculous.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 20, 2012 at 9:35 am
TimeToShine (1/20/2012)
Please, please, please read up on nomalisation. What you've got there is a nightmare waiting to happen. Think about how you'd go about adding or removing an attribute from that pipe-delimited set (I'll wait...)
(...Hope you weren't holding your breath) I am familiar with normalization. Hence the reason I did not put the attributes directly in the Products table.
Which normal form requires that attributes of an entity must be stored separately from the entity?
I do believe my original approach was fairly normalized. Each attribute was assigned to a ProductID. Each attribute had its own row.
Well, I suppose you could call that normalised. It is, however, a very poorly performing design. EAV tables look sooooo tempting but are an absolute mess to query (as you've discovered). They're also terrible for data consistency (nothing prevents "Red" been stored for a size or "Large" for a colour) (and don't say that the app enforces the consistency, that works as long as no one ever accesses the database directly and no other applications are ever written and your app is bug-free)
Typically you would approach this in one of two ways:
If most of the attributes are consistent across all the product types, then they are stored as columns in the product table
If there are vastly different attributes for the different product types, then either the product table can be horizontally partitioned into multiple tables for each product type or the products can be kept in one table and have a 1-1 relationship with an appropriate attribute tables. I prefer horizontally partitioning the products if going this route.
Finally I've seen people use XML for the cases where the attributes are vague and unknown at design time (though that's usually not the case unless there's a very, very, very, very fast changing business, it's more normally an excuse for poor business requirements gathering). So the product table would gain a column called Attributes of type XML. This is generally not a good design, it is a violation of 1st normal form, but if it's really necessary it's better than EAV because you can use XQuery and shred the nodes into a relational structure for reasonably easy querying and filtering.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply