Suppose you had a list of product sales and were curious about which one was sold the most? It’s a simple query, and one that I used to ask people in interviews so I thought it would make a nice easy blog post.
If I look at AdventureWorks2008, I have a Sales.SalesOrderDetail table that roughly looks like this:
There’s a productID in there, and if I want to see how products were sold, I can do this:
SELECT COUNT(DISTINCT ProductID) FROM Sales.SalesOrderDetail
However that doesn’t help me with the actual products. I could instead do this:
SELECT productid , COUNT(*) FROM Sales.SalesOrderDetail GROUP BY ProductID
That isn’t ordered, so I can easily add an ORDER BY to see the products sold and the counts.
SELECT productid , COUNT(*) FROM Sales.SalesOrderDetail GROUP BY ProductID ORDER BY COUNT(*) DESC
That gives me a list, but I really want to just get the top seller. That’s easy as well.
SELECT TOP 1 productid , COUNT(*) FROM Sales.SalesOrderDetail GROUP BY ProductID ORDER BY COUNT(*) DESC
Which returns:
productid
———– ———–
870 4688
However I don’t know which product this is, so I’d really want a join here.
SELECT TOP 1 p.Name , COUNT(*) FROM Sales.SalesOrderDetail sod INNER JOIN Production.Product p ON sod.ProductID = p.ProductID GROUP BY p.Name ORDER BY COUNT(*) DESC
Which lets me know that a water bottle was the most sold item.
Name
————————————————– ———–
Water Bottle – 30 oz. 4688
However I wasn’t asked for the count of sales, just the most sold item. I don’t really need the count in order for the query to work. I can do this:
SELECT TOP 1 p.Name FROM Sales.SalesOrderDetail sod INNER JOIN Production.Product p ON sod.ProductID = p.ProductID GROUP BY p.Name ORDER BY COUNT(*) DESC
Which just returns
Name
————————————————–
Water Bottle – 30 oz.
What if there were two items that had the same sales? I’d really want to include WITH TIES to be complete.
SELECT TOP 1 WITH TIES p.Name FROM Sales.SalesOrderDetail sod INNER JOIN Production.Product p ON sod.ProductID = p.ProductID GROUP BY p.Name ORDER BY COUNT(*) DESC
It’s simple, easy, but it’s a query that trips a lot of people up in the intervening steps. I’d prefer you quickly returned the last query to me (wrote it, told me, etc.), but if you had to work through it, I’d hope that you talked about the steps I did as you went through deriving the query so I can understand how you think, and how you improve. If you have me the first query, I’d probably hint around with you about the items forgotten.
You could use a CTE, subquery, or other methods, which are more complex, less efficient, and unnecessary, but could be valid answers. It’s best to stick with something simple, but go with your first instinct and work through it. If you get stuck, backtrack and try something else, explaining along the way. Or tell the interviewer you don’t know if you don’t.
It’s best to be honest, and to show that you can learn, correct yourself, or even admit you don’t know.
Filed under: Blog Tagged: syndicated, T-SQL