February 11, 2012 at 10:00 am
Jeff Moden (2/10/2012)
wurkenmomz (2/10/2012)
Also, he states that using the Distinct will not get the recommended results and can be very unreliable, in these cases. He does this for a living.Perfect. The instructor must be correct especially since he does this for a living. The rest of us all do this in our spare time just for fun and only every other weekend and only if it's not nice outside. 😛
Just remember that your instructor is the one that wrote the "Triangular Join" and then lookup "DISTINCT" in Books Online and see if it's really as "unreliable" as he says. 😉
"Doing it for a living"...doesn't mean that your instructor is always correct.
When learning a subject, you should always explore other methods / answers and question why the results are different from what you have been "taught".......you may well learn a lot more this way 🙂
now, please if you will, consider the following code.....cut and paste into SSMS and run it
what are your conclusions?
--+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--PLEASE NOTE THAT THIS WILL DELETE ANY EXG TABLES WITH SAME NAME IN database TEMPDB....please amend if required
--+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
use [tempdb]--==== start in safe place!!!
GO
--====Conditionally delete table
IF OBJECT_ID('tempdb..Vendors', 'U') IS NOT NULL DROP TABLE tempdb..Vendors ;
--==== create a table with 10 million record ... tskes around 20 secs on a desktop
SELECT TOP 10000000 ---- NOTE 10 MILLION rows
VendorName = 1 + CAST(Abs(Checksum(Newid()) % 1000 ) AS INT),
VendorCity = 1 + CAST(Abs(Checksum(Newid()) % 100 ) AS INT),
VendorState = 1 + CAST(Abs(Checksum(Newid()) % 10 ) AS INT)
INTO Vendors
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
CROSS JOIN sys.all_columns ac3
GO
--==== insert a new record that we know will be unique
INSERT INTO [dbo].[Vendors]([VendorName], [VendorCity], [VendorState])
SELECT 99999, 9999, 999
GO
-- your instructor code
SET statistics TIME ON
SET statistics IO ON
SELECT VendorName
,VendorCity
,VendorState
FROM Vendors
WHERE VendorCity NOT IN
(SELECT v1.VendorCity
FROM Vendors AS v1
JOIN Vendors AS v2
ON (v1.VendorName <> v2.VendorName) AND
(v1.VendorCity = v2.VendorCity AND
v1.VendorState = v2.VendorState));
SET statistics TIME OFF
SET statistics IO OFF
--==== alternative code by Jeff Moden
SET statistics TIME ON
SET statistics IO ON
SELECT addr1.VendorName, addr1.VendorCity, addr1.VendorState
FROM dbo.Vendors addr1
INNER JOIN
(
SELECT VendorCity, VendorState
FROM dbo.Vendors
GROUP BY VendorCity, VendorState
HAVING COUNT(DISTINCT VendorName) = 1
) unqaddr
ON addr1.VendorCity = unqaddr.VendorCity
AND addr1.VendorState = unqaddr.VendorState
SET statistics TIME OFF
SET statistics IO OFF
I would now suggest that you (and your instructor) start to improve upon these results by creating some judicious index(s) that further improve the performance...
(I appreciate that you may consider the code to be "contrived"....but hopefully it does represent a more efficient method)
ps
The rest of us all do this in our spare time just for fun and only every other weekend and only if it's not nice outside. 😛
I would add that:
if there is beer in the fridge
or pork chops for lunch (apologies to JM)
or the kids need playing with......yadda yadda
...then we just post the first piece of code that springs to mind and hope that it works
😀
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
February 11, 2012 at 10:11 pm
Mmmmmm.... Porkchops marinated in beer. 😀
Hey, Graham... here's a little tip for you to try...
SET STATISTICS IO, TIME ON;
...
some code to be measured
...
SET STATISTICS IO, TIME OFF;
I know, I know... I'm lazy. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 2 posts - 16 through 16 (of 16 total)
You must be logged in to reply to this topic. Login to reply