May 31, 2007 at 9:55 pm
Probably, Serqiy.... just I think they did it at a machine language level which would be a lot faster the at the "T-SQL Level".
Heh... I agree about the OO hints... kinda like that one fellow says as part of his sig... something like "DTS and Cursors are for those who don't understand T-SQL" Still, I haven't been able to beat a properly written CTE, yet... was hoping it was because my home machine is a bit slower than servers which is why I keep asking folks to give us timing on a million row example.
I finally got a copy of the Developer's Edition of 2K5... will install sometime this weekend... right after I backup my machine
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 10:01 pm
Jeff, I got to understand your code, but found a bug in it. Using your short test table, if you change the following: |
Thanks Antonio... was pretty sure that was covered even without the extra variable but I'll double check...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 10:23 pm
Antonio,
Heh... had to do a double take... but no... it's working correctly because of the DISTINCT transfer and the GROUP BY... here's the results with the original data and your data modification...
Results BEFORE Data Change |
Results AFTER Data Change
EmployeeID
Freight
Items
EmployeeID
Freight
Items
1
1.35
2
1
1.35
2
1
4.98
2
1
4.98
2
1
8
2
1
8
2
1
9.01
4
1
9.01
4
2
4.98
3
2
4.98
3
2
8
3
2
8
3
2
9.01
4
2
9.01
4
3
9.07
2
Because the 3,9.07 was changed to 3,9.08 and there already was a 3,9.08, then according to the rules the OP posted, the two 3,9.08's are disqualified and the listing for EmployeeID #3 simply goes away...
Way to keep me on my toes, though
--Jeff Moden
Change is inevitable... Change for the better is not.
June 1, 2007 at 9:02 am
Jeff, thanks for your patience. I am learning a LOT on this thread.
Here is a variation on your code that will run faster than the original:
--===== If the table that holds the test data already exists drop it
IF OBJECT_ID('TempDB..#MyHead') IS NOT NULL
DROP TABLE #MyHead
--===== Recreate the test table and populate it with test data
CREATE TABLE #MyHead (EmployeeID INT,Freight DECIMAL(6,2))
INSERT INTO #MyHead (EmployeeID,Freight)
SELECT 1,0.21 UNION ALL
SELECT 1,0.45 UNION ALL
SELECT 1,0.93 UNION ALL
SELECT 1,1.27 UNION ALL
SELECT 1,1.35 UNION ALL --should be a "hit" with count of "2"
SELECT 1,1.36 UNION ALL
SELECT 1,1.51 UNION ALL
SELECT 1,1.66 UNION ALL
SELECT 1,2.50 UNION ALL
SELECT 1,3.94 UNION ALL
SELECT 1,4.03 UNION ALL
SELECT 1,4.27 UNION ALL
SELECT 1,4.41 UNION ALL
SELECT 1,4.98 UNION ALL --should be a "hit" with count of "2"
SELECT 1,4.99 UNION ALL
SELECT 1,4.99 UNION ALL --dupe (not included according to David's specs)
SELECT 1,7.46 UNION ALL
SELECT 1,8.00 UNION ALL --should be a "hit" with count of "2"
SELECT 1,8.01 UNION ALL
SELECT 1,9.01 UNION ALL --should be a "hit" with count of "4"
SELECT 1,9.02 UNION ALL
SELECT 1,9.03 UNION ALL
SELECT 1,9.04 UNION ALL
SELECT 2,4.27 UNION ALL --different employee id
SELECT 2,4.41 UNION ALL
SELECT 2,4.98 UNION ALL --should be a "hit" with count of "3"
SELECT 2,4.99 UNION ALL
SELECT 2,4.99 UNION ALL --dupe (not included according to David's specs)
SELECT 2,5.00 UNION ALL
SELECT 2,7.46 UNION ALL
SELECT 2,8.00 UNION ALL --should be a "hit" with count of "3"
SELECT 2,8.01 UNION ALL
SELECT 2,8.02 UNION ALL
SELECT 2,9.01 UNION ALL --should be a "hit" with count of "4"
SELECT 2,9.02 UNION ALL
SELECT 2,9.02 UNION ALL --dupe (not included according to David's specs)
SELECT 2,9.02 UNION ALL --dupe (not included according to David's specs)
SELECT 2,9.02 UNION ALL --dupe (not included according to David's specs)
SELECT 2,9.03 UNION ALL
SELECT 2,9.04 UNION ALL
SELECT 2,9.07 UNION ALL
SELECT 3,9.07 UNION ALL --just to show one EmployeeID won't bleed into the next
SELECT 3,9.08
--===== If the table that holds the isolated data already exists drop it
IF OBJECT_ID('TempDB..#Work') IS NOT NULL
DROP TABLE #Work
--===== Create and populate the work table with a clustered primary key and a special
CREATE TABLE #Work
(
EmployeeID
INT NOT NULL,
Freight
DECIMAL(6,2) NOT NULL,
MyGroup
INT, --This is the "special" column
PRIMARY KEY CLUSTERED (EmployeeID,Freight)
)
INSERT INTO #Work (EmployeeID,Freight)
SELECT DISTINCT
t1
.EmployeeID,t1.Freight
FROM #MyHead t1,
#MyHead t2
WHERE (t1.Freight+.01 = t2.Freight)
AND t1.EmployeeID = t2.EmployeeID
--***************************************************
--*** Here I removed the second part of the union ***
--***************************************************
--=======================================================================================
-- This bit of "trick" code relies on the order of the data which is
-- forced by the Clustered Primary Key
--===== Declare a couple of local variables to support the "trick" code
DECLARE
@MyGroup INT
SET @MyGroup = 0
DECLARE
@LastFreight DECIMAL(6,2)
SET @LastFreight = -1
--===== Here's the "trick"... this "groups" rows that are .01 apart in Freight using the
-- proprietary UPDATE clause of SQL Server... magic... and blows cursor speed away!!!
UPDATE w
SET @MyGroup = MyGroup = CASE
WHEN Freight = @LastFreight + .01
THEN @MyGroup
ELSE @MyGroup+1
END,
@LastFreight
= Freight
FROM #Work w --Order of processing is by the Clustered Primary key
--=======================================================================================
-- So, with that having been done, it's now easy to get the results you want.
--=======================================================================================
--*********************************************************************************
--*** Here I add 1 to the count to compensate for the "missing" trailing record ***
--*********************************************************************************
SELECT
EmployeeID, MIN(Freight) AS Freight,COUNT(*) + 1 AS Items
FROM #Work
GROUP BY EmployeeID,MyGroup
Antonio Macedo
June 1, 2007 at 4:47 pm
Very cool... nice job, Antonio... takes only 28 seconds on the million row test...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 4, 2007 at 3:09 am
Here is a solution that only takes about 12 seconds to do the calculation and present the data...
-- Create sample data
SELECT
TOP 1000000
CAST(RAND(CAST(NEWID() AS VARBINARY)) * 1000 + 1 AS INT) AS EmployeeID,
CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10 AS DECIMAL(6,2)) AS Freight
INTO
#MyHead
FROM
Master.dbo.SysColumns as sc1 WITH (NOLOCK),
Master
.dbo.SysColumns as sc2 WITH (NOLOCK)
-- Create a clustered index over Freight
CREATE
CLUSTERED INDEX IX_Freight ON #MyHead (EmployeeID, Freight)
-- Add an auxiliary column
ALTER
TABLE #MyHead
ADD
Grp INT
-- Do the grouping thingy
DECLARE
@Freight DECIMAL(6, 2),
@Grp
INT,
@Now
DATETIME
-- Start the clock
SELECT
@Freight = MIN(Freight) - 1,
@Grp
= 0,
@Now
= CURRENT_TIMESTAMP
FROM
#MyHead
UPDATE
#MyHead
SET
@Grp = Grp = CASE WHEN Freight > @Freight + 0.01 THEN @Grp + 1 ELSE @Grp END,
@Freight
= Freight
-- Show the result
SELECT
EmployeeID,
MIN(Freight) AS Freight,
COUNT(DISTINCT Freight) AS [Count]
FROM
#MyHead
GROUP
BY Grp,
EmployeeID
HAVING
COUNT(DISTINCT Freight) > 1
ORDER
BY EmployeeID,
MIN(Freight)
SELECT
DATEDIFF(MILLISECOND, @Now, CURRENT_TIMESTAMP) AS Milliseconds
-- Clean up
drop
table #myhead
N 56°04'39.16"
E 12°55'05.25"
June 4, 2007 at 6:12 am
You know what they say... "Tuning makes perfect" Keeps getting better and better...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 4, 2007 at 8:31 am
<proprietary UPDATE clause of SQL Server... magic... and blows cursor speed away>
This solution is really cute, and timely too - there are several current threads looking to a sequential cumulative component. But this statement type doesn't unfortunately work in a SELECT ("A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations."). Does it work with 2k5?
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 4, 2007 at 9:14 am
Here is a set-based method that will work on both SQL 2000 and SQL 2005.
-- Prepare sample data
DECLARE
@Sample TABLE (EmployeeID INT, Freight SMALLMONEY)
INSERT
@Sample
SELECT
1, 4.99 UNION ALL
SELECT
1, 0.21 UNION ALL
SELECT
1, 4.03 UNION ALL
SELECT
1, 1.36 UNION ALL
SELECT
1, 0.93 UNION ALL
SELECT
1, 4.98 UNION ALL
SELECT
1, 1.35 UNION ALL
SELECT
1, 0.45 UNION ALL
SELECT
1, 1.51 UNION ALL
SELECT
1, 4.99 UNION ALL
SELECT
1, 1.66 UNION ALL
SELECT
1, 2.50 UNION ALL
SELECT
1, 4.27 UNION ALL
SELECT
1, 3.94 UNION ALL
SELECT
1, 4.41 UNION ALL
SELECT
1, 1.27 UNION ALL
SELECT
1, 7.46
-- Show the expected output
SELECT
d1.EmployeeID,
d1
.Freight,
CAST(1 + 100 * MIN(d2.Freight) - 100 * d1.Freight AS INT) AS [Count]
FROM
(
SELECT EmployeeID,
Freight
FROM @Sample AS t1
WHERE NOT EXISTS (
SELECT *
FROM @Sample AS t2
WHERE t1.Freight - t2.Freight = 0.01
AND t1.EmployeeID = t2.EmployeeID
)
) AS d1
INNER
JOIN (
SELECT EmployeeID,
Freight
FROM @Sample AS t1
WHERE NOT EXISTS (
SELECT *
FROM @Sample AS t2
WHERE t2.Freight - t1.Freight = 0.01
AND t2.EmployeeID = t1.EmployeeID
)
) AS d2 ON d1.Freight <= d2.Freight AND d1.EmployeeID = d2.EmployeeID
GROUP
BY d1.EmployeeID,
d1
.Freight
HAVING
MIN(d2.Freight) > d1.Freight
N 56°04'39.16"
E 12°55'05.25"
July 3, 2007 at 5:42 am
Sorry, Chris... lost track of this thread...
Specifically, what statement gave you that error?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 4, 2007 at 3:30 am
Yeah me too!
I saw a use for ...
UPDATE #MyHead
SET
@Grp = Grp = CASE WHEN Freight > @Freight + 0.01 THEN @Grp + 1 ELSE @Grp END,
@Freight
= Freight
... in the work I'm currently involved with and tried this syntax in a SELECT, but it only works with an UPDATE.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 4, 2007 at 11:33 am
Yeah... that's the bugger about it... @variable=column=formula ONLY works for Update and only with certain other "hooks" present....
--Jeff Moden
Change is inevitable... Change for the better is not.
July 5, 2007 at 12:03 pm
and
@variable=column=formula
is equivalent to:
SET @variable=formula, column = formula
That is why it works for "Update" statements only
Cheers,
* Noel
April 10, 2012 at 8:11 am
My friend gave me an interesting problem yesterday. He works for a hospital AND needs TO COUNT patient visits, IF a patient visits the doctor ON consecutive DAYS, it needs TO be counted AS 1 visit.
Here is how i solved it. This is just one OF many ways TO do it.
DECLARE @tempTable TABLE(row_id INT IDENTITY(1,1),
patient_id INT,
er_date DATETIME)
-- Test Data
INSERT INTO @tempTable(patient_id,er_date)
SELECT 1,'1/1/2012'
UNION ALL
SELECT 1,'1/2/2012'
UNION ALL
SELECT 1,'1/3/2012'
UNION ALL
SELECT 2,'1/5/2012'
UNION ALL
SELECT 2,'1/6/2012'
UNION ALL
SELECT 3,'1/8/2012'
UNION ALL
SELECT 3,'1/9/2012'
UNION ALL
SELECT 1,'2/1/2012'
UNION ALL
SELECT 1,'2/2/2012'
UNION ALL
SELECT 2,'2/3/2012'
UNION ALL
SELECT 2,'3/3/3012'
UNION ALL
SELECT 3,'1/10/2012'
UNION ALL
SELECT 3,'1/13/2012'
UNION ALL
SELECT 5,'1/5/2012'
--If datediff is only 1 , it is treated as same visit
SELECT
tbl1.patient_id,
SUM( CASE WHEN tbl1.er_date IS NOT NULL AND tbl2.er_date IS NULL THEN 1 ELSE 0 END ) [number_of_visits]
FROM
@tempTable tbl1
LEFT JOIN @tempTable tbl2
ON tbl1.patient_id = tbl2.patient_id
AND tbl1.er_date = DATEADD(dd,1,tbl2.er_date)
GROUP BY
tbl1.patient_id;
April 11, 2012 at 7:11 am
Some people are entertained by fast cars...others by fast code...GO GEEKS, I say. 🙂
Viewing 15 posts - 31 through 45 (of 45 total)
You must be logged in to reply to this topic. Login to reply