May 28, 2007 at 7:07 am
This query will return the consequitve row ,Try this and let me know u r looking for this resutl
select a.employeeid , a.freight ,2 countno
from test a, test b
where a.freight =b.freight -.01 and a.employeeid =b.employeeid
and a.slno =b.slno-1
May 28, 2007 at 7:04 pm
Why are you hard coding the nubmer 2? Also, What is the "slno" column? If it's the sequential column that everyone has been listing, that column is there just so we can conveniently talk about the same rows... according to the OP, the sequential column is not sequential in his table.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 28, 2007 at 7:59 pm
And to finalize it:
#Work must permanent table populated/updated when a new row is saved in "Header" table.
Then you'll not gonna have any problems with building SELECTs and providing best performance.
_____________
Code for TallyGenerator
May 30, 2007 at 4:59 am
Hi
Here's another way using Jeff's data (thanks Jeff)
-- collect only the rows we're interested in
SELECT IDENTITY (int, 1, 1) AS RowID, *, CAST(0 AS int) AS BandID
INTO #temp
FROM (
SELECT t1.*, t2.Freight as t2, t3.Freight AS t3
FROM (SELECT DISTINCT * FROM #MyHead) t1
LEFT JOIN (SELECT DISTINCT * FROM #MyHead) t2
ON t2.EmployeeID = t1.EmployeeID AND t2.Freight = t1.Freight + 0.01
LEFT JOIN (SELECT DISTINCT * FROM #MyHead) t3
ON t3.EmployeeID = t1.EmployeeID AND t3.Freight = t1.Freight - 0.01
) q WHERE t2 IS NOT NULL OR t3 IS NOT NULL
ORDER BY EmployeeID, Freight
-- explicitly mark the first of each freight band
DECLARE @BandID INT
SET @BandID = 1
WHILE (SELECT COUNT(*) FROM #temp WHERE BandID = 0 AND t2 IS NOT NULL AND t3 IS NULL) > 0
BEGIN
UPDATE #temp SET BandID = @BandID
WHERE RowID IN (SELECT MIN(RowID) FROM #temp WHERE BandID = 0 AND t2 IS NOT NULL AND t3 IS NULL GROUP BY EmployeeID)
SET @BandID = @BandID + 1
END
-- mark the remaining rows of each band
WHILE (SELECT COUNT(*) FROM #temp WHERE BandID = 0) > 0
BEGIN
UPDATE #temp SET BandID = b.BandID
FROM #temp a
INNER JOIN (SELECT * FROM #temp WHERE BandID > 0) b
ON b.EmployeeID = a.EmployeeID AND b.t2 = a.Freight
END
-- extract
SELECT EmployeeID, MIN(Freight), COUNT(BandID) FROM #temp GROUP BY EmployeeID, BandID ORDER BY EmployeeID
DROP TABLE #temp
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 30, 2007 at 8:28 am
Heh... you're welcome, Chris...
But that's just step one... (make it work)
Here's the code to generate a million row test table for step two... (make it work fast)
--=======================================================================================
-- Create some test data. This is NOT part of the solution... it's just to give us
-- all something common to work with for testing.
--=======================================================================================
--===== If the table that holds the test data already exists drop it
IF OBJECT_ID('TempDB..#MyHead') IS NOT NULL
DROP TABLE #MyHead
--===== Recreate the test table and populate it with test data
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1),
EmployeeID = CAST(RAND(CAST(NEWID() AS VARBINARY))*1000+1 AS INT),
Freight = CAST(RAND(CAST(NEWID() AS VARBINARY))*10 AS DECIMAL(6,2))
INTO #MyHead
FROM Master.dbo.SysColumns sc1 WITH (NOLOCK),
Master.dbo.SysColumns sc2 WITH (NOLOCK)
ALTER TABLE #MyHead
ADD PRIMARY KEY CLUSTERED (RowNum)
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2007 at 2:18 pm
Using the MyHead table that Jeff provided, I came up with this solution.
I am using SQL 2005 so it might need a few changes to run in older versions.
-- Get Distinct values and create a row number for each record
with
Test as(
SELECT
D
.[EmployeeID]
,D.[Freight]
,row_number() over (partition by [EmployeeID] order by Freight) RowNum
FROM (
select distinct * from [TestDB].[dbo].[MyHead]
) D
)
-- Here we go
select
t3
.[EmployeeID],
-- Get the row number where the group starts
min(t3.RowNum) RowStart,
-- Get the count of items in the group using the row number difference
t3
.MaxRow - min(t3.RowNum)+1 RowCnt
from
(
Select
t1
.[EmployeeID],
-- Temporary group start
T1
.RowNum,
-- Group End
Max(t2.RowNum) MaxRow
from
-- Get the data using an intentional cross join for all records for the same employee
test t1
inner join
test t2
on t1.[EmployeeID] = t2.[EmployeeID]
where
-- Check for freight difference equal to the difference between row numbers * 0.01
t2
.freight = t1.freight + (0.01 * (t2.RowNum - t1.RowNum))
-- Check for diffent row numbers
and t1.RowNum < t2.RowNum
group by
t1
.[EmployeeID],
t1
.RowNum
) as t3
group
by
t3
.[EmployeeID],
t3
.MaxRow
order
by
t3
.[EmployeeID],
t3
.MaxRow
-- That was FUN
-- Antonio Macedo
May 30, 2007 at 8:06 pm
Dang... I still don't have access to SQL Server 2005 ... I love the CTE's that use that wonderful Rownumber Over Partition...
Antonio... can you tell us how long that took for the million row example? Thanks...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 2:32 am
-- Prepare sample data
DECLARE @ TABLE (EmployeeID INT, Freight SMALLMONEY)
INSERT @
SELECT 1, 4.99 UNION ALL
SELECT 1, 0.21 UNION ALL
SELECT 1, 4.03 UNION ALL
SELECT 1, 1.36 UNION ALL
SELECT 1, 0.93 UNION ALL
SELECT 1, 4.98 UNION ALL
SELECT 1, 1.35 UNION ALL
SELECT 1, 0.45 UNION ALL
SELECT 1, 1.51 UNION ALL
SELECT 1, 4.99 UNION ALL
SELECT 1, 1.66 UNION ALL
SELECT 1, 2.50 UNION ALL
SELECT 1, 4.27 UNION ALL
SELECT 1, 3.94 UNION ALL
SELECT 1, 4.41 UNION ALL
SELECT 1, 1.27 UNION ALL
SELECT 1, 7.46
-- Initialize CTE
;
WITH cte
AS (
SELECT DISTINCT EmployeeID,
Freight,
DENSE_RANK() OVER (PARTITION BY EmployeeID ORDER BY Freight) AS RecordNumber
FROM @
)
-- Show the data
select
c1.EmployeeID,
c1.Freight
from cte as c1
inner join cte as c2 on c2.freight - c1.freight = 0.01
order by c1.EmployeeID,
c1.Freight
N 56°04'39.16"
E 12°55'05.25"
May 31, 2007 at 5:44 am
Can't test it, but that's SO easy! Way to go, Peter... can't wait to get my hands on 2k5!
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 12:55 pm
Jeff,
My code took 11 minutes to process the 1000000 rows table.
That was on a single Pentium 4, 3Ghz, 1.5Gb RAM, single processor laptop.
Peter, nice use of the dense_rank function, I snoozed on that. But you forgot to return the count of consecutive distinct items that was required by the original post.
Antonio Macedo
May 31, 2007 at 5:03 pm
Ah... how rude of me... forgot to post how long my code took... Using SQL Server Developer's Edition (same as Enterprise Edition), sp4, 1.8 Ghz single cpu, 1 Gig RAM, twin 80 Gig IDE HD's... LDF files on same drive as the MDF's. Duration for the million row test (Grid mode of Query Analyzer) NOT including the time to build the million row test was about 1 minute and 31 seconds... returned 147,209 rows (will vary slightly for you because the test data is made randomly).
If you're using SQL 2K5, I'll just bet that Peter's DENSE_RANK solution will beat the tar out of that...
Peter, any chance of you doing a test run on the million rows? Thanks...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 7:16 pm
Ouch... I obviously did not play enough with Jeff's solution before posting, my bad...
Jeff, your solution is outstanding.
Thanks
Antonio Macedo
May 31, 2007 at 8:01 pm
Jeff, there are no miracles.
CTE with ranking is nothing else but #table with IDENTITY column. Don't you think?
CTE just makes it looking familiar to OO programmers who quickly catch known syntax and can easily use it.
It does not bring any value to SQL programmers who don't need OO-related hints.
_____________
Code for TallyGenerator
May 31, 2007 at 8:43 pm
Sergiy, I avoided using temporary tables for no reason. I looks like they might offer better performance.
CTE just places the block that runs first at the top of the code, so it's easier to read.
Jeff, I got to understand your code, but found a bug in it.
Using your short test table, if you change the following:
>SELECT 3,9.07 UNION ALL --just to show...
To
>SELECT 3,9.08 UNION ALL --just to show...
The grouping would leak from one employee to the next.
I think that the "trick" part needs an extra variable for the LastEmpolyeeID.
Antonio Macedo
May 31, 2007 at 9:10 pm
Antonio, you don't use #table explicitly in code, but it does not mean they are not used in your query.
CTE just hides it behind the scene, yes, I agree, it's a nice wrap made to please your eyes. But it comes with it's cost. Cost of performance losses.
Not a problem if your system never gonna get to million rows scale.
_____________
Code for TallyGenerator
Viewing 15 posts - 16 through 30 (of 45 total)
You must be logged in to reply to this topic. Login to reply