T-sql question

Question

Post reply

T-sql question

Mh-397891

SSCertifiable

Points: 6704
More actions
June 25, 2008 at 11:30 am

#70869

The follow is the data I have in a single table:
createdateIDcolumnCustnumber
12/1/20052100
5/1/2007 1100
5/29/20072100
5/30/20073100
6/1/120074100
6/28/20075100
2/1/2006 2200
6/1/200071200
6/5/2007 2200
6/7/2007 3200
2/5/2005 4300
6/10/20081300
For the above data, if the first row's createdate is less than second row's createdate for the same custnumber 100,
then the IDcolumn of the first row should be updated to 1, and accordingly if the second row's createdate is less than
3rd row, then second row's IDcolumn should be 2 etc., the following is the output I am expecting:
createdateIDcolumnCustnumber
12/1/20051100
5/1/2007 2100
5/29/20073100
5/30/20074100
6/1/120075100
6/28/20076100
2/1/2006 1200
6/1/200072200
6/5/2007 3200
6/7/2007 4200
2/5/2005 1300
6/10/20082300
Any ideas are greatly appreciated, Thanks!!

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Wildcat SSCarpal Tunnel Points: 4774 More actions · Answer 1

I happened to have one: :w00t:

declare @temp table

(

Createdate varchar(10),

IDColumn smallint,

CustNumber int

)

-- INSERT DATA for testing

insert into @temp values('12/1/2005', 2, 100)

insert into @temp values('5/1/2007', 1, 100)

insert into @temp values('5/29/2007', 2, 100)

insert into @temp values('5/30/2007', 3, 100)

insert into @temp values('6/1/12007', 4, 100)

insert into @temp values('6/28/2007', 5, 100)

insert into @temp values('2/1/2006', 2, 200)

insert into @temp values('6/1/20007', 1, 200)

insert into @temp values('6/5/2007', 2, 200)

insert into @temp values('6/7/2007', 3, 200)

insert into @temp values('2/5/2005', 4, 300)

insert into @temp values('6/10/2008', 1, 300)

-- CREATE A TEMP TABLE holding the sorted data

create table #temp

(

ID int identity(1, 1),

Createdate varchar(10),

IDColumn smallint,

CustNumber int

)

INSERT INTO #temp(CREATEDATE, IDColumn, CustNumber)

SELECT *

FROM @temp

ORDER BY CustNumber, CreateDate, IDColumn

UPDATE A -- UPDATE every first record of CustNumber

SET IDColumn = 1

FROM #temp A

INNER JOIN

(SELECT CustNumber, MIN(CreateDate) AS CreateDAte

FROM @temp

GROUP BY CustNumber

) B

ON A.CustNumber = B.CustNumber AND A.CreateDate = B.CreateDate

declare @row int

SELECT @row = MAX(X.CT)

FROM (SELECT CustNumber, COUNT(IDColumn) AS CT

FROM @temp

GROUP BY CustNumber) X

WHILE @row > 0

BEGIN

UPDATE A

SET A.IDColumn = B.IDColumn + 1

FROM #temp A

INNER JOIN #temp B

ON A.ID = B.ID + 1 AND A.CustNumber = B.CustNumber

SET @row = @row - 1

END

UPDATE A

SET A.IDColumn = B.IDColumn

FROM @temp A

INNER JOIN #temp B

ON A.CustNumber = B.CustNumber AND A.CreateDate = B.CreateDate

---- CHECKING

select * from @temp

drop table #temp

go

--******** RESULT*********--

Createdate IDColumn CustNumber

---------- -------- -----------

12/1/2005 1 100

5/1/2007 2 100

5/29/2007 3 100

5/30/2007 4 100

6/1/12007 5 100

6/28/2007 6 100

2/1/2006 1 200

6/1/20007 2 200

6/5/2007 3 200

6/7/2007 4 200

2/5/2005 1 300

6/10/2008 2 300

(12 row(s) affected)

Carl Federl One Orange Chip Points: 25384 More actions · Answer 2

Try this:

UPDATE#temp

setIDColumn

= (Select count(*)

from #temp as CustPrior

whereCustPrior.CustNumber = #temp.CustNumber

and (CustPrior.Createdate < #temp.Createdate

or ( CustPrior.Createdate = #temp.Createdate

andCustPrior.IDColumn<= #temp.IDColumn

)))

SQL = Scarcely Qualifies as a Language

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 3

Good code... but be careful when you get a lot of rows or the groups of rows become large. Take a look at the following article...

http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Wildcat SSCarpal Tunnel Points: 4774 More actions · Answer 4

Wildcat

SSCarpal Tunnel

Points: 4774

June 27, 2008 at 3:28 pm

#835649

Nice article, Jeff.:D

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 5

Aye. Thanks for the feedback... I sometimes wonder if posts like that do any good. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Carl Federl One Orange Chip Points: 25384 More actions · Answer 6

Jeff is quite correct on the performance affect of a triangular join so I decided to compare using the RANK window function available under SQL Server 2005 and 2008.

For 100 Customers and 1,000 rows for each customer, the resoures used are:

[font="System"]

TableScanslogical reads

TriangleRankTriangleRank

#temp1000012 805064 574

WorkTable1 2203266405932

MillisecondsCPU msElapsed

Triangle35,95136,734

rank 1,462 1,508

[/font]

In short, using RANK is 36 times better than using a Triangle Join.

set statistics io off

set statistics time off

go

IF object_id('tempdb..#temp') is not null drop table #temp

create table #temp

(CustNumberintegernot null

,CreateDatedatetimenot null

,IDColumnsmallintnot null

, constraint temp_p primary key (CustNumber,Createdate))

INSERT into #temp

(CustNumber,CreateDate, IDColumn )

selectCustomer.N

,CAST( '2000-01-01' as datetime) + DayAdd.N

,0 as IDColumn

frommaster.dbo.TALLY as Customer

cross join

master.dbo.TALLY as DayAdd

whereCustomer.N between 1 and 100

andDayAdd.N between 1 and 1000

update statistics #temp

go

set statistics io on

set statistics time on

-- SQL Server 2000

UPDATE #temp

set IDColumn

= (Select count(*)

from #temp as CustPrior

where CustPrior.CustNumber = #temp.CustNumber

and ( CustPrior.Createdate < #temp.Createdate

or ( CustPrior.Createdate = #temp.Createdate

and CustPrior.IDColumn <= #temp.IDColumn

)))

-- SQL Server 2005 or 2008

UPDATE#temp

SETIDColumn = New.NewRank

from#tempas Old

JOIN(select Ranker.CustNumber

,Ranker.CreateDate

,RANK ( ) OVER (PARTITION BY Ranker.CustNumber order by Ranker.CreateDate) as NewRank

from#temp as Ranker

) as New

onNew.CustNumber= Old.CustNumber

andNew.CreateDate= Old.CreateDate

andNew.NewRank<> Old.IDColumn

SQL = Scarcely Qualifies as a Language

Carl Federl One Orange Chip Points: 25384 More actions · Answer 7

Just as a follow-up, here are the resource usages when performing a SELECT instead of an UPDATE

[font="System"]

ResourceScanlogical readsCPU MSElapsed MS

Triangle Join:100001 455,648 23,103 24,798

Rank 1 287 100 899

[/font]

set statistics io on

set statistics time on

-- Triangle Join

SelectCustomer.CustNumber

,Customer.Createdate

,count(*) as NewRank

from#temp as Customer

join#temp as CustomerPrior

on CustomerPrior.CustNumber = Customer.CustNumber

and CustomerPrior.Createdate <= Customer.Createdate

group by Customer.CustNumber

,Customer.Createdate

-- Rank Window for SQL Server 2005 or 2008

selectCustomer.CustNumber

,Customer.CreateDate

,RANK ( ) OVER (PARTITION BY Customer.CustNumber order by Customer.CreateDate) as NewRank

from#temp as Customer

SQL = Scarcely Qualifies as a Language

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 8

If you change Carl's code to 2000 per customer, you see where Triangular Joins really start to fall apart... time goes up by a factor of 4 for the Triangular join... time for the RANK stays just about the same. And that's the way it is with Triangular joins... double the number of rows, the amount of time goes up by a factor of about 4.

If you need to do ranking type functions in SQL Server 2000, please see the following article. The neat thing is that the code in the article is nearly as fast as that of the Rank function and is directly portable to SQL Server 2005...

http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)