July 2, 2009 at 10:47 am
GilaMonster (7/2/2009)
ifila (7/2/2009)
I will grab the first record - in this case 'livelink, java'The one with the lowest ID?
yes
July 2, 2009 at 10:51 am
ifila, we want to help, really. But the sample data doesn't match the table definition. The desired output doesn't match the table definition. Based on that, I cannot write a tested query.
If I've interpreted what you want, this query should get the data that you want.
SELECT OriginalEmail, OriginalResume
(SELECT OriginalEmail, OriginalResume, Row_Number() OVER (Partition by OriginalEmail Order By ID) as RowNo
FROM dbo.resumein) sub
Where RowNo = 1
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 2, 2009 at 11:10 am
I added a screen shot of my SQL DB.
All resumes are initially imported into dbo.emailaddress which has one big field 'emailaddresstxt'.
READ desktop text folder - WRITE to dbo.emailaddress
At this point the emails are parsed out and written, with a copy of the resume to dbo.outputresume3
READ dbo.emailaddress - WRITE parsed data to dbo.outputresume3
Apologies for my lack of clarity !
July 2, 2009 at 11:23 am
All we're asking for is the definitions of all tables involved (as CREATE TABLE statements), sample data for those tables (as insert statements), an example of the results you want, based on the sample data and a explanation of the rules involved.
Do the queries we provided do what you want? If not, please provide what I've requested above.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 2, 2009 at 11:54 am
I just tested Jeffrey's solution and it works ! 🙂
Many Thanks to both of you, for your assistance !
July 2, 2009 at 12:37 pm
I do have a couple of questions. While testing Jeffrey's solution, is there any way to make it run quicker ?
It took 35 minutes to process 600 000 records !
How do i run this query so that it uses resources on a different drive ?
It was rapidly using ALL my 'C' drive space !
Thanks
July 2, 2009 at 12:49 pm
ifila (7/2/2009)
I do have a couple of questions. While testing Jeffrey's solution, is there any way to make it run quicker ?It took 35 minutes to process 600 000 records !
How do i run this query so that it uses resources on a different drive ?
It was rapidly using ALL my 'C' drive space !
Thanks
Is that returning 600,000 rows - or does the table contain 600,000 rows?
As for running out of space on your C: drive, that is going to happen because SSMS has to put the output results somewhere. This ends up being your temp directory - and if you want it redirected, you need to move your temp directory to another drive.
The query that I provided is not filtered in any way other than eliminating those rows you said you didn't want. If that equates to more than 600,000 rows - then it is going to take that long to return the results.
Oh, one more thing - if you had bothered with reviewing the article we mentioned and posted the table definitions (with indexes) we could offer better advise. As it is, we have no idea whether or not you even have any indexes on this table. If you don't have a clustered index on your ID, and a non-clustered index on your email1 column - I can't see how this query could be improved.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 2, 2009 at 1:09 pm
Jeffrey Williams (7/2/2009)
ifila (7/2/2009)
I do have a couple of questions. While testing Jeffrey's solution, is there any way to make it run quicker ?It took 35 minutes to process 600 000 records !
How do i run this query so that it uses resources on a different drive ?
It was rapidly using ALL my 'C' drive space !
Thanks
Is that returning 600,000 rows - or does the table contain 600,000 rows? I had to stop after 600k, the DB has 1.4 Million records.
As for running out of space on your C: drive, that is going to happen because SSMS has to put the output results somewhere. This ends up being your temp directory - and if you want it redirected, you need to move your temp directory to another drive.
Thanks
The query that I provided is not filtered in any way other than eliminating those rows you said you didn't want. If that equates to more than 600,000 rows - then it is going to take that long to return the results.
Oh, one more thing - if you had bothered with reviewing the article we mentioned and posted the table definitions (with indexes) we could offer better advise. As it is, we have no idea whether or not you even have any indexes on this table. If you don't have a clustered index on your ID, and a non-clustered index on your email1 column - I can't see how this query could be improved.
I did review the article and believed what i provided in my attachment was a table definition. This shows the only index field i have. I also mentioned i was using FTI on the resume field.
Thanks
July 2, 2009 at 1:15 pm
ifila (7/2/2009)
I do have a couple of questions. While testing Jeffrey's solution, is there any way to make it run quicker ?It took 35 minutes to process 600 000 records !
How do i run this query so that it uses resources on a different drive ?
It was rapidly using ALL my 'C' drive space !
Thanks
Are you saying that your database files (mdf/ldf) reside on the C drive and are growing to fill the entire disk ? What sort of disk io system have you ?
July 2, 2009 at 1:27 pm
No, i have :
SQL Server program files on - C drive
.ldf on an array on E drive
.mdf on an array on F drive
tempdb on an array on G drive
The query was filling up my C drive. It used 10 GB in 30 minutes, i only had 10 GB more available and my DB was 35 GB, so i did not want to use all my C drive space.
Thanks
July 2, 2009 at 1:49 pm
I would imagine that putting an index on the email column would do nothing but help with the performance as, internally, it will already have a ordered list to base the rownum on. As for it filling up your local disk , why do you want or need all the results in the SSMS results tab. Why not put them to a #table where they will be much easier to process to whatever your next step is.
July 2, 2009 at 2:07 pm
I am still trying to figure out the best solution !
It is a work in progress 🙂
The only reason i did not use a temp table is that i wanted to verify the results.
Since it is taking longer than i would like, i may have to wait until the user selects certain records and run this code on a much smaller record group or create the equivalent code in ASP.
Thanks
Viewing 12 posts - 16 through 26 (of 26 total)
You must be logged in to reply to this topic. Login to reply