Trouble with Distinct in SQL2000

  • Im trying to get 2005 but the expense is a little high for me. I want to make it a concatenated PK with ClientNumber and ClientAcctNumber for the TMP9999 table. That would be the only guaranteed unique data in this file.

  • Also I will be running this query on a monthly basis for a report. I wanted to start with this and see how I could achieve unique data first.

  • select 6035IN,101244488,13,6/9/2009,711,0,,,0,0,DUP,6/12/2009,711,0,5/26/2008,9999,,UNION ALL

    Select 6035BD,101244488,,10/24/2008,711,0,,,0,0,HRT,3/2/2009,711,0,5/26/2008,9999,,UNION ALL

    I think the string and Date values need a ' ' ...:-)


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • blambu56 (11/11/2009)


    I want to make it a concatenated PK with ClientNumber and ClientAcctNumber for the TMP9999 table. That would be the only guaranteed unique data in this file.

    But the combination of ClientNumber and ClientAcctNumber currently is not unique. If it was, we wouldn't be trying to get rid of duplicates. So as the data currently stands, is there no combination of columns that is unique?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ahhhhh - you are correct sir. Sorry I missed that. I need to add an INT column then to make a unique identifier for this then? If so, how would I? I have never done that before.

  • I hope i can redeem myself with this final post and hopefully find a solution to this query.

    Here is my code reworked and now the create table works fine and I figured out how to script so you all can easily copy and paste the code and build the table. What I want to do is select the records that are distinct at the ClientNumber and ClientAcctNumber level. My PK field is just that PrimKey. If you look at the table you will notice there are instances of three duplicate acct numbers:

    6062IN,5591018

    6062BD,5591018

    6062BD,5591018

    I want to make those three records into two unique records:

    6062IN,5591018

    6062BD,5591018

    CREATE TABLE tmp9999

    (PrimKey int PRIMARY KEY IDENTITY,

    ClientNumber VARCHAR(10) null,

    ClientAcctNumber VARCHAR(20) null,

    Period VARCHAR (8) null,

    FinancialClass VARCHAR(10) null,

    Listdate VARCHAR(10) null,

    InitialBalance VARCHAR(25) null,

    CurrentBalance VARCHAR(25) null,

    PaymentDate VARCHAR(10) null,

    PayType VARCHAR(12) null,

    AmtApplied VARCHAR(25) null,

    NumberOfPayments INT null,

    Cancel VARCHAR(13) null,

    DateCancelled VARCHAR(10) null,

    AmtCancelled VARCHAR(25) null,

    BalanceAfterTrans VARCHAR(25) null,

    ServiceDate VARCHAR(10) null,

    Disposition VARCHAR(20) null,

    InsuranceName VARCHAR(255) null

    );

    Set Identity_Insert tmp9999 on

    Insert into tmp9999

    (PrimKey,ClientNumber, ClientAcctNumber, Period, FinancialClass,

    ListDate, InitialBalance, CurrentBalance, PaymentDate, PayType,

    AmtApplied, NumberOfPayments, Cancel, DateCancelled, AmtCancelled,

    BalanceAfterTrans,ServiceDate, Disposition, InsuranceName)

    select '1','6062BD','5050896','','99','6/8/2009','744.45','0','','','0','0','HRT','9/15/2009','744.45','0','5/4/2008','9999','' UNION ALL

    select '2','6062IN','5050896','','','5/27/2008','744.45','0','','','0','0','CPC','9/24/2008','744.45','0','5/4/2008','9999','' UNION ALL

    select '3','6062IN','5591018','','13','10/19/2009','271.2','271.2','','','0','0','','','0','0','10/9/2008','3NEW','' UNION ALL

    select '4','6062BD','5591018','','','1/12/2009','921.91','0','7/10/2009','DP','484.77','1','HRT','7/27/2009','271.2','437.14','10/9/2008','9999','' UNION ALL

    select '5','6062BD','5591018','','','1/12/2009','921.91','0','7/10/2009','CRJ','165.94','1','HRT','7/27/2009','271.2','271.2','10/9/2008','9999','' UNION ALL

    select '6','6062MC','5628031','','12','10/6/2009','50','50','','','0','0','','','0','0','10/20/2008','3LTR','' UNION ALL

    select '7','6062BD','5628031','','','3/24/2009','1926.1','0','6/17/2009','DP','485.13','1','HRT','8/31/2009','50','1440.97','10/29/2008','9999','' UNION ALL

    select '8','6062BD','5628031','','','3/24/2009','1926.1','0','6/17/2009','CRJ','1321.14','1','HRT','8/31/2009','50','119.83','10/29/2008','9999','' UNION ALL

    select '9','6062BD','5628031','','','3/24/2009','1926.1','0','6/24/2009','CRJ','69.83','1','HRT','8/31/2009','50','50','10/29/2008','9999','' UNION ALL

    select '10','6062IN','5632977','','13','10/20/2009','75','0','','','0','0','DUP','10/29/2009','75','0','10/21/2008','9999','' UNION ALL

    select '11','6062BD','5632977','','99','4/14/2009','8361.92','0','10/9/2009','CRJ','1505.15','1','HRT','10/26/2009','75','6856.77','10/21/2008','9999','' UNION ALL

    select '12','6062BD','5632977','','99','4/14/2009','8361.92','0','10/9/2009','DP','6781.77','1','HRT','10/26/2009','75','75','10/21/2008','9999','' UNION ALL

    select '13','6035IN','101148117','','13','8/21/2009','4149.75','4149.75','','','0','0','','','0','0','4/9/2008','3LN1','' UNION ALL

    select '14','6035MC','101148117','','1','10/9/2008','279.46','0','12/3/2008','DBJ','3253.69','0','HRT','5/21/2009','4149.75','3533.15','4/9/2008','9999','MEDICARE' UNION ALL

    select '15','6035MC','101148117','','1','10/9/2008','279.46','0','12/3/2008','DBJ','616.6','0','HRT','5/21/2009','4149.75','4149.75','4/9/2008','9999','MEDICARE' UNION ALL

    select '16','6035IN','101244488','','13','6/9/2009','711','0','','','0','0','DUP','6/12/2009','711','0','5/26/2008','9999','' UNION ALL

    select '17','6035BD','101244488','','','10/24/2008','711','0','','','0','0','HRT','3/2/2009','711','0','5/26/2008','9999','' UNION ALL

    select '18','6035IN','101422296','','13','7/30/2009','360.75','0','','','0','0','DUP','8/17/2009','360.75','0','8/27/2008','9999','' UNION ALL

    select '19','6035BD','101422296','','99','5/8/2009','360.75','150','8/10/2009','DP','37.59','1','','','0','323.16','8/27/2008','3FUP','' UNION ALL

    select '20','6035BD','101422296','','99','5/8/2009','360.75','150','8/10/2009','CRJ','173.16','1','','','0','150','8/27/2008','3FUP','' UNION ALL

    select '21','6035IN','101500850','','13','10/15/2009','277','0','','','0','0','DUP','10/29/2009','277','0','10/7/2008','9999','' UNION ALL

    select '22','6035BD','101500850','','','3/10/2009','277','277','','','0','0','','','0','0','10/7/2008','3700',''

    Set Identity_Insert tmp9999 off

    Please let me know if this code does not work. It works on my side.

  • Agreed, the script works fine and good work for making it to here.

    I ran the script, and got 22 Records. That's all I have as as of now.....

    regarding your post mentioning this

    6062IN,5591018

    6062BD,5591018

    6062BD,5591018

    I want to make those three records into two unique records:

    6062IN,5591018

    6062BD,5591018

    It is simple to get that with a distinct statement, the following will give you that and you must have figured it out by yourself.

    SELECT DISTINCT ClientNumber, ClientAcctNumber FROM tmp9999

    So the question comes down to this, what is that you want to achieve here. So went back from the first post of yours in this thread and tried to understand. So I did some research but still not able to figure out.

    So I narrowed it down to this query.

    Select * from tmp9999 WHERE ClientNumber = '6062BD' AND ClientAcctNumber = '5628031'

    It will display 3 records, in that recordset, my question is what do you want to do on other columns?

    In your previous replies you replied that you would apply SUM on the columns, so it can be applied on only the columns InitialBalance, AmtApplied, AmtCancelled, BalanceAfterTransfer.

    So please run the query mentioned above and let me know what is that you want by indicating the corresponding column values.


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • Thank you sir.

    I want all the records in the table for those distinct records. Here is the query that I tried to run but only received duplicates again.

    select distinct ClientNumber, ClientAcctNumber, ListDate, InitialBalance, CurrentBalance, PaymentDate, PayType, AmtApplied, Cancel, DateCancelled, AmtCancelled, BalanceAfterTrans, ServiceDate, Disposition

    from TMP9999

    For example for Clientnumber and Clientacctnumber: 6062BD - 5628031 :Here is the output I would prefer

    6062BD,5628031,3/24/2009,1926.1,0,6/24/2009,CRJ,69.83,HRT,8/31/2009,50,50,10/29/2008,9999

    The fields above are as follows: ClientNumber, ClientAcctNumber, ListDate, InitialBalance, CurrentBalance, PaymentDate, PayType, AmtApplied, Cancel, DateCancelled, AmtCancelled, BalanceAfterTrans, ServiceDate, Disposition.

    I am going to sum these later on but want to get the distinct records in their own table first. There are about 15 different sums that I will be performing per ClientNumber so it would be nice to have them in their own separate table.

  • From my understanding, you are planning to move the distinct from this table to a new table (Step1) and then apply 15 different Summations / Aggregation on top of that table (Step2)

    What you need to know is, finding distinct with almost all the columns will result into another table which is of 75-80 % of the size of the original table, because the more columns combination you include to make it distinct, the more distinct records it give you, as I have seen that atleast one or the other values in each row is changed resulting for that record to be a distinct one.

    Instead of that I would prefer to doing your aggregations on this table rightaway.


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • I understand your concern with that but the columns that I truly need for the summations do not change when one duplicate record is deleted. Is it more efficient in SQL to do it this way then save the summations in their own individual tables? I can give you the constraints but the data set I provided is only 22 records out of 250,000 records that I am working with. My plan was to reduce that 250,000 records from what we call out staging table to another smaller table in terms of records to make the processing faster.

  • Your concern is can be understandable, but if it is a table of Quarter of a million records it should be ok to run on top of it rightaway, when you run this distinct query on the staging table, how many records are you getting?


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • Here you go. You can use this as a subquery for your aggregations or insert the results into another table.

    SELECT tmp9999.*

    FROM tmp9999 INNER JOIN

    (SELECT MAX(PrimKey) AS MaxKey, ClientNumber, ClientAcctNumber

    FROM tmp9999

    GROUP BY ClientNumber, ClientAcctNumber) sub ON tmp9999.PrimKey = sub.MaxKey

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks, I am going to dissect that query to figure out why it works. Sorry for the rookie post when I started this thread. I will do better on my next thread. Thanks again guys for all the help.

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply