Getting row duplications based on conditions

  • Hi all, please i need some help, i have this table that has a sample data like this:

    insert into tbl_radio (station, addate, adtime, Duration) values (

    select 'b104', '12-12-2010', '09:08:10', '45' union all

    select 'b104', '12-12-2010', '09:08:24', '45' union all

    select 'b10', '12-12-2010', '09:08:08', '45' union all

    select 'b10', '12-12-2010', '09:08:10', '45' )

    the problem is that i'm not supposed to have another record in the database, unless the 'Duration' has been exceeded for the adtime. So what i intend to do is to get out the duplicates by using this query:

    SELECT AdDate, AdTime, station, duration

    FROM tbl_Radio

    WHERE (FK_ProductId = 'cge006') AND (FK_BrandId = '0005') AND (AdDate BETWEEN '1 dec 2010' AND '31 dec 2010') and fk_stateid = 'la'

    group by AdDate, AdTime, station, duration

    having count(substring(adtime,1,5)) > 1

    order by station, addate, adtime, duration

    but its not bringing out the results where the adtime are the same e.g '09:08' should have given me four records, i.e two for each station.

    Please how can i get the duplicates to show up.

    Thanks

  • Can you post your expected result of that sample data ? i am finding it quite hard to understand your requirement..

  • may be i did not give you the full structure of the table, let me put it like this:

    insert into tbl_radio (station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid) values

    select 'b104', '12-12-2010', '09:08:10', '45', 'la', 'cge006', '0005' union all

    select 'b104', '12-12-2010', '09:08:24', '45', 'la', 'cge006', '0005' union all

    select 'b10', '12-12-2010', '09:08:08', '45', 'kw', 'cge006', '0005' union all

    select 'b10', '12-12-2010', '09:08:10', '45', 'kw', 'cge006', '0005'

    the query is supposed to return something like this:

    station, addate, adtime, duration

    'b104', '12-12-2010', '09:08:10', '45'

    i.e. its supposed to display the duplicate record that has the condition of substring(adtime,1,5) which equals to '09:08'

    for a particular fk_stateid, fk_productid, and fk_brandid, which means one single record for each duplicate.

    Thanks

  • This?

    ; WITH CTE AS

    (

    SELECT station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid,

    rn = ROW_NUMBER() OVER(PARTITION BY station,substring(adtime,1,5) ORDER BY (SELECT 0))

    FROM tbl_radio

    )

    SELECT station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid

    FROM CTE

    WHERE rn = 1

  • thanks so much for your reply, it actually brought some of the records that has duplicate time, but it did not bring all of them, because i have so much data with that condition.

    if only the query could check if the adtime fall within the same minute, for that state, station and addate, then the problem will be fully solved.

    please look into the query and please suggest other ways i could achieve this. I did some modifications, please look into this:

    ; WITH CTE AS

    (

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid,

    rn = ROW_NUMBER() OVER(PARTITION BY station,substring(adtime,1,5) ORDER BY (SELECT 0))

    FROM tbl_radio

    )

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid

    FROM CTE

    WHERE rn = 1 and addate between '1 dec 2010' and '31 dec 2010' and (FK_ProductId = 'cge006')

    AND (FK_BrandId = '0005')

  • You have to cast a date data type to a character data type before you can use SUBSTRING on it. I took the liberty of using DATEPART instead:

    ; WITH CTE AS

    (

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid,

    rn = ROW_NUMBER() OVER(PARTITION BY station,DATEPART(hour,adtime)*60 + DATEPART(minute,adtime)

    ORDER BY (SELECT 0))

    FROM tbl_radio

    )

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid

    FROM CTE

    WHERE rn = 1 and addate between '1 dec 2010' and '31 dec 2010' and (FK_ProductId = 'cge006')

    AND (FK_BrandId = '0005')

    If the results of that aren't what you're looking for, please give the results you are looking for.

    John

  • Thanks John, but it brought this error: Arithmetic overflow error converting expression to data type datetime.

    The adtime is nvarchar(8). please what can i do?

    Thanks

  • I see. This is why you should always provide table DDL as well as sample data. It also explains why your code didn't return an error. I created my test table with adtime as time, not nvarchar.

    If you have any influence over database design, the first thing I would do is change the data type of the column to time (in your test environment, of course). This will enable time-related arithmetical operations (such as DATEPART) to be carried out more easily on it.

    Assuming you can't do that, does your code return the same as mine?

    [font="Courier New"]

    pkidstationaddateadtimedurationfk_stateidfk_productidfk_brandid

    3b102010-12-1209:08:08.000000045kwcge0060005

    1b1042010-12-1209:08:10.000000045lacge0060005

    [/font]

    If not, what results does it return? Are my results above correct? If not, why not?

    John

  • Verify that the adtime can be converted to a datetimeformat as you use it to extract the hour & minutes.

    select datepart(hour,convert(datetime,adtime,8))

  • Thanks John, that's exactly the kind of result i'm looking for. I tried using Jo's convertion in the code, but it still gives me the same error. This is how i did it:

    ; WITH CTE AS

    (

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid,

    rn = ROW_NUMBER() OVER(PARTITION BY station,DATEPART(hour, CONVERT(datetime, AdTime, 8))*60 + DATEPART(minute,adtime)

    ORDER BY (SELECT 0))

    FROM tbl_radio

    )

    SELECT pkid, station, addate, adtime, duration, fk_stateid, fk_productid, fk_brandid

    FROM CTE

    WHERE rn = 1 and addate between '1 dec 2010' and '31 dec 2010' and (FK_ProductId = 'cge006')

    AND (FK_BrandId = '0005')

    I think the problem is the nvarchar(8) field, but its not posible to change the datatype, because i have over a million records, and i don't think sql server will be able to do that convertion from nvarchar(8) to datetime, except we can do in the query then use.

    Thanks Guys

  • Use the appropriate function for the data type you have. Your code is doing a conversion and then applying a function - this will probably be a performance nightmare given that you have a million rows. If you're staying with nvarchar, use SUBSTRING or LEFT or RIGHT.

    However, I don't think the reason you gave is a good reason for not changing the data type. Try changing it, and if it doesn't let you, copy the data to a new table with the correct data types, delete the old table and then rename the new one. Job done... although it'll be a bit more complicated if you have foreign keys and such like. Remember, test first.

    John

  • Hi guys, thanks for all your help, i met this guy called Nate, who provided a solution to the problem.

    CREATE TABLE #tbl_radio (

    AdDate CHAR(12)

    , AdTime CHAR(8)

    , fk_StationId CHAR(4)

    , Duration TINYINT

    )

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-12-2010', '09:08:05', 'b104', '45')

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-12-2010', '09:08:06', 'b104', '45') -- FAIL CASE

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-12-2010', '09:08:07', 'b104', '45') -- FAIL CASE

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-12-2010', '10:08:05', 'b104', '45')

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-12-2010', '10:08:06', 'b104', '45') -- FAIL CASE

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-13-2010', '09:08:05', 'b104', '45')

    INSERT INTO #tbl_radio (AdDate, AdTime, fk_StationId, Duration) VALUES ('12-13-2010', '09:08:06', 'b104', '45') -- FAIL CASE

    GO

    WITH tbl_radio (AdDate, AdTime, fk_StationId, Duration, DateRank, DateDuration)

    AS (

    SELECT AdDate, AdTime, fk_StationId, Duration

    , DENSE_RANK() OVER (PARTITION BY AdDate ORDER BY AdTime)

    , DATEADD(SS,Duration,CONVERT(DATETIME,AdTime))

    FROM #tbl_radio

    )

    SELECT DISTINCT tmp.AdDate, tmp.AdTime, tmp.fk_StationId, tmp.Duration

    FROM tbl_radio tmp

    LEFT OUTER JOIN (

    SELECT t1.fk_StationId

    , t1.AdDate

    , t1.DateRank

    , MIN(t2.DateRank) AS t2DateRank

    FROM tbl_radio t1

    LEFT OUTER JOIN tbl_radio t2

    ON t1.fk_StationId = t2.fk_StationId

    AND t1.AdDate = t2.AdDate

    AND t1.DateRank < t2.DateRank

    AND t2.AdTime NOT BETWEEN CONVERT(DATETIME,t1.AdTime) AND t1.DateDuration

    GROUP BY t1.fk_StationId

    , t1.AdDate

    , t1.DateRank

    ) x ON tmp.fk_StationId = x.fk_StationId

    AND tmp.AdDate = x.AdDate

    AND tmp.DateRank = x.t2DateRank

    WHERE tmp.DateRank = 1

    OR x.t2DateRank IS NOT NULL

    DROP TABLE #tbl_radio

    It solved all my problems.

    Thanks

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply