Matching one set with another

  • Here's a little corker of a problem that has got me.  I can't come up with a solution, although I'm sure it is possible, so I'm cheating by posting it here.

    We have an app that manages its permissions by storing rights in a table. (In an Oracle db, but I've exported it to SQL).

    /****** Object:  Table [dbo].[user_access]    Script Date: 12/01/2006 11:35:04 ******/
    if not exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[user_access]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
     BEGIN
    CREATE TABLE [user_access] (
     [INGRES_USERNAME] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [FUNCTION_CODE] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [QUERY_YN] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [CREATE_YN] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [UPDATE_YN] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [DELETE_YN] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL ,
     [UPDATE_COUNT] [varchar] (8000) COLLATE Latin1_General_CI_AS NULL 
    ) ON [PRIMARY]
    END
    GO

    There are 'Template' users that have a number of rows in this table, on average 600 each!  (I know, it's not good).

    When a new user is created, one of the 'template users is selected and the rows are copied to the new user.

    All user names are 5 letters long, template users are named like TBCCU, TBCDA, , TTRAI, TBCCA whereas the 'real' users have initals plus numerics in the name, AA003, DJ002, YK004 etc.

    The challange is to go through the table, matching each user to the template that created it.  i.e, for one set, find the matching set that it was copied from. I want to do this with no cursors, & preferably no loops, but I've run out of talent.

    This is what I've got thus far:

    SELECT distinct z.INGRES_USERNAME, y.INGRES_USERNAME
    FROM [dbo].[user_access] z
    inner Join
     (SELECT distinct a.*
     FROM [zDBA].[dbo].[user_access] a
     WHERE EXISTS (SELECT *
     FROM [zDBA].[dbo].[user_access] b
     WHERE a.INGRES_USERNAME = b.INGRES_USERNAME
      AND a.FUNCTION_CODE = b.FUNCTION_CODE 
      and a.QUERY_YN = b.QUERY_YN 
      and a.CREATE_YN = b.CREATE_YN 
      and a.UPDATE_YN = b.UPDATE_YN 
      and a.DELETE_YN = b.DELETE_YN 
      and a.UPDATE_COUNT = b.UPDATE_COUNT
      and IsNumeric(Right(a.INGRES_USERNAME,3)) = 0 )) y
    on z.FUNCTION_CODE = y.FUNCTION_CODE 
    and z.QUERY_YN = y.QUERY_YN 
    and z.CREATE_YN = y.CREATE_YN 
    and z.UPDATE_YN = y.UPDATE_YN 
    and z.DELETE_YN = y.DELETE_YN 
    and z.UPDATE_COUNT = y.UPDATE_COUNT
    

    The table has about 350,000 rows and has a clustered index across all columns, so the cost is 9.4, but it takes forever to run & the CPU usage goes through the roof.  Any body with a better understanding of set theory got a better solution?

    Dave J

     


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • If I understand you correctly, I have seen a similar problem before... Does the following work?

     

     

    select dt1.username, dt2.template from

    (

    select INGRES_USERNAME as username, count(*) as cnt from user_access

    where

    IsNumeric(Right(INGRES_USERNAME,3)) = 1

    group by INGRES_USERNAME

    )

    dt1

    cross join

    (

    select INGRES_USERNAME as template, count(*) as cnt from user_access

    where

    IsNumeric(Right(INGRES_USERNAME,3)) = 0

    group by INGRES_USERNAME

    )

    dt2

    inner join

    (

    select a.INGRES_USERNAME as username, b.INGRES_USERNAME as template, count(*) as cnt

    from user_access a cross join user_access b

    where

    IsNumeric(Right(a.INGRES_USERNAME,3)) = 1

    and IsNumeric(Right(b.INGRES_USERNAME,3)) = 0

    and a.FUNCTION_CODE = b.FUNCTION_CODE

    and a.QUERY_YN = b.QUERY_YN

    and a.CREATE_YN = b.CREATE_YN

    and a.UPDATE_YN = b.UPDATE_YN

    and a.DELETE_YN = b.DELETE_YN

    and a.UPDATE_COUNT = b.UPDATE_COUNT

    group by a.INGRES_USERNAME, b.INGRES_USERNAME

    )

    dt3

    on dt3.username = dt1.username and dt3.template = dt2.template

    where dt3.cnt = dt1.cnt and dt3.cnt = dt2.cnt

  • It certainly seems to.  Thank you

    And it runs in 21 seconds too

    Of course they have checked the results and shifted the goalposts

    They wouild like to know which users do not match any template!  That is, users who have had additional rights added. 

    I've wrapped your query up in brackets and Anti joined it, but that was a silly thing to do after thinking about it while staring at the screen waiting for it to return.

    I'll try & work out how yours works and see where I need to flip an equality operator, but be warned I may be asking again...

    Thanks again.

    Dave J


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Right, Ive used your query as the basis for a NOT in Set. 

    No where near as elegant as yours, but it runs in 25 seconds & meets the new requirement perfectly.

    Once again many thanks

    Dave J

    --Find users that do not match any tempate
    if object_id('tempdb..#temp') is not null
        drop table #temp
    select dt1.username into #temp
    from
    (
    select INGRES_USERNAME as username, count(*) as cnt from user_access
    where
    IsNumeric(Right(INGRES_USERNAME,3)) = 1
    group by INGRES_USERNAME
    )
    dt1
    cross join
    (
    select INGRES_USERNAME as template, count(*) as cnt from user_access
    where
    IsNumeric(Right(INGRES_USERNAME,3)) = 0
    group by INGRES_USERNAME
    )
    dt2
    inner join
    (
    select a.INGRES_USERNAME as username, b.INGRES_USERNAME as template, count(*) as cnt
    from user_access a cross join user_access b
    where
    IsNumeric(Right(a.INGRES_USERNAME,3)) = 1
    and IsNumeric(Right(b.INGRES_USERNAME,3)) = 0
    and a.FUNCTION_CODE = b.FUNCTION_CODE
    and a.QUERY_YN = b.QUERY_YN
    and a.CREATE_YN = b.CREATE_YN
    and a.UPDATE_YN = b.UPDATE_YN
    and a.DELETE_YN = b.DELETE_YN
    and a.UPDATE_COUNT = b.UPDATE_COUNT
    group by a.INGRES_USERNAME, b.INGRES_USERNAME
    )
    dt3
    on dt3.username = dt1.username and dt3.template = dt2.template
    where dt3.cnt = dt1.cnt and dt3.cnt = dt2.cnt
    select INGRES_USERNAME from user_access u 
    left outer join #temp t 
    on u.INGRES_USERNAME = t.username 
    where
    IsNumeric(Right(INGRES_USERNAME,3)) = 1
    and t.username is Null
    group by INGRES_USERNAME


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Glad you worked it out , it's more complicated than the first problem. I would probably have attempted a similar solution...You might however want to write select distinct dt1.username into #temp, that could be a bit faster....

     

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply