Help with a Select - can''t see the trees for the splinters at the moment!

  • Firstly I have to apologise for a really stupid question - I just know I'm going to be kicking myself when I see the replies but I've been doing so much development work over the past few weeks that I think I've gone SQL blind (and I'm only just learning it so please bare with me).

    Anyway, I've got three tables that I'm querying:

    1) CandidateDetails (contains a CandidateID, name etc)

    2) Qualifications (contains a QualificationID, and Qualification name etc)

    3) Candidate_Qualifications (contains a CandidateID and QualificationID which obviously stores details of what qualifications a candidate has).

    What I'm trying to get is a list of candidates that have qualifications A, B, and C (i.e they MUST have ALL of those qualifications) and who preferably have qualifications D and E.

    At the moment all I've been able to get is either a list of candidates who have any of the 'must-have' qualifications, but not necessarily all of them, or nothing - please help me (and excuse my ignorance - as I say, I'm still learning)

    Thanks in advance

    Martin

  • Please bare with me as this script is quite long. But the sp at the end is extremely versatile and this is what you need in this case :

    IF Object_id('fnSplit_Set') > 0

    DROP FUNCTION dbo.fnSplit_Set

    GO

    IF Object_id('Numbers') > 0

    DROP TABLE dbo.Numbers

    GO

    CREATE TABLE dbo.Numbers (PkNumber int identity(1,1), dude bit null, CONSTRAINT Pk_Number PRIMARY KEY CLUSTERED (PkNumber))

    GO

    INSERT INTO dbo.Numbers (dude)

    SELECT NULL FROM (SELECT TOP 100 NULL AS A FROM master.dbo.spt_values) dt100 cross join (SELECT TOP 80 null AS A FROM master.dbo.spt_values) dt80

    GO

    ALTER TABLE dbo.Numbers

    DROP COLUMN dude

    GO

    --Select min(PkNumber) as MinA, max(PkNumber) as MaxA from dbo.Numbers

    --1, 8000

    GO

    CREATE FUNCTION [dbo].[fnSplit_Set] (@IDS as varchar(8000), @vcDelimiter varchar(3))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    Return

    Select dtSplitted.EachID, dtSplitted.Rank from (

    SELECT SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter),

    CHARINDEX(@vcDelimiter, @vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter)) - N.PkNumber - len(@vcDelimiter)) as EachID

    , (LEN(SUBSTRING(@IDs, 1, N.PkNumber)) - LEN (REPLACE (SUBSTRING(@IDs, 1, N.PkNumber), ',', ''))) + 1 AS Rank

    FROM dbo.Numbers N

    WHERE SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber, len(@vcDelimiter)) = @vcDelimiter

    AND PkNumber 0

    GO

    DECLARE @Ids as varchar(8000)

    SET @IDS = ''

    Select @IDS = @IDS + CAST(id as varchar(10)) + ',' from (Select TOP 10 id from dbo.SysObjects order by NEWID()) dtIds

    SET @IDS = left(@IDS, ABS(len(@IDS) - 1))

    PRINT @IDS

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by O.Name

    Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by dtIDS.Rank

    --Dont't ever forget to cast the eachid column to the same datatype as in the join to avoid table scans.

    GO

    IF NOT Object_id('SearchColumns') IS NULL

    DROP PROCEDURE SearchColumns

    GO

    CREATE PROCEDURE dbo.SearchColumns @Items as varchar(8000), @ItemsCount as smallint --avoid calling split twice

    AS

    SET NOCOUNT ON

    SELECT

    O.id

    , O.Name

    FROMdbo.SysObjects O

    WHERE Exists(

    SELECT 1

    FROMdbo.SysColumns C

    WHERE C.id = O.id and C.Name in (Select EachID from dbo.fnSplit_Set(@Items, ','))

    GROUP BY id having count(*) >= @ItemsCount

    )

    ORDER BYName

    SET NOCOUNT OFF

    GO

    --find all items

    Exec dbo.SearchColumns 'name,id', 2-- 4 row(s) affected

    Exec dbo.SearchColumns 'name,id,indid', 3-- 1 row(s) affected

    Exec dbo.SearchColumns 'id', 1-- 12 row(s) affected

    --find at least 2

    Exec dbo.SearchColumns 'name,id,indid', 2-- 5 row(s) affected (instead of 1)

    --find at least 1

    Exec dbo.SearchColumns 'name', 1-- 10 row(s) affected

    Exec dbo.SearchColumns 'name,indid', 1-- 11 row(s) affected

    GO

    DROP PROCEDURE SearchColumns

    --DROP FUNCTION fnSplit_Set

    GO

    As for qualifications D and E I guess you could make a final left join (in) on candidate_qualifications and get the count of those and sort descending on that count (left join to the candidates that already qualify with A,B,C, that info being kept in a derived table).

    Still think it was simple ?

  • Whoah hos! I take it back about kicking myself as I'd never had come up with anything like that (or even knew where to begin with something like that!!)

    A BIG mighty thanks for that - I'll print the query/UDF out and try to get my head around it (easier said than done!) - must admit that I'd seen the UDF in these forums before over the last couple of days when I've been searching for something else but didn't get how I could apply this to what I need - many, many thanks again for taking the time to answer - really appreciate it.  Now all I gotta do is try to understand what's going on(!).

    I might post a question or two once I've read through it to help me understand how it works if that's ok but I'll try to figure out what's going on first.

    Thanks again

    Martin

  • not as versatile

    select MustHave.candidateid,NiceToHave.Extras

    from

    (

    select candidateid

    from  dbo.Candidate_Qualifications Candidate_Qualifications

    where qualificationid = A

    OR qualificationid=B

    OR qualificationid=C

    group by candidateid

    having count(QualificationID)=3 /*COUNT 3 = have A,B,C*/

    ) AS MustHave

    left join /*All of musthave, Nice to have if possible*/

    (

    SELECT candidateid,count(*) AS Extras

    from  dbo.Candidate_Qualifications Candidate_Qualifications

    where qualificationid =D or qualificationid=E

    group by candidateid

    ) AS NiceToHave

    on MustHave.candidateid=NiceToHave.candidateid

  • Hi Jo,

    Thanks for that - must say that at least I understand yor version but unfortunately the number of 'must have' and 'nice to have' qualifications is going to change on each query so looks like I'm going to have to get my head around the beast supplied by Remi - me thinks my sql knowledge needs to increase somewhat!!

    Thanks again - much appreciated

    Martin

  • Just use Jo's version combined with my split function. That's all there is to it. It's pretty straight forward from there.

  • I would be leery of using massive string handling in a function against a large dataset.  How about a table-driven solution?

    create table Candidate_Evaluation (

       QualificationID   int not null primary key clustered,

       Required          tinyint not null )

    go

    insert into Candidate_Evaluation values (1, 1)   -- Qualification A

    insert into Candidate_Evaluation values (2, 1)   -- Qualification B

    insert into Candidate_Evaluation values (3, 1)   -- Qualification C

    insert into Candidate_Evaluation values (4, 0)   -- Qualification D

    insert into Candidate_Evaluation values (5, 0)   -- Qualification E

    select cd.*

    from CandidateDetails cd

    inner join (

       select CandidateID, COUNT(*) as Matches

       from Candidate_Qualifications q

       inner join Candidate_Evaluation e on e.QualificationID = q.QualificationID

       group by CandidateID

       having sum(e.Required) = (select sum(Required) from Candidate_Evaluation)

    ) m on cd.CandidateID = m.CandidateID

    order by m.Matches desc

  • Even with 2 strigns of 8K, the execution is still in the ms for my split function... No problems there.

  • In case you can store some profiles (either permanent or temporary) to store all the requirements

    profile_qualifications:

    profileid

    qualificationid

    required  /*required might be weighted like 0 or -1 (@MyconstantforRequired

    ) for required, >(=)0 for non-required*/

    /*Variables : @MyconstantforRequired = value for required

    @MyProfileId = profile to match

    */

    select MustHave.candidateid,NiceToHave.Extras

    from

    (

    select candidateid

    from  dbo.Candidate_Qualifications Candidate_Qualifications

    inner join dbo.profile_qualifications profiles

    /*matches all required qualifications*/

    on Candidate_Qualifications.qualificationid = profiles.qualificationid

    and profiles.required = @MyconstantforRequired

    and profiles.profileid=@MyProfileId

    group by candidateid

    /*calculates the amount of necessary qualifications*/

    /*count >= is having them all*/

    having  count(Candidate_Qualifications.qualification)=(select count(*) from dbo.profile_qualifications where required=@MyconstantforRequired and profileid=@MyProfileId)

    ) as  MustHave

    left join /*All of musthave, Nice to have if possible*/

    (

    SELECT candidateid,count(*) AS Extras /* sum(required) possible to sum required*/

    from  dbo.Candidate_Qualifications Candidate_Qualifications

    left join /*0 or more*/

    dbo.profile_qualifications profiles

    on Candidate_Qualifications.qualificationid = profiles.qualificationid

    and profiles.required<>@MyconstantforRequired /* or >@MyconstantforRequired */

    and profiles.profileid=@MyProfileId

    group by candidateid

    ) AS NiceToHave

    on MustHave.candidateid=NiceToHave.candidateid

     

  • I can't imagine this working in a multi-user environement where they all need to get results based on a list of criterias. I think my solution would be more adaptable to such a case, but I don't know your system so I might be typing for nothing here .

  • I can't imagine why anyone capable of coming up with your overengineered, impenetrable solution couldn't figure out how to use a simpler solution.  The Candidate_Evaluation table in my solution could be a temp table, therefore unique to each user.  Jo's variation with multiple sets of criteria distinguished by a profileid works nicely for multiple users.

  • I can.

    What's the cost of creating/droping a temp table for each use, then populate it with 50 round trips on the server to do then a join??

    What's the cost of sending 2 parameters and using existing code on the server to split 'em up??

    The simple fact is that all we know is that the number of needed/usefull number of parameters are variable. I now also assume that if he was using profiles, he would have mentioned it by now.

    As I already said, Jo's solution was exactly was I had described in the first place so I couldn't agree more with him.

  • Comparing the cost of alternative solutions is not the same as saying you can't imagine how an alternative solution could possibly work.  And how you get the idea that Jo's solution (with a notable lack of fuction calls) is exactly what you described is beyond me.

    Your solution would probably work, they all would probably work.  Someone who says "Whoah hos!" when he sees your version might be able to implement a simpler solution more easily.

  • Sure, if calling a split function twice is too hard to understand, I certainly see your point. Sorry if I rubbed you the wrong way, wasn't my intention.

    IIRC, I said Jo's solution was the missing code that I had explained, not demonstred how to do. So even if it's not the same solution they certainly go in the same direction and greatly complete each other's explaination.

  • Hi

    Just wanted to say thanks to everyone that has posted a solution/suggestion on this - I appreciate it. Apologies for not doing this earlier but my mother has been taken into hospital and so pretty much everything else has had to take a back seat for the next few days. I've printed off all of the replies so I can look through them all and work out which one(s) to work with but just wanted to say thanks for taking the time to reply.

    Martin

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply