Create list of random, mutually exclusive selection(s) from the same table

  • The TEMP table '#project_types' has 3 columns: 1) a pk, 2) the 'project_type' unique name, and 3) 'is_combinable' which is a 0/1 boolean (INT because legacy...).  The job spec for this part (of the larger project specification) says: "randomly select a representative distribution of 1 value (if not combinable) or 1-3 values (if combinable) and return as a single comma separated value".  Here's some data

    drop table if exists #project_types;
    go
    create table #project_types(
    pt_id bigint identity(1,1) constraint pk_project_types_id primary key not null,
    project_type nvarchar(128) unique not null,
    is_combinable int not null);
    go
    insert #project_types(project_type, is_combinable) values
    ('Greenfield Project (Standalone)', 0),
    ('Systems Integration', 1),
    ('Custom Data Model', 1),
    ('Legacy System Replacement', 0),
    ('WordPress Replacement', 0),
    ('Drupal Replacement', 0),
    ('Cloud Migration', 1),
    ('Mobile Application Development', 1),
    ('Business Process Automation', 1),
    ('Ecommerce Development', 1),
    ('API Development', 1),
    ('Machine Learning Implementation', 1),
    ('IoT (Internet of Things) Project', 1),
    ('Cybersecurity Enhancement', 1),
    ('Custom Content Management System (CMS)', 1),
    ('User Experience (UX) Improvement', 1),
    ('Data Visualization and Reporting', 1),
    ('Regulatory Compliance', 1),
    ('Customer Relationship Management (CRM) Integration', 1),
    ('Enterprise Resource Planning (ERP) Integration', 1),
    ('IT Systems Consolidation', 1);

    Here's my query which makes a variable declaration to store the randomly chosen 'is_combinable' value

    declare @combo int=(select top(1) is_combinable
    from #project_types
    order by newid());

    with
    true_cte as (
    select top(abs(checksum(newid()))%3+1) pt.project_type --, pt.is_combinable
    from #project_types pt
    where pt.is_combinable=1
    and @combo=1
    order by newid()),
    false_cte as (
    select top(1) pt.project_type --, pt.is_combinable
    from #project_types pt
    where pt.is_combinable=0
    and @combo=0
    order by newid()),
    list_cte as (
    select * from true_cte
    union all
    select * from false_cte)
    select string_agg(project_type, N', ') project_types_
    from list_cte;

    I'm wondering if there's a way to do this without declaring a variable.  The SQL statements are within a stored procedure so it's possible to leave it the way it is.  The order within the final string doesn't matter so I didn't include WITHIN GROUP (ORDER BY ...) It seems like maybe there's a better way.  What have I tried?  I tried using a CTE to randomly set the 'combo' value and then access the value by CROSS JOIN'ing to the CTE twice, once in each of the true_cte/false_cte CTE's.  Didn't work tho because the CTE is apparently accessed 2 times with 2 possible outcomes, i.e. sometimes the list_cte CTE returned 0 rows and sometimes it returned 4 rows.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • Well maybe the allocation is unavoidable.  I was hoping for pixie dust or a magic trick 🙂

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • The correct term is "PFM".  🙂

       WITH cteCombo(Combo) AS (SELECT TOP 1 is_combinable FROM #project_types ORDER BY NEWID())
    SELECT project_types_ = string_agg(ca.project_type, N', ')
    ,ComboType = MAX(cmb.Combo) --Comment this column out if not desired.
    FROM cteCombo cmb
    CROSS APPLY ((--==== The cmb.Combo is 0 or 1 and so causes a random 0,1,2 to be returned or just a 0
    -- and then a 1 gets added to that. 0 says return just 1 type 0. 1 says return 1 to 3
    -- type 1 returns. It's also used in the WHERE clause to control the type returned.
    -- "type" means "Combo type" or value of is_combinable, in this case.
    SELECT TOP (cmb.Combo*ABS(CHECKSUM(NEWID())%3)+1)
    project_type
    FROM #project_types pt1
    WHERE pt1.is_combinable = cmb.Combo
    ORDER BY NEWID()
    ))ca(project_type)
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Excellent query Jeff.  Thank you, thank you!  Multiplying by 'combo' within the TOP clause is very nice

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Excellent query Jeff.  Thank you, thank you!  Multiplying by 'combo' within the TOP clause is very nice

    Thanks for the feedback, Steve.  The hard part was to make "Combo" available for top.  That's where the Cross Apply came it.  It wouldn't work with a simple CROSS JOIN or any kind of JOIN.  It was also the only way to refer to "Combo" more than once without it recalculating during each reference because of NEWID().

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply