Need To Use joins in accompalishing this.

  • Hi all,

    I have this wierd challenge to accompalish this but somehow not able to figure it out.

    I have a table having 2 columns - Let's say Col1 and Col2.

    Column 1 is like an ID and contains repeating values. Column 2 contains Values and contains Unique Values. Now I would like to accompalish the the resultset in the following form in the best way possible:

    Desired Result

    col1 col2(varchar)

    --------------------------

    1 Red | Blue

    2 Green | Yellow | Red

    My Problem is, I do not know how many times the values exist for each Col1.

    As of now, I have accomplaished it using the following way:

    Schema of table:

    CREATE TABLE Test (

    col1 INT,

    col2 VARCHAR(10))

    INSERT Test (col1, col2) VALUES(1, '1.1')

    INSERT Test (col1, col2) VALUES(1, '1.2')

    INSERT Test (col1, col2) VALUES(2, '2.1')

    INSERT Test (col1, col2) VALUES(2, '2.2')

    INSERT Test (col1, col2) VALUES(2, '2.3')

    GO

    CREATE FUNCTION Concatenate (@col1 INT)

    RETURNS VARCHAR(8000)

    BEGIN

    DECLARE @sReturn VARCHAR(8000)

    DECLARE @sDelimiter CHAR(3)

    SET @sDelimiter = ' | '

    SELECT @sReturn = COALESCE(@sReturn + @sDelimiter, '') + col2 FROM Test

    WHERE col1 = @col1 ORDER BY col2

    RETURN @sReturn

    END

    ========================================

    Final Query:

    SELECT col1,

    dbo.Concatenate(col1) as col2

    FROM Test

    GROUP BY col1

    --------------------------------------------

    The biggest issue to above approach is if the table size is huge, it will mess up the performance of the system . Any suggestions are appreciated.

  • The best you can do here is to set up clustered index on (Col1, Col2)

    And don't load the whole table. If you will select limited number of ID's to present it will not create performance problems.

    If you need it to generate some kind of report then you are not in hurry, reports not to be generated every 10 seconds.

    _____________
    Code for TallyGenerator

  • Thanks Sergiy for your response, but the question here is I need the result in the following format:

    Desired Result

    col1 col2(varchar)

    --------------------------

    1 Red | Blue

    2 Green | Yellow | Red

    Where as col1 contains repeated values. Now I don't want to use the crappy approach that I have currently. I want to use joins. can anyone guide as to how can I use them? Now if at all I have to use Self joins, I will need to know how?

    Has anyone experienced it anytime? I was also asked this question by one of my friends who faced an interview with one of the companies.

    Please suggest.

    thanks

  • Did you actually try this approach?

    I mean including right indexing.

    How many ID's you need to include in report to make it over 1s to run?

    What makes you think that this aproach is crappy?

    I played with different join options, but they require to add at least 1 column into your table, and performance was even worse than for this "crappy" approach.

    _____________
    Code for TallyGenerator

  • Other than trying Sergiy's approach. If the dataset is large then performance will always be an issue whatever solution you try.

    To avoid the use of a function you could try a single pass update on a temp extract of your data, eg

    DECLARE @table TABLE ([id] int identity(1,1),col1 INT,col2 VARCHAR(10),col3 VARCHAR(7000))

    INSERT INTO @table (col1,col2,col3) SELECT col1,col2,'' FROM @test-2 ORDER BY col1,col2

    DECLARE @col3 varchar(7000),@col1 int

    SET @col1 = 0

    SET @col3 = ''

    UPDATE @table

    SET @col3 = col3 = (CASE WHEN @col1=col1 THEN @col3 + ' | ' + col2 ELSE col2 END),

    @col1 = col1

    SELECT a.col1,a.col3 AS [col2]

    FROM @table a

    INNER JOIN (SELECT MAX(id) AS [id] FROM @table GROUP BY col1) b

    ON b.[id] = a.[id]

    a bit quirky I admit but it may do as you want, the only problem may be the sequence sql reads the heap (not tried it on large dataset)

    p.s. I used table variable for testing, substitute for temp table or perm table.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • I once saw someone use the clustered index in the indexhint.  I can't confirm this but it should guarante the order of the update.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply