Best method to virtually combine tables

  • I have 3 tables with about 250 fields and the fields are unique(mistake - not unique) in each table.  They have the exact number of records in each.  Each record of each table has an ID number that matches all 3 tables.  In other words, it is like a table of 750 fields that was divided into 3 seperate tables of 250 fields - and, that each record has a corresponding ID(wrong - corresponding row - see example in later post below) between tables.  I must keep these tables seperate.  When I do a single record query on the Join - It gives me 3 records in the results.  I assume for each join - Should I use UNION?  Is there a way to get just one virtual record returned.  Any help is appreciated . . . I am new at this, thanks.  (can row position be used in a join without creating a new ID?)

  • declare @tbl1 table (ID int, data1 varchar(10),data2 varchar(10))

    declare @tbl2 table (ID int, data1 varchar(10),data2 varchar(10))

    declare @tbl3 table (ID int, data1 varchar(10),data2 varchar(10))

    INSERT INTO @tbl1

    SELECT 0,'0','0' UNION ALL

    SELECT 1,'1','1' UNION ALL

    SELECT 2,'2','2'

    INSERT INTO @tbl2

    SELECT 3,'3','3' UNION ALL

    SELECT 2,'2','2'

    INSERT INTO @tbl3

    SELECT 1,'1','1' UNION ALL

    SELECT 2,'2','2' UNION ALL

    SELECT 4,'4','4'

    SELECT * FROM @tbl1

    SELECT * FROM @tbl2

    SELECT * FROM @tbl3

    select a.*,b.*,c.*

    from @tbl1 a full outer join @tbl2 b on a.id=b.id full outer join @tbl3 c on a.id=c.id or b.id=c.id

    select coalesce (a.id,b.id,c.id) as gID,a.*,b.*,c.*

    from @tbl1 a full outer join @tbl2 b on a.id=b.id full outer join @tbl3 c on a.id=c.id or b.id=c.id

    order by gID

     

    or you can use inner join if each ID has a corespondent ID in ALL tables

     

    you can create a view to contain the general select and run your queries against that view that will contain all columns


    Kindest Regards,

    Vasc

  • use the inner join - from your description you have some pkey that is the same in each table... So an inner join such as

    select A.pkey, A.field1, A.field2, B.field1, B.field2, C.field1, C.field2
    from A
      inner join B on A.pkey = B.pkey
      inner join C on (A.pkey = C.pkey) and (B.pkey = C.pkey)

    should do the trick.  Note that your join clause for joining to C needn't have both conditions - this just lets the query optimiser definitely know that it could choose A's pkey column or B's pkey column when joining to table C...  Not necessary at all but may help performance on large tables if the indices are different in some way.

    Cheers

  • Thanks but . . . I made a mistake.  The pkey's are not unique.  It is actually a many to many relationship.  But the rows are identical(in order top down & ID's match across the 3 tables)  Like this:

    a.pkey  b.pkey  c.pkey

    1            1            1

    3            3            3

    3            3            3

    2            2            2

    Is there a way to reference the row as the join?

    This is not typical and may not be possible - but, the data must work this way.  Any help is appreciated.

  • Please explain what you mean by the statement "the pkeys are not unique" - they have to be unique to be primary keys!?  That's the whole point of them.

    Are you using the term "primary key" meaning "this is the column I'll use for joining"?  If so, then yes, you can have duplicate keys although this is still an odd requirement/design.

    When you say "in order top down" - is there some sort of identity column in each table so that the 1st row in table A with pkey = 3 can match the first row in tables B & C with pkey = 3 and the 2nd row in table A with pkey=3 matches the 2nd row in tables B & C with pkey = 3.  If this is what you mean, I also assume that your identity column doesn't necessarily have the same value in each table (otherwise you'd join on it).

    Can you please confirm the above assumptions about your design?  If all correct, then I guess you'll have a rather inefficient joining process.  I would, apart from redesigning the tables (!), create an index on your columns you have labelled pkey and your identity column (so that's one index over 2 columns - in that order).

    Then you could have in your join condition some logic about mins, maxs and counts of records in each table to match the Xth row with pkey=P with the Xth row with Pkey=P in the other tables.  Very messy   This could be improved by calculating the order upon inserting records into the table - this would make a composite key such as pkey+position with position starting at 1 for each pkey.  When you insert another record into the table with the same "pkey" it has position=2.  Note that I use the term "pkey" here merely as a column header as the true pkey is the combination of the pkey column + the position column (so no flaming please )

    Sorry for long post - wanted to spell out the assumptions I made from your brief post and then a possible solution.  Hope it helps - let us know how you go

  • You are correct - there are no pkey's.  This is NOT the column that I will use for joining.  This column that may be used in the query - but not for joining.  What is common between the tables is that each row - matches each row of the 3 tables.   For example:  table1 row 27's data lines up by row with table2's row 27's data and the same with table3.  These tables are essentially read-only and will never be modified, inserted or edited.  I would only like to display in a query - the fields of table1's - fields of table3's - fields of table3's adjacent to each other when I query on table one's "non-unique" identity column.

    Is there some machine reference to the row number - that can be used as a join?

    Like join where table1.row# = table2.row#?  I don't want to create a pkey or any new fields.

    thanks, this may not be possible, tho.

  • For row #27 is there a field with the value 27 in it?  (this would be like an identity).  If not, how do you know which row is row 27?  I assume you don't mean "the 27th row inserted" as you couldn't have any duplicates.

    Thus I assume that you mean the 27th row based on some ordering of the rows using an order by clause - the duplicate row numbers would arise due to two rows being completely the same on the ordering columns....

    In this case, you could create a temp table.  In this table you have it contain all fields from all 3 tables.  Additionally you create 3 temp tables - one for each of your tables PLUS an identity column in each temp table.  Then insert the data as desired into the temp tables following your order by using an order by clause.  This will give you a start with a method to start joining them.

    If you need to access it from a view then you could create a view that selects from a user-defined function.  Have the function contain the code using the temp tables (would need to be table variables inside the function), etc and the function returns the VERY wide table from which the view selects.

    If you are using SQL 2005 then you can use the row_number function which will even account for ties in your fields/row numbers.  This would be the best way.

  • Yes, the SELECT ROW_NUMBER() function would work - that is what I've been looking for.  It's just too bad I don't have 2005 or I'm gonna have to solve this with temp tables - like you said.  At least I'm not the only one that wanted a fuction like that.  Thanks for all your time and help.  I can figure it out from here.

  • It's a shame when something is so straight forward but you need the new version - i'm in a similar situation myself for a small part of our proposed Infopath integration

    Glad to help you though!  If you come up with a fancy solution, plz post it

  • If I misunderstood the problem, then I apologize. But if the rows numbers all match each other anyway, wouldn't it be easiest to just throw an identity(1,1) column on each of the three tables and use INNER JOIN to combine them together in a view?

    I've only been working with relational databases for a couple years now, so just for my own curiosity, what requirements led to three tables with 250 columns each where the rows match perfectly?

  • Yes, I could make an identity column - but, I wanted to know if it could be done without one.  Given that row numbers are identical between the tables.  Why make something if you don't have to?  The "requirement" was a limitation of the software that accesses SQLserver tables and to potentially reduce processing time required for making identity columns, every night.  The client applications can only display 250 fields - but, some clients can display and utilize more than a thousand fields.  Nightly, the data is dumped out of proprietary databases and loaded as readonly into sqlserver.  There are actually more than three tables that are divided from a 750 column table.  I have 108 of them, with about 80,000 fields total - and some have 360,000 records . . . so, just for efficiency, simplicity and processing time - I wondered if a join could be performed by simply referencing the row position as the join.  I thought this could be a better method.  Since, sqlserver knows the row number of every record, and it, itself has to reference row numbers to perform it's own operations(at the core level) why can't I?

  • Because SQL Server is set based, not row based. For instance, "SELECT colName FROM tblName" might return rows in the same order every time, SQL does not guarantee that it will. I haven't personally noticed much of a(any, actually) performance hit from having system generated primary keys, and I've dealt with plenty of extremely large tables.

  • David's correct - set based systems return a bunch of rows that form a set - a set has no order.  To maintain row numbers, etc you'll need to add one to the set (ie, an identity col) or use something like DBase (yuk!) 

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply