inner join give me huge data when package id equal package id from another table

Question

Post reply

inner join give me huge data when package id equal package id from another table

ahmed_elbarbary.2010

Hall of Fame

Points: 3118

November 29, 2021 at 11:46 pm

Go to Answer

#3958461

I work on sql server 2014 i face issue when join two tables as inner join

from table package it have 100000 rows

and table package2 100000 rows

so it will give result as

100000 * 100000

and it take too much time

so how to handle that issue of big data

are there are any way to get data quickly

create table #package
 (
 packageId int,
 PartId int
 )
 insert into #package(packageId,PartId)
 values
 (1121,2311),
 (1121,9911),
 (1121,2020),
 (1121,5052),
 (1194,4311),
 (1194,9812),
 (1194,2391),
 (1194,5541)
    
    
 create table #package2
 (
 packageId int,
 PartId int
 )
 insert into #package2(packageId,PartId)
 values
 (1121,2377),
 (1121,2111),
 (1121,3420),
 (1121,5057),
 (1194,4388),
 (1194,9912),
 (1194,1091),
 (1194,6441)
    
 select p.partid as partc,p2.partid as partx  from #package p
 inner join #package2 p2 on p.packageId=p2.packageId

Mr. Brian Gale

SSC-Insane

Points: 24800
More actions
November 30, 2021 at 5:34 pm
Answer
#3958745

To add to what everyone else has said, network performance could be a bottleneck too. Those tables don't look like they hold that much data in terms of MB or GB, so I doubt that network would be a bottleneck, but if you have a slow network between your workstation and the SQL server, getting a faster network connection MAY help.
I do agree with everyone though - better hardware will help more than indexes. Indexes on the joined columns should offer a benefit as well though if they don't already exist.
It is not shown here, but just in case - doing cross database queries or using a linked server can cause slowness due to the query optimizer guessing the wrong number of rows. Cross database queries and linked servers don't share their statistics with the optimizer, so the optimizer guesses there will be only 1 row which can result in a poor plan. Pulling all of the cross database/linked server data into a temp table (or table variable) can help with performance.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63501 More actions · Answer 1

"It takes too much time" -- Got any indexes on the tables? like on the Primary/Foreign Keys that you're doing your joins on? Did you look at the query's execution plan? You might want to download Grant Fritchey's book on reading execution plans... it should help a lot.

ahmed_elbarbary.2010 Hall of Fame Points: 3118 More actions · Answer 2

after create indexes

are there are any thing remaining can enhance performance

Johan Bijnens SSC Guru Points: 135369 More actions · Answer 3

Add RAM & SSD ( especially for tempdb )

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 4

The example query doesn't have a WHERE clause, so you're moving every single bit of the data. Indexes will help a little bit, but not much. A query that moves simply everything is only assisted by hardware. More, bigger, faster CPU. More, faster, disk. More, more, more, faster, memory. When you're simply moving all the data, all the time, you have no other choice for performance.

Now, toss a WHERE clause or some other filtering mechanism in there, we can talk.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ahmed_elbarbary.2010 Hall of Fame Points: 3118 More actions · Answer 5

can i do while loop to divide data

then insert to table

or this solution is bad

Jeffrey Williams SSC Guru Points: 90303 More actions · Answer 6

Why are you only joining on packageId - when both tables have packageId and partId? If you join only on one part of the key - then you will get this type of issue.

Based on your example though - you wouldn't get anything because you don't have matching package and parts in both tables.

In your example, for 1121 packageId - you will get 4 rows from #package2 for each row.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

DesNorton SSC-Insane Points: 24384 More actions · Answer 7

I suspect that the amount of time that you are experiencing is due to the time for the data to be shown on the screen, not the time for SQL to find it.

What are you trying to do with all the data?

If it is for display, NOBODY is going to read 10mil rows of data.

How to post data/code on a forum to get the best help.