August 20, 2018 at 2:20 am
Dear All,
I have following Query in my production. My Table_B has 8mil + rows. If I have 4 role in Table_C, my query result will return 32mil + rows.
Current execution time taken is about 1 day. I am trying to reduce the execution time. When I create the Non-cluster Index with the columns from my where clause, it was a bit faster but not significant. Is there any way to reduce the execution time please? I tried to create NonCluster ColumnStore index in my local with 2.5mil rows but it was failed with memory error. Any suggestion please?
Select (CASE WHEN Table_A.[Column A] = 'YES' Table_B.Col_A atSE Table_A.Column_A END) AS Col_A,
.
.
.
From table_B
Cross Join Table_A
inner join TABLE_C on Table_A.colc=Table_C.colc
where
(Table_A.[Ag_CD] is NULL
OR
Table_AAg_CD]=Table_B.[Ag_CD]
OR
Table_A.[Ag_CD]=Table_B.[Ag_CD_New]
)
AND
(Table_A.[Dept] is NULL
OR
Table_A.[Dept]=Table_B.[Dept]
OR
Table_A.[Dept]=Table_B.[Dept_New]
)
AND
(Table_A.[Org1] is NULL
OR
Table_A.[Org1]=Table_B.[Org1]
OR
Table_A.[Org1]=Table_B.[Org1_New]
)
AND
(Table_A.[Org2] is NULL
OR
Table_A.[Org2]=Table_B.[Org2]
OR
Table_A.[CoOrg2]=Table_B.[Org2_New]
)
AND
(Table_A.[Org3] is NULL
OR
Table_A.[Org3]=Table_B.[Org3]
OR
Table_A.[Org3]=Table_B.[Org3_New]
)
Table structure as follows:
Table_A | ||||||||||||||||
UserID | Ag_CD | Dept | Org1 | Org2 | Org3 | Org4 | Org5 | Role | ||||||||
001|002 | R_All | |||||||||||||||
001 | A1 | R_A | ||||||||||||||
R_All | ||||||||||||||||
002 | A1 | Dept1 | ||||||||||||||
003 | A1 | Dept2 | OrgA2 | R_A | ||||||||||||
004 | A1 | Dept2 | OrgA2 | Org2 | R_A | |||||||||||
005 | R_B_ALL | |||||||||||||||
005 | B1 | R_B | ||||||||||||||
006 | B1 | R_B | ||||||||||||||
007 | B1 | R_B | ||||||||||||||
Table_C | ||||||||||||||||
Role | Leave | Age | AgeGrp | Allowance | Sal | Branch | ||||||||||
R_All | Y | Y | Y | Y | N | Y | ||||||||||
R_A | Y | Y | Y | N | N | N | ||||||||||
R_B_ALL | N | N | Y | Y | Y | Y | ||||||||||
R_B | N | N | N | Y | Y | N | ||||||||||
Table_B | ||||||||||||||||
ID | Name | Ag_CD | Dept | Org1 | Org2 | Org3 | Org4 | Org5 | Leave | Age | AgeGrp | Allowance | Sal | Branch | S_Date | P_Date |
1 | UserA | A1 | Dept1 | OrgA2 | NULL | NULL | NULL | NULL | 10 | 40 | 40-45 | 1000 | 10000 | A | 01/08/2018 | 08/08/2018 |
1 | UserA | A1 | Dept1 | OrgA1 | NULL | NULL | NULL | NULL | 11 | 39 | 35-39 | 700 | 8000 | B | 01/08/2017 | 12/08/2017 |
2 | UserB | A1 | Dept1 | OrgA2 | NULL | NULL | NULL | NULL | 10 | 40 | 40-45 | 1000 | 10000 | B | 01/08/2018 | 08/08/2018 |
3 | UserC | B1 | Dept1 | OrgA1 | NULL | NULL | NULL | NULL | 10 | 40 | 40-45 | 1000 | 10000 | A | 01/08/2018 | 08/08/2018 |
4 | UserD | B1 | Dept2 | OrgA2 | Org2 | NULL | NULL | NULL | 10 | 40 | 40-45 | 1000 | 10000 | C | 01/08/2018 | 08/08/2018 |
Thank you for your time!
Best Regards
August 20, 2018 at 5:59 am
Firstly this doesn't look like a CROSS JOIN is needed. I think the query would produce the same results with an INNER JOIN:SELECT *
FROM table_B
INNER JOIN Table_A
ON(Table_A.[Ag_CD] IS NULL
OR Table_A.[Ag_CD] = Table_B.[Ag_CD]
OR Table_A.[Ag_CD] = Table_B.[Ag_CD_New])
AND (Table_A.[Dept] IS NULL
OR Table_A.[Dept] = Table_B.[Dept]
OR Table_A.[Dept] = Table_B.[Dept_New])
AND (Table_A.[Org1] IS NULL
OR Table_A.[Org1] = Table_B.[Org1]
OR Table_A.[Org1] = Table_B.[Org1_New])
AND (Table_A.[Org2] IS NULL
OR Table_A.[Org2] = Table_B.[Org2]
OR Table_A.[CoOrg2] = Table_B.[Org2_New])
AND (Table_A.[Org3] IS NULL
OR Table_A.[Org3] = Table_B.[Org3]
OR Table_A.[Org3] = Table_B.[Org3_New])
INNER JOIN TABLE_C
ON Table_A.colc = Table_C.colc;
Have you got an index on TABLE_C(colc)? It might help
What indexes have you got on tables: Table_A and Table_B?
Can you post the DDL and some code to create some data?
August 20, 2018 at 8:15 pm
Dear Jonathan,
Yes. It is actually inner join already since we have join in where clause.
My indexes are as follows.
Table_A=>
Create Clustered Index on
(UserID, Ag_CD, Dept, Org1, Org2, Org3, Org4, Org5, Role)
Table_C =>
Create Clustered Index on (Role)
Table_B =>
Create Index on Table_B
(Ag_CD,
Ag_CD_New,
Dept,
Dept_New,
Org1,
Org1_New
Org2,
Org2_New,
Org3,
Org3_New
)
Please replace column name =Colc with Role.
Please find table structure as follows:
create table Table_A(
UserID VARCHAR(30),
Ag_CD VARCHAR(30),
Dept VARCHAR(30),
Org1 VARCHAR(30),
Org2 VARCHAR(30),
Role VARCHAR(30)
);
Create table Table_C(
Role VARCHAR(10),
Leave VARCHAR(10),
Age VARCHAR(10),
AgeGrp VARCHAR(10),
Allowance VARCHAR(10),
Sal VARCHAR(10),
Branch VARCHAR(10)
);
create Table Table_B(
ID VARCHAR(30),
Name VARCHAR(30),
Ag_CD VARCHAR(30),
S_Date Datetime NOT NULL,
P_Date Datetime NOT NULL,
Dept VARCHAR(30),
Org1 VARCHAR(30),
Org2 VARCHAR(30),
Org3 VARCHAR(30),
Leave INT,
Age INT,
AgeGrp VARCHAR(30),
Allowance VARCHAR(30),
Sal INT,
Branch VARCHAR(30),
Ag_CD_New VARCHAR(30),
Dept_New VARCHAR(30),
Org1_New VARCHAR(30),
Org2_New VARCHAR(30),
Org3_New VARCHAR(30)
);
You may use sample data from my 1st post.
Problem here is we cannot reduce the data amount.
Currently we have 8 mil + rows in Table_B. If we have 4 roles in Table_C, our result set become 32mil + rows. Do you have any idea to reduce the execution time? How to make faster the query?
Thank you so much for your response.
Best Regards,
August 21, 2018 at 6:37 am
SELECT A.Role, -- Only the columns you need
...
B.ColumnX, ...
INTO #AandB
FROM TableA A
INNER JOIN TableB B ON
(A.[Ag_CD] IS NULL
OR A.[Ag_CD] = B.[Ag_CD]
OR A.[Ag_CD] = B.[Ag_CD_New])
AND (A.[Dept] IS NULL
OR A.[Dept] = B.[Dept]
OR A.[Dept] = B.[Dept_New])
AND (A.[Org1] IS NULL
OR A.[Org1] = B.[Org1]
OR A.[Org1] = B.[Org1_New])
AND (A.[Org2] IS NULL
OR A.[Org2] = B.[Org2]
OR A.[CoOrg2] = B.[Org2_New])
AND (A.[Org3] IS NULL
OR A.[Org3] = B.[Org3]
OR A.[Org3] = B.[Org3_New])
Then create any helpful indexes on the temporary table that will help with the next join to TableC:CREATE INDEX IX_#AandC_1 ON #AandB(col....)
Then do the select joining #AandB to TableC
There will be a bit of trial and error on index creation (it's not obvious to me what indexes can help with that join of TableA to TableB) and which tables are best to join but you need to do a few experiments and time the results.
August 21, 2018 at 6:53 am
Cool2018 - Monday, August 20, 2018 2:20 AMDear All,I have following Query in my production. My Table_B has 8mil + rows. If I have 4 role in Table_C, my query result will return 32mil + rows.
Current execution time taken is about 1 day. I am trying to reduce the execution time. When I create the Non-cluster Index with the columns from my where clause, it was a bit faster but not significant. Is there any way to reduce the execution time please? I tried to create NonCluster ColumnStore index in my local with 2.5mil rows but it was failed with memory error. Any suggestion please?...
You mentioned that you were unable to create a nonclustered columnstore index due to a memory error, and you were only looking at 2.5 million rows. That may suggest you have some kind of RAM constraint. How much RAM does your server have available to SQL Server, and how much space do all of your databases use?
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
August 21, 2018 at 11:59 am
Do you really want every match from Table_B to Table_A or only the most relevant match? If you want to list every role, then this join will always be very big.
For a single "best match" to Table_A, you could add a unique column to Table_A (if it doesn't already have one) and store that column in Table_B. A trigger of Table_B could set the value whenever a row is INSERTed or UPDATEd.
Assuming the values in Table_B are relatively static, if you need every match, you should consider a separate "junction" table that stores the joins for Table_A to Table_B. You'd want to add a unique column to both tables -- if they didn't already have it -- to use in the junction table. You could perhaps use a materialized view or, again, use triggers to add the matching junction table row(s) when either table is INSERTed to or UPDATEd.
Btw:
-- The clus index for Table_A should not have UserID at the first column, rather it should be Ag_CD. If you need UserID in the clus index, move it to the end.
-- I'm somewhat confused by the duplicate "ID" values in Table_B. Typically an "ID" is unique.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 23, 2018 at 2:34 am
Dear ScottPletcher,
Yes. We need every match from Table_B to Table_A. We need to list every role whenever there is no value for Ag_CD or Dept or .... As such if we have 8 mil rows in Table_B and we have 6 roles in Table_A, it become 48mil+ rows in our output dataset.
Also we cannot add unique number from Table_A to Table_B since it has more than one role for each ID from Table_B.
Please assume our Table_A is unique. We already make it unique by concatenating the UserID into one for same group.
Is there any idea we can tune to be faster please?
Thank you,
Best Regards,
August 23, 2018 at 3:14 am
Dear sgmumson,
Please find below for my ans for your question.
SELECT object_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name IN ('Total Server Memory (KB)', 'Target Server Memory (KB)');
SQLServer:Memory Manager |
SELECT physical_memory_kb FROM sys.dm_os_sys_info;
16657508 |
For DB spaces:
Database Name | Database_Size | Unallocated space | |
DB_NAME | 77518.00 MB | 9548.98 MB | |
reserved | data | index_size | unused |
43842448 KB | 43161632 KB | 453040 KB | 227776 KB |
Thank you,
Best Regards,
August 23, 2018 at 8:01 am
Cool2018 - Thursday, August 23, 2018 2:34 AMDear ScottPletcher,Yes. We need every match from Table_B to Table_A. We need to list every role whenever there is no value for Ag_CD or Dept or .... As such if we have 8 mil rows in Table_B and we have 6 roles in Table_A, it become 48mil+ rows in our output dataset.
Also we cannot add unique number from Table_A to Table_B since it has more than one role for each ID from Table_B.
Please assume our Table_A is unique. We already make it unique by concatenating the UserID into one for same group.
Is there any idea we can tune to be faster please?Thank you,
Best Regards,
The unique number is not "from" one table to another. It's unique only for the table it's on, such as an identity column or other unique #.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 23, 2018 at 9:42 pm
Dear ScottPletcher,
Noted. Thanks.
According to your suggestion, what should I put in junction table? Is it table_A.uid & Table_B.uID? If I create materialize view, what would be the advantage for my current problem? Currently we use cluster column index in our PROD.it As per my knowledge, cluster column index is very suit for DW. It will give fastest performance but still have performance issue.
Our prod table has used partition as well. Any more idea you have to resolve this performance issue?
Thank you,
Regards,
August 24, 2018 at 12:31 pm
Please find below for my ans for your question.
SELECT object_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name IN ('Total Server Memory (KB)', 'Target Server Memory (KB)');
SQLServer:Memory Manager | 3645440 |
SQLServer:Memory Manager | 3625208 |
SELECT physical_memory_kb FROM sys.dm_os_sys_info;
16657508 |
For DB spaces:
Database Name | Database_Size | Unallocated space | |
DB_NAME | 77518.00 MB | 9548.98 MB | |
reserved | data | index_size | unused |
43842448 KB | 43161632 KB | 453040 KB | 227776 KB |
Thank you,
Best Regards,
Well, you have just 3.6 GB available memory to SQL Server on a 16 GB RAM server, and are expecting a 77 GB database to run faster with that kind of limitation on RAM ???
Hate to say this, but you are seriously RAM constrained. Assuming you have at least 4 cores on this box, you need a serious RAM upgrade. Other than leaving 2 GB out of
the 16 GB total for the OS, what else needs RAM on this box? You could probably speed things up at least some just by allocating more of the RAM for SQL Server. If you
have apps and web services all on this box, you might also be cpu constrained. You definitely need more RAM. I'd jump right up to 64 GB without a lot of hesitation. And if
you measure how much cpu you are using and find a high usage percentage, you might want to add more cores if possible, and if that server can't accommodate same, it's time
for a new server.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
August 24, 2018 at 8:10 pm
Dear All,
Do you think Clustered columnstored index is suit for my case? I read Internet which said it suit for DW but I saw there is calculation included. Our case is just to join and extract. It's no need to calculate or no need group by. What I mean is our query is not like sql from BI. However our table has 8mil+ rows and about 350 columns. Is there any idea please? I still cannot create that index in my local computer. There is memory error. I cannot test it out. Is there any idea to resolve that memory issue?
Thank you,
Best Regards,
August 26, 2018 at 9:55 am
Cool2018 - Friday, August 24, 2018 8:10 PMDear All,Do you think Clustered columnstored index is suit for my case? I read Internet which said it suit for DW but I saw there is calculation included. Our case is just to join and extract. It's no need to calculate or no need group by. What I mean is our query is not like sql from BI. However our table has 8mil+ rows and about 350 columns. Is there any idea please? I still cannot create that index in my local computer. There is memory error. I cannot test it out. Is there any idea to resolve that memory issue?
Thank you,
Best Regards,
What is the purpose of producing this "data"? Is this for some kind of Data Warehouse?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply