How to optimise the table performanace (90 lacs records)

Question

How to optimise the table performanace (90 lacs records)

sethumail2000

SSC Enthusiast

Points: 127
More actions
June 16, 2020 at 11:49 am

#3763416

Hi,
I have a table with 5000 users along with 6 years of attendance data. If I want to retrieve user data from the table it takes time to load.
Table Structure
Usercode, Rosterdate, Shift code, AttendanceCode
Case 1
If i want to retrieve the shift information from 1-jun-2020 to 10-jun-2020 for 5000 users,
would take time to retrieve the data in SQL server.
what kind of index should i update in above table?
Regards
Sethu

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 399381 More actions · Answer 1

This is a very general question, so I'm going to give a very general answer.

If you intend to filter your information based on a particular column, one presumes Rosterdate from the description, then the index key goes on that column.

However, there is more to it than that. Is this a clustered table or a heap? What's the clustered index key? With just an index on Rosterdate, you're looking at either an key lookup or an RID lookup, which could negatively affect performance. So, you might want to consider adding INCLUDE columns to your index on Rosterdate to cover the ShiftCode column (since you said that's what you want). However, guessing, you also want to see the Usercode and AttendeanceCode? Then those should also be included in the INCLUDE. At this point, you've just recreated a clustered index.

So, is the table a heap? If so, then, create a clustered index on the Rosterdate column and your done.

However, there's so much more to picking indexes and defining data storage. I'd strongly suggest doing a lot more studying on the topic.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

sethumail2000 SSC Enthusiast Points: 127 More actions · Answer 2

Hi Grant,

Thanks for the reply. My table is not a heap and we created a index on rosterdate, usercode and rosterdate, usercode & Shiftcode.

Most of the reports based on these two combinations. For example , clients wants to show the list of employees with shift and attendance information between last 2 months and shift belongs to "Working day". My query like "select <<list of fields>> from LeaveRoster where rosterdate between <<from>> and <<to>> and shift = 'workingday'

Total No employees: 5000

Total No.Records : 80Lac approximately(both active & inactive employee attendance records)

This reply was modified 5 years, 9 months ago by Grant Fritchey. Reason: Fixed some typos

sterling3721 SSChasing Mays Points: 658 More actions · Answer 3

sterling3721

SSChasing Mays

Points: 658

June 16, 2020 at 1:38 pm

#3763450

what's "lac"?

Ant-Green SSC Guru Points: 113567 More actions · Answer 4

@sterling3721 - lac is (forgive the dialect - I believe Indian/Gujarati) for thousand

@sethumail2000-2 - those two indexes, are essentially left key subsets, I would very much recommend removing the index with 2 columns if these are to stay, duplicated indexes like this actually hurt rather then improve as they require more storage, IO, CPU, RAM etc to keep up to date with changes, so no point having two indexes where the key columns are in the same order.

rosterdate, usercode

rosterdate, usercode & Shiftcode

Without seeing statistics and execution plans its tricky to see what is the best indexing strategy.

But if your query is on ROSTERDATE and SHIFTCODE, then neither of the above indexes fully satisfy the need as it can seek on ROSTERDATE but then cant use the rest of the index to get to SHIFTCODE as your missing USERCODE.

It they also comes down to distribution and which is more selective, 90,000 records but only 10 unique SHIFTCODES then put ROSTERDATE on the left of the index.

But for the query

select <<list of fields>> from LeaveRoster where rosterdate between <<from>> and <<to>> and shift = 'workingday'

I would say there should be an index on RosterDate & ShiftCode

pietlinden SSC Guru Points: 63642 More actions · Answer 5

pietlinden

SSC Guru

Points: 63642

June 16, 2020 at 2:51 pm

#3763478

1 lakh = 100,000

Grant Fritchey SSC Guru Points: 399381 More actions · Answer 6

Anthony has nailed it. Those indexes are basically duplicates. Potentially, in the event of scans, you may, sometimes, see that smaller indexed used because scanning fewer pages is better. However, except for that, you'll see that the second index will work for queries that only refer to the two columns.

Are you looking at the execution plans to understand how your indexes are being used?

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

drew.allen SSC Guru Points: 77014 More actions · Answer 7

Lakh is derived from the Indo-European word for "salmon" and is related to the words "lox" and 'lacquer". It should be obvious how "lox" is related to "salmon". "Lacquer" was derived from the similarity in red color to salmon. "Lakh" has a lengthier derivation. When the Indic people moved away from the rivers where salmon were plentiful, the meaning became abstracted from specifically salmon to an attribute of salmon - their great numbers. Eventually, this became a specific great number - 100,000.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA