Pagination in T-SQL to be done on 50000 records . Display 20 rows at single time

Question

Pagination in T-SQL to be done on 50000 records . Display 20 rows at single time

myzus2004

Old Hand

Points: 378
More actions
September 25, 2015 at 5:04 am

#300976

Hello Guys,
i got a requirement from business that they need pagination for an application.
The stored procedure takes ~ 10 seconds every call. The execution plan is Index seek or clustered index scan almost everywhere . No table spools or lazy loads or key lookups. Cannot share the code or execution plan due to company rules and regulations. Any help is appreciated.
Need help with changing business logic of my T SQL code.
Issue is that i am dealing with result set of around 50,000 records . Out of these i have to return 20 records at a single time (which is also customizable i.e 40 / 60/ 150 records in a page). Application cannot handle all 50k records so i have to return 20 records for every stored procedure call.
The result set changes as per the start date and end date which i recieve as parameters.
In application there are few Column filters namely- Country(around 50 countries), Outcome(around 6 to 10 values ) . These filters will values in drop down( as excel ) depending on the distinct values in that columns. These filters will be populated on every page, if no filter value is selected.
Issue is if user does sorting or filtering any records , this stored procedure is called and every time i have to deal with ~50000 records.
Current Code :
Step 1 ) Get the required result set in temp table .
Step 2) Compute the results on some business rules . (Outcome and SharesAvailable calculation - see attachment)
Step 3) populate filter Columns (Country,Outcome) these values will be comma seperated.
Step 4) Dynamic query to get required result set i.e if user wants only 10 records in single page then TOP 10 . Sorting can be applied on any column mentioned in screenshot.
Please let me know if any other information is needed.
Thanks,
Myzus

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Tim ffitch SSCommitted Points: 1637 More actions · Answer 1

One possible solution that springs to mind is to use the Row_Number function in a CTE with parameters passed in for RowNumberFrom and RowNumberTo and your output SQL using these in a WHERE clause. It all depend on the app being able to pass the parameters. Alternative parameters might be PageNumber and PageLength from which you could derive RowNumberFrom and RowNumberTo

Sean Lange SSC Guru Points: 286609 More actions · Answer 2

myzus2004 (9/25/2015)
Hello Guys,
i got a requirement from business that they need pagination for an application.
The stored procedure takes ~ 10 seconds every call. The execution plan is Index seek or clustered index scan almost everywhere . No table spools or lazy loads or key lookups. Cannot share the code or execution plan due to company rules and regulations. Any help is appreciated.
Need help with changing business logic of my T SQL code.
Issue is that i am dealing with result set of around 50,000 records . Out of these i have to return 20 records at a single time (which is also customizable i.e 40 / 60/ 150 records in a page). Application cannot handle all 50k records so i have to return 20 records for every stored procedure call.
The result set changes as per the start date and end date which i recieve as parameters.
In application there are few Column filters namely- Country(around 50 countries), Outcome(around 6 to 10 values ) . These filters will values in drop down( as excel ) depending on the distinct values in that columns. These filters will be populated on every page, if no filter value is selected.
Issue is if user does sorting or filtering any records , this stored procedure is called and every time i have to deal with ~50000 records.
Current Code :
Step 1 ) Get the required result set in temp table .
Step 2) Compute the results on some business rules . (Outcome and SharesAvailable calculation - see attachment)
Step 3) populate filter Columns (Country,Outcome) these values will be comma seperated.
Step 4) Dynamic query to get required result set i.e if user wants only 10 records in single page then TOP 10 . Sorting can be applied on any column mentioned in screenshot.
Please let me know if any other information is needed.
Thanks,
Myzus

So you have a performance problem with a query and you need help but you can't provide any details about the query that would be useful in making it faster. How exactly do you anticipate anybody can actually help? Also, from what you posted it sounds like you have some potential issues. Index scans are NOT good in general. That is an indication that it is scanning every row of the index when it might be able to seek. Also, you mention comma delimited values. Are you parsing or splitting those values? That could be another performance hog. It could be literally anything since you can't share any details. SARGability? Stale statistics? Fragmented indexes? Parameter sniffing? Who knows? I hope you are able to figure this out but nobody can really help much from here.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

myzus2004 Old Hand Points: 378 More actions · Answer 3

Thanks Sean for the reply. Appreciate you taking time out to reply.

I understand its difficult to judge performance from what i have mentioned. But my company doesnt allow code to be published online. Iwill try to get approval to post code online.

Input parameters : StartDate,EndDate, PageNumber, NoofRows,SortByColumn,CountryList ( multiple values - can be comma seperated), Outcome (multiple - Can be comma seperated)

I just want better logic. Do i always have to select 50k records. The values for some columns need to be comma seperated

ex: MeetingID Country

---------------- ---------

1123 Japan

1225 Australia

15526 Japan

Then output will be

MeetingID Country

----------- -------------

1123 Japan, Australia

1225 Japan, Australia

15526 Japan, Australia

I am doing this using XML_path .Any other alternative ?

The stats and index are updated.

Thanks,

Myzus

Steve Jones - SSC Editor SSC Guru Points: 741485 More actions · Answer 4

One thing you might try is using row_number or some other counter based on the rows to retrieve only the PKs needed for that set of data. Then join that back to get specific rows from that set.

This really requires some good indexing. We can help, as Sean pointed out, but we need information on what you've got set up for tables and data.

Sean Lange SSC Guru Points: 286609 More actions · Answer 5

myzus2004 (9/25/2015)
CountryList ( multiple values - can be comma seperated), Outcome (multiple - Can be comma seperated)

Let's start with this...how do you deal with delimited values? Do you have a function that does this? Does it have a cursor or a while loop? If it is a table valued function does it have more than 1 statement?

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 6

myzus2004 (9/25/2015)
The execution plan is Index seek or clustered index scan almost everywhere

Note that "clustered index scan" is a fancy name for "table scan".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

myzus2004 Old Hand Points: 378 More actions · Answer 7

Thanks all for sharing your ideas / opinions.

I have attached few costly queries . The query number is same as the query number in execution plan.

The time statistics of these costly queries are mentioned at start of each sql file. Execution plan also attached.

The DDL of a table and its index is in DDL script file. This table has close to 5 milliion records of only 2 customers . One customer has 2,50,000 records rest 4,750,000 belong to other customer. The query is run for a specific customer at one time. ( Would partitioning this table on customerID help ? )

script of Split function used in queries is also attached. I will try out replacing my split function with Jeff Moden splitter function.

The result set of this stored procedure (SP) avg. row count is 50,000 depending on the parameters (StartDate, EndDate, FundIds,Companyname). By default only start date - end date have values rest are NULL.

The SP for 1st customer takes around 15 seconds and for other customer (4.7 million records ) takes more than 90 seconds.

Would like the timings to be come down to 2/4 seconds and 20/25 seconds respectively .

The MAXDOP is set to 4 on Prod server and DBA's are thinking to have atleast 4 mdf files of tempDB.

Let me know if any other information is needed.

Thanks,

Mihir

Sean Lange SSC Guru Points: 286609 More actions · Answer 8

myzus2004 (9/26/2015)
Thanks all for sharing your ideas / opinions.
I have attached few costly queries . The query number is same as the query number in execution plan.
The time statistics of these costly queries are mentioned at start of each sql file. Execution plan also attached.
The DDL of a table and its index is in DDL script file. This table has close to 5 milliion records of only 2 customers . One customer has 2,50,000 records rest 4,750,000 belong to other customer. The query is run for a specific customer at one time. ( Would partitioning this table on customerID help ? )
script of Split function used in queries is also attached. I will try out replacing my split function with Jeff Moden splitter function.
The result set of this stored procedure (SP) avg. row count is 50,000 depending on the parameters (StartDate, EndDate, FundIds,Companyname). By default only start date - end date have values rest are NULL.
The SP for 1st customer takes around 15 seconds and for other customer (4.7 million records ) takes more than 90 seconds.
Would like the timings to be come down to 2/4 seconds and 20/25 seconds respectively .
The MAXDOP is set to 4 on Prod server and DBA's are thinking to have atleast 4 mdf files of tempDB.
Let me know if any other information is needed.
Thanks,
Mihir

Thanks for sharing your information. You should start by throwing that splitter away. With 4.7 million rows it is going to suck the wind out of your sails. What you have a multi statement table valued function. Those are usually even worse for performance than a scalar function. Add to that what you have also has a loop inside there and you have recipe for performance catastrophe. I would take a look at the splitter from Jeff Moden found at the link in my signature about splitting strings. If you really need to handle more than 8,000 characters for input let me know. There are some other splitters out there that can handle that but are not as performant as the one Jeff built.

For indexes, on your VDSMeetingInformation table do you really need a single index with 24 included columns? That seems a bit overkill to me.

It seems that several of your queries also have nonSARGable predicates in the joins and where clauses. Once you start wrapping columns in functions like that you are forcing a scan instead of seek.

I would start by swapping out that splitter and checking your performance again. There are certainly other things you can do to help but that is probably the biggest issue.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

ScottPletcher SSC Guru Points: 101238 More actions · Answer 9

Unfortunately I can't get the files from my web link at work due to internal issues (seemingly neverending!).

Rather than actual partitioning, try clustering the table on CustomerID, start date and end date.

If you could post the code as separate, unzipped files it would help me give you more details. I'm sorry I can't read it as is.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

myzus2004 Old Hand Points: 378 More actions · Answer 10

Hi Sean, Can you please point out the function name to be used from Jeff Moden splitter attachment.

All, Can some one give me an alternative to these below queries

Query 1 :

Time Stats for this query :

CPU time = 5036 ms, elapsed time = 6029 ms.

Start date and end date of meetings are always available. Rest parameters - companyName, Ticker, securityID (@value1) is optional .

Can we make this into sargable argument search ?

To see execution plan of this query, please refer to query no.1 in the attached execution plan.

FROM dbo.VDSMeetingInformation mv

INNER JOIN vdsfn_SplitString(@fund_id,',') fnsplt ON mv.FundID =(CASE WHEN fnsplt.splitdata IS NULL THEN mv.FundID ELSE fnsplt.splitdata END)

INNER JOIN dbo.VDSFund Vf ON mv.FundID = Vf.FundID And vf.NPxOnly = 0 And vf.ActiveYN = 1 And Vf.CustomerID = @customer_id

INNER JOIN dbo.VDSCustomer Vc ON Vf.CustomerID = Vc.CustomerID And Vc.CustomerID = @customer_id

LEFT OUTER JOIN dbo.VDSCustomerMeetingOverride cmo ON cmo.MeetingID = mv.MeetingID

LEFT OUTER JOIN dbo.VDSCustomerVoteOverride vcv ON vcv.MeetingID = mv.MeetingID AND vcv.FundID = Vf.FundID AND vcv.ItemOnAgendaID = mv.ItemOnAgendaID

WHERE mv.MeetingDate BETWEEN @Start_date AND @End_date

AND

( COALESCE(cmo.CompanyName, mv.CompanyName) LIKE COALESCE(@Value1,cmo.CompanyName,mv.CompanyName) +'%'

OR

COALESCE((CASE cmo.Ticker WHEN '' THEN NULL ELSE cmo.Ticker END),(CASE mv.Ticker WHEN '' THEN NULL ELSE mv.Ticker END),(CASE vc.NullTickerOverride WHEN '' THEN NULL ELSE vc.NullTickerOverride END)) LIKE COALESCE(@Value1,(CASE cmo.Ticker WHEN '' THEN NULL ELSE cmo.Ticker END),(CASE mv.Ticker WHEN '' THEN NULL ELSE mv.Ticker END),(CASE vc.NullTickerOverride WHEN '' THEN NULL ELSE vc.NullTickerOverride END)) +'%'

OR

mv.SecurityID LIKE COALESCE(@value1,mv.SecurityID) +'%'

)

Query 2:

To see execution plan of this query, please refer to query no.10 in the attached execution plan.

Time stats : CPU time = 905 ms, elapsed time = 1445 ms.

CREATE TABLE #FinalList

( RowNumber INT, MeetingID INT, CompanyName VARCHAR(80), Ticker VARCHAR(20),FundName VARCHAR(200),MeetingDate DATETIME,

Country VARCHAR(100),MeetingType VARCHAR(60),SecurityId VARCHAR(14),VoteFlag VARCHAR(60),PageNumber INT,TotalRows INT,

CountryList VARCHAR(MAX),MeetingTypeList VARCHAR(MAX),VotedList VARCHAR(MAX),SixDigitSectorType VARCHAR(50),FundId INT

)

UPDATE pm

SET pm.MeetingTypeList = STUFF((SELECT ' || ' + CAST(temp.MeetingType AS VARCHAR(200)) [text()]

FROM (SELECT DISTINCT MeetingType FROM #FinalList) temp

FOR XML PATH(''), TYPE)

.value('.','VARCHAR(MAX)'),2,2,' ')

,pm.CountryList = STUFF((SELECT ' || ' + CAST(temp.Country AS VARCHAR(200)) [text()]

FROM (SELECT DISTINCT Country FROM #FinalList) temp

FOR XML PATH(''), TYPE)

.value('.','VARCHAR(MAX)'),2,2,' ')

,pm.VotedList = STUFF((SELECT ' || ' + CAST(temp.VoteFlag AS VARCHAR(200)) [text()]

FROM (SELECT DISTINCT VoteFlag FROM #FinalList) temp

FOR XML PATH(''), TYPE)

.value('.','VARCHAR(MAX)'),2,2,' ')

FROM #FinalList pm

Thank you all the possible help . Bless you.

Sean Lange SSC Guru Points: 286609 More actions · Answer 11

myzus2004 (10/1/2015)
Hi Sean, Can you please point out the function name to be used from Jeff Moden splitter attachment.

Did you actually read the article or did you just want me to hand you the code?

All, Can some one give me an alternative to these below queries

To be honest there seems to be a some major overhaul of the entire process required. It seems like there are lots of temp tables and everything being done one step at a time. The query you are concerned with is a culmination of several steps in front of it being less than optimal. It would take quite a bit of time to piece through all the ddl and such along with all the queries in here to understand what it is doing so I could make recommendations of where you can help the performance. It might be money well spent to hire in a consultant that specializes in t-sql performance. I just don't have the time right now to dig through all that and figure it out.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

myzus2004 Old Hand Points: 378 More actions · Answer 12

Thanks for reply Sean. I did read the article but after analyzing there are less changes of getting more than 10 comma seperated values. So spilt function might be of little use. But Jef Moden function will surely help in future for sure. Thanks for guiding.

I got an alternative idea which is a workaround for the whole stored procedure. The T-SQL code will remain same.

Time for website application to fetch this data has reduced from 65 seconds to 4 seconds.[/u]

Currently testing with different parameters. Will share the idea once its success and approved by testing team.

Btw, please feel free to contribute if anyone else wishes to contribute ideas about optimization. Any suggestion is welcome.