Query on a large database

Question

Query on a large database

Viewing 15 posts - 46 through 60 (of 105 total)

You must be logged in to reply to this topic. Login to reply

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 1

Hi Gail

The tempdb example was not meant to indicate that there is some kind of partitioning involved but to point out that indexes are not the only way to improve read performance and that proper allocation and placement of files od disk also provide a improvement in performance.

With regard to clustered index and placement of rows on disk , I wanted to point out how we are able to move files from one file group to another ( and thus from one file to another ) by simply creating a clustered index on the partition scheme , this way clustered index does have a role to play in moving data on the disk and where it gets placed.

Mainly trying to highlight the difference between an allocation scan and traversing the b tree.

Jayanth Kurup[/url]

Sergiy SSC Guru Points: 110209 More actions · Answer 2

Jayanth_Kurup (10/23/2015)
Mainly trying to highlight the difference between an allocation scan and traversing the b tree.

Once again:

Can you show how a query for an age group for ages between @m and @N should look like?

_____________
Code for TallyGenerator

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 3

Yeah, a covered index in my case would be highly expensive on a table with billion+ rows. But, I get the point.

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 4

Sergiy (10/23/2015)
Jayanth_Kurup (10/23/2015)
Mainly trying to highlight the difference between an allocation scan and traversing the b tree.
Once again:
Can you show how a query for an age group for ages between @m and @N should look like?

Here you go Sergiy

Screenshot for table with clustered index on ref_age vs unindexed heap that is just partitioned by ref_age

for a range of 5 ages

Let me know your thought, I would be happy to test out any other scenarios if you can provide me the scripts with details on how you want me to test.

EDIT :- to take things further aggregated some data over the indexed and the partitioned table to measure performance , when summarizing , partitioned table provides a minor improvement. screenshot attached.

Check out the screenshot for aggregated over non partitioned or indexed column. here I am not even using the partition key or index key column to filter the data , the the difference in performance is almost double.

Jayanth Kurup[/url]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

Jayanth_Kurup (10/23/2015)
Hi Sergiy
I can demonstrate what I am explaining in my video , no gimmicks just plain tables and rows.If you have a few minutes , please add me on skype and I can demo the concept.
my skypeid is Jayanth (dot) Kurup at hotmail dot com
More than happy to demo the concept if it helps , Also the sample script and data are shared on google drive have you tried it?
PS:Big table resides in a single file , it can be made to sit in multiple files but without any control over which row goes into which file ( proportinal filling) ,
if you are going to quote me please do so completely.

What's wrong with building a test data generator and posting the code for that and for what you've done?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 6

Jayanth_Kurup (10/23/2015)
Here are the scripts , I usually create the video to show the steps to reproduce the issue, mostly for those cases where it more important to explain the concept than how i got there , the video is not meant to explain how partitioning works just that if you compare a table with and without partitioning the latter does provide benefits.
. As on linkedin I provide the code when requested it help keep things focused on the main point.
https://drive.google.com/folderview?id=0ByGed3QzAHjfZl9OTEdZRHZBdnc&usp=sharing

Ah... was reading the posts bottom up. Thanks. I'll check it out.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 7

Jayanth_Kurup (10/23/2015)
Hi Sergiy
Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).

It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 8

Jeff Moden (10/23/2015)
Jayanth_Kurup (10/23/2015)
Hi Sergiy
Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).
It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.

There are screenshot with aggregated results as well which use just two columns and return around 16 rows.

Jayanth Kurup[/url]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 9

Took a quick peek at the code you posted.

EXEC sys.sp_db_vardecimal_storage_format N'partitioned', N'ON'

I could be misreading something but you do know that VarDecimal has been deprecated, yes?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 10

i havent changed the storage formats , its all at defaults , the only thing done extra is trace 1117

Jayanth Kurup[/url]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 11

Jayanth_Kurup (10/23/2015)
Jeff Moden (10/23/2015)
Jayanth_Kurup (10/23/2015)
Hi Sergiy
Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).
It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.
There are screenshot with aggregated results as well which use just two columns and return around 16 rows.

This is the screen shot I'm looking at and it contains a wee bit more than 16 rows.

You might also want to do an upgrade. According to that screen shot, you're still using 2012 RTM.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 12

Yes I am using DB engine version 2014 RTM

Jayanth Kurup[/url]

Eric M Russell SSC Guru Points: 125623 More actions · Answer 13

You mentioned the report should be accessible by multiple customers and should run within 15 seconds. Does each customer see the exact same report, or does each customer see a subset of records pertaining only to them? Some reporting tools like SSRS have a feature to cache reports for reuse, if for example you need to produce only one report each month and then make it accessible to multiple users.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

ScottPletcher SSC Guru Points: 101236 More actions · Answer 14

You must cluster the table properly to get good performance from large tables. You can instead try creating gazillions of covering indexes, to cover each (major) query, but that is a lot more maintenance effort. INSERTs than also take much longer, which can lead to blocking/deadlocking in other tasks.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 15

Just wondering if anybody was able to finish testing the scripts and if there are any new results ?

EDIT : BUMP, hoping for some independently verified results , also created a complete video with more scenarios in case any body is looking for a deep dive.

Jayanth Kurup[/url]