Partitioned Indexes

Question

Post reply

Partitioned Indexes

christian_t

Hall of Fame

Points: 3588
More actions
December 23, 2010 at 4:00 am

#229880

hi guys,
I have a DWH were I partitioned by customerno. (blocks like this 10000, 20000, 30000 etc.)
Now I have a lot of queries that are using historizied table, so they check the valid_from, valid_to dates.
So you usually built Indexes that sort the data in the right way
My e.g. valid_from asc, valid_to asc, customerno asc (include a few attributes)
As I want to align the indexes with my table I have to use the same partition_function and _scheme.
Lets make an example, where we just look at one partition:
customerno valid_from valid_to
110101 2010-12-01 2010-12-05
110101 2010-12-06 2010-12-07
110101 2010-12-08 2010-12-09
110102 2010-12-01 2010-12-04
110102 2010-12-05 2010-12-09
110102 2010-12-10 2010-12-12
My Index should sort it like that:
110102 2010-12-01 2010-12-04
110101 2010-12-01 2010-12-05
110102 2010-12-05 2010-12-09
110101 2010-12-06 2010-12-07
110101 2010-12-08 2010-12-09
110102 2010-12-10 2010-12-12
Now as my partition-key is the customerno it could implies a certain sort. So I was wondering, if the partitioned index is doing the sort over my selected index colums (valid_from, valid_to) and just uses the partition-scheme as "hint" in which partition to put the data, or if he first uses the partitionfunction "sort over customerno" and than my index-sort?
This would have quite an impact as I should re-think to use a more "general" partion-column that only gives hints which partition to use but doesn't implies a sort in depth. So that I could built indexes that are really of use
Thanx for your advice

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

christian_t Hall of Fame Points: 3588 More actions · Answer 1

So Many Hits no answer... Is that a Problem that no One has or no One knows an answer to? :w00t:

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 2

mitch.fh (12/23/2010)
So Many Hits no answer... Is that a Problem that no One has or no One knows an answer to? :w00t:

I think people may be waiting for you to look at the actual execution plan and determine the answer yourself.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

christian_t Hall of Fame Points: 3588 More actions · Answer 3

Thanks for your advice.

I would like to read sth out of the execution plan, but I do not see anything out of the index creation index plan 🙁 or which execution plan do you mean?

<?xml version="1.0" encoding="utf-16"?>

<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.1" Build="10.0.4000.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">

<BatchSequence>

<Batch>

<Statements>

<StmtSimple StatementCompId="1" StatementId="1" StatementText="

CREATE NONCLUSTERED INDEX [IX_ST_address_valid_from_valid_to_user_id] ON [dbo].[ST_address]

(

[valid_from] ASC,

[valid_to] ASC,

[user_id] ASC

)

INCLUDE ( [country_id],

[state_id],

[zip_code],

[addressline1],

[city], [part_key_id]) WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90, DATA_COMPRESSION = PAGE)

ON [ST_address_index_partition_scheme]([part_key_id])

" StatementType="CREATE INDEX" />

</Statements>

</Batch>

</BatchSequence>

</ShowPlanXML>

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 4

mitch.fh (12/23/2010)
So Many Hits no answer... Is that a Problem that no One has or no One knows an answer to? :w00t:

You know the old saying... "if you have nothing good to say better say nothing" but, if you insist I would say that there is an underlying - more basic - design issue. What would be the reason to partition DWH tables by customer blocks? I bet that partition strategy doesn't help during archiving and purging and doesn't help during reporting either - those are the two primary reasons to focus on when choosing a partition key.

Well... that's the reason why I'm saying nothing 😀

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

christian_t Hall of Fame Points: 3588 More actions · Answer 5

Thanks Paul, I am open for constructive criticism though.

The idea for partitioning is that I have about 100 million users from different portals united in one dwh.

Load processes are seperated by portal and in the beginning most queries where filtered by a portal. So partitioning made sense in my head.

But I more and more have queries that are portal overlapping. (60% now)

In my example I simplyfied my partitioning rule. I have an partitioning-key that is combinding portal and customerno

portal_id * 10,000,000,000 + customerno. And the function is using the portal to create the blocks. so e.g.

right from

portal1 10,000,000,000

portal2 20,000,000,000

portal3 30,000,000,000

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 6

mitch.fh (12/27/2010)
Thanks for your advice.
I would like to read sth out of the execution plan, but I do not see anything out of the index creation index plan 🙁 or which execution plan do you mean?
<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.1" Build="10.0.4000.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
<BatchSequence>
<Batch>
<Statements>
<StmtSimple StatementCompId="1" StatementId="1" StatementText="
CREATE NONCLUSTERED INDEX [IX_ST_address_valid_from_valid_to_user_id] ON [dbo].[ST_address]
(
[valid_from] ASC,
[valid_to] ASC,
[user_id] ASC
)
INCLUDE ( [country_id],
[state_id],
[zip_code],
[addressline1],
[city], [part_key_id]) WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90, DATA_COMPRESSION = PAGE)
ON [ST_address_index_partition_scheme]([part_key_id])
" StatementType="CREATE INDEX" />
</Statements>
</Batch>
</BatchSequence>
</ShowPlanXML>

Right idea but I was looking for the execution plan of the query against the data that caused you to ask the following question...

Now as my partition-key is the customerno it could implies a certain sort. So I was wondering, if the partitioned index is doing the sort over my selected index colums (valid_from, valid_to) and just uses the partition-scheme as "hint" in which partition to put the data, or if he first uses the partitionfunction "sort over customerno" and than my index-sort?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

christian_t Hall of Fame Points: 3588 More actions · Answer 7

Hi Jeff,

I created an index to test it. I realised that I had to put the part_key_id in to the "sorted" columns, so it looks like that:

CREATE NONCLUSTERED INDEX [IX_ST_activity_valid_from_valid_to_customerno] ON [dbo].[ST_activity]

(

[valid_from] ASC,

[valid_to] ASC,

[portal_id] ASC

[customerno] ASC,

[part_key_id] ASC <-- combination of portal and customer as discribed above

)

INCLUDE ( [act_status_id])

WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)

ON [ST_activity_index_partition_scheme]([part_key_id])

This is my example-query:

DECLARE @data_date date = '2008-06-15';

SELECT sa.customerno, sa.act_status_id

FROM dbo.ST_activity AS sa

WHERE sa.valid_from <= @data_date AND sa.valid_to >= @data_date

The Execution Plan of the example-Querie looks like that

<?xml version="1.0" encoding="utf-16"?>

<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.1" Build="10.0.4000.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">

<BatchSequence>

<Batch>

<Statements>

<StmtSimple StatementCompId="1" StatementId="1" StatementText="DECLARE @data_date date = '2010-06-15'; " StatementType="ASSIGN" />

</Statements>

<Statements>

<StmtSimple StatementCompId="2" StatementEstRows="17418000" StatementId="2" StatementOptmLevel="FULL" StatementSubTreeCost="235.169" StatementText=" SELECT sa.customerno, sa.act_status_id FROM dbo.ST_activity AS sa WHERE sa.valid_from <= @data_date AND sa.valid_to >= @data_date " StatementType="SELECT" QueryHash="0xE4B0A2E72089E207" QueryPlanHash="0xA03C41B788CD8D5B">

<StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />

<QueryPlan CachedPlanSize="16" CompileTime="2" CompileCPU="2" CompileMemory="240">

<RelOp AvgRowSize="17" EstimateCPU="6.06058" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="17418000" LogicalOp="Gather Streams" NodeId="0" Parallel="true" PhysicalOp="Parallelism" EstimatedTotalSubtreeCost="235.169">

<OutputList>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</OutputList>

<Parallelism>

<RelOp AvgRowSize="20" EstimateCPU="7.98406" EstimateIO="217.641" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="17418000" LogicalOp="Index Seek" NodeId="1" Parallel="true" Partitioned="true" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="225.625" TableCardinality="193534000">

<OutputList>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</OutputList>

<IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" NoExpandHint="false">

<DefinedValues>

<DefinedValue>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

</DefinedValue>

<DefinedValue>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</DefinedValue>

</DefinedValues>

<Object Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Index="[IX_ST_activity_valid_from_valid_to_portal_id_customerno]" Alias="[sa]" IndexKind="NonClustered" />

<SeekPredicates>

<SeekPredicateNew>

<SeekKeys>

<StartRange ScanType="GE">

<RangeColumns>

<ColumnReference Column="PtnId1000" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="(1)">

<Const ConstValue="(1)" />

</ScalarOperator>

</RangeExpressions>

</StartRange>

<EndRange ScanType="LE">

<RangeColumns>

<ColumnReference Column="PtnId1000" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="(6)">

<Const ConstValue="(6)" />

</ScalarOperator>

</RangeExpressions>

</EndRange>

</SeekKeys>

<SeekKeys>

<EndRange ScanType="LE">

<RangeColumns>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="valid_from" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="[@data_date]">

<Identifier>

<ColumnReference Column="@data_date" />

</Identifier>

</ScalarOperator>

</RangeExpressions>

</EndRange>

</SeekKeys>

</SeekPredicateNew>

</SeekPredicates>

<Predicate>

<ScalarOperator ScalarString="[DWH_DB].[dbo].[ST_activity].[valid_to] as [sa].[valid_to]>=[@data_date]">

<Compare CompareOp="GE">

<ScalarOperator>

<Identifier>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="valid_to" />

</Identifier>

</ScalarOperator>

<ScalarOperator>

<Identifier>

<ColumnReference Column="@data_date" />

</Identifier>

</ScalarOperator>

</Compare>

</ScalarOperator>

</Predicate>

</IndexScan>

</RelOp>

</Parallelism>

</RelOp>

</QueryPlan>

</StmtSimple>

</Statements>

</Batch>

</BatchSequence>

</ShowPlanXML>

Then I created a index that is just using the portal_id for partitioning (with another function and scheme!)

CREATE NONCLUSTERED INDEX [IX_ST_activity_valid_from_valid_to_customerno] ON [dbo].[ST_activity]

(

[valid_from] ASC,

[valid_to] ASC,

[portal_id] ASC

[customerno] ASC

)

INCLUDE ( [act_status_id])

WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)

ON [ST_activity_index_with_portal_id_partition_scheme]([portal_id])

Using this index the execution plan looks like that:

<?xml version="1.0" encoding="utf-16"?>

<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.1" Build="10.0.4000.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">

<BatchSequence>

<Batch>

<Statements>

<StmtSimple StatementCompId="1" StatementId="1" StatementText="DECLARE @data_date date = '2010-06-15'; " StatementType="ASSIGN" />

</Statements>

<Statements>

<StmtSimple StatementCompId="2" StatementEstRows="17418000" StatementId="2" StatementOptmLevel="FULL" StatementSubTreeCost="235.169" StatementText=" SELECT sa.customerno, sa.act_status_id FROM dbo.ST_activity AS sa WHERE sa.valid_from <= @data_date AND sa.valid_to >= @data_date " StatementType="SELECT" QueryHash="0xE4B0A2E72089E207" QueryPlanHash="0x958222DFD826C9A1">

<StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />

<QueryPlan CachedPlanSize="16" CompileTime="5" CompileCPU="5" CompileMemory="240">

<RelOp AvgRowSize="17" EstimateCPU="6.06058" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="17418000" LogicalOp="Gather Streams" NodeId="0" Parallel="true" PhysicalOp="Parallelism" EstimatedTotalSubtreeCost="235.169">

<OutputList>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</OutputList>

<Parallelism>

<RelOp AvgRowSize="20" EstimateCPU="7.98406" EstimateIO="217.641" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="17418000" LogicalOp="Index Seek" NodeId="1" Parallel="true" Partitioned="true" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="225.625" TableCardinality="193534000">

<OutputList>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</OutputList>

<IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" NoExpandHint="false">

<DefinedValues>

<DefinedValue>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="customerno" />

</DefinedValue>

<DefinedValue>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="act_status_id" />

</DefinedValue>

</DefinedValues>

<Object Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Index="[IX_ST_activity_valid_from_valid_to_portal_id_customerno2]" Alias="[sa]" IndexKind="NonClustered" />

<SeekPredicates>

<SeekPredicateNew>

<SeekKeys>

<StartRange ScanType="GE">

<RangeColumns>

<ColumnReference Column="PtnId1004" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="(1)">

<Const ConstValue="(1)" />

</ScalarOperator>

</RangeExpressions>

</StartRange>

<EndRange ScanType="LE">

<RangeColumns>

<ColumnReference Column="PtnId1004" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="(6)">

<Const ConstValue="(6)" />

</ScalarOperator>

</RangeExpressions>

</EndRange>

</SeekKeys>

<SeekKeys>

<EndRange ScanType="LE">

<RangeColumns>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="valid_from" />

</RangeColumns>

<RangeExpressions>

<ScalarOperator ScalarString="[@data_date]">

<Identifier>

<ColumnReference Column="@data_date" />

</Identifier>

</ScalarOperator>

</RangeExpressions>

</EndRange>

</SeekKeys>

</SeekPredicateNew>

</SeekPredicates>

<Predicate>

<ScalarOperator ScalarString="[DWH_DB].[dbo].[ST_activity].[valid_to] as [sa].[valid_to]>=[@data_date]">

<Compare CompareOp="GE">

<ScalarOperator>

<Identifier>

<ColumnReference Database="[DWH_DB]" Schema="[dbo]" Table="[ST_activity]" Alias="[sa]" Column="valid_to" />

</Identifier>

</ScalarOperator>

<ScalarOperator>

<Identifier>

<ColumnReference Column="@data_date" />

</Identifier>

</ScalarOperator>

</Compare>

</ScalarOperator>

</Predicate>

</IndexScan>

</RelOp>

</Parallelism>

</RelOp>

</QueryPlan>

</StmtSimple>

</Statements>

</Batch>

</BatchSequence>

</ShowPlanXML>

Could you tell me, where I can read sth out of it?

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 8

I was hoping that someone more skilled at reading XML execution plans than I would stop by to answer your question.

I see that the partitions are coming into play after the index but I don't believe it's working to any advantage because the query is necessarily doing an IndexScan.

I'm not sure that answers your original question, though.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 9

mitch.fh (12/27/2010)
Thanks Paul, I am open for constructive criticism though.
The idea for partitioning is that I have about 100 million users from different portals united in one dwh.
Load processes are seperated by portal and in the beginning most queries where filtered by a portal. So partitioning made sense in my head.
But I more and more have queries that are portal overlapping. (60% now)
In my example I simplyfied my partitioning rule. I have an partitioning-key that is combinding portal and customerno
portal_id * 10,000,000,000 + customerno. And the function is using the portal to create the blocks. so e.g.
right from
portal1 10,000,000,000
portal2 20,000,000,000
portal3 30,000,000,000

That's a very different story Mitch - Thank you for the clarification.

I can see current partitioning strategy being helpful during ETL - particularly if ETL for different portals are running concurrently; provided different partitions are sitting in different spins.

Question. Do you rely in partition switching or any feature that forces you to have indexes aligned with base table partitioning strategy?

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 10

There is one VERY important point you are missing about table partitioning: it is NOT primarily about performance, but rather for MAINTENANCE/MANAGEMENT. If it also happens to allow for partition elimination for some queries, that is great.

If you want best query performance - partition by date. If you want best loading and managment, partition by customerno. You really cannot have it both ways unless you make a separate copy of the data after loading into another table partitioned by date.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 11

TheSQLGuru (12/28/2010)
If you want best query performance - partition by date. If you want best loading and managment, partition by customerno. You really cannot have it both ways unless you make a separate copy of the data after loading into another table partitioned by date.

I agree you can't have it both ways - as I already pointed out - but, isn't proposed strategy a self-defeating one?

Lets see... ETL against first table to take advantage of ETL-friendly partitioning strategy then, to maximize reporting performance ETL against second table - which has a reporting-friendly partitioning strategy.

mmhhh... isn't first ETL process 100% overhead? 😀

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

christian_t Hall of Fame Points: 3588 More actions · Answer 12

PaulB-TheOneAndOnly (12/28/2010)
Question. Do you rely in partition switching or any feature that forces you to have indexes aligned with base table partitioning strategy?

There is just one table that needs to be aligned as I use a dynamic partition switching (big table 1 billion rows) and this one is partitioned by portal + date. Every day I through out one date and put a new one in, so I have a "classic" sliding window there.

The other tables don't need alignment in the sliding-window-way as I don't have to delete whole partitions. All normal tables have separate filegroups for the indexes.

There is another question poping up though: Isn't it better to use alignment for performance reasons?

I agree you can't have it both ways - as I already pointed out - but, isn't proposed strategy a self-defeating one?

But I want BOTH 😀

Now seriously: I cannot effort to copies of the data as it would take to much space. So I need a solution that serves the Load-ETLs and the Report-ETLs.

My original question was aiming on the partitioning rule.

The partition-key I constructed had to goals: 1. Partitioning by Portal 2. Sorting the data in the clustered index by customerno because new customers get always a higher cus_no and it is my main join connection.

If the delicate partitionkey is causing a sort problem for indexes that sort by valid_from/valid_to than I would use another partitionkey - maybe just the portal_id. The data won't be sorted by customerno anymore in the clustered index but the non clustered index will take care sorting.

If the delicate partitionkey is not creating any sort problem then I will keep my partitioning key.

A compromise between both solutions would be - as I have no necessity for alignment - to use a different partition function/scheme combination for indexes. In this function I would just use the portal for partitioning and the index will be sorted by the columns needed.

Johan Bijnens SSC Guru Points: 135369 More actions · Answer 13

IMO aligned indexes provide both performance and recovery advantages:

- performance: because the are physically split like their corresponding table, so the engine will only use the part needed by the query, split per partition if possible. Eliminating IO.

parallelism will also take advantage of this.

- recovery: if you perform a partial restore, all aligned indexes might be used with the restored partition if you can direct the query to use only that partition.

a non-aligned index would need the full table to be restored to be usable.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 14

TheSQLGuru (12/28/2010)
There is one VERY important point you are missing about table partitioning: it is NOT primarily about performance, but rather for MAINTENANCE/MANAGEMENT. If it also happens to allow for partition elimination for some queries, that is great.
If you want best query performance - partition by date. If you want best loading and managment, partition by customerno. You really cannot have it both ways unless you make a separate copy of the data after loading into another table partitioned by date.

Hmmm... I wonder if there IS a way to have it both ways without duplicating the data. I've gotta try a couple of things when I get some time because this actually comes up a lot in the world of heavy metal.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)