Counting Records

  • Is EntityCollection a partitioned table by any chance?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • its not partitioned, but i managed to sort this problem out.

    i added a non-clustered index on a smaller column and voila... i now have great performance

    thanks for all the advice guys... im loving this forum

  • sql_prodigy (5/13/2009)


    its not partitioned, but i managed to sort this problem out.

    i added a non-clustered index on a smaller column and voila... i now have great performance

    thanks for all the advice guys... im loving this forum

    That is nice, but it looks like a workaround to me. I'm still really interested to know if you still get 9 table scans even with that nonclustered index?

    If yes, can you post the .sqlplan as a zip file?

    Thanks,

    Cheers,

    J-F

  • I'm also curious to know if it does 9 scans on the nonclustered index. Can you post those stats up too?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • ill post that in a bit, just busy on something else....

  • As mentioned please could you post the query plan, and could you let us know how many rows are in the table or am I right in saying it's just one?

    Could you also post a create table sample for us so we can see what the table looks like. e.g any text fields etc, and of coarse what the index construction looks like.

    It seems that your work around might work because the non-clustered index is smaller than the clustered index and hence runs faster but that is guess...

    I'm also very interested in the scan count here

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • ok i was also interested to know how many scans were taking place since i added the non clustered index

    (1 row(s) affected)

    Table 'EntityCollection'. Scan count 1, logical reads 2351, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 39 ms.

  • Just whenever you get a chance. Thanks, prodigy.

    The great thing about this is we all get a deeper understanding of what's going on under the hood. That's one of the best things about SSC.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • @christopher, yes i the non clustered index was created on a smaller column:cool:

  • cool well when you get a chance, it would be nice to see the table and index definitions along with the query plan.

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • i get a count of 315000.

    this is the create table script.

    CREATE TABLE [dbo].[EntityCollection](

    [EntityIdentifierContext] [varchar](32) NOT NULL,

    [EntityIdentifier] [varchar](64) NOT NULL,

    [EntityIdentifierSequence] [varchar](32) NULL,

    [EntityIdentifierComposite] [varchar](128) NOT NULL,

    [Data] [ntext] NULL,

    [XML] [xml] NULL,

    CONSTRAINT [PK_Entity] PRIMARY KEY CLUSTERED

    (

    [EntityIdentifierComposite] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    this is the clustered index

    ALTER TABLE [dbo].[EntityCollection] ADD CONSTRAINT [PK_Entity] PRIMARY KEY CLUSTERED

    (

    [EntityIdentifierComposite] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    this is the non clustered index i added

    CREATE NONCLUSTERED INDEX [idx_entity_identifier] ON [dbo].[EntityCollection]

    (

    [EntityIdentifier] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  • this is the query plan without the non clustered index which produced 9 table scans

    |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[globalagg1005],0)))

    |--Stream Aggregate(DEFINE:([globalagg1005]=SUM([partialagg1004])))

    |--Parallelism(Gather Streams)

    |--Stream Aggregate(DEFINE:([partialagg1004]=Count(*)))

    |--Clustered Index Scan(OBJECT:([Wallet0000].[dbo].[EntityCollection].[PK_Entity]))

    this is with the index i added

    |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1004],0)))

    |--Stream Aggregate(DEFINE:([Expr1004]=Count(*)))

    |--Index Scan(OBJECT:([Wallet0000].[dbo].[EntityCollection].[idx_entity_identifier]))

  • this is the non clustered index i added

    CREATE NONCLUSTERED INDEX [idx_entity_identifier] ON [dbo].[EntityCollection]

    (

    [EntityIdentifier] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Yes, it's obvious the query time improves, now that you have a nonclustered index that is twice as small as your Clustered index. It would be even better if you used "EntityIdentifierSequence" as the nonclustered, since it's also twice as small as your new nonclustered index.

    Now that you get a real good time running this query, that is because the query is cached in the sql server, if you want to have a real portrait of the time the query will take on the first run, you should use this:

    dbcc dropcleanbuffers;

    dbcc freeproccache;

    select Count(*) from TheTable;

    It will give you the exact time the query will take on the first run, or when the results are not cached, and please post the .sqlplan for this (.sqlplan zipped, so we get a graphical view), I'm sorry to ask again, but I'm really curious to see what could be the issue on this, going from 9 scans, to 1.. that is weird in my book!

    Thanks,

    Cheers,

    J-F

  • Just a side not for anyone that is interested.

    I've created a table on my test system with two columsn

    VARCHAR(900) --Max size for a clustered index

    VARCHAR(32)

    even when I create two index one clustered on the first col and one nonclustered on the second column, my query always uses parallelism and always uses the first index(the bigger one) and always runs slow.

    I'm going to trying to find out how the optimizer decides which index/column etc to use for a count 🙂

    wish me luck

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Christopher Stobbs (5/13/2009)


    Just a side not for anyone that is interested.

    I've created a table on my test system with two columsn

    VARCHAR(900) --Max size for a clustered index

    VARCHAR(32)

    even when I create two index one clustered on the first col and one nonclustered on the second column, my query always uses parallelism and always uses the first index(the bigger one) and always runs slow.

    I'm going to trying to find out how the optimizer decides which index/column etc to use for a count 🙂

    wish me luck

    I'm interested to know what will be the results of those queries, and how the optimiser will behave!

    I'm guessing a

    select count(*) from tableX

    should take the most updated statistics or the index with the smaller size. Well, that would make sense, unless all indexes were updated at the same time.

    You might as well want to try

    select count(smallerIndexedField) from tableX

    to see if it changes from the Select count(*), and if it forces the optimiser to choose the index, or at least, help it take the decision?

    Anywayz, update this thread if you get a chance, I'm interested in having a conclusion on this, 😉

    Good luck,

    Cheers,

    J-F

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply