July 14, 2015 at 7:54 pm
This is my table structure,
There is a table defined as:
CREATE TABLE [Positions](
[load_id] [int] NOT NULL,
[acct_cd] [varchar](20) NOT NULL,
[acct_num] [varchar](255) NULL,
[sec_id] [varchar](50) NOT NULL,
[long_sht_cd] [varchar](3) NOT NULL,
[sedol] [varchar](15) NULL,
[isin] [varchar](15) NULL,
[cusip] [varchar](9) NULL,
[sec_type] [varchar](8) NULL,
[sec_name] [varchar](100) NULL,
[currency_cd] [varchar](3) NULL,
[total_holding] [decimal](18, 4) NULL,
[mkt_price] [float] NULL,
[datetime_stamp] [datetime] NULL,
CONSTRAINT [pk_Positions] PRIMARY KEY CLUSTERED (
[load_id] ASC,
[acct_cd] ASC,
[sec_id] ASC,
[long_sht_cd] ASC )
)
This table holds account positions data that are appended to multiple times a day
There are currently some 24 million rows in the table. Every time we append additional positions we add approximately 32,000 entries to this table, and all 32,000 entries will have the same load_id. The load_id is incremented by one each time we load a batch of 32,000 entries (i.e. the first 32K entries have load_id=1, the next 32K has load_id=2, etc...). The datetime_stamp field shows the time at which the entries were loaded and is the same for all 32K entries in a single load
what is the efficient way to retrieve the first set of positions for the current day given the above table definition?
I can understand that it's hard to give the query without the sample data. but at this moment i couldn't get the data. Any sample query based on the table structure please
July 14, 2015 at 10:23 pm
Quick question, any indices on the table?
😎
July 15, 2015 at 4:40 am
Hi Eirik,
this is the structure i could see. I don't see any indices. any query please
July 15, 2015 at 6:56 am
-- Determine the most recent load_id, use it as a filter.
-- It will be about as fast as it can get without adding
-- additional indexes because load_id is the leading edge
-- of the clustered index
SELECT p.*
FROM Positions p
INNER JOIN (
SELECT MAX_load_id = MAX(load_id)
FROM Positions
) m ON m.MAX_load_id = p.load_id
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 15, 2015 at 7:02 am
Could you be a little clearer as to what you want? If 32k entries all have the same datetime stamp and same load_id, how do you determine the 'first' row?
Or do you want the first load_id for each day?
Or all the rows for the first load_id of each day?
Or something else?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 15, 2015 at 7:12 am
Or rather this:
-- This table holds account positions data that are appended to multiple times a day
-- what is the efficient way to retrieve the first set of positions
-- for the current day given the above table definition?
SELECT p.*
FROM Positions p
INNER JOIN (
SELECT FirstLoadOfToday = MIN(load_id)
FROM Positions
WHERE datetime_stamp > CAST(GETDATE() AS DATE)
) m ON m.FirstLoadOfToday = p.load_id
-- an ordinary index on [datetime_stamp] INCLUDEing [load_id] would help.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 15, 2015 at 7:28 am
Hi Gail,
When i see the table structure and question, i am also confused with the same questions. i still didn't get authorization to see the data. I am sorry that at this moment my hands are tied . Thanks for your precious time on this post.
Hi Chris,
You query is more understandable. Below one that i thought of.
SELECT TOP (1) WITH TIES *
FROM Positions
WHERE [datetime_stamp] >= CAST(GETDATE() AS DATE);
thanks everyone for your time on this post and i am closing this chain now. once i get the authorization to see the data, i will initiate the thread again.
Thanks once again.
July 15, 2015 at 8:14 am
KGJ-Dev (7/15/2015)
Hi Gail,When i see the table structure and question, i am also confused with the same questions. i still didn't get authorization to see the data. I am sorry that at this moment my hands are tied . Thanks for your precious time on this post.
Hi Chris,
You query is more understandable. Below one that i thought of.
SELECT TOP (1) WITH TIES *
FROM Positions
WHERE [datetime_stamp] >= CAST(GETDATE() AS DATE);
thanks everyone for your time on this post and i am closing this chain now. once i get the authorization to see the data, i will initiate the thread again.
Thanks once again.
You forgot the ORDER BY.
July 15, 2015 at 8:18 am
KGJ-Dev (7/15/2015)
Hi Gail,When i see the table structure and question, i am also confused with the same questions.
If you don't know what you want, there's not much chance we're going to be able to figure it out. When you get clarification on exactly what's actually needed, please let us know.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 15, 2015 at 9:13 am
Sure. thanks Gail and Luis
July 15, 2015 at 10:57 am
If you typically query by datetime, cluster the table by datetime first. No need to use kludges just to keep load_id as the lead key column.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply