February 26, 2018 at 9:54 am
I need to find the most recent post date for all the invoices in my table. There are millions of records. Any thoughts?
selectINV_NUM,
POST_DT
from (
selectINV_NUM,
POST_DT,
row_number() over(partition by INV_NUM order by POST_DT desc) as RowNum
from IDX_INCOME
where GRP__2='7'
) b
where b.RowNum=1
and INV_NUM='0'
February 26, 2018 at 10:19 am
Without you providing a create table statement and sample data in the form of inserts very generically:
select non_agregated_column, max(date_column) MaxDate
from tbl
group by non_agregated_column;
February 26, 2018 at 10:30 am
I know how to find the most recent records, see my query, but it's wicked slow because of the number of records. I would have to send a few million rows of sample data in order to get the same impact.
February 26, 2018 at 11:19 am
NineIron - Monday, February 26, 2018 10:30 AMI know how to find the most recent records, see my query, but it's wicked slow because of the number of records. I would have to send a few million rows of sample data in order to get the same impact.
Provide an actual execution plan, please.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 26, 2018 at 11:24 am
Pardon my ignorance but, how do I copy then paste the execution plan?
February 26, 2018 at 11:27 am
NineIron - Monday, February 26, 2018 11:24 AMPardon my ignorance but, how do I copy then paste the execution plan?
Right click / Save Execution Plan As ... pick your filename & then attach.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 26, 2018 at 11:41 am
The optimiser will do this anyway but simplify your query and you clarify the index requirements:
SELECT MAX(POST_DT)
FROM IDX_INCOME
WHERE GRP__2 = '7'
AND INV_NUM = '0'
If you don't already have an index on GRP__2 and INV_NUM which also includes POST_DT in the KEY or INCLUDE part, then you might need one.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 26, 2018 at 12:00 pm
See attached.
February 26, 2018 at 12:30 pm
NineIron - Monday, February 26, 2018 12:00 PMSee attached.
The query in that execution plan is quite a bit different than what you posted.
In the execution plan, you're doing a convert on the invoice date. What is the datatype of that date column?
You're also searching for an invoice balance of '0', which is a string rather than a numeric so please identify the datatype of the invoice column, as well.
The only thing that may make this faster is an index on the WHERE criteria and, even then, it may result in an index scan simply because it needs to a scan to enumerate the rows.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 26, 2018 at 12:40 pm
NineIron - Monday, February 26, 2018 12:00 PMSee attached.
"idx_income" is an odd name for a heap! Why don't you have a clustered index? What's the purpose of this table? What's its daily/weekly cycle of changes?
And what Jeff said too - this is wildly different from your trivial original query.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 27, 2018 at 4:05 am
I appologize for the confusion. I'm trying to get some financial data to tie out and I can't get out of this rat hole. The data is indexed on MRN, INV_NUM, and POST_DT. The data types on all of the columns is nvarchar(255). It's a pain to work with this table but, that's what I'm stuck with.
Thanx for your help. I'm going to schedule this stuff to run off hours so, the time it takes won't impact the user.
February 27, 2018 at 6:52 am
I'm a bit of a noob, but couln't he just create computed columns that convert the data to proper types, then create an index on and query on those computed columns?
Technet article: https://technet.microsoft.com/en-us/library/ms191250(v=sql.105).aspx
February 28, 2018 at 3:39 pm
Any chance that in your script you can insert the contents of that generic table into a temp table of your creation, with proper data types and indexes that you create? If your data is not huge and you do it in on the same machine then that may be a good approach.
----------------------------------------------------
March 1, 2018 at 4:07 am
I got some more information from the owner of the table and was able to reduce the number of records and stick them in a temporary table. Then, join the temp table to the other stuff. Now, seconds instead of 3 minutes. Thanx for all the input.
March 1, 2018 at 7:40 am
NineIron - Thursday, March 1, 2018 4:07 AMI got some more information from the owner of the table and was able to reduce the number of records and stick them in a temporary table. Then, join the temp table to the other stuff. Now, seconds instead of 3 minutes. Thanx for all the input.
Thanks for the feedback. "Pre-aggregation" and "Divide'n'Conquer" are frequently all that's needed to make monsters behave. People make the mistake of thinking that "Set Based" means "All in one query" and nothing could be further from the truth.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply