August 2, 2011 at 7:54 pm
SET Twisted mind on
I've had another inspiration about the execution plan. I think this is delibirately left out simply because some of the greatest DBAs all had a wack at it. Let's assume at least Brad and Steve if not 5-6 more of the same caliber. My best guess at this point is that this info is <considered> completely useless or they simply want us to consider other options because that path's been beaten to death already. Missing indexes are fine but there's no way in h*ll they have not considered that before. So while this seems like a viable option, I'm pretty sure they expect something completely different than that.
With that in mind >> Do you guys really need to track those clicks real time? Are you opened to creating either a logging table with identity bigint, article_key and just inserting to that table with a single update every 15 minutes to the base table? Or god forbid, track that in aspx in the session variables and flush to the db once a minute (or whatever makes sense for your needs)
AnalysisLog.Data is varchar(50). This is a hell of a long key for something that gets that much activity. Any opened-ess to change that key to identity as int (4 billions articles should get you a long way). I have no idea of the real cost of doing this as I don't know how much code change this would entail nor the actual cost of adding FKs to point to that and possibly new indexes to support that operation.
Same thing for PostName nvarchar(256).
Are you guys using compression, if so page or row? I checked out on SM but I couldn't find the sql version info anywhere (sorry if I missed it and also if it's not available). The "standard" is that you get more CPU and less disk activity. You seem to have a lot of room on the cpus, so maybe that could be a nice hit to take there and save on disks / ram (even if cache hit ratio is 100%). All in all maybe a 0 to 5% gain, but still something worth trying.
Do you have a job that runs every hour around H:20-23? You get a 5X spike in page splits around that time. I would guess something like the top poster and other leaderboads being updated but I can't tell for sure. Log shipping, replication?
Do you do anything special arount 11:00 to 17:00 on sqlservercentral db? It gets a hell of a spike on log flushes and transactions... or is it just your "normal" business hours.
Any way we can get a trace for replay so we can prove our ideas make a real impact or are we supposed to guess all the way to the end of the contest?
August 2, 2011 at 7:56 pm
bradmcgehee@hotmail.com (8/2/2011)
Ninja's_RGR'us (8/2/2011)
If we don't want to "win" SM. Can we pick another tool?Don't know this answer, but I will ask.
P.S. I'd like a tool-belt upgrade. If that's too much I can accept just sql prompt upgrade. Support is nice but considering all the new suggestions I make with the products I use I think you should at least take that out of the "value" of the gift...
not to mention all the times I recommend your stuff over these forums. Must have been 100+ times over the last few months :Whistling:
August 2, 2011 at 8:03 pm
--Windows 2008 (64-bit)
--SQL Server 2008 (64-bit)
What editions? AND SP for that matter.
August 2, 2011 at 8:10 pm
Re merge :
My first thought was nice idea. But then I realized that you do "all that" extra work for an event that happens less than 0.01% of the lifetime of the article.
Same thing for ISNULL(). Didn't have time to test and prove this but I think that separating those is a better path to take.
Also 3 part update exists (or output). So no need for update + select there. I know all articles page display the viewed n times in the title. So maybe you can remove an extra select or join (excluded from the top 4) by doing something like this >
UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...
maybe add the "missing" index... depends on the datatype option.
August 3, 2011 at 7:42 am
Ninja's_RGR'us (8/2/2011)
UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...
Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.
August 3, 2011 at 10:02 am
Contest Entry:
For query 1
I would put an index on Time and Data for the analysis log table.
For query 2
The query 1 item would take care of this query as well.
For query 1 and 2. The interesting thing would be to cache the days counts off in server memory or another temporary table then do a nightly or hourly job to write all that down to the databaase or copy it to the analysis log. One other arc change I would make is down use a varchar(50) if possible, can these data events be a key of an int or bigint. That would far improve the indexing. Maybe provide a lookup table to find the description.
For query 3 remove the order by if possible, the sort might be better selecting out of the temp table later if needed. So, add a PostDate column there. Alternately, add an index on cs_Posts with a ThreadID,IsApproved,PostDate,PostID,PostLevel,ApplicationPostType (though this is wide).
Query 4:
Looks like an index is needed on cs_weblog_weblogs on sectionID, but selectivity might just be poor. I would alias applicationtype in the where clause, also is W.* really needed? Expand the W.* items out. Write the , out as an ANSI join style just for best practice.
August 4, 2011 at 8:57 am
Thanks for everyone who has contributed to the contest so far. The goal of the contest was to try an interactive article format with reader feedback on “SQLServerCentral.com Best Practices Clinic: Part 5”, and have a little fun trying to solve a “puzzle of sorts” with the information provided. I feel that there is enough information in the article to provide a “good enough” answer. Sure, with more information you might be able to provide a “better” answer, but this "puzzle of sorts" was designed this way not to place too much impact on our reader’s time. Please keep the answers coming!
Thanks!
Brad M. McGehee
DBA
August 4, 2011 at 11:46 am
Ninja's_RGR'us (8/3/2011)
Ninja's_RGR'us (8/2/2011)
UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.
We both did. I was inspired to believe it was a single row first generated when the first click of the page was performed, primarily because of this quote from the article:
When a web page is accessed, Query #2 runs first, looking to see if that particular article has ever been viewed before or not. If not, then another query (which is not in the top 10 list) is run to create a new record where clicks are tracked in a table;
Brad, can you clarify please?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
August 4, 2011 at 12:49 pm
Craig Farrell (8/4/2011)
Ninja's_RGR'us (8/3/2011)
Ninja's_RGR'us (8/2/2011)
UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.
We both did. I was inspired to believe it was a single row first generated when the first click of the page was performed, primarily because of this quote from the article:
When a web page is accessed, Query #2 runs first, looking to see if that particular article has ever been viewed before or not. If not, then another query (which is not in the top 10 list) is run to create a new record where clicks are tracked in a table;
Brad, can you clarify please?
Based on a Profiler trace I took, here is what appears to happen. Query 1 runs first, and if it doesn't find a record for page that is viewed, then an INSERT is performed into the table so that its views can be tracked. This INSERT query is not in the 10 ten, so I did not list it. If Query 1 does find a record, then Query 2 runs, updating the record for the page that is viewed. The developer of this code was a contractor and did not document eactly what happens or why, so I have had to infer what happens based on what I see in the trace.
Brad M. McGehee
DBA
August 4, 2011 at 11:34 pm
Contest Entry
--------------------------------------------------------------
Recommendation For Query#1 and @Query#2
--------------------------------------------------------------
Understanding: It seems that AnalysisLog table used to keep count of clicks on each artical(data) date wise.
Logical Change Recommended : first execute Update query to increase count (query#1).
if not row affected by Update Query then execute insert query to make first record. This will eliminate execution of query#2
Solution:
UPDATE dbo.AnalysisLog SET [Count] = [Count] + 1 WHERE [Time] = @RoundedTime AND [Data] = @data
IF @@ROWCOUNT = 0
**Insert query**
--------------------------------------------------------------
Recommendation For Query#3
--------------------------------------------------------------
Recommendation#1. use nolock with first select query, which is used to get ThreadID and PostID
Recommendation#2. create index on [cs_Posts].[PostDate], this saves sorting time. Because in execution plan of Insert into #PageIndex query sort cost is 36%.
Solution:
CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_Posts] ([PostDate])
GO
Select
@ThreadID = p.ThreadID, @PostID = p.PostID
From
cs_Posts p with (nolock)
Where
p.SectionID = @SectionID and p.PostName = @PostName
GO
--------------------------------------------------------------
Recommendation For Query #4
--------------------------------------------------------------
1. We can use inner join between cs_Sections F, cs_weblog_Weblogs W on F.SectionID = W.SectionID
2. We can use nolock with both tables
3. Make non-cluster index on F.SectionID and W.SectionID
Solution:
CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_Sections] ([SectionID])
GO
CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_weblog_Weblogs] ([SectionID])
GO
SELECT
F.[SectionID], F.[SettingsID], F.[IsActive], F.[ParentID], F.[GroupID], F.[Name],
F.[NewsgroupName], F.[Description], F.[DateCreated], F.[Url], F.[IsModerated],
F.[DaysToView], F.[SortOrder], F.[TotalPosts], F.[TotalThreads], F.[DisplayMask],
F.[EnablePostStatistics], F.[EnablePostPoints], F.[EnableAutoDelete],
F.[EnableAnonymousPosting], F.[AutoDeleteThreshold], F.[MostRecentThreadID],
F.[MostRecentThreadReplies], F.[PostsToModerate], F.[ForumType], F.[IsSearchable],
F.[ApplicationType], F.[ApplicationKey], F.PropertyNames as SectionPropertyNames,
F.PropertyValues as SectionPropertyValues, F.[DefaultLanguage], F.[DiskUsage], W.*,
null as LastUserActivity
FROM
cs_Sections F with (nolock)
inner join cs_weblog_Weblogs W with (nolock) on F.SectionID = W.SectionID
WHERE
F.SettingsID = @SettingsID AND
ApplicationType = 1
GO
--------------------------------------------------------------
Pradeep Chaurasia
P. I. Softek Ltd
Sec-62, Noida, UP
Pradeep Chaurasia
P.I. Softek Ltd
Sec-62, Noida, UP
August 23, 2011 at 12:10 pm
Sorry for the delay in announcing the winner of this contest, but I was out last week traveling. After reviewing the entries, and talking about them to several other DBAs, we selected WilliamD's response as the winning entry. We all thought his answer seemed the most complete. Thanks to everyone who participated in this contest.
Brad M. McGehee
DBA
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply