August 2, 2007 at 8:25 am
John,
I think the name is bit misleading. Actually the purpose of this SP is to get the Current / Acquisition segment for Customers who has only 1 Product in their last order. So in that case if some one has 1 product (say French QandS) in their last order, the procedure will execute and returns CurSegment = QandS.
Similarly the 3rd SP [SP_GetNextLevel_Multiple]. Its purpose is to get the Current / Acquisition segment for Customers who have more than 1 Products in their last order. So in that case if some one has 1 product (say French QandS) and other product say German QandS in their last order, the procedure will execute and returns CurSegment = QandS and CurLanguage=French as French has higher priority.
I think if you can show me an example how can the 2nd SP be written in Set based approach that will be great and then I can apply the same on the rest of the code.
Best regards,
JB
JB
August 2, 2007 at 12:19 pm
Well, rewriting your work Set based is a bit of a chore without having your core tables, views, and data so please take any work that I post here with a grain of salt. Take this work as more of an example of how to write procedural code Set based and make sure you test, test, test this before using. The key here is for you to understand the difference between procedural code and working with sets of data. Many folks new to SQL Server struggle with this concept as virtually all other programming is based on procedural logic. Making the jump from procedural to Set based thinking will quickly separate you from the crowd and put you a cut above in terms of being able to write effective, efficient code. So for starters, let’s throw away the idea of using loops/cursors to step through recordsets one row at a time and start considering working with groups or sets of data. A set of data can be a table/view, a subset of a table/view, or a recordset created by JOINing tables/views together. If you can write a query to return to you a certain set of rows you need to work on, then you can, 99.9% of the time, write a Set based method to do the work.
So in your case here, I see 2 major groups or sets of data that need to be operated on, single product orders and multiple product orders. Given the number of different tasks that we will need to do to the data, I would start by pulling all of the data that we need to work with into one temp table that we will work off of. Your current use of temp tables in your SPs is unnecessary. For the most part, you have taken entire tables/views and copied them into temporary tables and then used them instead of the core tables/views. This is in-efficient and unnecessary because what you are in a sense doing is you are creating non-indexed copies of your data and then looping through those copies…..very in-efficient. Use temp tables to temporarily store data that needs further manipulation.
So, let’s start by defining a temp-table that will hold all of the data that we will be needing to work with.
CREATE TABLE #tmpCustomer_Profiles (
CustID int PRIMARY KEY CLUSTERED,
CRID varchar (50),
FirstName varchar (50),
LastName varchar (50) ,
AcqSegment varchar (50) ,
AcqDate datetime,
AcqLangCode varchar (50) ,
AcqLanguageName varchar (50) ,
AcqSource varchar (50),
CurLastPurchaseDate datetime,
CurLangCode varchar (50),
CurLanguageName varchar (50),
CurSegment varchar (50),
RecID int,
TestCell varchar (50),
EmailAddress varchar (50),
NumberOfProducts int
)
Next, let’s populate the table with all of the data that you need from Customer, Product_Quantitiy, and Acq_Information_View.
--Get customer info
INSERT INTO #tmpCustomer_Profiles (
CustID,
FirstName,
LastName,
AcqDate,
AcqSource,
CurLastPurchaseDate,
RecID,
TestCell,
EmailAddress,
NumberOfProducts
)
SELECT C.CustomerNumber as CustID,
C.FirstName,
C.LastName,
AIV.MinOfOrderDate as AcqDate,
NULL as AcqSource
AIV.MaxOfOrderDate as CurLastPurchaseDate,
DATEDIFF(dd, AIV.MinOfOrderDate, AIV.MaxOfOrderDate) as 'AcqLastDateDiff',
CASE WHEN DATEDIFF(dd, AIV.MaxOfOrderDate, GETDATE()) < 0 THEN 0 ELSE DATEDIFF(dd, AIV.MaxOfOrderDate, GETDATE()) END as RecID,
NULL as TestCell,
C.Email,
PQ.NumberOfProducts
FROM Customer C
LEFT JOIN Acq_Information_View AIV
ON C.CustomerNumber = AIV.CustomerNumber
LEFT JOIN Product_Quantity PQ
ON PQ.CustomerNumber = C.CustomerNumber AND PQ.OrderDate = AIV.MaxOfOrderDate
-- assign current and actual initial values
UPDATE CP
SET CP.CurLangCode = COI.LanguageCode,
CP.AcqLangCode = COI.LanguageCode,
CP.CurLanguageName = CONVERT(varchar(255),COI.Language),
CP.AcqLanguageName = CONVERT(varchar(255),COI.Language),
CP.CurSegment = CONVERT(varchar(16),Stk_Contents),
CP.AcqSegment = CONVERT(varchar(16),Stk_Contents)
FROM #tmpCustomer_Profiles CP
INNER JOIN Customer_Order_Info_View COI
ON COI.CustomerNumber = CP.CustID AND COI.OrderDate = CP.CurLastPurchaseDate
You temp table now holds all of your customer data that you were looping through before. Now that it is all together, we can work on the 2 major Sets of data, single item orders and multiple item orders. Here is your entire sp_getnextlevel_single procedure broken down into one query that will update the temp table rows. Notice that instead of looping through and looking at each row, we are running this logic against all rows that are single item orders.
--sp_getnextlevel_single updates
UPDATE CP
SET AcqSegment =
CASE COALESCE(STK_CODE,'')
WHEN '' THEN 'EOC'
ELSE
CASE CP.CurSegment
WHEN 'QandS' THEN 'Level 1'
WHEN 'Level 1' THEN 'Level 2'
WHEN 'Level 2' THEN 'Level 3'
END
END,
AcqLangCode = CP.LanguageCode,
AcqLanguageName = CP.CurLanguageName
FROM #tmpCustomer_Profiles CP
INNER JOIN FSTOCK_VIEW F
ON F.STK_CustomText1 = CP.LanguageCode
WHERE STK_Medium LIKE 'CD'
AND STK_Contents LIKE CASE CP.[Level] WHEN 'QandS' THEN 'Level 1' WHEN 'Level 1' THEN 'Level 2' WHEN 'Level 2' THEN 'Level 3' END
AND CP.[Level] NOT IN ('Level 3', 'Compact')
AND NumberOfProducts = 1 --all single item orders
OK, now for the multiple item orders. I’ll leave this one to you. You’ll want to write an update that will be run against all rows in the temp table where NumberOfProducts > 1.
<Your Set based code here>
Now we want to truncate the Customer_Profiles table and re-populate it from the temp table data.
-- Delete the Customer Profile Table to start with an empty table
TRUNCATE TABLE Customer_Profiles
INSERT INTO Customer_Profiles (
CustID,
FirstName,
LastName,
AcqLangCode,
AcqLanguageName,
AcqSegment,
AcqDate,
AcqSource,
CurLastPurchaseDate,
CurLangCode,
CurLanguageName,
CurSegment,
RecID,
TestCell,
EmailAddress
)
SELECT CustID,
FirstName,
LastName,
AcqLangCode,
AcqLanguageName,
AcqSegment,
AcqDate,
AcqSource,
CurLastPurchaseDate,
CurLangCode,
CurLanguageName,
CurSegment,
RecID,
TestCell,
EmailAddress
FROM #tmpCustomer_Profiles
This should be enough to get you started. You’ll need to tweak and test this. Also, add in any error processing standard logic that you use. Have fun…..
August 3, 2007 at 9:53 am
Hey John,
This is amazing... you are the MAN
This looks great, I'm in the process of sprinkling some salt This is very different way of approaching this problem as I'm used to of, but make lot more sense.
Once I'm done, I'll get back to you to update you how its doing and how much processing time has cut b/c of this.
Thanks alot,
JB
JB
August 3, 2007 at 11:18 am
You're welcome. As for the different approach, Set based thinking will seem different until you are used to it, at which point, procedural code will stick out like a sore thumb. Please do post your results/improvements as other reading this thread will benefit. I would expect you to see significant improvements. Please keep us posted.
August 9, 2007 at 10:59 am
This is kind of trivial, but....
If you change your date fields to smalldatetime, you save 8 bytes per record (4 bytes per storage value for smalldatetime vs 8 bytes per storage value for datetime). One page = 8096 bytes (incl overhead), your record width is 624 bytes with datetime, 616 bytes with smalldatetime. This gives you 12 or 13 records per page depending (and assuming 100% fill), not counting any adjustments for varchars. So at 240,000 records, you've got 20,000 pages with datetime, 18,461 with smalldatetime. Eliminate 1500 pages = slightly faster i/o because of fewer pages read, but that 'slightly faster' savings will continue to accumulate as the number of records increase.
This also discounts data being pulled from indexes without having to read actual table pages. Good indexing is extremely important to high-speed and improving performance.
It's possible that my math is a little fuzzy as I've got a couple of drugs in my system fighting a respiratory and sinus infection, thus my brain is a little fuzzy at the moment.
I almost never use datetime because most business apps don't need that level of time precision.
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply