December 13, 2011 at 3:51 pm
I need to create a query that can partition a data set and the do some updates to the partitions and finally produce an output.
In my data set are rows that pertain to individual trade transactions. A package is made up of several transactions. Each transaction contains a broker, portfolio, commission and gross amount. Within each transaction there can be placeholder brokers and place older portfolios. i.e.values that are used as dummies to fill fields in the source entry system. For each package there can only be on real broker
My query needs to pull out all the rows(transactions) per package which have a real portfolio and then for those lines update the broker field with correct/real broker for that package. The way to determine if a broker or portfolio is real is to check if it exists in the portfolio and broker table. I have provided some sample data for clarity.
CREATE TABLE #TradeDetail
(
Package varchar(12)
, [Broker] varchar(50)
, Portfolio VARCHAR(20)
, GrossAmount_EUR numeric(18,6)
, Commission_EUR numeric(18,6)
)
CREATE TABLE #Broker
(
ID INT IDENTITY (1,1)
, [Broker] varchar(50)
)
CREATE TABLE #Portfolio
(
ID INT IDENTITY (1,1)
, [Portfolio] varchar(50)
)
INSERT INTO #TradeDetail
--drop table #TradeDetail
SELECT '123456','Good_Broker','Investment','120','130' UNION ALL
SELECT '123456','Bad_Broker1','Investment_false','100','130' UNION ALL
SELECT '123456','Bad_Broker2','InvestCo','120','100' UNION ALL
SELECT '123456','Bad_Broker3','Investment_false','100','130' UNION ALL
SELECT '123456','Bad_Broker4','Investment_Cash','100','150' UNION ALL
SELECT '123456','Bad_Broker5','PortfolioCO','150','100' UNION ALL
SELECT '789110','Good_Broker2','InvestmentCO2','150','100' UNION ALL
SELECT '789110','Bad_Broker6','Investment_false','150','100' UNION ALL
SELECT '789110','Bad_Broker7','PortfolioCO2','150','100' UNION ALL
SELECT '789110','Bad_Broker8','Investment_Cash','150','100' UNION ALL
SELECT '789110','Bad_Broker9','PortfolioCO','150','100' UNION ALL
SELECT '453211','Good_Broker3','InvestCo','150','100' UNION ALL
SELECT '453211','Bad_Broker10','InvestmentCO1','150','100' UNION ALL
SELECT '453211','Bad_Broker11','Investment_false2','150','143' UNION ALL
SELECT '453211','Bad_Broker12','InvestmentCO2','105','200' UNION ALL
SELECT '453211','Bad_Broker13','Investment_false','100','200' UNION ALL
SELECT '453211','Bad_Broker14','Investment','100','220'
INSERT INTO #Broker
SELECT 'Good_Broker' UNION ALL
SELECT 'Good_Broker2' UNION ALL
SELECT 'Good_Broker3'
INSERT INTO #Portfolio
SELECT 'Investment' UNION ALL
SELECT 'InvestCo' UNION ALL
SELECT 'PortfolioCO' UNION ALL
SELECT 'PortfolioCO2' UNION ALL
SELECT 'PortfolioCO' UNION ALL
SELECT 'PortfolioCO' UNION ALL
SELECT 'InvestmentCO1' UNION ALL
SELECT 'InvestmentCO2'
With this data the output should be:
Package Broker Portfolio GrossAmount_EUR Commission_EUR
123456 Good_Broker Investment 120 130
123456 Good_Broker InvestCo 120 100
123456 Good_Broker PortfolioCO 150 100
789110 Good_Broker2 InvestmentCO2 150 100
789110 Good_Broker2 PortfolioCO2 150 100
789110 Good_Broker2 PortfolioCO 150 100
453211 Good_Broker3 InvestCo 150 100
453211 Good_Broker3 InvestmentCO1 150 100
453211 Good_Broker3 InvestmentCO2 105 200
453211 Good_Broker3 Investment 100 220
December 14, 2011 at 10:03 am
I felt you needed a reply, as you've provided us with DDL and sample data so I've had a go.
There's a possibility I haven't understood your problem correctly, so test this thoroughly 😉
SELECT Package, b.Broker, Portfolio, GrossAmount_EUR, Commission_EUR
FROM (SELECT a.Package, b.Portfolio, a.GrossAmount_EUR, a.Commission_EUR,
DENSE_RANK() OVER(ORDER BY a.Package) AS joiner
FROM #TradeDetail a
INNER JOIN #Portfolio b ON a.Portfolio = b.Portfolio
GROUP BY a.Package, b.Portfolio, a.GrossAmount_EUR, a.Commission_EUR) a
INNER JOIN #Broker b ON a.joiner = b.ID
December 14, 2011 at 10:22 am
Cadavre (12/14/2011)
I felt you needed a reply, as you've provided us with DDL and sample data so I've had a go.There's a possibility I haven't understood your problem correctly, so test this thoroughly 😉
SELECT Package, b.Broker, Portfolio, GrossAmount_EUR, Commission_EUR
FROM (SELECT a.Package, b.Portfolio, a.GrossAmount_EUR, a.Commission_EUR,
DENSE_RANK() OVER(ORDER BY a.Package) AS joiner
FROM #TradeDetail a
INNER JOIN #Portfolio b ON a.Portfolio = b.Portfolio
GROUP BY a.Package, b.Portfolio, a.GrossAmount_EUR, a.Commission_EUR) a
INNER JOIN #Broker b ON a.joiner = b.ID
Your join depends on an artifact of the sample data that you simply cannot guarantee in your production data. To see the problem, use the following for your broker table:
INSERT INTO #Broker
SELECT 'Good_Broker' UNION ALL
SELECT 'Good_Broker1' UNION ALL
SELECT 'Good_Broker2' UNION ALL
SELECT 'Good_Broker3'
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
December 14, 2011 at 10:24 am
Cheers.
I actually managed to get the result with this query 😛
SELECT
td.Package
,MAX(B.[Broker]) OVER (PARTITION BY td.Package) AS [Broker]
,td.Portfolio
,td.GrossAmount_EUR
,td.Commission_EUR
FROM #TradeDetail td
LEFT JOIN #Broker b
ON B.[Broker] = td.[Broker]
WHERE EXISTS(SELECT 1 FROM #Portfolio WHERE Portfolio=td.Portfolio)
thanks for the response
December 14, 2011 at 10:46 am
I was about to post almost exactly the same query. The only difference was that I used an INNER JOIN rather than an EXISTS for the portfolio. Depending on your data and indices, either one might perform better.
The join did bring out a problem with your sample data. You had duplicate portfolio records, because of the UNION ALL. I assume that this is just a problem with the way that the sample was generated. If you have duplicates in your actual data, then you should take some measures to clean up the data and prevent this from happening.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply