July 14, 2014 at 2:12 pm
Hello Everyone
I hope that you all had a wonderful weekend.
I am working with some data that is not so clean. I have some rows that are Very close to being a duplicate row. I need to take all the rows with the same SKU number and fill in the missing data for the Price, Cost or the SalePrice, what ever may be missing. The data is set like @DiamondsMissingData and I need the row to look like @DiamondsWithData table. I can remove the duplicate values once I get the missing data filled in. I am not sure how the best way to go about coding for this situation.
DECLARE @DiamondsMissingData TABLE
(
SKU varchar(20)
,VendorNumber varchar(20)
,VendorSKU varchar(20)
,GradingReportType varchar(20)
,GradingReportNumber bigint
,Price money
,Cost money
,SalePrice money
)
DECLARE @DiamondsWithData TABLE
(
SKU varchar(20)
,VendorNumber varchar(20)
,VendorSKU varchar(20)
,GradingReportType varchar(20)
,GradingReportNumber bigint
,Price money
,Cost money
,SalePrice money
)
INSERT INTO @DiamondsMissingData
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,NULL UNION ALL
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,NULL,27375500 UNION ALL
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,NULL,177940,27375500
INSERT INTO @DiamondsWithData
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,27375500
SELECT * FROM @DiamondsMissingData
SELECT * FROM @DiamondsWithData
Thank You in advance for your assistance, time, and code sample.
Andrew SQLDBA
July 14, 2014 at 2:44 pm
It seems like you're wanting to copy the non-NULL values for any given SKU into the NULL values, correct? If so, this should do it:
;WITH NullFill(SKU,FillPrice,FillCost,FillSalePrice) AS(
SELECT VendorSKU,MAX(Price)AS Price, MAX(Cost) AS Cost, MAX(SalePrice) AS SalePrice
FROM @DiamondsMissingData
GROUP BY VendorSKU
)
UPDATE @DiamondsMissingData
SET Price = CASE WHEN Price IS NULL THEN FillPrice ELSE Price END,
Cost = CASE WHEN Cost IS NULL THEN FillCost ELSE Cost END,
SalePrice = CASE WHEN SalePrice IS NULL THEN FillSalePrice ELSE SalePrice END
FROM @DiamondsMissingData A
INNER JOIN NullFill B
ON A.VendorSKU = B.SKU
That should take the non-NULL value for each of the three potentially-NULL columns, and fill it in where the value is NULL, ignoring it if it's not NULL.
However, this won't work if the same SKU can have multiple values for any of the three nullable columns; if there's a variation in price, for example, this will make all price values the same. In your provided data, this isn't a problem, but if it could happen in your real data, this will be problematic.
If that's the case, a better solution can certainly be worked out; please provide an example of a case where that might happen, and adjustments will be made! 🙂
- 😀
July 14, 2014 at 2:52 pm
This will deduplicate the rows that have missing values. However, if the duplicate rows are the majority, I might use a different approach.
INSERT INTO @DiamondsMissingData
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,NULL UNION ALL
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,NULL,27375500 UNION ALL
SELECT '6157560941C','Crown','CR50135','GIA',6157560941,NULL,177940,27375500
INSERT INTO @DiamondsWithData
SELECT SKU
,VendorNumber
,VendorSKU
,GradingReportType
,GradingReportNumber
,MAX( Price)
,MAX( Cost)
,MAX( SalePrice)
FROM @DiamondsMissingData
WHERE Price IS NULL
OR Cost IS NULL
OR SalePrice IS NULL
GROUP BY SKU
,VendorNumber
,VendorSKU
,GradingReportType
,GradingReportNumber
DELETE FROM @DiamondsMissingData
WHERE Price IS NULL
OR Cost IS NULL
OR SalePrice IS NULL
INSERT INTO @DiamondsMissingData
SELECT *
FROM @DiamondsWithData
July 15, 2014 at 6:54 am
Thank You to you both. That code worked very nicely. I had to change it just slightly, but you got me going down the correct path
Thank you again
Andrew SQLDBA
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply