April 8, 2013 at 5:00 am
Hi,
I have a complex problem that I'm pretty sure should have a very simple solution. If I could just find it.
I am trying to import data from a flat file aka Raw_Build_List:
Contract Description varchar(30)
Section Description varchar(25)
Item Description varchar(30)
Quantity integer
I am trying to write a script that will pull it into a relational database to use with other data. The script will have to run every night. The Raw data may contain records processed before.
I have 4 tables:
Contracts (ID, Contract_Description, etc),
Sections (ID, Section, etc),
Items (ID, Description, Category, etc) and
Build_list (ContractID, SectionID, ItemID, Qty, StartDate, etc)
To import the data I do multiple steps.
1. Add 3 extra fields to Raw_Build_List: ContractID, SectionID and ItemID.
2. I run a conditional insert into each table (Contracts, Sections and Items) using a group by eg.
INSERT INTO Contracts (Contract_Description)
SELECT ContractDesctiption
GROUP BY ContractDesctiption
WHERE ContractDescription NOT IN (SELECT Contract_Description FROM Contracts)
3. Then using the IDs generated by the Identity Fields I update the new fields in the Raw_Build_List table. eg.
UPDATE Raw_Build_List
SET ContractID = ID
FROM Contracts LEFT OUTER JOIN Raw_Build_List
ON Contracts.Contract_Description = Raw_Build_List.ContractDesctiption
4. Now in the Raw_Build_List I have all the key fields against all the correct descriptions and I want to put all those keys into the Build_list table and not create duplicates.
Is there a non-RBAR way of doing this?
The one way I thought may work is to string all the IDs together and compare them to the string of all the IDs in the other table and it does work but as expected is really slow. RIGHT('00000000'+LTRIM(RTRIM(ContractID))+RIGHT('00000000'+LTRIM(RTRIM(SectionID))+RIGHT('00000000'+LTRIM(RTRIM(ItemID))
NOTE: I had to pad it as Contract=1, Section=24 and Item=45 (joined = 12445) will match up with Contract=12, Section=44 and Item 5 (joined = 12445).
There has to be a better way of doing this...:blink:
April 8, 2013 at 6:26 am
Does this help?
-- set up some sample data
DROP TABLE #Contracts
CREATE TABLE #Contracts (ID INT IDENTITY, Contract_Description VARCHAR(50), StartDate DATETIME)
DROP TABLE #Sections
CREATE TABLE #Sections (ID INT IDENTITY, Section VARCHAR(50))
DROP TABLE #Items
CREATE TABLE #Items (ID INT IDENTITY, [Description] VARCHAR(50), Qty INT)
DROP TABLE #Build_list
CREATE TABLE #Build_list (
ContractID INT,
SectionID INT,
ItemID INT,
Contract_Description varchar(30),
Section_Description varchar(25),
Item_Description varchar(30),
Qty INT,
StartDate DATETIME)
INSERT INTO #Build_list (Contract_Description, Section_Description, Item_Description, Qty, StartDate)
SELECT 'Contract 1', 'Section 1', 'Item 1', 10, GETDATE()-22 UNION ALL
SELECT 'Contract 2', 'Section 2', 'Item 2', 20, GETDATE()-21 UNION ALL
SELECT 'Contract 3', 'Section 3', 'Item 3', 30, GETDATE()-20 UNION ALL
SELECT 'Contract 3', 'Section 3', 'Item 3', 40, GETDATE()-19 UNION ALL
SELECT 'Contract 4', 'Section 4', 'Item 4', 50, GETDATE()-18 UNION ALL
SELECT 'Contract 5', 'Section 5', 'Item 5', 60, GETDATE()-17
-- create a table to receive OUTPUT rows
DROP TABLE #InsertedContracts
CREATE TABLE #InsertedContracts (ID INT, Contract_Description VARCHAR(50), StartDate DATETIME)
-- insert distinct new values from source table into contract table,
-- output new ID's etc into temp table
-- Contract_Description, StartDate are keys for markback
INSERT INTO #Contracts (Contract_Description, StartDate)
OUTPUT inserted.ID, inserted.Contract_Description, inserted.StartDate INTO #InsertedContracts
SELECT
Contract_Description,
StartDate
FROM (
SELECT
Contract_Description,
StartDate,
rn = ROW_NUMBER() OVER(PARTITION BY Contract_Description ORDER BY StartDate)
FROM #Build_list b
WHERE NOT EXISTS(SELECT 1 FROM #Contracts c WHERE c.Contract_Description = b.Contract_Description)
) s
WHERE rn = 1
-- markback source table
UPDATE b SET ContractID = c.ID
FROM #Build_list b
INNER JOIN #InsertedContracts c
ON c.Contract_Description = b.Contract_Description AND c.StartDate = b.StartDate
-- check results
SELECT * FROM #Build_list
SELECT * FROM #Contracts
/* a future release of SQL Server will allow you to do this in one step using composable dml:
UPDATE b SET ContractID = c.ID
FROM #Build_list b
INNER JOIN (
INSERT INTO #Contracts (Contract_Description, StartDate)
OUTPUT inserted.ID, inserted.Contract_Description, inserted.StartDate INTO #InsertedContracts
SELECT
Contract_Description,
StartDate
FROM (
SELECT
Contract_Description,
StartDate,
rn = ROW_NUMBER() OVER(PARTITION BY Contract_Description ORDER BY StartDate)
FROM #Build_list b
WHERE NOT EXISTS(SELECT 1 FROM #Contracts c WHERE c.Contract_Description = b.Contract_Description)
) s
WHERE rn = 1
) c (ID, Contract_Description, StartDate)
ON c.Contract_Description = b.Contract_Description AND c.StartDate = b.StartDate
*/
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
April 8, 2013 at 8:21 am
Not really let me put a full copy of the example:
--Okay Slightly different sample data
DROP TABLE Contracts
CREATE TABLE Contracts (ID INT IDENTITY, Contract_Description VARCHAR(50), StartDate DATETIME)
DROP TABLE Sections
CREATE TABLE Sections (ID INT IDENTITY, Section VARCHAR(50))
DROP TABLE Items
CREATE TABLE Items (ID INT IDENTITY, [Description] VARCHAR(50), Category VARCHAR(15))
DROP TABLE Build_list
CREATE TABLE Build_list (
ContractID INT,
SectionID INT,
ItemID INT,
Qty INT,
FlagCollect BIT)
DROP TABLE Raw_Build_list
CREATE TABLE Raw_Build_list (
Contract_Description varchar(30),
Section_Description varchar(25),
Item_Description varchar(30),
Qty INT)
INSERT INTO Raw_Build_list (Contract_Description, Section_Description, Item_Description, Qty)
SELECT 'Build Car 1', 'Trunk', 'Trunk Latch', 2
UNION ALL SELECT 'Build Car 1', 'Trunk', 'Cover', 1
UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Front Door', 1
UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Indicator', 2
UNION ALL SELECT 'Build Car 1', 'Drivers Side', 'Handle', 2
UNION ALL SELECT 'Build Car 1', 'Passenger Side', 'Front Door', 1
UNION ALL SELECT 'Build Car 2', 'Trunk', 'Latch', 2
UNION ALL SELECT 'Build Car 2', 'Trunk', 'Cover', 1
UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Front Door', 1
UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Back Slide Door', 1
UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Indicator', 2
UNION ALL SELECT 'Build Car 2', 'Drivers Side', 'Handle', 2
UNION ALL SELECT 'Build Car 2', 'Passenger Side', 'Front Door',1
INSERT INTO Contracts (Contract_Description, StartDate)
SELECT 'Build Car 1', GETDATE()- 12
INSERT INTO Sections (Section)
SELECT 'Trunk' UNION ALL
SELECT 'Drivers Side' UNION ALL
SELECT 'Passenger Side'
INSERT INTO Items (Description, Category)
SELECT 'Front Door', 'BODY WORK' UNION ALL
SELECT 'Trunk Latch', 'SMALLS' UNION ALL
SELECT 'Cover', 'BODY WORK' UNION ALL
SELECT 'Indicator', 'LIGHTS' UNION ALL
SELECT 'Handle', 'SMALLS'
INSERT INTO Build_list (ContractID, SectionID, ItemID, Qty)
SELECT 1, 1, 2, 2
UNION ALL SELECT 1, 1, 3, 1
UNION ALL SELECT 1, 2, 1, 1
UNION ALL SELECT 1, 2, 4, 2
UNION ALL SELECT 1, 2, 5, 2
--SELECT * FROM Contracts
--SELECT * FROM Sections
--SELECT * FROM Items
--SELECT * FROM Build_list
--SELECT * FROM Raw_Build_list
--HERE IS WHERE THE WORK STARTS
ALTER TABLE Raw_Build_List ADD ContractID bigint
ALTER TABLE Raw_Build_List ADD SectionID bigint
ALTER TABLE Raw_Build_List ADD ItemID bigint
--INSERT ALL THE CONTRACTS
INSERT INTO Contracts (Contract_Description)
SELECT Contract_Description
FROM Raw_Build_list
GROUP BY Contract_Description
HAVING (NOT (Contract_Description IN
(SELECT Contract_Description
FROM Contracts)))
--UPDATE ID FIELD FOR CONTRACTS
UPDATE Raw_Build_list
SET ContractID = Contracts.ID
FROM Contracts LEFT OUTER JOIN
Raw_Build_list ON Contracts.Contract_Description = Raw_Build_list.Contract_Description
--INSERT ALL THE SECTIONS
INSERT INTO Sections (Section)
SELECT Section_Description
FROM Raw_Build_list
GROUP BY Section_Description
HAVING (NOT (Section_Description IN
(SELECT Section
FROM Sections)))
--UPDATE ID FIELD FOR SECTIONS
UPDATE Raw_Build_list
SET SectionID = Sections.ID
FROM Sections LEFT OUTER JOIN
Raw_Build_list ON Sections.Section = Raw_Build_list.Section_Description
--INSERT ALL THE ITEMS
INSERT INTO Items (Description)
SELECT Item_Description
FROM Raw_Build_list
GROUP BY Item_Description
HAVING (NOT (Item_Description IN
(SELECT Description
FROM Items)))
--UPDATE ID FIELD FOR ITEMS
UPDATE Raw_Build_list
SET ItemID = Items.ID
FROM Items LEFT OUTER JOIN
Raw_Build_list ON Items.Description = Raw_Build_list.Item_Description
--NOW THIS WORKS BUT IS VERY, VERY SLOW __THERE HAS TO BE A BETTER WAY TO DO THIS
INSERT INTO Build_list
(ContractID, SectionID, ItemID, Qty)
SELECT ContractID, SectionID, ItemID, Qty
FROM Raw_Build_list
WHERE ((RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)) NOT IN
(SELECT RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)
AS Joinid
FROM Build_list))
--The results of these two queries should be identical
SELECT Contract_Description, Section_Description, Item_Description, Qty
FROM Raw_Build_list
SELECT Contracts.Contract_Description, Sections.Section, Items.Description, Build_list.Qty
FROM Build_list LEFT OUTER JOIN
Items ON Build_list.ItemID = Items.ID LEFT OUTER JOIN
Sections ON Build_list.SectionID = Sections.ID LEFT OUTER JOIN
Contracts ON Build_list.ContractID = Contracts.ID
April 8, 2013 at 8:27 am
Try changing
INSERT INTO Build_list
(ContractID, SectionID, ItemID, Qty)
SELECT ContractID, SectionID, ItemID, Qty
FROM Raw_Build_list
WHERE ((RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)) NOT IN
(SELECT RIGHT('000000' + LTRIM(RTRIM(ContractID)), 6) + RIGHT('000000' + LTRIM(RTRIM(SectionID)), 6) + RIGHT('000000' + LTRIM(RTRIM(ItemID)), 6)
AS Joinid
FROM Build_list))
to
INSERT INTO Build_list
(ContractID, SectionID, ItemID, Qty)
SELECT ContractID, SectionID, ItemID, Qty
FROM Raw_Build_list
EXCEPT
SELECT ContractID, SectionID, ItemID, Qty
FROM Build_list
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537April 9, 2013 at 2:49 am
Thanks Mark that is exactly what I'm looking for!
Simple and fast!
๐
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply