Add Column to Staging table

Question

Add Column to Staging table

JJR333

Default port

Points: 1469
More actions
November 29, 2017 at 11:22 am

#341040

Hi,
The code below used to work fine.
It is part of a stored procedure which bulk inserts data from a csv file.
Suddenly, after editing, I keep on getting an error stating that RowNr does not exist.
Any ideas?
Cheers,
Julian
Netherlands
/*##############################################   ###   ### FIND DUPLICATE ROWS AND DELETE   ### THESE ROWS EXEPT ONE      ###      ############################################## */
  -- ADD COLUMN TO STORE ROW NUMBER   --ALTER TABLE Bewoners_STAGING   --ADD RowNr INT;      SET @MYSQL = 'ALTER TABLE Bewoners_STAGING ADD RowNr INT;'   EXEC (@MYSQL);
  -- ADD NUMBER OF ROWS TO FIELD RowNr   UPDATE Bewoners_STAGING   SET RowNr =  R.RowNr           FROM(           SELECT ROW_NUMBER () OVER (PARTITION BY zzpclient_id ORDER BY [zzpclient_id]) AS RowNr,               zzpclient_id           FROM [Bewoners_STAGING]) R   WHERE R.zzpclient_id = Bewoners_STAGING.zzpclient_id      -- DELETE ROWS WHERE ROW NUMBER > 1   DELETE FROM Bewoners_STAGING WHERE RowNr > 1

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101126 More actions · Answer 1

SQL must parse the code before it runs. At parse time, the column "RowNr" indeed does not exist because the code that adds the column to the table hasn't run yet.

You could use dynamic SQL to run the UPDATE as well. I believe that would work.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

drew.allen SSC Guru Points: 76997 More actions · Answer 2

You're making this much more complicated than it needs to be. You're 1) adding a column, 2) updating the column, 3) deleting rows based on the column values, and presumably 4) dropping the column. You only need one step: delete duplicates.

WITH dupes_cte AS ( SELECT ROW_NUMBER () OVER (PARTITION BY zzpclient_id ORDER BY [zzpclient_id]) AS RowNr, zzpclient_id FROM [Bewoners_STAGING] )

DELETE dupes_cte WHERE RowNr > 1

Also notice that your RowNr definition is non-deterministic. This may cause issues down the line.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

JJR333 Default port Points: 1469 More actions · Answer 3

Drew and Scott,
Thank you.
Cheers,
Julian

USE xxx; GO WITH MYCTE AS ( SELECT COUNT (*) OVER (PARTITION BY zzpclient_id) AS [Rows], ROW_NUMBER () OVER (PARTITION BY zzpclient_id ORDER BY [zzpclient_id], [startdate] desc) AS [RowNr], * FROM CLIENTENOVERZICHT )

SELECT * FROM MYCTE WHERE RowNr = 1 ORDER BY ROWS DESC, RowNr ASC

JJR333 Default port Points: 1469 More actions · Answer 4

drew.allen - Wednesday, November 29, 2017 1:31 PM
You're making this much more complicated than it needs to be. You're 1) adding a column, 2) updating the column, 3) deleting rows based on the column values, and presumably 4) dropping the column. You only need one step: delete duplicates.
WITH dupes_cte AS (     SELECT ROW_NUMBER () OVER (PARTITION BY zzpclient_id ORDER BY [zzpclient_id]) AS RowNr,         zzpclient_id     FROM [Bewoners_STAGING] )
DELETE dupes_cte WHERE RowNr > 1
Also notice that your RowNr definition is non-deterministic. This may cause issues down the line.
Drew
--------------------------------------------
USE XXX; GO WITH MYCTE AS ( SELECT         --COUNT (*) OVER (PARTITION BY zzpclient_id) AS [Rows],         ROW_NUMBER () OVER (PARTITION BY zzpclient_id ORDER BY [zzpclient_id], [startdate] desc) AS [RowNrS],         * FROM    testBewonersSTAGING )
DELETE FROM MYCTE WHERE     RowNrS > 1

Hi Drew,
Now I see what you mean. Great!
Q: why does the delete statement "do its work" on the Staging table?
Cheers,
Julian

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

JJR333 - Thursday, November 30, 2017 2:01 PM
Hi Drew,
Now I see what you mean. Great!
Q: why does the delete statement "do its work" on the Staging table?
Cheers,
Julian

This follows the rules of "updatable views" as either a view or cte don't have any data stores and simply reference a table (or group of tables), any DML will affect those underlying tables.

Also, ROW_NUMBER is deterministic when the PARTITION and ORDER clauses refer to unique rows and not duplicates.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

JJR333 Default port Points: 1469 More actions · Answer 6

JJR333

Default port

Points: 1469

November 30, 2017 at 2:30 pm

#1970182

Thank you Luis!

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 7

Deletes are pretty expensive and they're fully logged even in the SIMPLE recovery model. Why not just use the generated "dupe number" (row number) to process only the data you want?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

JJR333 Default port Points: 1469 More actions · Answer 8

Thank you Jeff, have to look into that.
Cheers,
Julian