Transferring data between tables

Question

Transferring data between tables

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110640 More actions · Answer 1

richardmgreen1 - Thursday, April 19, 2018 8:53 AM
Jeff Moden - Thursday, April 19, 2018 8:46 AM
richardmgreen1 - Thursday, April 19, 2018 8:20 AM
ZZartin - Thursday, April 19, 2018 8:08 AM
richardmgreen1 - Thursday, April 19, 2018 8:00 AM
I'm aware you suggested running it against a single table, but this is on our development server (with backups) and we have a lot of tables with this issue.
It looks like we'll need to find another way of loading this data.
I'm still confused as to why it loads into our staging area without an issue but chokes when we move it to the "proper" table for reporting purposes.
How exactly are you moving the data? You mentioned you were doing a merge and doing insert/update are you modifying the data during that process?
We've tried doing a MERGE as well as INSERT/UPDATE separately.
Both give the same error. There's no changes happening to the data at the point of the move.
Jeff Moden - Thursday, April 19, 2018 8:09 AM
richardmgreen1 - Wednesday, April 18, 2018 6:32 AM
The tables are, for the most part, identical.
I know it seems too obvious but I suspect that if you can identify where they are different, you'll have your answer.
I've posted both table create statements in my OP (I think).
There's one extra field in the "actual" table (SYSDateLoaded) and, apart from that and a few minor field name changes to keep to our naming conventions (pk for Primary Key, fk for Foreign Key, etc) I can't see any differences.
Happy for a second (or even third) pair of eyes though.
Yes. I know you've posted the DDL for both tables. I just don't have the time to do a comparison. Just wanted to make sure that you did.
Yeah, I put both create statements into Excel and did a "does this equal that".
That ones that didn't I could explain with the slight change of name (or the inclusion of a primary key in the actual table which gave extra statements).
I'll try it in SQL Compare (now we've managed to get work to pay for the licence).
::edit::
Just used SQL Compare and the tables are identical apart from 4 minor differences:-
1) Small name changes (as mentioned before, this is for our naming conventions)
2) Differences in NULL allowances (allowed on staging table but not on actual table but not for all fields)
3) Actual table has a trigger and a clustered primary key on it.
4) There are two indexes on the actual table. These are nonclustered indexes on fkAccountNumber and fkPatientLocalID
I'm not sure how much difference the trigger will make as it only updates the SYSDateLastUpdated field to today's date so we can track when the data changes. I know the indexes will add a little overhead but don't really want to get rid of them.

Any chance that the one additional column is enough to push you over the edge? Alternatively, any chance one of the indexes ends up too wide?

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

richardmgreen1 SSChampion Points: 10356 More actions · Answer 2

It's possible, the extra column is datetime column (which would add 8 bytes).
The primary key column add up to a maximum of 67 bytes. That's these columns:-
[pkDocumentID] [VARCHAR](30)
[pkEditedByStaffID] [VARCHAR](25)
[pkEditedDateTime] [DATETIME]

The current maximum length on the above varchar columns is 24 and 10 characters respectivly.

I'm not sure about the non-clustered indexes but they are on columns that are:-
[fkAccountNumber] [VARCHAR](12)
[fkPatientLocalID] [VARCHAR](10)

They are fixed lengths and are always that length.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Jeff Moden - Thursday, April 19, 2018 8:09 AM
richardmgreen1 - Wednesday, April 18, 2018 6:32 AM
The tables are, for the most part, identical.
I know it seems too obvious but I suspect that if you can identify where they are different, you'll have your answer.

They are different - the target table has an extra column
[SYSDateLoaded] [DATETIME] NOT NULL
I wonder if it's just this which is tipping it over the edge?
More to the point, as others have pointed out this degree of denormalisation isn't a good idea. Normalising the data would eliminate the problem, take up far less storage overall and make some queries far simpler.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

richardmgreen1 SSChampion Points: 10356 More actions · Answer 4

I'd spotted that (see my previous reply) but I'm not sure if that would tip me over the edge (figuratively speaking).

We could normalise the data but it would mean departing from our standard of adding the documents in their current form.
I'm also not sure how we would split columns to become rows (outer apply perhaps?).

As previously mentioned, we've asked for the document to be changed to repeating sections so we only get one copy of each field but we need to load this data regardless.

I'm going to try leaving the SYSDateLoaded field blank (i.e. NULL) to see if that helps.

Phil Parkin SSC Guru Points: 246977 More actions · Answer 5

richardmgreen1 - Friday, April 20, 2018 5:01 AM

We could normalise the data but it would mean departing from our standard of adding the documents in their current form.
I'm also not sure how we would split columns to become rows (outer apply perhaps?).

Despite the fact that this may sound like a lot of work, I would recommend it. People here would be happy to help with
a) The normalisation process
b) Any queries required to support usage of the normalised table(s)

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

richardmgreen1 SSChampion Points: 10356 More actions · Answer 6

Thanks all

Unfortunately, I can't make that decision arbitrarily.

I'll have a chat with my boss and see what he says.
I think we'll have to go that way, but he likes to have uniformity in code, etc.

Jeff Moden SSC Guru Points: 1004402 More actions · Answer 7

richardmgreen1 - Friday, April 20, 2018 6:52 AM
Thanks all
Unfortunately, I can't make that decision arbitrarily.
I'll have a chat with my boss and see what he says.
I think we'll have to go that way, but he likes to have uniformity in code, etc.

I wouldn't blame this table on "uniformity". 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

richardmgreen1 SSChampion Points: 10356 More actions · Answer 8

Me neither, but the load process is uniform.
I'll fully agree the document is cr@p.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 9

richardmgreen1 - Friday, April 20, 2018 7:19 AM
Me neither, but the load process is uniform.
I'll fully agree the document is cr@p.

So the reporting system, the final resting place of this data, has the same denormalised structure as the two tables you've shown?
I was about to post something similar to Jeff's post but with less restraint - however I don't think it's totally clear yet what the final structure is, which is why I'm asking. And of course, if you already have reports running off this structure, you're not going to change it on a whim. Perhaps nudging a SQL Server limit is sufficient motivation to revisit your table structures.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

richardmgreen1 SSChampion Points: 10356 More actions · Answer 10

At that stage, yes.
Personally, I don't think we'll ever use this particular document for reporting purposes but we've told to load it "just in case".
We do need to use some of the tables that have the same problem so it looks like a rethink is in order.

Other documents we load are a bit more sensible with repeating sections. In those, we have one set of fields (e.g.
medicationStoppedAndReason_MedicationStoppedAndReason
medicationStoppedAndReason_Name
(just as a couple of fields from this problem table))
but they only appear once with a numeric marker to say which set of medication (in this case) people are talking about.

ScottPletcher SSC Guru Points: 101118 More actions · Answer 11

Again, the easiest solution is to simply split into two tables.

You can even keep the original table name if you:
1) Create the new table, with all the key columns and roughly half the other data columns, just adding a '2' to the original table name
2) Copy the data from existing table to new table.
3) Remove the data columns that are now in table 2 from original table.
4) Rename original table to add a '1' suffix to it.
5) Create a view that combines the two tables to produce the complete original table structure,

I believe you could use an INSTEAD OF INSERT trigger on the view to allow you to INSERT to the view name but have it properly inserted into the split tables underneath.

Overall, I think you'd need very few, if any, changes to existing code.

Whereas trying to normalize the structure would almost certainly require fairly extensive changes to code, along with being a completely new approach to existing users and developers.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

richardmgreen1 SSChampion Points: 10356 More actions · Answer 12

We've decided to split the table into sections (with _1, _2, etc. where necessary).
We kept the first few columns static in each table so we can join them together if we need to.

Seems to be working for now.

I guess we'll be writing views to stitch everything back together for a while.

Thanks all.

Jeff Moden SSC Guru Points: 1004402 More actions · Answer 13

richardmgreen1 - Wednesday, April 25, 2018 3:42 AM
We've decided to split the table into sections (with _1, _2, etc. where necessary).
We kept the first few columns static in each table so we can join them together if we need to.
Seems to be working for now.
I guess we'll be writing views to stitch everything back together for a while.
Thanks all.

You've done the naming correctly.... don't write the views... write some code that automatically detects that table sections and sews them together.

Also, I'm curious to know what you decided to base the split on. If you did it right, you should put a constraint on that column for each table section and then your views will actually be partitioned views, which have some advantages.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

richardmgreen1 SSChampion Points: 10356 More actions · Answer 14

To be honest, it wasn't very scientific.
We split them into 300 column "lumps".
We've quite of few of these stupidly-wide tables to load but didn't notice at the time.

From my original post, we kept the following fields in each of the new tables:-
[DocumentID]
[EditedByStaffID]
[EditedDateTime]
[AccountNumber]
[PatientLocalID]
[CreatedByStaffID]
[CreatedDateTime]
[DocumentStatus]

I'm sorry, I didn't quite understand this bit:-

If you did it right, you should put a constraint on that column for each table section and then your views will actually be partitioned views, which have some advantages.

Jeff Moden SSC Guru Points: 1004402 More actions · Answer 15

richardmgreen1 - Wednesday, April 25, 2018 7:51 AM
To be honest, it wasn't very scientific.
We split them into 300 column "lumps".
We've quite of few of these stupidly-wide tables to load but didn't notice at the time.
From my original post, we kept the following fields in each of the new tables:-
[DocumentID]
[EditedByStaffID]
[EditedDateTime]
[AccountNumber]
[PatientLocalID]
[CreatedByStaffID]
[CreatedDateTime]
[DocumentStatus]
I'm sorry, I didn't quite understand this bit:-
If you did it right, you should put a constraint on that column for each table section and then your views will actually be partitioned views, which have some advantages.

Ah... my bad. If forgot you were doing vertical splits instead of horizontal splits.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)