April 29, 2004 at 10:00 am
I'd like to get a sanity check on this, and any comments regarding 'been there done that watch out for x'.
Objective:
Must support legacy business processes. Need to be able to report ALL constraint violations within a repetitive mass insert/update at once so that they can be submitted in a report to multiple PoCs for correction.
Where I'm at now:
I don't want to force maint of TWO sets of constraints, foreign keys, etc. (one set in the schema, another in an import validation procedure). I don't want to use triggers for validation instead of constraints.
dbo.fnHelpConstraint(@TableName)
table function from a mutilated version of sp_helpconstraint that returns customized data
returned data includes 'ValidationQuery' field containing a pre-built select statement
UTIL_ReportConstraintViolations sp
@ConstraintTable nvarchar(128)
,@ErrorTableToCreate_WithOverwrite nvarchar(128)
,@ViolationsOccured int output
I intend to put a before trigger on the insert/update into the products table. Inside it, I'll need to copy the entire contents of 'inserted' into a table that can be seen outside the trigger, then call UTIL_ReportConstraintViolations to generate a 'static' dataset of all constraint violations for reporting.
I am still a bit shaky on the trigger/error/begintran/endtran/rollback stuff. but..
import process
wipe report table
set @Trancount = @@Trancount
begin tran
insert new records (before trigger fires)
Build an out-of-scope visible table from 'inserted'
using fnHelpConstraint, iterate through the schema-driven validation queries and fill the report table with violations
if violations did not occur
insert into products table from inserted
update existing records (before trigger fires)
Build an out-of-scope visible table from 'inserted'
using fnHelpConstraint, iterate through the schema-driven validation queries and fill the report table with violations
if violations did not occur
insert into products table from inserted
if @@trancount <> @Trancount + 1
commit tran
if @TranCount <> @Trancount
warn of error
if @violationsoccured
warn of insert and update issues seperately
advise calling report procedure for correction of issues
Thoughts?
Background:
I work for a company (8 years) that has horrible business processes (but they do work). There are no business process analysts, per se. Any time a business process change (even approved, correct, and well thought out) becomes part of an IT project, the solution ends up in either disuse or misuse. I am in the process of migrating off of a FoxPro table solution that is being used on a WAN, to SQL.
It's a complex line to walk, but typically it seems that as long as the legacy-replacement project supports the current business processes, and meets the required functionality, it'll get used. The trick is to not create another monster... whee.
There are two products tables that have evolved on their own (i.e. are similar). The pricing, etc. for ~20 country 'categories' (legal considerations etc) is created by a changing conglomerate cluster of folks scattered across this office, the UK office, and the NZ office (which handles Japanese orders only). Then they dump the imports and change requests on an old-timer here, who just manually goes out and does most of the imports and changes. He does have a specific format he accepts data for import/replacement in when they do a full published price list update though. There is also an app where non-IT people can update, add, delete individual items. My goal is to support the current business process of price changes.
If I make it too difficult to do the mass imports, the stack of cards that centers around the old-timer doing his updates will all fall down, and I'll get sucked into the business process. I do NOT, however, want a SQL schema that has no constraints or a bad design - it'll make any front ends using the products data error-prone.
April 29, 2004 at 10:04 pm
My thoughts are that if you are trying to import data that is either delimited, fixed field, or a combo of both, you should probably take the time to use BCP. Yeah, I know... a bit cryptic but well worth the effort. Will create runtime log files, error log files, and put "bad records" in a holding file. And of the three methods of importing data (the other two are Bulk Insert and DTS), it is the hands down winner.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 30, 2004 at 7:53 am
Won't BCP only record single constraint violations? I've only used it once, and just did a simple dump OUT. I should read up on it.
Thanks
May 1, 2004 at 7:41 pm
It won't actually report what constraint was violated but it will reject any record to the "error log" that either violates a constraint, key, foreign key, data-type, or column width. Using the -m parameter, you can set it to allow, say, 20 such errors (actually, any number of records that have any error). If more records than the number you set have errors, the BCP run will shut down.
Provided that the same number of delimiters occur in the header as what are identified in the Format file, you can even tell it to skip the header without writing a separate program. If a record has the wrong number of delimiters or is missing the "end of record" mark, BCP may take out the bad record and the record that follows until it gets "back in synch" (which would be true of nearly any import except those that read a whole record first).
The format file can be a real pain to get right but it's well worth it. You can even tell BCP to skip unwanted fields by formatting the format file correctly. I've even done doubled quoted comma seperated values with it.
p.s. I totally agree with your decision not to use triggers. About the only time I use them is to guarantee referential integrity across remote servers. Even then, if other forms of "RI" worked, I wouldn't use triggers.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2004 at 8:03 am
After testing, BCP is fast (no suprise there), and it does give the error log nicely. I'll likely use it over DTS (it has it's uses but oof I can't stand those wierd no-info errors it gives) for importing history data when it's needed.
This time however, because there is an explicit ongoing small-import process to update price and VAT changes for the UK office, I gave it a no-constraint mirror of the target table and am going to use that function I wrote to spit out the constraint violation report. The table can also be used to store the data once validated so that it can be pushed live on a schedule (another need). Using this method I don't have to tinker with the before triggers, etc.
The UK office publishes price sheets, where the amounts must match exactly. Once they print them, it costs a lot of money to throw a bunch away due to a small error. If it wasn't for there being failed projects in the past by others that have attempted to implement business process changes (i.e. the published price sheet should in fact be a report that gets printed out, not the other way around), then I'd approach it differently.
May 4, 2004 at 7:21 pm
Cmore,
Here's one more little "secret" of mine ... I never use BCP to import to a "live" table . I always use it to populate a "staging" table and then do any extra validation I need in the staging table before I move it (usually just an INSERT). The staging table can have all sorts of extra columns to store pertinent "values" for repeated testing on each batch of records imported. Cleanup is a snap... I just TRUNCATE the staging table (starts IDs at "1" again which can be a huge help! ) which also keeps a bazillion transactions out of the LDF (log) file of the database.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply