Idempotent

  • Just as a general call, if anyone is interested in sharing how they do this for specific code, perhaps an example of how you handle data, config, table changes, etc. in an idempotent way, we'd love to have examples people can use to better understand how they can transition to this in practice.

  • Steve Jones - SSC Editor (5/6/2015)


    Bradley Deem (5/6/2015)


    I've been writing idempotent scripts for years. However, even that's not the best defense against a script run twice. I've found it's more economical to write a guard at the top of the script to check database/app version or existence of a change already being performed.

    In other words, stop the script immediately at start instead of reading through 20k lines to make sure everything was idempotent.

    Any examples here? What do you use to check for db version?

    Unfortunately, I don't have a SQL Server example, but here's the code in Oracle using SQLPLus. The key to this solution is to carry the upgrade version in a script variable that would survive DDL and DML changes.

    -- Required database version for upgrade. This is the upgrade "from" version.

    DEF db_version_expected = '6'

    -- New database version after upgrade scripts. This is the upgrade "to" version.

    DEF db_version_new = '7'

    PROMPT

    PROMPT -- Validate Database Version --

    PROMPT

    WHENEVER SQLERROR EXIT SQL.SQLCODE

    BEGIN

    DECLARE

    db_version_actual VARCHAR2(200);

    BEGIN

    SELECT VALUE

    INTO db_version_actual

    FROM schema.config

    WHERE KEY = 'DB_VERSION'

    ;

    -- Halt on invalid database version

    IF db_version_actual <> '~db_version_expected' THEN

    RAISE_application_error(-20000,'Database version is incorrect expected '''

    || '~db_version_expected' || ''', found ''' || db_version_actual || '''.');

    END IF;

    -- Expected database version found

    -- Set the database version to an upgrade state.

    UPDATE schema.config

    SET VALUE = '~db_version_expected.U' -- Version is set to #U, ie '7U' for upgrade.

    WHERE KEY = 'DB_VERSION';

    COMMIT;

    END;

    END;

    /

    -- Halt on failure

    WHENEVER SQLERROR EXIT -1

    WHENEVER OSERROR EXIT -2

    PROMPT

    PROMPT -- Begin Upgrade Scripts --

    PROMPT

    -- DO WORK HERE

    PROMPT

    PROMPT -- End Upgrade Scripts --

    PROMPT

    -- Upgrade Database version

    UPDATE schema.CONFIG

    SET VALUE = '~db_version_new'

    WHERE KEY = 'DB_VERSION';

    COMMIT;

    PROMPT

    PROMPT -- End Upgrade Scripts --

    PROMPT

  • Bradley Deem (5/6/2015)


    Unfortunately, I don't have a SQL Server example, but here's the code in Oracle using SQLPLus. The key to this solution is to carry the upgrade version in a script variable that would survive DDL and DML changes.

    Interesting. Certainly you can track db version in the database. quite a few vendors do this and it works well as long as you can keep control of the db and ensure no drift (changes unexpected) to the database. Otherwise your version scripts still need idempotent code to account for things.

    However I'm curious, if your script fails halfway through, you have a version at 7U, not 6 or 7 and now your upgrade script won't run. Do you manually reset to 6 and your script skips the sections that worked? Do things roll back?

  • I learnt the term idempotent a few years ago, even though I ensured my code was re-runable long before that.

    Personally, I don't trust version numbers in a database, they are a useful first glance when comparing systems but unless security is locked down, something I have yet to see properly implemented, you cannot guarantee someone hasn't gone into the system and changed something manually following a deployment.

    As requested Steve, here's how I write most of my scripts:

    -- Always include the database that the script is expected to be run against

    USE MyDatabase;

    GO

    -- Inserting new data.

    -- I use merge as a preference so my solution is always consistent,

    -- rather than using IF NOT EXISTS ... INSERT sometimes and MERGE on others

    MERGE dbo.MyTable t

    USING (

    VALUES (1, 'Value1')

    , (2, 'Value2')

    ) s (Id, Value) ON s.Id = t.Id

    WHEN NOT MATCHED BY TARGET THEN

    INSERT (Id, Value)

    VALUES (s.Id, s.Value);

    GO

    -- Altering/creating sp's/functions/views/triggers

    IF (OBJECT_ID('dbo.MyProc') IS NULL)

    EXEC ('CREATE PROCEDURE dbo.MyProc AS SELECT 1;');

    GO

    ALTER PROCEDURE dbo.MyProc

    AS

    BEGIN

    SET NOCOUNT ON;

    END;

    GO

    -- New table, depends on the situation, sometimes I drop if it exists, sometimes I leave the table

    IF (OBJECT_ID('dbo.MyTable') IS NOT NULL)

    DROP TABLE dbo.MyTable;

    GO

    CREATE TABLE dbo.MyTable (

    Id INT NOT NULL CONSTRAINT PK_dbo_MyTable PRIMARY KEY CLUSTERED

    , Value VARCHAR(10) NOT NULL

    );

    GO

    -- OR...

    IF (OBJECT_ID('dbo.MyTable') IS NULL)

    CREATE TABLE dbo.MyTable (

    Id INT NOT NULL CONSTRAINT PK_dbo_MyTable PRIMARY KEY CLUSTERED

    , Value VARCHAR(10) NOT NULL

    );

    -- New column

    IF (NOT EXISTS (SELECT 1 FROM sys.columns WHERE name = 'NewColumn' AND [object_id] = OBJECT_ID('dbo.MyTable')))

    ALTER TABLE dbo.MyTable ADD NewColumn BIT NOT NULL CONSTRAINT DF_dbo_MyTable_NewColumn DEFAULT (1);

    -- I may drop the constraint if not required, always checking

    IF EXISTS (SELECT 1 FROM sys.default_constraints WHERE name = 'DF_dbo_MyTable_NewColumn')

    ALTER TABLE dbo.MyTable DROP CONSTRAINT DF_dbo_MyTable_NewColumn;

    Writing deployment scripts like this is so important. The pain of one command in one script generating an error, and then picking through it all find the bits you need to re-run (that command may have had a knock on effect to following commands) is something I really hate.

  • Steve Jones - SSC Editor (5/6/2015)


    Gary Varga (5/6/2015)


    This also applies to application code. Not just scripts. I use this technique in stored procedures too.

    Examples? Want to share in an article, Gary?

    I'll add it to my list. This should only be a brief thing so whilst I could question the value you, Mr Jones, could question any excuse I could come up with for not writing it ๐Ÿ˜‰

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Interesting. Certainly you can track db version in the database. quite a few vendors do this and it works well as long as you can keep control of the db and ensure no drift (changes unexpected) to the database. Otherwise your version scripts still need idempotent code to account for things.

    Correct. You must have the discipline to deploy version next without drift. If you experience drift you should integrate that back into your code base before deployment of version next. If tracking a version is not feasible, for example due to a chaotic environment where anybody can make changes whenever, then you must pick something else to verify has not been completed. For example, you could check that the first thing performed in your script has not been performed.

    However I'm curious, if your script fails halfway through, you have a version at 7U, not 6 or 7 and now your upgrade script won't run. Do you manually reset to 6 and your script skips the sections that worked? Do things roll back?

    So it's assumed that the deployment process has gone through the necessary quality control gates that this will rarely happen. If it does happen then the database would be in an unexpected state. This would require manual intervention to correct and resume the script from the point of failure. Alternatively, a rollback to before the deployment could be performed via Flashback (Oracle) / Snapshot (SQL Server) or a Restore. Admittedly, it's even more rare to perform a flashback or restore.

    The key to a successful deployment despite a failure is to write idempotent scripts when practical to save on headache time in the event of failure. Also, multiple idempotent scripts are not necessary idempotent when performed together. This integration side effect is even more difficult to spot when multiple developers are involved.

  • 'Flashback'. What is that?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (5/7/2015)


    'Flashback'. What is that?

    Sorry, Oracle terminology. In SQL Server this would be Snapshot. Regardless, use technology to allow you to rollback to the point in time before deployment without the need of a restore operation. I've edited my previous post for clarification.

  • Bradley Deem (5/7/2015)


    Phil Parkin (5/7/2015)


    'Flashback'. What is that?

    Sorry, Oracle terminology. In SQL Server this would be Snapshot. Regardless, use technology to allow you to rollback to the point in time before deployment without the need of a restore operation. I've edited my previous post for clarification.

    Has anyone here actually used Dabase Snapshots to rollback a deployment? Will it rollback only DML and DDL modifications made by your session, or is it a point in time rollback for the entire database?

    Oracle's Flashback featureset goes well beyond SQL Server snapshots. For example, they have some interesting SELECT statement extensions for querying previous versions of table rows.

    -- The following query retrieves the state of the employee record for JOHN at 9:30AM, April 4, 2003:

    SELECT * FROM employee AS OF TIMESTAMP

    TO_TIMESTAMP('2003-04-04 09:30:00', 'YYYY-MM-DD HH:MI:SS')

    WHERE name = 'JOHN';

    -- This update then restores John's information to the employee table:

    INSERT INTO employee

    (SELECT * FROM employee AS OF TIMESTAMP

    TO_TIMESTAMP('2003-04-04 09:30:00', 'YYYY-MM-DD HH:MI:SS')

    WHERE name = 'JOHN');

    http://docs.oracle.com/cd/B13789_01/appdev.101/b10795/adfns_fl.htm

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (5/8/2015)


    Bradley Deem (5/7/2015)


    Phil Parkin (5/7/2015)


    'Flashback'. What is that?

    Sorry, Oracle terminology. In SQL Server this would be Snapshot. Regardless, use technology to allow you to rollback to the point in time before deployment without the need of a restore operation. I've edited my previous post for clarification.

    Has anyone here actually used Dabase Snapshots to rollback a deployment? Will it rollback only DML and DDL modifications made by your session, or is it a point in time rollback for the entire database?

    Oracle's Flashback featureset goes well beyond SQL Server snapshots. For example, they have some interesting SELECT statement extensions for querying previous versions of table rows.

    -- The following query retrieves the state of the employee record for JOHN at 9:30AM, April 4, 2003:

    SELECT * FROM employee AS OF TIMESTAMP

    TO_TIMESTAMP('2003-04-04 09:30:00', 'YYYY-MM-DD HH:MI:SS')

    WHERE name = 'JOHN';

    -- This update then restores John's information to the employee table:

    INSERT INTO employee

    (SELECT * FROM employee AS OF TIMESTAMP

    TO_TIMESTAMP('2003-04-04 09:30:00', 'YYYY-MM-DD HH:MI:SS')

    WHERE name = 'JOHN');

    http://docs.oracle.com/cd/B13789_01/appdev.101/b10795/adfns_fl.htm

    We've used snapshots. It's like rolling back your database to the state it was at at the time the snapshot was taken (in effect, a point-in-time restore).

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Maybe this https://www.simple-talk.com/sql/database-administration/using-migration-scripts-in-database-deployments/ may be of use. This formed the basis for our scripts.

    We use one script to update data models of two versions of the same application. (Though we split those scripts every quarter).

    Both versions differ quite a bit. So each code fragment basically starts with a version check. Therefore we store the version in a global temp table, which is dropped at the end of the script.

    SET NOCOUNT ON;

    DECLARE @Version TINYINT = 0;

    /* Specify correct version... */ /* This has to be changed before deployment */

    IF COALESCE(@Version, 0) = 0 BEGIN

    PRINT 'Unspecified version, process is terminated.'

    SET NOEXEC ON /* ! */

    RETURN;

    END;

    IF OBJECT_ID('tempdb..##Version') IS NOT NULL BEGIN

    DROP TABLE ##Version;

    END;

    CREATE TABLE ##Version (version TINYINT);

    INSERT INTO ##Version (version) SELECT @Version;

    PRINT 'Selected version: ' + CAST(@Version AS VARCHAR);

    GO

    Because of the combination of DDL and DML, we had to use dynamic SQL. And we use batches to separate modifications.

    One made up example could be:

    /* [<Version>] [<Date>] [<Change request info>] */

    DECLARE @StatementExecuted BIT = 0;

    IF EXISTS (SELECT 'Version' FROM ##Version WHERE version = 2)

    /* only for a certain version */

    AND NOT EXISTS(SELECT 'CR' FROM information_schema.columns c WHERE c.table_name = 'OrderDetail' AND c.column_name = 'IsConfirmed')

    /* do we still have to create the column? */

    AND EXISTS(SELECT 'CR' FROM information_schema.tables t WHERE t.table_name = 'OrderDetail' )

    /* and does the table exist? */

    BEGIN

    PRINT 'Modifying TABLE OrderDetail could take some time...';

    BEGIN TRY

    ALTER TABLE OrderDetail ADD IsConfirmed BIT NOT NULL CONSTRAINT df_OrderDetail_IsConfirmed DEFAULT 0;

    PRINT 'Modifying TABLE OrderDetail succeeded';

    SELECT @StatementExecuted = 1; /* Alter historical data */

    END TRY

    BEGIN CATCH

    PRINT 'Modifying TABLE OrderDetail: ' + ERROR_MESSAGE();

    END CATCH;

    END;

    IF @StatementExecuted = 1 BEGIN /* so this will not execute when the script runs a second time */

    BEGIN TRY

    EXEC sp_executesql N'

    UPDATE OrderDetail SET

    OrderDetail.IsConfirmed = 1

    FROM OrderHeader

    JOIN OrderDetail ON OrderHeader.OrderHeader_id = OrderDetail.OrderHeader_id

    WHERE OrderHeader.IsInvoiced = 1;';

    PRINT 'Modifying data OrderDetail succeeded';

    END TRY

    BEGIN CATCH

    PRINT 'Modifying data OrderDetail: ' + ERROR_MESSAGE();

    END CATCH;

    END;

    GO

    So far this way of scripting has served our needs. Most of the changes are trivial.

    Scripting a UDTT - modification can be challenging. ๐Ÿ™‚


    Dutch Anti-RBAR League

  • I hadn't heard the term but have been doing this for some time as well. It's saved my biscuits MULTIPLE times because a lot of operational work, even inside the apps are needing changes "to go live on x date" - we DO that. And we don't muck about with it. We do the change, we write the new stuff, we put it in, it "goes live" on it's own. First time I did it some heads spun, but that's the beauty of the design. Inversely, things can be strategically deprecated in the same way - which made some heads spin THIS year with the elegant simplicity. Hilarious!

  • If constraints are added, as many as possible, the data will be safer then without constraints. If there is error handling, either in external code or in SQL procedures,ย  error messages can be intercepted and sent back to caller code, wherever it is. With constraints in place, developers quickly learn what works or does not.

    There are two ways of protecting code from common constraint violations. When we worry about duplicates (with constraints in place) we can do two things 1) check if data we are passing violates, say, uniqueness, and 2) check nothing, let the constraints and error code work together to determine problem, when it have happened. Then display message to users and let the decide what to do.

    That is OK in theory, for occasional inserts. It is not practical when 50 users are scanning barcodes on the shipping/receiving line. Checking each entry to prevent duplicates, is very slow. We may have hundreds of insert at the same time, by many users. Checking each entry for constraint violations takes too much time and the system appears blocked most ofย  the time, or at least very often.

    In most situations, constraint problems arise occasionally, meaning most of the time all is well. That is when option 2 comes to play. Boldly assume that all will be well and do not check before insert. Most of the time it is OK, keep churning.. However, once a constraint violating is attempted, the system must stop, display the message to the user, to make decision what to do. For duplicates, we may have a real duplicate - item with the same barcode was scanned hours ago and it appears now again. That situation requires specific approach - looking for the first instance. It could be the first instance was a miss scan, in which case it has to be dealt with.ย  Often though, there is keying mistake at the moment (barcode scanners are super fast key entries). Scanners are not perfect, once in few thousand reads data is misread. Manual key entry - we do not need to wait few hours, it happens much faster. If the code which handles insertion (data entry) has Error handling, we simply let errors happen, error handler determines what constraint was violated (attempt made to violate constraint). When the messages is shown to the user, they will know what is going one and how to quickly remedy the situation.

    With error handling and constraints, we can expect two outcomes 1) all is well and 2) constraint fires, suspicious entry is prevented. The problem is number 2) has many faces and appearances. In other words, outcome of performing some scripted operation is seldom binary. There is always chance that all is well. However, when all is not well, the task of solving the problem is open response type.

    What follows is that there cannot be a solution that would lead to clear solution, always. Most of the time, solution and correction can be found, but in odd cases a new problem will pop up. Only thing that can be done is keep track of issues, handle known ones when they happen. For odd ones, new issues, well, investigate and somehow include them in the list of things that can go wrong. We may need a new constraint, a new code module to handle the situation, or prevent somehow bad data from coming to the point of entry. Which is beyond programming and data modeling.

    All solutions given in responses are valid, if just for particular situation. Writing code, error code as well, database design, are kind of programming AI system. And AI is not smart in specific unexpected and new situation, most of the time. There is no unique solution for teh problem of re-running scripts, but we can make the system robust enough to survive expected and unexpected - exceptions. Then we, not the system must be able to solve unexpected new problems and with some luck, to include solutions in our tool kit, or even automate the solutions by updating database design and /or code,.

    Zidar's Theorem: The best code is no code at all...

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply