How to go abouit ensuring a GUID is unique

  • I am new to SSIS so any ideas directing me in the correct direction is appreciated. I have a large flat file that I am normalizing into several files. I am creating a GUID to be used as the primary key value in say "Table A". I have other junction tables to fill which I use the GUID and the PK from another table. I'm using lookups to get these PK fields from the reference tables and unioning them with the GUID created so when I import this some 17 million records I won't have to use any joins to update the junction tables which would take entirely too long to run. I can just insert Table A first then import into the junction tables afterwards to avoid any FK contraint violations. Am I going about this in the best possible way? I know sooner or later I could end up creating a GUID that is already in the database and I won't be able to insert it. Is that just one of the breaks where I'll have to redirect records that don't import and rerun them to create other GUID IDs?

  • well guids are supposed to be unique; although the algorythm got a little weeker when they disconnected it from the MAC address of a NIC card for the seed, it's still good.

    I think you are worried about something that is not going to happen.

    for fun, i created a table with 10 million rows, just to see if SQL Server would create a duplicate.

    i ran it, checked for dupes, and then did the insert again, for 20 million rows in two passes.

    you said you are creating your own GUIDS, I'm using SQL server for that.

    no dupes. none. in 20 million rows. I could repeat till I'm out of disk space, but I don't foresee any dupes occurring.

    here's my code :

    CREATE TABLE #MyHead

    ( SomeId int identity(1,1) primary key,

    SomeValue varchar(36)

    )

    --this makes tempdb grow like crazy: 4.7 gig,

    --took 6:36 seconds on a single proc 1g ram 2.6GHtz machine

    INSERT INTO #MyHead

    SELECT TOP 10000000 --ten million

    NEWID()

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --any dupes? ugly with no index but here goes

    select SomeValue

    from #MyHead

    group by SomeValue

    having count(*) > 1

    --3:10 seconds no dupes, temp expanded to 5.25 gig

    --repeating to get 10Mill more rows

    INSERT INTO #MyHead

    SELECT TOP 10000000 --ten million

    NEWID()

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --took 4:03 seconds to add the second 10 mill

    --tempdb is now at 10.1 gig

    select SomeValue

    from #MyHead

    group by SomeValue

    having count(*) > 1

    3:41 seconds: no dupes.

    ==

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply