DateTime Stamp data as unique identifier

  • I am getting a data feed from a sql database that is provided to me in access.

    It is 7500 rows of data with no unique identifer (I have asked and they do not have a unique identifer in the table). They do have a date added field, which they returned to me as a date time 7/7/2008 11:22:17 AM field that has over 1/2 the records non-unique time stamps due to a mass upload of data.

    I know SQL stores datetime to 00:00:000. What is the likelihood that I could get a unique time stamp if I requested that they return the full time stamp data?

    What is the best way to instruct them on how to extract the data so that the full timestamp field is represeted in excel?

    Getting them to add a unique identifer in the time frame needed is unlikely.

    Thank,

    Jennifer

  • jennifer_l_hogan (10/24/2008)


    I am getting a data feed from a sql database that is provided to me in access.

    It is 7500 rows of data with no unique identifer (I have asked and they do not have a unique identifer in the table). They do have a date added field, which they returned to me as a date time 7/7/2008 11:22:17 AM field that has over 1/2 the records non-unique time stamps due to a mass upload of data.

    I know SQL stores datetime to 00:00:000. What is the likelihood that I could get a unique time stamp if I requested that they return the full time stamp data?

    What is the best way to instruct them on how to extract the data so that the full timestamp field is represeted in excel?

    Getting them to add a unique identifer in the time frame needed is unlikely.

    Thank,

    Jennifer

    The time portion of a datetime data type has a resolution of 3 milliseconds. The values are rounded to 0, 3 and 7 milliseconds, so if you have 2 records with the exact same time stamp (up to the millisecond) or the difference between the time stamps might be less or equal to 4 milliseconds, you’ll have more then one record with the same time stamp (by the way I used the term time stamp and not timestamp, because timestamp is a different datatype that has nothing to do with date nor time).

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Looking at the data they provided we have 6000 of the 7000 records added with the same date time. They returned the data as dd/mm/yyyy hh:mm:ss AM formatting. I don't know how SQL handles mass uploads - do all records return the same time stamp or is it the actual time of the upload? Will probably still return some duplication if but fewer than 6000 duplicate records.

    I know this is a scratch in the dark, but I will need to evaluate changes of the data on a monthly basis and don't know what else could be used to create a unique identifier as all other fields are updateable.

  • jennifer_l_hogan (10/24/2008)


    Looking at the data they provided we have 6000 of the 7000 records added with the same date time. They returned the data as dd/mm/yyyy hh:mm:ss AM formatting. I don't know how SQL handles mass uploads - do all records return the same time stamp or is it the actual time of the upload? Will probably still return some duplication if but fewer than 6000 duplicate records.

    I know this is a scratch in the dark, but I will need to evaluate changes of the data on a monthly basis and don't know what else could be used to create a unique identifier as all other fields are updateable.

    You are confusing datetime data type and timestamp data type. When you use datetime data type, you need to set its value, and the value doesn’t have to be connected to the time that the row was inserted into the table (for example I can store my birthday in a datetime column no matter when I inserted the row). The timestamp on the other hand is a binary value that you have no control on its value. SQL Server sets and modifies the value of a timestamp column. If you have a timestamp column in a row, each time that someone updates the row, SQL Server also modifies the value of the timestamp column. The value of the timestamp is promised to be unique and despite its name it has nothing to do with time at all (as I wrote before it is a binary value)

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I really don't think that there is much that you can do here with this approach. You need to get the customer to tag them appropriately in the table.

    Otherwise, sort them in Excel and assign them ID's on that basis.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • It looks like I am back to the drawing board. Don't understand why anyone would create a data file with no way to identify the records and all updateable fields.

    I'll try to get them to create an identifier, but usually I am stuck with what I get and will have to figure out something to weed out changes that are relevant to my data extract and those that are not. Just a lot more programming time involved than a simple record comparison.

    Thanks for the insight.

    Jennifer

  • Another way to look at this would be:

    INSERT INTO dbo.Table (col1, datecolumn)

    SELECT 'my column 1'

    ,getdate()

    FROM dbo.TableWith10000Rows;

    In the above statement, what will be the value assigned for getdate()? You might assume that it would be different for all rows - and, you would be wrong. In the above, SQL Server should evaluate 'getdate()' one time and use that value for all rows.

    If SQL Server evaluates the call to getdate() for each row - then the question becomes: how many of the 100,000 rows would be inserted in 3-4 milliseconds? 10, 20, 100, 1000? And, is the function evaluated when the query selects the data (reads it?), or when it is filed (written?).

    And, if you use a default value for your datecolumn - and you use the default value for the data column, the same rules apply.

    And finally, move the call to getdate() outside of the select and put it in a variable and you will guarantee that the value does not change.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • They could expand the DateAdded field to char(17) and use

    Select replace(convert(varchar, getdate(),111),'/','') + replace(convert(varchar, getdate(),114),':','')

    to return 20081102115655013

    If you need unique fields and the .013 seconds does not suffice then use or add an identity column.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply