What datatype to use for primary key if the table is going to have a billion rows?

  • Hi guys,

    This is my first post here. But I will appreciate responses.

    I was wondering if which datatype would be best suited for creating primary key for a table that is expected to grow to billions of rows?

    thanks

     

    Nan Ku

  • thats depends on many other factors. Please let us know more about the table

     

     


    Kindest Regards,

    Amit Lohia

  • Use bigint:    bigID bigint IDENTITY(1,1) PRIMARY KEY

    Here is an example where the seed value starts at 1,000,000,000

    to show it works for large numbers:

    CREATE TABLE bigtable

    (

      bigID bigint IDENTITY(1000000000,1) PRIMARY KEY

    , myData varchar(20)

    )

    GO

    SET NOCOUNT ON

    INSERT bigTable (myData) VALUES ('AAA')

    INSERT bigTable (myData) VALUES ('BBB')

    INSERT bigTable (myData) VALUES ('CCC')

    INSERT bigTable (myData) VALUES ('DDD')

    INSERT bigTable (myData) VALUES ('EEE')

    SET NOCOUNT OFF

    SELECT *

      FROM bigTable

     

  • Thanks once again, Joe, for your brow-beating and demeaning response that was of no help!

    Add to that the insult you just gave so many on this forum by referring to us as kludges.  It is much appreciated!

     

     

    ----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011

  • Thanks once again, Joe, for your brow-beating and demeaning response that was of no help!

    Add to that the insult you just gave so many on this forum by referring to us as kludges. It is much appreciated!

    Before I post my opinion in support of Mr. Celko, look up the definition of "kludge' in the Jargon File. You apparently don't understand what it means.

    Now. Celko has a point: Is there a natural key that will uniquely identify this table? (Any relational data modeler would say "yes" there is, you just need to find it.)

    Second, is the table properly normalized? If not, then normalization might make the table much smaller. I am working from the assumption it is below.

    Third, if he feels that a surrogate key is necessary, why? What is missing? Is a composite key too unwieldy? (probably)

    We know that on most SQL Server systems, an integer datatype would probably suffice. Bigint is even better.

    Those're just three questions I can come up with using the above question as a reference.

    The original poster just asked a question without giving any specifics about the data to be represented.

    Most likely, he is asking: What datatypes can have billions of values and of those what is fastest?

    But he didn't and we are left wondering what he meant. Granted Celko went a different way with his assumptions of what is asked, but his point is still valid: If this gent is asking this question in this forum, he might be better served in the formation of this database by finding someone where he is to get answers. This is probably a much bigger issue than just what his one line question asks and is too big for him to handle alone (and still have the possiblility of solving the problem adequately).

  • Hold on...I think there's a difference between disagreeing with the content of someone's message and disputing the way they expressed it.

    Joe could have just as easily said "I can't help you without sufficient information, and without knowing what the candidate keys are, there is no way to answer your question."  Which would have been just as valid a way to answer the question without implying the person was too stupid to breathe.

    Perhaps when you are at the level of expertise of Mr. Celko is you can afford to be arrogant (I know more than you do, and you will never ever catch up), but frankly I don't know anyone who couldn't stand a bit of humility when dealing with others.

    That being said, to the orginal poster:

    We need a lot more information to begin to answer your question.  Even if you do decide to use a bigint as a key, you will need to identify at least one or more fields that truly identify each unique record.  Without opening the whole natural vs. surrogate key argument can of worms, The answer to this question should help you identify how many likely values or combinations of values you have.  After that you can start thinking through what data type to use.

    HTH,

    Larry

  • I understand where Joe is coming from. I even own a few of his books. However, the reality is that there is a full range of experience levels that visit this forum, and it's not practical (nor do most of have the time) to impart several years of computer science knowledge through a single forum thread.

    Rather than judge the question, I prefer to answer as directly as possible, even if the original question or premise is flawed.

     

  • Actually the whole question of surrogate and natural keys appears to be unanswerable to anyone's satisfaction, despite how much people quote the founder of relational theory.

     

    Which I suppose makes it the piece of Codd which passeth all understanding....

    <commence throwing rotten fruit now>

  • mkeast, I do understand "kludge" (or "kluge") and it basically says that we on this forum may present solutions but those solutions are not the best solutions.  They may work but they shouldn't be done that way.

    My original post was more at Mr. Celko's response and not his intellect.  I am the first to admit that he far more intelligent than me but after reading many of his posts (and listening to him at a conference), I hold the belief that if a solution does not meet to his standards, it is no good, and anyone offering a solution that does not meet with his approval is stupid/inferior/(choose your own adjective) and is guilty of promoting stupidity/inferiorority/(again, choose your own description).

    I also believe that his latest post to this thread just helps prove my point that many of his posts are of little, or no, value in presenting a solution.  Rather, he simply must stroke his ego by identifying the inferiority of everyone else's skills.

    I did not mean for this thread to take off in this manner but I just got fed up with what I was repeatedly seeing in many of these forums and on that particular morning, I just had to respond.

    This post, however, will be my last, concerning Mr. Celko.

    ----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply