Sort already comma separated list

Question

Sort already comma separated list

Viewing 15 posts - 46 through 60 (of 69 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 1

ScottPletcher - Friday, February 8, 2019 3:47 PM
But none of that is relevant to me. They're not "attributes" to mean since they have no business value to me. Besides, VINs actually have different formats depending on year (before some year -- I forget which one -- is a different format).

I didn't say that it would be relevant to anyone. I didn't say that weren't useful and proper as a PK. I said that they don't actually follow the rules despite the fact that I'd use them without hesitation.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

x SSC-Insane Points: 23660 More actions · Answer 2

paul s-306273 - Tuesday, February 5, 2019 1:21 AM
Just an observation - Jeff and Lynn frequently respond to Joe's posts.
Anybody else think they 'follow' Joe?

I admit to searching out Joe's posts. I'm sure I do stuff he wouldn't approve of, heck I do stuff I don't approve of, all in the interest of getting paid.

Still, its important for me anyways to at least consider what he says, theres been occasions that I've been able to learn important, foundational lessons from him that goes to the very heart of what I want to do in this business. This sort of information is pretty important to me. Maybe I'm being selfish but whatever.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 3

paul s-306273 - Tuesday, February 5, 2019 1:21 AM
Just an observation - Jeff and Lynn frequently respond to Joe's posts.
Anybody else think they 'follow' Joe?

I do agree with Patrick that he does come up with a gem every once in a while and he does have some incredible history lessons but I don't follow Joe nor do I search his posts out.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 4

ScottPletcher - Friday, February 8, 2019 3:47 PM

Jeff Moden - Friday, February 8, 2019 3:27 PM

Alan.B - Friday, February 8, 2019 2:49 PM

Jeff Moden - Friday, February 8, 2019 2:09 PM

jcelko212 32090 - Friday, February 8, 2019 2:04 PM
But none of that is relevant to me. They're not "attributes" to mean since they have no business value to me. Besides, VINs actually have different formats depending on year (before some year -- I forget which one -- is a different format).

The VIN had to be expanded because when they were first designed nobody predicted just how many cars it would be on earth. I usually stress that a key should have

Validation = a dependable method of determining if the encoding is correct. This could be a regular expression, check digits, a range rule, etc.

Verification = okay the codes correct, but does it actually identify a real entity in the model?

For example, there are five digit potential ZIP Codes that were really never issued. It's valid but it's not real.

Some people like to add another attribute; stability. At one extreme, the identifier can be incredibly stable because it's immutable. For example (longitude, latitude) might have something different from time to time at that location, but that location doesn't ever change. At the other extreme, I had a friend who worked for a New York brokerage firm that required him to use a dongle on his computer. This piece of hardware and software constantly changed his passwords and identifiers every few minutes for security. Most of us work with things somewhere in between.

You had worked retail, you would've gone from the 10 digit UPC codes to13 digits to the current GTIN standards. If you're in healthcare, you have the ICD code upgrades. Etc.

The real trick for good design is that when an encoding changes, you have a migration path. It's really nice if that migration path can be done with an algorithm (in the book trade migration from ISBNâ€“10 to ISBN-13).

Please post DDL and follow ANSI/ISO standards when asking for help.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 5

jcelko212 32090 - Saturday, February 9, 2019 4:40 AM

ScottPletcher - Friday, February 8, 2019 3:47 PM

Jeff Moden - Friday, February 8, 2019 3:27 PM

Alan.B - Friday, February 8, 2019 2:49 PM

Jeff Moden - Friday, February 8, 2019 2:09 PM
jcelko212 32090 - Friday, February 8, 2019 2:04 PM
But none of that is relevant to me. They're not "attributes" to mean since they have no business value to me. Besides, VINs actually have different formats depending on year (before some year -- I forget which one -- is a different format).

The VIN had to be expanded because when they were first designed nobody predicted just how many cars it would be on earth. I usually stress that a key should have

Validation = a dependable method of determining if the encoding is correct. This could be a regular expression, check digits, a range rule, etc.

Verification = okay the codes correct, but does it actually identify a real entity in the model?

For example, there are five digit potential ZIP Codes that were really never issued. It's valid but it's not real.

Some people like to add another attribute; stability. At one extreme, the identifier can be incredibly stable because it's immutable. For example (longitude, latitude) might have something different from time to time at that location, but that location doesn't ever change. At the other extreme, I had a friend who worked for a New York brokerage firm that required him to use a dongle on his computer. This piece of hardware and software constantly changed his passwords and identifiers every few minutes for security. Most of us work with things somewhere in between.

You had worked retail, you would've gone from the 10 digit UPC codes to13 digits to the current GTIN standards. If you're in healthcare, you have the ICD code upgrades. Etc.

The real trick for good design is that when an encoding changes, you have a migration path. It's really nice if that migration path can be done with an algorithm (in the book trade migration from ISBNâ€“10 to ISBN-13).

Heh... yes... I get all that. But, we've gotten way off the subject and you've still not identified something reasonable to use as a PK
for a Customer table other than the verboten Tax ID and the not 100% common DUNs.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

GaryV Hall of Fame Points: 3307 More actions · Answer 6

A VIN could be repeated after 30 years. That's because the Year character repeats on a 30-year cycle, using 0-9 and 20 alphabetic characters (excluding those that look like numbers, I, O, etc.)

While it is unlikely that the first 8 characters representing values would exactly match 30 years apart when the last 8 characters repeat, it is a possibility. One that neither the government nor the auto companies worry about.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 7

gvoshol 73146 - Monday, February 11, 2019 6:01 AM
A VIN could be repeated after 30 years. That's because the Year character repeats on a 30-year cycle, using 0-9 and 20 alphabetic characters (excluding those that look like numbers, I, O, etc.)
While it is unlikely that the first 8 characters representing values would exactly match 30 years apart when the last 8 characters repeat, it is a possibility. One that neither the government nor the auto companies worry about.

And even if it didn't have a 30 year cycle, it still wouldn't do squat for a customer table. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Alan Burstein SSC Guru Points: 61141 More actions · Answer 8

Jeff Moden - Friday, February 8, 2019 3:27 PM
Alan.B - Friday, February 8, 2019 2:49 PM
Jeff Moden - Friday, February 8, 2019 2:09 PM
jcelko212 32090 - Friday, February 8, 2019 2:04 PM
>> But to create a primary key with an int or bigint all you need to do is define the table like this:
CREATE TABLE My_Table (my_table_id INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, col1 NVARCHAR(20) NOT NULL, col2 NVARCHAR(20) NOT NULL );
Then you can forget about it, no need to invent some text column which needs a complicated algorithm to work out the value. <<
Your solution of using an IDENTITY property (not a column!) has all kinds of problems. It is a table property that belongs to only one table, and only one implementation of one particular SQL product on only this machine. It has nothing whatsoever to do with the entities being modeled. It's a physical locator, that Sybase implemented because their first products were on UNIX based machines. The file system was based on a magnetic tape which was mapped over to a sequential file on the early disk drives.
Think about a VIN number. I can validate it with a regular expression. I can verify it by going to my local DMV, Carmax, the manufacturer, and any other number of trusted sources. That's why my auto insurance company uses it. This is why the same car can be referenced in so many different databases. If worse comes to worst, I can go over and physically take the VIN number off the dashboard manually.
What you're doing is using a physical locator and building pointer chains. Essentially, your IDENTITY is like the parking space number in a single garage building. Oh, since it is a table level property and cannot be a valid column and it is not an attribute of whatever the table is modeling. Therefore, by definition, it can never be part of a key.
>> You also don't usually need to display the primary key anywhere, you just use it for joins and uniqueness. <<
Actually, I found it to be quite the opposite. You're not displaying your locator because it has nothing to the data model. I'm constantly displaying the keys because keys are part of the data. A very important attribute of each entity. Can you really imagine not having a VIN an automobile database or failing to show it?
>> If you had a database with 400 tables in it each one having a text column for the primary key it would become a total nightmare to maintain. <<
First of all, after 35 years of doing databases, I have seldom had any schema with that many tables. It's usually a sign of a bad design (lots of attribute splitting, mimicking individual tape files with tables, etc.). And no, it isn't.
>> There also is some maths done on these columns: 1 is added to it to get the next value.<<
The next value of what? It's an identifier, which is on a nominal scale by definition. Please tell me you don't believe that the square root of your credit card number has some meaning. Oh,wait a minute! You're still thinking of pointers and sequential file allocations on magnetic tape, not RDBMS.
Heh... VIN numbers are actually a violation of first normal form.
Why so? If I were to use a VIN number as an identifier because it's unique, is it automatically a violation of 1NF? Or does it only become a violation of 1NF when I treat it as as a concatenated series of values?
I would not use a VIN for this, I'm asking out of curiosity. This is something I haven't thought about much.
First, and to be clear, there is no question that VINs make a great PK for all the reasons and rules that a PK must be and I wouldn't hesitate to use it. But, technically, they violate the rules of columns which must contain an attribute and should only contain one type of attribute. VINs actually contain many attributes, including what Joe Celko refers to as a "God" number (sequence). So, yes... it contains many concatenated attributes.

Picture source: https://www.google.com/imgres?imgurl=https://cfx-wp-images.s3.amazonaws.com/2017/11/VIN-Decode-1.jpg&imgrefurl=https://www.carfax.com/blog/vin-decoding&h=334&w=816&tbnid=a9PlZkci-y8zuM:&q=parts+of+a+vin+number&tbnh=87&tbnw=214&usg=AI4_-kTTE2zZyBld7XuFBai1FkvW62G0NA&vet=12ahUKEwizyL_Klq3gAhXtzVkKHfi1BdwQ9QEwAHoECAAQBg..i&docid=6QqOyLnYddKqyM&sa=X&ved=2ahUKEwizyL_Klq3gAhXtzVkKHfi1BdwQ9QEwAHoECAAQBg

Thanks Jeff! This made more sense when I read the comments going back to the OP. Now I know what "God Numbers" are too - I was always curious about what that was.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 9

Alan.B - Friday, February 8, 2019 2:49 PM

Jeff Moden - Friday, February 8, 2019 2:09 PM

jcelko212 32090 - Friday, February 8, 2019 2:04 PM
Heh... VIN numbers are actually a violation of first normal form.

Why so? If I were to use a VIN number as an identifier because it's unique, is it automatically a violation of 1NF? Or does it only become a violation of 1NF when I treat it as 0a concatenated series of values?

I would not use a VIN for this, I'm asking out of curiosity. This is something I haven't thought about much.

It is a scalar value because it cannot be decomposed in a meaningful manner. It identifies one and only one entity

Please post DDL and follow ANSI/ISO standards when asking for help.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 10

jcelko212 32090 - Tuesday, February 12, 2019 1:55 PM

Alan.B - Friday, February 8, 2019 2:49 PM

Jeff Moden - Friday, February 8, 2019 2:09 PM
jcelko212 32090 - Friday, February 8, 2019 2:04 PM
Heh... VIN numbers are actually a violation of first normal form.
Why so? If I were to use a VIN number as an identifier because it's unique, is it automatically a violation of 1NF? Or does it only become a violation of 1NF when I treat it as 0a concatenated series of values?
I would not use a VIN for this, I'm asking out of curiosity. This is something I haven't thought about much.

It is a scalar value because it cannot be decomposed in a meaningful manner. It identifies one and only one entity

Sorry but the VIN is an intelligent value that provides specific information regarding a vehicle. For North America there is even an algorithm that you can use to validate the VIN.

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 11

ScottPletcher - Friday, February 8, 2019 2:44 PM

jcelko212 32090 - Friday, February 8, 2019 2:04 PM

[IDENTITY] is a physical locator

>> No, it's not. "1" is meaningless as any type of physical locator. <<

you also feel that the ticket the valet in the parking garage gave you is totally unrelated to where your car is? I disagree.

>> Just because physical inserts are used as an easy mechanism to trigger a number increment doesn't mean that the number is necessarily a "physical locator". Why can't you accept that? <<

Have you ever had a course in basic data modeling? We try to separate the logical data model from the physical implementation(s). Letâ€™s take a look at automobiles since were working on VINs. We know that is an identifier for vehicles. We know we can physically determine what it is by going out and looking on our automobile, motorcycle or whatever. We do know that the VIN will be the same everywhere we use it â€“ our auto insurance, title and registration, dealer warranty, etc.

Now letâ€™s consider the IDENTITY property for table. I have no way of finding it before I put the entity into my schema. thatâ€™s because itâ€™s not a property or attribute of any entity. It belongs to the physical storage, on only one machine. It is created at insertion! It is based on the disk, and a file system with sequential records. I would have the same objections if you were using a hashing algorithm and exposed it.

>> In general, we just want a simple way to get a guaranteed unique, never-changing internal key value. Sure, the natural key, say an SSN, will still be stored, but I'll never make it the actual key. That would violate security concerns, and besides which a SSN can change, that fact alone rendering it an invalid key choice for me. SSN can be verified externally, as you insist for your keys, but that's not enough for me to prefer it as a key. <<

When you talk about internals, youâ€™re talking about physical storage and using things to get physical locations inside that storage. Youâ€™re not talking about a valid data model. Also, I believe that keys can change.in fact, Iâ€™ve actually watched it happen! I was in the book business. Many decades ago and I watched the ISBN â€“ 10 be replaced by the current ISBN â€“ 13. The same things happening with GTIN in retail (also this year, the GTIN can no longer be reused, so give you your never changing value).

>> As to VINs, they're too long and complex to use internally to IDENTITY vehicles. Again, I'll absolutely store the VIN, and it's certainly a candidate key, but it's just impractical for most common business uses of it. I canThat said, a car dealership, for example, might have some tables keyed on the actual VIN itself, but most companies won't need that. <<

Actually, any company that deals with automobiles or other vehicles not only has to have it, but finds no problem using it. You are basically trying to go back to pointer chains! Now what routine do you have been make sure that Iâ€™d your IDENTITY value always matches to the particular car. Oh, how do you guarantee that the other databases in which your car appears also have exactly the same IDENTITY value? If it really was a key and identified your particular automobile, then it would be the same everywhere in the universe.
Isnâ€™t that the definition of a key
>> Neither does the square of an item_price have any meaning, does it, but it clearly must be numeric. <<

So you feel that the expression (item_price * inventory_quantity) is meaningless? Numerics are for computations, and the valid computations depend on what kind of scale they represent. Also, I guess youâ€™ve never had a system where you had to do reorder points, depreciation, and other fancy computations; they do use squares and square roots.

>> Again, the practical overhead of using char for long digit-only values, such as credit card numbers, are just too severe to be ignored. The overhead of the massive new numbers of CHECK constraints alone could significantly damage system performance. <<

Credit card numbers are a bad example for you. Most of them are 16 digits, broken into four groups of four. When you store it all is one god-awful number, you then have to break it apart to validate each of the four sets of four digits. Oh and of course you ever want to display it, you have to translated from internal binary into a string anyway. I hate to tell you, but in the 21st century, storage is not the consideration it was when you are writing for punch card systems. However, accurate data is
.
>> Customer numbers can also have check digits, but they're still most practically stored as a number, not a string. <<

How many check digit routines have you written? The check digit is part of the identifier, usually in the last position (the rightmost). This is because the string was originally scanned from left to right in old machinery. The algorithms depend not on the value of the digits, but on the weight assigned to their position within the string and the string has to be of a known length. The check digit is stored as a character, not as a number! I donâ€™t know if you can get a copy of it, but there was a good book on check digit algorithms from the Mathematics Centre in the Netherlands.

Please post DDL and follow ANSI/ISO standards when asking for help.

ScottPletcher SSC Guru Points: 101099 More actions · Answer 12

jcelko212 32090 - Friday, February 15, 2019 2:56 PM

ScottPletcher - Friday, February 8, 2019 2:44 PM
jcelko212 32090 - Friday, February 8, 2019 2:04 PM