October 19, 2016 at 10:41 am
We have an ID that is 8 characters long with leading zeros if the ID is less than 8 characters long. Currently, it is stored as a VARCHAR(8), with the developer having to specifically add the leading zeros during insertion. I would like to change that MemberId field to an INT field, since letters are never used for this field, only numbers. The only issue is we still need to display the leading zeros for reports, websites, applications, etc.
My solution was to use a computed column on the INT field, using for the presentation the function FORMAT(MemberId, '00000000'). While this does work, I noticed an anomaly with the computed column's data type; it returns an NVARCHAR(4000), which is the max size for NVARCHAR.
I also added another computed column, this time using REPLICATE and LEN to get the desired result. This returned a data type of VARCHAR(8000), which is the max size for VARCHAR.
Here's the code I used to set up this quick example:
CREATE TABLE test.MemberTrial
(
MemberId INT,
ReplicatedMemberId AS REPLICATE('0', 8 - LEN(MemberID)) + CAST(MemberID AS VARCHAR(8)),
FormattedMemberId AS FORMAT(MemberId, '00000000'),
CastMemberId AS CAST(FORMAT(MemberId, '00000000') AS CHAR(8))
)
INSERT INTO test.MemberTrial (MemberId)
VALUES
(123),
(1123),
(11235),
(112358),
(11235813),
(112358132),
(9999999),
(321321),
(3),
(73)
SELECT
MemberId,
ReplicatedMemberId,
FormattedMemberId,
CastMemberId
FROM
test.MemberTrial
My questions are:
October 19, 2016 at 12:54 pm
A computed column is not stored, therefor there is no need to be concerned about the size. HOWEVER, SQL has to recompute the value every time that it is read.
If you make the computed column PERSISTED, then it is persisted to disk. You also have the added benefit of being able to index the value, if necessary/desired.
Both of these options will result in a column of type varchar(8)
ReplicatedMemberId AS RIGHT(REPLICATE('0', 8) + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
ReplicatedMemberId2 AS RIGHT('00000000' + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
October 19, 2016 at 1:37 pm
Exactly what DesNorton mentioned. I just wanted to add something about the performance.
The FORMAT function has been proven to be very slow. It's up to 40 times slower that previous formatting methods. What seems to be the most effective option is to add the zeros before casting.
CREATE TABLE MemberTrial
(
MemberId INT,
RightMemberId AS RIGHT(MemberId + 100000000, 8)
);
October 19, 2016 at 1:54 pm
DesNorton (10/19/2016)
A computed column is not stored, therefor there is no need to be concerned about the size. HOWEVER, SQL has to recompute the value every time that it is read.If you make the computed column PERSISTED, then it is persisted to disk. You also have the added benefit of being able to index the value, if necessary/desired.
Both of these options will result in a column of type varchar(8)
ReplicatedMemberId AS RIGHT(REPLICATE('0', 8) + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
ReplicatedMemberId2 AS RIGHT('00000000' + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
I realize that if you persist them they will go to disk. But wouldn't that also increase storage? If you're already storing the INT value (4 bytes), then persist the VARCHAR(8) value (10 bytes, correct? Eight for the characters, 2 for overhead?), you're basically storing the value twice, taking up 14 bytes as opposed to 4. Is that correct?
And wouldn't it be best just to store and index the original MemberId INT column and use the computed column for display only? Then you're indexing an INT instead of a VARCHAR. Isn't that better for speed and performance, too?
October 19, 2016 at 2:04 pm
CferMN (10/19/2016)
DesNorton (10/19/2016)
A computed column is not stored, therefor there is no need to be concerned about the size. HOWEVER, SQL has to recompute the value every time that it is read.If you make the computed column PERSISTED, then it is persisted to disk. You also have the added benefit of being able to index the value, if necessary/desired.
Both of these options will result in a column of type varchar(8)
ReplicatedMemberId AS RIGHT(REPLICATE('0', 8) + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
ReplicatedMemberId2 AS RIGHT('00000000' + CAST(MemberID AS VARCHAR(8)), 8) PERSISTED,
I realize that if you persist them they will go to disk. But wouldn't that also increase storage? If you're already storing the INT value (4 bytes), then persist the VARCHAR(8) value (10 bytes, correct? Eight for the characters, 2 for overhead?), you're basically storing the value twice, taking up 14 bytes as opposed to 4. Is that correct?
And wouldn't it be best just to store and index the original MemberId INT column and use the computed column for display only? Then you're indexing an INT instead of a VARCHAR. Isn't that better for speed and performance, too?
If you ever need to search for MemberId like '000012%' then you will have to do a full table scan for each search. In this case, you are better off adding the persisted column and indexing it.
If that never happens, then there are 2 options
1 - Store the extra 10Bytes, and use drive space - With a max of 10mil rows, that transaltes to a max 100MB (or 95.3 if 1KB = 1024B).
2 - Take the CPU perf hit of having to do the calc on every read.
You need to make the call of which is the lesser of 2 evils for your situation.
October 19, 2016 at 2:37 pm
Persisted computed columns only make sense when the computation is expensive or when an index is needed. If that's not the case, persisting the computed column makes little sense.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply