September 25, 2017 at 10:50 pm
Thanks Eric. Yeah the idea of not consuming the space is clear ,as per other posts too .The overhead of SQL spending resources in allocating n de-allocating the space from say 300 to 30 in ur example... other wouldn't SQL have grayed out the length combo used in defining the length of a column? my understanding... pls correct if wrong..
Thank you.
September 26, 2017 at 8:12 am
Arsh - Monday, September 25, 2017 10:50 PMThanks Eric. Yeah the idea of not consuming the space is clear ,as per other posts too .The overhead of SQL spending resources in allocating n de-allocating the space from say 300 to 30 in ur example... other wouldn't SQL have grayed out the length combo used in defining the length of a column? my understanding... pls correct if wrong..Thank you.
If the question is whether (for example) altering the length of a VARCHAR column from (300) to (30) on a large table would consume considerable time, then the answer is generally no. I would that expect that operation to take only a few seconds, because it's a meta-data change. SQL Server contains the column definitions and constraints within the header page(s) of the table, and it simply needs to update that meta-data page, not re-write changes to every row in the table. It's basically the same as adding a check constraint to a column and then subsequently dropping the constraint; the data itself doesn't change as a result of changing the max column length definition.
If you choose not to re-define the length of the VARCHAR columns, then when selecting a resultset into an ETL tool like SSIS or Informatica, you may want to CAST the definition as having a shorter maximum width runtime, so it won't consume more allocated space in buffers with fixed width records.
SELECT CAST( LastName AS VARCHAR(30) ) AS LastName, ...
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
September 27, 2017 at 12:33 am
Eric M Russell - Tuesday, September 26, 2017 8:12 AMArsh - Monday, September 25, 2017 10:50 PMThanks Eric. Yeah the idea of not consuming the space is clear ,as per other posts too .The overhead of SQL spending resources in allocating n de-allocating the space from say 300 to 30 in ur example... other wouldn't SQL have grayed out the length combo used in defining the length of a column? my understanding... pls correct if wrong..Thank you.If the question is whether (for example) altering the length of a VARCHAR column from (300) to (30) on a large table would consume considerable time, then the answer is generally no. I would that expect that operation to take only a few seconds, because it's a meta-data change. SQL Server contains the column definitions and constraints within the header page(s) of the table, and it simply needs to update that meta-data page, not re-write changes to every row in the table. It's basically the same as adding a check constraint to a column and then subsequently dropping the constraint; the data itself doesn't change as a result of changing the max column length definition.
If you choose not to re-define the length of the VARCHAR columns, then when selecting a resultset into an ETL tool like SSIS or Informatica, you may want to CAST the definition as having a shorter maximum width runtime, so it won't consume more allocated space in buffers with fixed width records.
SELECT CAST( LastName AS VARCHAR(30) ) AS LastName, ...
Thanks so much Eric. So in that case , in my effort to make the table design efficient , converting from NVARCHAR to VARCHAR (the data is not multilingual) would benefit , as SQL doesn't have to spend resource for allocating the extra byte for each byte length ?
September 27, 2017 at 1:28 am
Eric M Russell - Monday, September 25, 2017 9:55 AMIf we're talking about how the definition of a column impacts storage or performance, a column like LastName VARCHAR(300) does not consume more space than LastName VARCHAR(30), and neither does it impact performance internally within SQL Server. The maximum length defined for a variable width character column is essentially just a meta-data constraint.
Eric, there's some elaboration on the question. I understand that its a meta-data change done in the header area when NVARCHAR(300) is changed to NVARCHAR(30). My question is regards to the allocation / deallocation at each row level while the data is inserted. For example , if there are 100 rows being inserted with each having different column width for this column .. say 25,20,5,10 (with 30 as max above) etc . Doesn't SQL need to allocate 30 initially and then trim the row content to 25 , 20 , 5, 10 as the records come in ? thank you.
September 27, 2017 at 11:33 am
Arsh - Wednesday, September 27, 2017 1:28 AMEric M Russell - Monday, September 25, 2017 9:55 AMIf we're talking about how the definition of a column impacts storage or performance, a column like LastName VARCHAR(300) does not consume more space than LastName VARCHAR(30), and neither does it impact performance internally within SQL Server. The maximum length defined for a variable width character column is essentially just a meta-data constraint.Eric, there's some elaboration on the question. I understand that its a meta-data change done in the header area when NVARCHAR(300) is changed to NVARCHAR(30). My question is regards to the allocation / deallocation at each row level while the data is inserted. For example , if there are 100 rows being inserted with each having different column width for this column .. say 25,20,5,10 (with 30 as max above) etc . Doesn't SQL need to allocate 30 initially and then trim the row content to 25 , 20 , 5, 10 as the records come in ? thank you.
No, the CHAR(N) and NCHAR(N) datatypes have fixed allocation width, but the VARCHAR(N) and NVARCHAR(N) datatypes allocate only the number of bytes needed to contain each individual value being inserted. If you insert the value "Smith" into a column defined as LastName VARCHAR(30), then it will allocate 5 bytes basically.
If this value is later changed to "Young-Smith", then data for that row and other rows within the page are shuffled around to accommodate the change in data length of that row. Depending on the page Fill Factor specified when the table/index was created, a "page split" may occur, which means that a new page is inserted and rows are split between the original and new page. The same row shuffling and page splitting can occur if LastName is inserted as NULL, and then subsequently populated. So, if you have a table where VARCHAR column may get updated, then you want a Fill Factor of something like 80, meaning that pages are initially allocated to 80% of capacity, leaving room for updates while reducing need for page splits.
If you're trying to reduce the storage of a table containing many VARCHAR columns, then tweaking max definition of VARCHAR columns buys you nothing. What you probably want to consider instead is enabling ROW or PAGE compression.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
September 27, 2017 at 11:35 am
Arsh - Wednesday, September 27, 2017 12:33 AMEric M Russell - Tuesday, September 26, 2017 8:12 AMArsh - Monday, September 25, 2017 10:50 PMThanks Eric. Yeah the idea of not consuming the space is clear ,as per other posts too .The overhead of SQL spending resources in allocating n de-allocating the space from say 300 to 30 in ur example... other wouldn't SQL have grayed out the length combo used in defining the length of a column? my understanding... pls correct if wrong..Thank you.If the question is whether (for example) altering the length of a VARCHAR column from (300) to (30) on a large table would consume considerable time, then the answer is generally no. I would that expect that operation to take only a few seconds, because it's a meta-data change. SQL Server contains the column definitions and constraints within the header page(s) of the table, and it simply needs to update that meta-data page, not re-write changes to every row in the table. It's basically the same as adding a check constraint to a column and then subsequently dropping the constraint; the data itself doesn't change as a result of changing the max column length definition.
If you choose not to re-define the length of the VARCHAR columns, then when selecting a resultset into an ETL tool like SSIS or Informatica, you may want to CAST the definition as having a shorter maximum width runtime, so it won't consume more allocated space in buffers with fixed width records.
SELECT CAST( LastName AS VARCHAR(30) ) AS LastName, ...
Thanks so much Eric. So in that case , in my effort to make the table design efficient , converting from NVARCHAR to VARCHAR (the data is not multilingual) would benefit , as SQL doesn't have to spend resource for allocating the extra byte for each byte length ?
We're talking about converting NVARCHAR columns to VARCHAR now? That's a different issue.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
September 27, 2017 at 1:09 pm
Thank u Eric for the insights into this. very helpful. So , for me to buy any space and performance gain , apart from row and page compression, I can only convert the nvarchar to varchar to reduce it to half in size if I don't need to use unicode char ?
shouldn't this help get performance gain as the number of pages will decrease significantly?
Pls put some light on tiny but, small int ,into big int too . are they fixed length? in which case it might help to trim their usage too.. my objective is to reduce IO , gain performance. Thank u so much.
September 27, 2017 at 1:33 pm
Arsh - Wednesday, September 27, 2017 1:09 PMThank u Eric for the insights into this. very helpful. So , for me to buy any space and performance gain , apart from row and page compression, I can only convert the nvarchar to varchar to reduce it to half in size if I don't need to use unicode char ? shouldn't this help get performance gain as the number of pages will decrease significantly? Pls put some light on tiny but, small int ,into big int too . are they fixed length? in which case it might help to trim their usage too.. my objective is to reduce IO , gain performance. Thank u so much.
Yes, the integer types are fixed width data types. However, you can't just go changing data types in the search for better performance. You really need to look at the data and what may be planned for the future. Maybe right now you aren't storing Unicode data in your database, that doesn't mean it may be a requirement later.
The best question we can ask is what problem are you trying to solve and why do you believe it is a problem.
September 27, 2017 at 1:45 pm
Arsh - Wednesday, September 27, 2017 1:09 PMThank u Eric for the insights into this. very helpful. So , for me to buy any space and performance gain , apart from row and page compression, I can only convert the nvarchar to varchar to reduce it to half in size if I don't need to use unicode char ? shouldn't this help get performance gain as the number of pages will decrease significantly? Pls put some light on tiny but, small int ,into big int too . are they fixed length? in which case it might help to trim their usage too.. my objective is to reduce IO , gain performance. Thank u so much.
Yes, NVARCHAR stores 2 bytes for each character, versus 1 byte for VARCHAR. Integers are fixed width: tinyint (1 byte), smallint (2 bytes), int (4 bytes), bigint (8 bytes).
However, PAGE compression will mitigate this, because it uses a dictionary compression technique to tokenize repeated values.
You can use DATALENGTH() function to determine how many bytes of storage a specific data type and value would require.
For example:
DECLARE @NLastName NVARCHAR(30) = 'John Smith'
, @LastName VARCHAR(30) = 'John Smith';
SELECT DATALENGTH(@NLastName) AS LengthNLastName
, DATALENGTH(@LastName) AS LengthLastName;
LengthNLastName LengthLastName
20 10
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
September 27, 2017 at 7:44 pm
Thank you very much Eric for clarifying it so well! Thank u.
September 27, 2017 at 11:41 pm
ScottPletcher - Friday, September 22, 2017 9:02 AMColumns lengths have another effect on memory allocation as well. SQL reserves memory prior to processing a table. It determines the amount of memory to reserve partly based on column lengths, as would typically make sense. Thus, though, if you have a lot of columns defined as, say, varchar(500) or varchar(1000) even they only ever used 10 bytes, you would cause SQL to pre-allocate a lot more memory than it would really need.
The following article by Aaron Bertrand illustrates the increased memory grants based on the defined data size, rather than the actual data size.
Performance Myths : Oversizing string columns
September 29, 2017 at 5:49 am
Lynn Pettis - Wednesday, September 27, 2017 1:33 PMThe best question we can ask is what problem are you trying to solve and why do you believe it is a problem.
Thanks Lynn. I'm looking at the design of a database to see what areas I can modify tweak as Re-engineering objective for performance gain as the DB development team works on tuning the SP's. This is a product already in production with 1.2 TB DB size and asking so many questions for its not-so-good performance. I learned that no care was taken for design considerations of the Database. I could see some normalization though. Most of the table structure 'evolved' on the fly.
September 29, 2017 at 5:59 am
DesNorton - Wednesday, September 27, 2017 11:41 PMScottPletcher - Friday, September 22, 2017 9:02 AMColumns lengths have another effect on memory allocation as well. SQL reserves memory prior to processing a table. It determines the amount of memory to reserve partly based on column lengths, as would typically make sense. Thus, though, if you have a lot of columns defined as, say, varchar(500) or varchar(1000) even they only ever used 10 bytes, you would cause SQL to pre-allocate a lot more memory than it would really need.The following article by Aaron Bertrand illustrates the increased memory grants based on the defined data size, rather than the actual data size.
Performance Myths : Oversizing string columns
Thank you Des , it helps in clearing the air about the impacts of the length/type definitions considered while designing a table which many architects(or shall I call DB developers) seem to ignore.
November 23, 2017 at 3:13 am
Arsh - Friday, September 29, 2017 5:59 AMDesNorton - Wednesday, September 27, 2017 11:41 PMScottPletcher - Friday, September 22, 2017 9:02 AMColumns lengths have another effect on memory allocation as well. SQL reserves memory prior to processing a table. It determines the amount of memory to reserve partly based on column lengths, as would typically make sense. Thus, though, if you have a lot of columns defined as, say, varchar(500) or varchar(1000) even they only ever used 10 bytes, you would cause SQL to pre-allocate a lot more memory than it would really need.The following article by Aaron Bertrand illustrates the increased memory grants based on the defined data size, rather than the actual data size.
Performance Myths : Oversizing string columnsThank you Des , it helps in clearing the air about the impacts of the length/type definitions considered while designing a table which many architects(or shall I call DB developers) seem to ignore.
Hi Des , does your statement above (SQL having to allocate half of the column-length size space / memory rings prior to processing the table rows true for all versions of SQL Server I mean it was deprecated somewhere) . Thank you.
November 23, 2017 at 4:20 am
Arsh - Thursday, November 23, 2017 3:13 AMArsh - Friday, September 29, 2017 5:59 AMDesNorton - Wednesday, September 27, 2017 11:41 PMScottPletcher - Friday, September 22, 2017 9:02 AMColumns lengths have another effect on memory allocation as well. SQL reserves memory prior to processing a table. It determines the amount of memory to reserve partly based on column lengths, as would typically make sense. Thus, though, if you have a lot of columns defined as, say, varchar(500) or varchar(1000) even they only ever used 10 bytes, you would cause SQL to pre-allocate a lot more memory than it would really need.The following article by Aaron Bertrand illustrates the increased memory grants based on the defined data size, rather than the actual data size.
Performance Myths : Oversizing string columnsThank you Des , it helps in clearing the air about the impacts of the length/type definitions considered while designing a table which many architects(or shall I call DB developers) seem to ignore.
Hi Des , does your statement above (SQL having to allocate half of the column-length size space / memory rings prior to processing the table rows true for all versions of SQL Server I mean it was deprecated somewhere) . Thank you.
That statement was made by Scott Pletcher.
Viewing 15 posts - 16 through 30 (of 32 total)
You must be logged in to reply to this topic. Login to reply