June 23, 2011 at 5:19 pm
Hi all,
What are the potential implications (if any) for a database having a different collation than a server for the query engine itself? For instance, I know if *I* want to use tempdb for a temp table then I need to use COLLATE statements. But what about a query, which only joins tables from on one database that the engine is executing? Are there performance implications for the server itself when tempdb is a different collation than the database it's executing queries on?
Any input is appreciated.
June 24, 2011 at 2:49 am
June 24, 2011 at 4:01 am
Jayanth_Kurup (6/24/2011)
Collation , will impact sort operations and as a result performance . Additionally collation will impact the results being returned and as a result the join and type of join operations being performed.
Can you provide anymore details specific to my example?
For example if tempdb uses collation A and database uses collation B, and I run the following query: "SELECT a.col1, a.col2 from table1 a INNER JOIN table 2 b on a.col3 = b.col3 ORDER BY a.col1" I assume my results are still sorted as per collation B? How is performance affected for queries.
The two collations in question here are SQL_Latin1_General_CI_AS and the windows Latin1_General_CI_AS (not at work right now, those may not be exact), I assume they are close enough that sort order won't be an issue?
June 24, 2011 at 4:25 am
Hi,
Sorting done based on the character. e.g. if you run the sort "order by " using the two different datatype column then you get different result. So insert some date in one column using the US date format and UK date format and run the order by column. You get the different sorting.
So you data value and position of the character differs in different Collation.
June 24, 2011 at 12:37 pm
Does anyone else have any input?
June 24, 2011 at 12:45 pm
http://msdn.microsoft.com/en-us/library/aa214297%28v=sql.80%29.aspx
Read this for a better understanding on collation
June 24, 2011 at 2:48 pm
Jayanth_Kurup (6/24/2011)
http://msdn.microsoft.com/en-us/library/aa214297%28v=sql.80%29.aspxRead this for a better understanding on collation
Thank you, but I think there is a misunderstanding on what I am asking, perhaps this is my fault. It is not that I don't understand collations, it is that I don't understand the performance implications for the SQL query engine when it is processing a query that uses (only) tables from database A but internally uses tembdb during processing which uses the different collation, and specifically for SQLServer 2008. My original question (reworded) was:
But what about a query, which only joins/processes tables from one database? Are there performance implications for the server itself when tempdb is a different collation than the database it's executing queries from?
Hopefully that makes what I am seeking information on clearer.
June 25, 2011 at 12:18 am
New temp tables will use the default collation of tempdb unless otherwise specified.
Here are a few options (there may be others) for you in terms of creating your temp tables so you do not get the default collation of tempdb, and rather get the collation that matches your particular columns.
Option 1 uses the "database_default" keyword:
CREATE TABLE #tmp
(
a VARCHAR(100) COLLATE database_default,
b NVARCHAR(100) COLLATE database_default
) ;
Option 2 explicitly specifies the collation:
CREATE TABLE #tmp2
(
a VARCHAR(100) COLLATE Chinese_Traditional_Bopomofo_100_CS_AI_WS,
b NVARCHAR(100) COLLATE Chinese_Traditional_Bopomofo_100_CS_AI_WS
) ;
Option 3 makes use of the SELECT...INTO technique of creating a new temp table which will instruct SQL Server to derive the temp table definition from the selected columns. Collations are part of a table's definition.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 25, 2011 at 1:37 am
Thanks, but really not what I am asking about.
For anyone else not sure what I am talking about take a look here: http://msdn.microsoft.com/en-us/library/ms190768.aspx
I'm talking about potential performance issues of Internal objects that are created by the SQL Server Database Engine, for example, work tables to store intermediate results for spools or sorting. And specifically for 2008.
I'm not creating any temp tables myself or manually interacting with tempdb in anyway. I'm asking about the performance hit (if any) of the server collation being SQL_Latin1_General_CI_AS and the db collation being the windows Latin1_General_CI_AS while the engine is resolving a query from only the single database in question where it decides to use tempdb (assuming it still would given the different collations). There is no explicit tempdb usage by the person executing the query.
I really don't think I can make the question any clearer at this point.
June 25, 2011 at 12:52 pm
I think I am the same page as you now, sorry, I missed your *I* inference in your OP.
The intertubes did not turn much up for me, as I suspect it did not for you which is why you posted. I am forced to abstract a bit here but only for lack of documentation. Just thinking out loud really, someone may come along and tell me to shut it, or school me, or both, but I am thinking about how schema is derived when an "internal" temp table needs to be created...and what happens when you ask for a table to be created using SELECT...INTO. I suspect the same techniques would apply for internal tables that you do not explicitly create. If you explicitly compare two columns with different collations that's on you, but I can't see the engine swapping out collations like that on your behalf, after all, collations are part of the schema definition of the column.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 25, 2011 at 1:06 pm
worktables use the collation as specified for the database in which the operation is being performed. So I dont think you would see any additional performance impact other than what you would see if the table was created outside of tempdb.
the example I tired was to change the collation of my user database and create a table variable into which it inserted a million rows . On querying the tempdb temptables columns I found collation name to be the same as that of my user database instead of the database default for temp db. Hope this answers your question.
June 25, 2011 at 1:25 pm
Jayanth_Kurup (6/25/2011)
worktables use the collation as specified for the database in which the operation is being performed
Can you cite an article?
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 25, 2011 at 3:36 pm
No, the Internet didn't really turn up much for me either unfortunately. Thanks for your comments, I'll bump this up on Monday and see if anyone else has any input then.
As a thought, I wonder if the engine is smart enough to say realize there is a difference in collations between tempdb and the database and create any internal structures to use the db collation? Wouldn't really be a hard thing for it to check for and do. -- heh, just saw the second page of comments with this same thing.
March 15, 2019 at 11:13 am
raistlinx - Saturday, June 25, 2011 1:37 AMI'm talking about potential performance issues of Internal objects that are created by the SQL Server Database Engine, for example, work tables to store intermediate results for spools or sorting. And specifically for 2008.I'm not creating any temp tables myself or manually interacting with tempdb in anyway. I'm asking about the performance hit (if any) of the server collation being SQL_Latin1_General_CI_AS and the db collation being the windows Latin1_General_CI_AS while the engine is resolving a query from only the single database in question where it decides to use tempdb (assuming it still would given the different collations). There is no explicit tempdb usage by the person executing the query.
Hi there. I am going to have to say "no" to the possibility of performance issues in this scenario. Database-level collation is used for:
-- DB collation is Latin1_General_100_BIN2 ; tempdb collation is SQL_Latin1_General_CP1_CI_AS
CREATE TABLE dbo.ColumnNames(col1 INT, Col1 INT, COL1 INT);
DROP TABLE dbo.ColumnNames;
DECLARE @Table TABLE (col1 INT, Col1 INT, COL1 INT);
/*
Msg 2705, Level 16, State 3, Line XXXXX
Column names in each table must be unique.
Column name 'Col1' in table '@Table' is specified more than once.
*/
CREATE TABLE #ColumnNames(col1 INT, Col1 INT, COL1 INT);
/*
Msg 2705, Level 16, State 3, Line XXXXX
Column names in each table must be unique.
Column name 'Col1' in table '#ColumnNames' is specified more than once.
*/
GO
and:
-- DB collation is Latin1_General_100_BIN2 ; tempdb collation is SQL_Latin1_General_CP1_CI_AS
DECLARE @ComputedColumnCollation TABLE
(
[ID] INT NOT NULL,
[SomeChar] AS (NCHAR([ID])),
[String] VARCHAR(10)
);
SELECT *
INTO dbo._comp
FROM @ComputedColumnCollation;
SELECT col.[name], col.[user_type_id], col.[collation_name], col.[is_nullable]
FROM sys.columns col
WHERE col.[object_id] = OBJECT_ID(N'dbo._comp')
ORDER BY col.[column_id];
/*
name user_type_id collation_name is_nullable
ID 56 NULL 0
SomeChar 239 SQL_Latin1_General_CP1_CI_AS 1
String 167 Latin1_General_100_BIN2 1
*/
DROP TABLE dbo._comp;
Both of those cases are functional issues, not performance issues.
What this comes down to is: whatever collation is being used by any string column (minus XML) is the collation that must be used in any work table, version store, trigger / OUTPUT clause table (i.e. INSERTED and DELETED), etc. And this has nothing to do with the collation of either DB or of the instance since every field of a result set can use a different collation. If the collation used for fields in these system-created work tables was different than the collation of the field the column is intended to capture, then you could end up with data loss for VARCHAR data if the code page was different. At the very least you would get the beloved collation mismatch error.
For more details on what areas are impacted by which level of collation, please see:
Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?
For info on working with collations in general, please visit: Collations Info
Take care,
Solomon...
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply