Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Understanding the COLLATE DATABASE_DEFAULT clause in SQL Server


By:   |   Last Updated: 2016-08-03   |   Comments (5)   |   Related Tips: More > Database Configurations

Problem

During a recent database code review, we noticed the COLLATE DATABASE_DEFAULT clause in many of the stored procedures which join tables to temporary tables and linked servers.  What does the COLLATE clause do and how can I learn about the behavior of COLLATE DATABASE_DEFAULT?

Solution

COLLATE is a clause applied to character string expression or column for textual data types such as char, varchar, text, nchar, nvarchar, and ntext to cast the string or column collation into a specified collation.

COLLATE can be specified with a specific collation name or use the COLLATE DATABASE_DEFAULT clause which will cast the character string expression or column collation into the collation of the database context where the command is executed.

All the T-SQL below was written and executed on a 64-bit version of SQL Server 2014 Enterprise Edition SP1. All the queries are executed in the same Query session due to the utilization of temporary tables.

Get the SQL Server instance collation

Below is the SQL Server instance collation. All the system databases such as master, tempdb, model and msdb will have the same collation as the SQL Server instance collation.

SET NOCOUNT ON
GO
SELECT SERVERPROPERTY('collation') SQLServerCollation
,DATABASEPROPERTYEX('master', 'Collation') AS MasterDBCollation
GO

The above returns the following output:

SQLServerCollation           MasterDBCollation
---------------------------- -----------------------------
Latin1_General_CI_AS         Latin1_General_CI_AS

Create a sample database, tables and data

Let's create a user database which has a case sensitive collation as follows:

CREATE DATABASE [CaseSensitiveDB]
COLLATE Latin1_General_CS_AS_KS
GO

Create two permanent tables and two temporary tables and populate with sample records. The table CaseSensitiveDB.dbo.NonCSCollation will be created in the context of the [master] database with COLLATE DATABASE_DEFAULT. The other table PermTable and two temporary tables will be created in the context of the CaseSensitiveDB database.

Note that the first row inserted into PermTable is in uppercase, but otherwise all tables contain the same values.

USE [master]
GO

CREATE TABLE CaseSensitiveDB.dbo.NonCSCollation
( Value VARCHAR(100) COLLATE DATABASE_DEFAULT )
GO

USE [CaseSensitiveDB]
GO

CREATE TABLE dbo.PermTable
( Value VARCHAR(100) )
GO

CREATE TABLE #TableWithoutCollateDB
( Value VARCHAR(100) )
GO

CREATE TABLE #TableWithCollateDB
( Value VARCHAR(100) COLLATE DATABASE_DEFAULT )
GO

INSERT INTO dbo.PermTable (Value) VALUES ('RECORD 1'),('Record 2')
INSERT INTO #TableWithoutCollateDB (Value) VALUES ('Record 1'),('Record 2')
INSERT INTO #TableWithCollateDB (Value) VALUES ('Record 1'),('Record 2')

Checking the column collation in NonCSCollation table, it will inherit the server collation even though the table is created in CaseSensitiveDB. The COLLATE DATABASE_DEFAULT clause will cast the table collation to the server collation which is Latin1_General_CI_AS

USE CaseSensitiveDB
GO

sp_help NonCSCollation
GO

SQL Server Database Collation

Examples to see differences with Collation

Query 1 below which joins to #TableWithoutCollateDB will fail with a collation conflict. This is because this temporary table is created using the Tempdb collation which is the same as the server collation while PermTable table uses the database collation.

-- Query 1
USE [CaseSensitiveDB]
GO
SET NOCOUNT ON
GO
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value = TA.Value
GO

We get this error:

Msg 468, Level 16, State 9, Line 5 Cannot resolve the collation conflict between "Latin1_General_CI_AS" and "Latin1_General_CS_AS_KS" in the equal to operation.

Query 2 will execute successfully because the column Value in temporary table #TableWithCollateDB will inherit the database collation and not the SQL Server instance collation. Both join columns are case sensitive hence the count would return a value of one.

-- Query 2
USE [CaseSensitiveDB]
GO

SET NOCOUNT ON
GO

SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithCollateDB TB
ON PT.Value = TB.Value
GO

Results:

-----------
1

Assuming the query is written to join over a linked server or cross databases which have tables with different collation. Using COLLATE DATABASE_DEFAULT in the join on either or both sides of the column will set both columns to inherit the database context collation.

Using Query 1 which failed with a collation conflict as an example, there are three updated version of the query which will now execute successfully without collation conflict errors. All three queries will return row counts of one, because the database collation is case sensitive.

USE [CaseSensitiveDB]
GO

SET NOCOUNT ON
GO

-- Query 3
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value COLLATE DATABASE_DEFAULT = TA.Value
GO

-- Query 4
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value = TA.Value COLLATE DATABASE_DEFAULT 
GO

-- Query 5
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value COLLATE DATABASE_DEFAULT = TA.Value COLLATE DATABASE_DEFAULT 
GO

Results:

-----------
1

-----------
1

-----------
1

Executing the same query in Step 5, but in the context of the master database. All queries now will inherit the server collation instead and the count will return two even though the PermTable table has case sensitive collation on the Value column.

USE [master]
GO

SET NOCOUNT ON
GO

-- Query 6
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value COLLATE DATABASE_DEFAULT = TA.Value
GO

-- Query 7
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value = TA.Value COLLATE DATABASE_DEFAULT 
GO

-- Query 8
SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value COLLATE DATABASE_DEFAULT = TA.Value COLLATE DATABASE_DEFAULT 
GO

Results:

-----------
2

-----------
2

-----------
2

The query below would also cause a collation conflict because of the differing collation.

-- Query 9
SELECT [Value] FROM [CaseSensitiveDB].dbo.PermTable
UNION 
SELECT [Value] FROM #TableWithoutCollateDB

And we get this error:

Msg 468, Level 16, State 9, Line 126 Cannot resolve the collation conflict between "Latin1_General_CI_AS" and "Latin1_General_CS_AS_KS" in the UNION operation.

The COLLATE clause can also be used for columns of a SELECT query. The query would produce differing result depending on the context of the database.

-- Query 10
USE master
GO
SELECT [Value] AS TwoRows FROM [CaseSensitiveDB].dbo.PermTable
UNION 
SELECT [Value] COLLATE DATABASE_DEFAULT FROM #TableWithoutCollateDB

-- Query 11
USE CaseSensitiveDB
GO
SELECT [Value] ThreeRows FROM [CaseSensitiveDB].dbo.PermTable
UNION 
SELECT [Value] COLLATE DATABASE_DEFAULT FROM #TableWithoutCollateDB

Results:

TwoRows
--------------------
RECORD 1
Record 2

ThreeRows
--------------------
Record 1
RECORD 1
Record 2

Summary

The COLLATE DATABASE_DEFAULT clause provides flexibility in database development to resolve collation conflict issues assuming it is used appropriately. Writing an ad-hoc query with this clause may produce unexpected results if not well understood.

MSDN states COLLATE DATABASE_DEFAULT clause casts the collation of an expression, column definition, or database definition to inherit the collation of the "current database". To complement MSDN, the "current database" is the context of the database where the query is executed.

Next Steps


Last Updated: 2016-08-03


next webcast button


next tip button



About the author
MSSQLTips author Simon Liew Simon Liew is an independent SQL Server Consultant in Sydney, Australia. He is a Microsoft Certified Master for SQL Server 2008 and holds a Masterís Degree in Distributed Computing.

View all my tips
Related Resources




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Monday, October 22, 2018 - 7:11:27 AM - Simon Liew Back To Top

Hi Awais,

The COLLATE DATABASE_DEFAULT uses the collation of the database context where the the query is executing. In Query 7, the context of the user database is master. Hence it is case insensitive and returns value of 2.

If you execute Query 7 in the context of database CaseSensitiveDB, then you will get value of 1.


Saturday, October 20, 2018 - 3:27:50 PM - Awais Back To Top

I wonder How Query #7 is returning count of 2 rows. In this query Collation of temporray table has been changed in Join condition and [CaseSensitiveDB].dbo.PermTable is already in Case sensitive collation. My question is How following query return count of 2 instead of 1

Use Master;

GO

SELECT COUNT(*)
FROM
[CaseSensitiveDB].dbo.PermTable PT
JOIN #TableWithoutCollateDB TA
ON PT.Value = TA.Value COLLATE DATABASE_DEFAULT
GO

 

Kind regards,


Wednesday, April 04, 2018 - 11:20:18 AM - Mick Woods Back To Top

Expanding a bit on Alex's comment - I had major performance issues using COLLATE DATABAS_DEFAULT where a query that should have executed almost instantly took around 10 seconds. Switching to the actual collation of the remote database (eg COLLATE Latin1_General_BIN) removed the overhead and the query was virtually instant again.

The collation error message helps by telling you what the actual DB collations are. 

 


Wednesday, August 03, 2016 - 9:29:24 AM - Simon Liew Back To Top

Thank you for the comment Alex. 

But sometimes database collation is not something that you can control. For example, Microsoft Dynamics, Sharepoint, SSRS databases, SAP, etc have different collation and it is not something that you can rebuild the database collation if you want to produce a report out of all these databases.

This tip is also clarifying the behaviour which msdn only provides a single sentence.


Wednesday, August 03, 2016 - 8:37:23 AM - Alex Friedman Back To Top

 

Beware of performance issues due to collation differences, especially when using the "collate both sides to default" method -- you'll need to convert according to the flow of data in the query in order to use indexes properly.

The best case would be to not have collation differences in the first place.


Learn more about SQL Server tools