The basics: collation is a set of rules governing the use for characters in a language or in an alhabet.
In SQL Server we have Collation on several different levels:
- SQL Server level – every time an instance is installed, a collation setting must be specified; the default one is selected from the Windows OS collation setting, but it can be changed on SQL Server instance level
- Database level – each database can have its own collation setting (same or different from the instance setting); this also means that the tempdb database has a collation setting. Tempdb inherits the collation setting from the model database.
- column level – the collation setting for char, varchar, text, nchar, nvarchar, and ntext data types can be overridden with a specific collation (same or different from the database setting);
Collation settings are important, since the sorts and searches are highly dependent on the collation settings.
Also, it is important to know what collations are in use, since problems may occur if tables are joined on columns with different collations.
For example, a statement like this:
SELECT a.name, b.name FROM a RIGHT OUTER JOIN b ON a.name = b.name
may fail with the following error:
Msg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict between “SQL_Latin1_General_CP1_CI_AS” and “Finnish_Swedish_CI_AS” in the equal to operation.
The shortcut to fix this error would be to rewrite the query like this:
SELECT a.name, b.name FROM a RIGHT OUTER JOIN b ON a.name collate database_default = b.name collate database_default
Even though, this is not necessarily the best way to solve the problem.
Changing the collation on a column:
A column’s collation cannot be changed if it is referenced by the following:
- A computed column
- An index
- Distribution statistics, either generated automatically or by the CREATE STATISTICS statement
- A CHECK constraint
- A FOREIGN KEY constraint
Know your database design and think about the ideas behind the different collations in your database before writing ‘easy fixes’.
Here is a script which shows all column collations in a database:
;with CTE as ( select object_name(object_id) as tablename ,name as columnname ,collation_name from sys.columns where collation_name is not null) select * from CTE where tablename not like 'sys%' order by tablename,columnname