SQL Collations mismatch after Upgrade / Consolidation February 14th, 2013

Vinod Kumar

Recently I was contacted by one of the customers with an interesting query. They upgraded from previous version of SQL Server and the DBA didn’t take  necessary precaution for collation in their new server. Now they need to deal with a mismatch between SQL_Latin1_General_CP1_CI_AS and Latin1_General_CI_AS collations in their new database instance versus their old databases. This was a typical rare scenario and I had to help them.

Most DBA’s are oblivious to select an appropriate collation at the point of installation. This is a typical recipe for disaster and I thought to write about this in this blog post for people to use as reference.

What can go wrong here?

  1. Usage of Temp Tables is causing wrong results or worst is breaking now.
  2. In consolidation scenario, now 100’s of database will be in one instance there needs to be application change that needs to be enforced.
  3. If the collation was wrongly selected as case-sensitive then the stored procedure and application can go down easily as soon as the switch happens.

In my close to over a decade of working with SQL Server, I have never ever seen application developers use specific collation on their code. Under rare scenario with multiple localized content, I have seen people use COLLATE keyword in their code. That as I said is one in million sort of scenario. Would love to hear if you have used such code in your application for sure.

Do look at the bigger checklist for SQL Server 2012 if you are doing the same. The same list can be useful when moving from other versions too.

What went wrong? Why?

Most DBAs don’t pay attention to collation until it is too late. The crux of the issue is that SQL Server metadata collation follows the database collation unlike every other database – meaning the metadata or system tables are collated based on the database collation. So in a database with case-sensitive collation, the metadata is case-sensitive (object identifiers, names of tables, sorting of object names & so on).

In SQL Server 2012, we introduced the concept of a fixed catalog collation with partially contained database (CDB). So within the context of a partial CDB, the metadata collation is always fixed to a specific collation irrespective of the database collation or how the database was created on an instance with different defaults.

If you haven’t read about Contained DB, I highly recommend reading the same as it has multiple advantages like:

  1. Have case-sensitive (or other) collation behavior for user data without having to endure a case-sensitive catalog.
  2. Consolidate databases from servers with differing collations without having to rework the code to add explicit collation comparisons.
  3. And of course move a database from one server to another and have the TempDb collation behavior the same.

Know your collation

In SQL Server 2012, you can use the catalog_default keyword when defining your columns for temporary tables. This will prevent the collation conflict issue when joining temporary table with user table data or catalog view data. In older versions, you can use database_default keyword.

See example below (run below in plain database and partially contained to see difference in behavior):

DECLARE @myVar SQL_VARIANT = COLLATE catalog_default;

SELECT SQL_VARIANT_PROPERTY(@myVar, ‘collation’);

SET @myVar = COLLATE database_default;

SELECT SQL_VARIANT_PROPERTY(@myVar, ‘collation’);

Basically, in a partially contained database the catalog_default will return the fixed metadata collation and in plain database it will return database collation (COLLATE property of database). The database_default keyword always returns the database collation. If you use these keywords then you don’t need to specify the collation name explicitly in your code for the temporary table column definitions.

Using COLLATE database_default for column definitions in temporary tables will avoid the issue in older versions of mismatch. Or you can also use table variables instead where the column definition defaults to database collation if collate clause is absent.

Final words

Though we discuss the workarounds here. This is a postmortem work. I highly recommend that if you are an DBA, please ask what the collation should be. Ask what the application design requirements are. Ask if contained DB is being used. Worst case, if you have made the mistake of defaults – it is never too late to reinstall SQL Server afresh as an worst case scenario to avoid all these heart burns :). Please feel free to share your stories with me if you have anything interesting like this that did happen in your organization.

Share this article

Tags: , , , , , , ,

This entry was posted on Thursday, February 14th, 2013 at 08:30 and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


2 Responses to “SQL Collations mismatch after Upgrade / Consolidation”

  1. ramesh says:

    Even i faced the same problem at the time of migration as a Immediate answer i have rebuild the system databases with new collation name check my blog for rebuilding system databases

    http://sqlservr.blog.com/2012/12/20/rebuild-system-database/

Leave a Reply



 

Email
Print