Monday, September 29, 2008

SQL Collation and SharePoint 2007

I will not forget to write this down... Latin1_General_CI_AS_KS_WS

I will not forget to write this down... Latin1_General_CI_AS_KS_WS

Now what was it I needed to write again?

Oh Yeah, I Remember...

Every time I build a SharePoint Farm, be it WSS 3.0- or MOSS 2007-based, I always forget the collation setting for SQL databases being used for SharePoint. There are several places on the web that this information is posted, yet I can never seem to find it in a hurry.

Here is the proper SQL collation for a WSS/MOSS installation:

  • Latin1_General Character Set
  • Case Insensitive - A and a are treated as the same character
  • Accent Sensitive - a and á are NOT treated as the same character
  • Kana Sensitive - Japanese Hirakana and Katakana characters which look the same are NOT treated as the same character
  • Width Sensitive - Single-Byte and Double-Byte versions of the same character are NOT treated as the same character
What effect does this have in my SharePoint Farm?

Although I do not (yet) know all of the items an improperly selected collation can affect, here is what I do know:

  1. Search relies on collation when crawling information in a SharePoint site.
  2. There have been reports of improperly configured collation which result in inconsistent behavior such as services not starting correctly.
So what, I can always change it later, right?

Yes, but I can almost guarantee you won't like it - the main databases in a SQL instance reflect the chosen collation. So, in order to change the collation for an instance that will be used for SharePoint, you have to change it on those databases as well.

This involves:

  • Backing up the databases
  • Documenting all user accounts/permissions
  • Dropping the content databases
  • Rebuilding the MASTER database
  • Restoring the content databases
  • Recreating the logins

In short, set this up right before you build your SharePoint Farm. If you are installing SharePoint against an existing SQL backend, consider placing it in its own SQL instance.

Sources

There is a good Blog Entry about it by Areej Abukar on the Jordan SharePoint User Group Blog. Gotta love the Internet.

More detailed information about SQL collation can be found here.

3 comments:

satyam said...

Hi Troy Lanphier,
Greetings.
Sir I have a question which my DBA team has asked.

We noticed that the SQL Server instance for SharePoint databases has different collation settings for System databases and Share point databases.



Collation settings for System databases (Default collation settings while installing SQL Server): SQL_Latin1_General_CP1_CI_AS

Collation settings for SharePoint databases (SharePoint will create automatically): Latin1_General_CI_AS_KS_WS.



In general if we have different collation settings for system databases and user databases will we will get sorting issues. i would like to know what will be the impact of having different collation settings for SQL server and SharePoint databases.

The DBA Team is asking any type of documentation from Microsoft about the settings

Thank you for all the support, help and guidelines

do we really need to change the collation settings of system databases..to match the sharepoint databases?

Satyam.

Troy Lanphier said...

Hi Satyam,

You did not specify what version of SQL you are using, but the problem still remains. Building any new databases from within SharePoint (Content DBs) may result in their having the wrong collation when they are created, which would create problems in the long run.

For this response, I am assuming that there are no other databases in this SQL instance (the one for SharePoint). I would follow the steps shown in this blog entry - http://blogs.digineer.com/blogs/jasons/archive/2007/10/18/change-sql-server-collation.aspx to rebuild the system databases to match the desired collation. Details on SharePoint and collation can be found at http://support.microsoft.com/kb/843578

Gyros.dk said...

Hi Troy. Does this mean that SharePoint has no support of collations other than general Latin1? This is not good when your main language calls for a different collation in sorting items. I'm from Denmark where the last three letters are "æ", "ø" and "å" - the collations DANISH_NORWEGIAN_XX_YY respect this while the Latin1 collation sorts the letters as variations of a, o and a respectively.