You have a mature SharePoint 2007 installation, but lets say you have not been able to update it... And then one day, you get "the call". (And the sky is falling, of course!)...
...The Helpdesk has just informed you that all users are getting the following error when attempting to access one or more site collections.
"Attempt to release mutex not owned by caller. (Exception from HRESULT: 0x00000120)
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.ApplicationException: Attempt to release mutex not owned by caller. (Exception from HRESULT: 0x00000120)"
While there could be many reasons for this type of exception. We are dealing with SharePoint, so all bets are off. :) So you login and take a look. And sure enough, one or more of your larger site collections is giving this error. So you being the super sleuth SharePoint admin, pull out trick number one! You access the site collection settings page, (/_layouts/setting.aspx). And oddly enough, it appears, no problem, no error. You can access most settings, and even some pages in the pages library. (This rules out my initial "guess" that this might be SQL related.) Everything else seems to work, but wait, the master page settings page is throwing an error too! Claims the master page is invalid...? huh? How would that happen all of a sudden? So what's the common denominator?
What you most likely find is that its the cached pages giving the error... The Output Cache to be exact. It has what I would consider to be, an undocumented limit, not posted in the software boundaries page on TechNet. Go figure... So ok, what is it then? Why this site? Why now? More importantly, how do I fix it?
The real root cause is that Output Cache will only support 10,000 Access Control Lists (ACLs). And how do you end up with 10k ACLs? You most likely have a large site collection, with a lot of unique permissions, with users breaking inheritance at the item or folder level. And then there's the ability to add "All Authenticated Users" and to a SharePoint group, or even direct permission to the site, web, or item. You also have to consider users that have been disabled in your profile import source, but have not been removed from your site collection. You add that all up, and you can hit 10k very quickly. And I cannot emphasize enough that 10k is the limit. I have seen the query I am going to show, you return as few as 10042, on a afflicted site collection. So it's pretty cut and dry, you have to clean up permissions, or turn off Output Cache. In most cases, you will simply turn off Output Cache while the "content/site-collection administrators" clean up the permissions, and then re-enable Output Cache.
Now for the icing on the cake, the April 2009 Cumulative update for SharePoint 2007, does take a step to address the issue. Albeit, a superficial one... What the afore mentioned update does, is simply avoid throwing the exception. When the number of ACLs goes over 10,000, Output Cache is still disabled, it just does not produce the error. So how then would you know about the problem? Well, you may start getting complaints about performance... But you will probably only know if you check. Yup, no one else is going to do it are they? Lucky, there' s an easy way to see if your Output Cache is working. In a future post, I will walk through some of the ways you can test output and the BLOB cache.
The Output Cache is not on by default, but most SharePoint administrators, particularly at the enterprise level, are keenly aware of it, and most of them recommend it's use. And I have to say, this is the first time I have seen it go afoul. What the Output Cache is best at and best used for; is loading items into memory that are frequently "read" but infrequently "modified". Portal landing pages are typically a good example, lots of people view them, but changes are usually not all that frequent.
For more information about the Output Cache, see the Microsoft documentation; http://technet.microsoft.com/en-us/library/cc298466(office.12).aspx, you will find a note about the issue here, but not in the software boundaries guide. http://office.microsoft.com/en-us/sharepoint-server-help/configure-page-output-cache-settings-HA010120686.aspx
If you have the needed permissions and expertise, you can run this query on the content database that contains the affected site collection. If you have been using best practices, you would have each site collection in its own database. So finding it should be a snap.
SELECT SiteID, Count(ScopeID) as 'Estimate ACLs' FROM Perms GROUP BY SiteID
If not, have your DBA run it and give you the results.
Analyzing the Results:
A sample screenshot of the SQL results:
The highlighted column is the estimated number of ACLs in the site collection with the given ID. Being that this is an estimated number, you want to stay as far away from 10,000 as possible. I would even go so far as making this part of an administrative check list, so that you don't get caught by surprise.
If you have your DBA run the query, he will most likely give you a CSV file, with the same information.
- The next step is to automate this process, with PowerShell!! But that is for another time and another post.
- Setting up PowerShell for SharePoint
- Connecting to SQL Server via Windows PowerShell with SQL Server authentication
- Generating SQL Scripts using Windows PowerShell
- Powershell Series #1- Setting up Powershell
- Using PowerShell with SQL Server Management Objects (SMO)
Last Update: 2011-05-18