What CDN Usage Does for SharePoint Online (SPO) Performance

If you need the what’s what on CDNs (content delivery networks), this is a bit of quick reading that will get you up to speed with what a CDN is, how to configure your SPO tenant to use a CDN, and the benefits that CDNs can bring.

The (Not Entirely Obvious) TL;DR Answer

CDN

Since I’m taking the time to write about the topic, you can safely guess that yes, CDNs make a difference withSPO page operations. In many cases, proper CDN configuration will make a substantial difference in SPO page performance. So enable CDN use NOW!

The Basis For That Answer: Introduction

Knowing that some folks simply want the answer up-front, I hope that I’ve satisfied their curiosity. The rest of this post is dedicated to explaining content delivery networks (CDNs), how they operate, and how you can easily enable them for use within your SharePoint Online (SPO) sites.

Let me first address a misconception that I sometimes encountered among SPO administrators and developers (including some MVPs) – that being that CDNs don’t really “do a whole lot” to help site and/or page performance. Sure, usage of a CDN is recommended … but a common misunderstanding is that a CDN is really more of a “nice-to-have” than “need-to-have” element for SPO sites. Of the people saying such things, oftentimes that judgment comes without any real research, knowledge, or testing. Skeptics typically haven’t read the documentation (the “non-RTFM crowd”) and haven’t actually spent any time profiling and troubleshooting the performance of SPO sites. Since I enjoy addressing perf. problems and challenges, I’ve been fortunate to experience firsthand the benefits that CDNs can bring. By the end of this post, I hope I’ll have made converts of a CDN skeptic or two.

What Is A CDN?

Abstract Network

A CDN is a Content Delivery Network. There are a lot of (good) web resources that describe and illustrate what CDNs are and how they generally operate (like this one and this one), so I’m not going to attempt to “add value” with my own spin. I will simply call attention to a couple of the key characteristics that we really care about in our use of CDNs with SPO.

  1. A CDN, at its core, can be thought of as a system of distributed (typically geographically so) servers for caching and offloading of SPO content. Rather than needing to go to the Microsoft network and data center where your tenant is located in order to fetch certain files from SPO, your browser can instead go to a (geographically) closer CDN server to get those same files.
  2. By virtue of going to a closer CDN instead of the Microsoft network, the chance that you’ll have a “bigger pipe” with more bandwidth – and less latency/delay – are greater. This usually translates directly to an improvement in performance.
  3. In addition to giving us the opportunity to download certain SPO files faster and with less delay, CDNs can do other things to improve the experience for the SPO files they serve. For instance, CDN servers can pass files back to the browser with cache-control headers that allow browsers to re-serve downloaded files to other users (i.e, to users who haven’t actually download the files), store downloaded files locally (to avoid having to download them again for a period of time), and more.

If you didn’t know about CDNs prior to this post, or didn’t understand how they could help you, I hope you’re beginning to see the possibilities!

The Arrival Of The Office 365 CDN

It wasn’t all that long ago that Microsoft was a bit more “modest” in its use of CDNs. Microsoft certainly made use of them, but prior to the implementation of its own content delivery networks, Microsoft frequently turned to a company called Akamai for CDN support.

When I first started presenting on SharePoint and its built-in caching mechanisms, I often spoke about Akamai and their edge network when talking about BLOB caching and how the max-age cache-control header could be configured and misconfigured. Back then, “Akamai” was basically synonymous with “CDN,” and that’s how many of us thought about the company. They were certainly leading the pack in the CDN service space.

Back then, if you were attempting to download a large file from Microsoft (think DVD images, ISO files, etc.), then there was a good change that the download link your browser would receive (from Microsoft’s servers) would actually point to an Akamai edge node near your location geographically instead of a Microsoft destination.

Fast forward to today. In addition to utilizing third-party CDNs like those deployed by Akamai, Microsoft has built (and is improving) their own first-party CDNs. There are a couple of benefits to this. First, many data regulations you may be subject to that prevent third-party housing of your data (yes, even in temporary locations like a CDN) can be largely avoided. In the case of CDNs that Microsoft is running, there is no hand-off to a third party and thus much less practical concern regarding who is housing your data.

Second, with their own CDNs, Microsoft has a lot more latitude and ability to extend the specifics of CDN configuration and operation its customers. And that’s what they’ve done with the Office 365 CDN.

Set Up The O365 CDN For Tenant’s Use

Now we’re talking! This next part is particularly important, and it’s what drove the creation of this post. It’s also the one bit of information that I promised Scott Stewart at Microsoft that I would try to get “out in the wild” as quickly and as visibly as possible.

So, if you remember nothing else from this post,please remember this:

Set-SPOTenantCdnEnabled -CdnType Public -Enable $true

That is the line of PowerShell that needs to be executed (against your SPO tenant, so you need to have a connection to your tenant established first) to enable transparent CDN support for public files. Run that, and non-sensitive files of public origin from SPO will begin getting cached in a CDN and served from there.

The line of PowerShell I shared goes through the SharePoint Online Management Shell – something most organizations using SPO (and their admins in particular) have installed somewhere.

It is also possible to enable CDN support if you’re using the PNP PowerShell module, if that’s your preference, by executing the following PowerShell:

Set-PnPTenantCdnEnabled -CdnType Public -Enable $true

No matter how you enable the CDN, it should be noted that the PowerShell I’ve elected to share (above) enables CDN usage for files of public origin only. It is easy enough to alter the parameters being passed in our PowerShell command so as to cover all files, public and private, by switching -CdnType to Both (with the SPO management shell) or executing another line of PowerShell after the first that swaps –type Public with –type Private (in the case of the SharePointPnP PowerShell module).

The reason I chose only public enablement is because your organization may be bound by restrictions or policies that prohibit or limit CDN use with private files. This is discussed a bit in the O365 CDN post originally cited, but it’s best to do your own research.

Enabling CDN support for public files, however, is considered to be safe in general.

What Sort Of Improvements Can I Potentially See?

I’ve got a series of images that I use to illustrate performance improvements when files are served via CDN instead of SPO list/library, and those files are from Microsoft. Thankfully, MS makes the images I tend to use (and a discussion of them) free available, and they are presented at this link for your reading and reference.

The example that is called out in the link I just shared involves offloading of the jQuery JavaScript library from SPO to CDN. The real world numbers that were captured reduced fetch-and-load time from just over 1.5 seconds to less than half a second (<500ms). That is no small change … and that’s for just one file!

The Other (Secret) Benefit Of CDNs

I guess “Secret” is technically the wrong choice of term here. A more accurate description would be to say that I seldom hear or see anyone talking about another CDN benefit I consider to be very important and significant. That benefit, quite simply, involves improving file fetching and retrieval parallelism when a web page and associated assets (CSS, JS, images, etc.) are requested for download by your browser. In plain English: CDNs typically improve file downloading by allowing the browser to issue a greater number of concurrent file requests.

To help with this concept and its explanation, I’ve created a couple of diagrams that I’ll share with you. The first one appears below, and it is meant to represent the series of steps a browser might execute when retrieving everything needed to show a (SharePoint/SPO) page. As we’ve talked about, what is commonly thought of as a single page in a SharePoint site is, more accurately, a page containing all sorts of dependent assets: image files, JavaScript files, cascading style sheets, and a whole bunch more.

A request for a SharePoint page housed at http://www.thesite.com might start out with one request, but your browser is going to need all of the files referenced within the context of that page (default.aspx, in our case) to render correctly. See below:

To get what’s needed to successfully render the example SharePoint page without CDN support, we follow the numbers:

  1. Your browser issues an HTTP request for the page you want to load – http://www.thesite.com/default.aspx in the case of example above.
  2. That page request goes to (and is served by) the web server/front-end that can return the page.
  3. Our page needs other files to render properly, like styling.css, logo.png, functions.js, and more. These get queued-up and returned according to some rules – more on this in a minute.
  4. In step four (4), files get returned to the browser. Notice I say “no more than six at a time” in the illustration. That’s important and will come into play once we start introducing CDN support to the page/site.

You might be wondering, “Only six files at a time? Really? Why the limitation?” Well, I should start by saying the limit is probably six … maybe a bit more, perhaps a bit less. It depends on the browser you’re using what the specific number is. There was a good summary answer on StackOverflow to a related (but slightly different) question that provides some additional discussion.

Section eight (8) of the HTTP specification (RFC 2616) specifically addresses HTTP connections, how they should be handled, how proxies should be negotiated, etc. For our purposes, the practical implementation of the HTTP specification by modern browsers generally limits the number of concurrent/active connections a browser can have to any given host or URL to six (6).

Notice how I worded that last sentence. Since you folks are smart cookies, I’ll bet you’re already thinking “Wait a minute. CDNs typically have different URLs/hosts from the sites they cache” and you’re imaging what happens (or can happen) when a new source (i.e., different host/URL) is introduced.

This illustration roughly outlines the fetch process when a CDN is involved:

Steps one (1) through four (4) of the fetch process with a CDN are basically still the same as was illustrated without a CDN a bit earlier. When the page is served-up in step three (3) and returned in step four (4), though, there are some differences and additional activity taking place:

  1. Since at least one CDN is in-use for the SPO environment, some of the resource links within the page that is returned will have different URLs. For instance, whereas styling.css was previously served from the SPO environment in the non-CDN example, it might now be referenced through the CDN host shown as http://cdn.source.com/styling.css
  2. The requested file is retrieved, and …
  3. Files come back to the client browser from the CDN at the same time they’re being passed-back from the SPO environment.

Since we’re dealing with two different URLs/hosts in our CDN example (http://www.thesite.com and cdn.source.com), our original six (6) file concurrent download limitation transforms into a 12 file limitation (two hosts serving six files a time, 2 x 6 = 12).

Whether or not the CDN-based process is ultimately faster than without a CDN depends on a great many factors: your Internet bandwidth, the performance of your computer, the complexity/structure of the page being served-up, and more. In the majority of cases, though, at least some performance improvement is observed. In many cases, the improvement can be quite substantial (as referenced and discussed earlier).

Additional Note: 8/24/2020

In a bit of laziness on my part, I didn’t do a prior article search before writing this post. As fate would have it, Bob German (a friend and fellow MVP – well, he was an MVP prior to joining Microsoft a couple of years back) wrote a great post at the end of 2017 that I became aware of this morning with a series of tweets. Bob’s post is called “Choosing a CDN for SharePoint Client Solutions” and is a bit more developer-oriented. That being said, it’s a fantastic post with good information that is a great additional read if you’re looking for more material and/or a slightly different perspective. Nice work, Bob!

Post Update: 8/26/2020

Anders Rask was kind enough to point out that the PnP PowerShell line I originally had listed wasn’t, in fact, PnP PowerShell. That specific line of PowerShell has since been updated to reflect the correct way of altering a tenant’s CDN with the PnP PowerShell cmdlets. Many thanks for the catch, Anders!

Conclusion

So, to sum-up: enable CDN use within your SPO tenant. The benefits are compelling!

References

  1. Microsoft Docs: Use The Office 365 Content Delivery Network (CDN) With SharePoint Online
  2. Imperva: What Is A CDN?
  3. Akamai: What Does CDN Stand For?
  4. MDN Web Docs: Cache-Control
  5. Company: Akamai
  6. Presentations: Caching-In For SharePoint Performance
  7. Akamai: Download Delivery
  8. Microsoft Docs: Configure Cache Settings For A Web Application In SharePoint Server
  9. Blog Post: Do You Know What’s Going To Happen When You Enable The SharePoint BLOB Cache?
  10. LinkedIn: Scott Stewart
  11. Microsoft Docs: Enabling O365 CDN support for public origin files.
  12. Microsoft Docs: Get Started With SharePoint Online Management Shell
  13. Microsoft Docs: PnP PowerShell Overview
  14. Microsoft Docs: Set Up And Configure The Office 365 CDN By Using PnP PowerShell
  15. Microsoft Docs: What Performance Gains Does A CDN Provide?
  16. Push Technologies: Browser Connection Limitations
  17. StackOverflow: How many maximum number of simultaneous Chrome connections/threads I can start through Selenium WebDriver?
  18. W3.org: RFC 2616, Section 8: Connection

Quick Tips for Managing the SharePoint 2010 Office Web Applications Cache

I presented remotely to the Boston Area SharePoint User Group (BASPUG) tonight (7/13/2016), and I referenced an article that I had written that is no longer available online. This post originally appeared as a “SharePoint Smarts” article from Idera. Idera is out of the SharePoint business nowadays, but the information I shared in that article is still relevant to those who use SharePoint 2010. So if you have a SharePoint 2010 environment and use the Office Web Apps, this post (and more specifically, the scripts contained within) is for you.

One of the hotly anticipated items in SharePoint 2010’s feature set is the introduction of the Microsoft Office Web Applications, or “Office Web Apps” for short. The release of the Office Web Apps opens up new possibilities for those who work with documents and files that are tied to Microsoft Word and other applications in the Microsoft Office Family.

What Are the Office Web Apps?

In prior versions of SharePoint, viewing and editing Office documents that existed in SharePoint document libraries normally required a client computer possessing the Microsoft Office suite of applications. If you wanted to view or edit a Word document that existed in SharePoint, for example, you needed Microsoft Word (or an equivalent application) installed on your computer.

That situation changes with the arrival of the Office Web Apps. When a SharePoint 2010 farm is properly set up and configured with the Office Web Apps, it becomes possible to view and edit several different Office document types directly from within a browser as shown in Figure 1 below.

Open Document

Figure 1: Browser-based editing of a Microsoft Word document

The Office Web Apps provide browser-based viewing and editing support for Microsoft Excel, OneNote, PowerPoint, and Word document types, and this support extends to more than just Internet Explorer. Firefox 3.x, Safari 4.x, and Google Chrome browser types are also supported for viewing and editing – making the Office Web Apps an enabler of cross-platform collaboration that centers on Office documents.

A Word about the Plumbing

As you might imagine, browser-based rendering and editing of Office documents involves a number of complex processes that engage a variety of front-end, middle-tier, and back-end components. The front-end and middle-tier tasks that are tied to document viewing and editing are handled primarily by a new set of service applications that appear when the Office Web Apps are installed. These service applications (and their associated pages, handlers, and worker processes) take care of the business of document conversion, load-balancing, and rendering for browser consumption.

Document conversion and rendering typically generate a combination of images, HTML, JavaScript, and XAML (or eXtensible Application Markup Language) that are sent to consuming browsers. The creation of these document resources is an expensive process, both in terms of CPU cycles and storage. To improve performance levels, it makes sense to generate these document resources only as needed and reuse them whenever possible. That’s where the Office Web Apps cache comes in.

The Office Web Apps cache is the back-end store that is responsible for housing images, HTML, JavaScript, and XAML resources once they have been created for a document. Each time a document is converted into a set of these resources, the resources are stored in the Office Web Apps cache. When a request for a document comes into SharePoint, the cache is checked to see if the document had been previously requested and rendered. If it had, and the cached document resources are up-to-date for the document, then the document request is served from the cache instead of engaging the Office Web Apps to convert and re-render it. Serving document resources from the Office Web Apps cache can yield significant performance improvements over scenarios where no cache is employed.

Quick side note before going too far: the Office Web Apps cache is only employed for Word and PowerPoint document types. It is not used for OneNote or Excel documents.

Inside the Office Web Apps Cache

The Office Web Apps cache takes the form of a single site collection for each Web application within a SharePoint farm. When the Office Web Apps are installed and configured in a SharePoint environment, a couple of new timer jobs are installed and run regularly within the farm. One of those timer jobs, the Office Web Apps Cache Creation timer job, ensures that each Web application where the Office Web Apps are running has a site collection like the one shown below in Figure 2.

Site Collection

Figure 2: The Office_Viewing_Service_Cache site collection

The Office_Viewing_Service_Cache site collection is a standard Team Site, and it is the location where resources are stored following the conversion and rendering of either a Word or PowerPoint document by the Office Web Apps.

The Team Site can be accessed just like any other SharePoint Team Site, and a glimpse inside the All Documents library (showing a number of document resources) appears below in Figure 3.

Cache Library

Figure 3: All Documents library in an Office Web Apps cache site collection

Managing the Cache

For such a complex system, the Office Web Apps components do a pretty good job of maintaining themselves without external intervention. This extends to the site collections that are used by Office Web Apps for caching purposes, as well. For example, the Office Web Apps Expiration timer job that is installed with the Office Web Apps removes old document resources from cache site collections once they’ve hit a certain age. The timer job also ensures that each of the site collections responsible for caching has adequate space to serve its purpose.

This doesn’t mean that there aren’t opportunities for tuning and maintenance, though. In fact, there are a couple of things that every administrator should do and review when it comes to the Office Web Apps cache.

Tip #1: Relocate the Cache to a New Database

By default, the Office Web Apps Cache Creation timer job creates an Office_Viewing_Service_Cache site collection in a content database that is collocated with one or more of the “real” site collections within each of your content Web applications. Since the cache site collection is allowed to grow to a beefy 100GB by default, it makes sense to relocate the cache site collection to its own (new) content database. By relocating the cache site collection to its own content database, it becomes easy to exclude it from other maintenance such as backups.

Relocating the cache site collection is pretty straightforward, and it can be accomplished pretty easily with following RelocateOwaCache.ps1 PowerShell script. Simply save the script, execute it, and supply the URL of a Web application within your farm where the Office Web Apps are running. The script will take care of creating a new content database within the Web application, and it will then move the Web application’s Office Web Apps cache site collection to the newly created content database.

[code language=”powershell”]
<#
.SYNOPSIS
RelocateOwaCache.ps1
.DESCRIPTION
Relocates the Office Web Apps cache for a specified Web application to a new content database that is created by the script
.NOTES
Author: Sean McDonough
Last Revision: 07-June-2011
.PARAMETER targetUrl
A Web application where Office Web Apps are in use
.EXAMPLE
RelocateOwaCache.ps1 http://www.TargetWebApplication.com
#>
param
(
[string]$targetUrl = "$(Read-Host ‘Target Web application URL [e.g. http://hostname]&#8217;)"
)

function RelocateCache($targetUrl)
{
# Ensure that the SharePoint cmdlets are loaded before continuing
$spCmdlets = Get-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction silentlycontinue
if ($spCmdlets -eq $Null)
{ Add-PSSnapin Microsoft.SharePoint.PowerShell }

# Get the name of the current database where the cache is located; it
# will serve as the basis for a new content database name.
$cacheSite = Get-SPOfficeWebAppsCache -WebApplication $targetUrl -ErrorAction stop
$newDbName = $cacheSite.ContentDatabase.Name + "_OWACache"

# Create a new content database and relocate the cache to it. Make sure the
# user knows what’s happening each step of the way.
Write-Host "- creating a new content database …"
$cacheDb = New-SPContentDatabase -Name $newDbName -WebApplication $targetUrl -ErrorAction stop
Write-Host "- moving the Office Web Apps cache …"
Move-SPSite $cacheSite -DestinationDatabase $cacheDb -Confirm:$false -ErrorAction stop
Write-Host "- performing required IISRESET …"
iisreset | Out-Null

# Let the user know where the cache is now located
Write-Host "Cache successfully relocated to the ‘$newDbName’ database."

# Abort script processing in the event an exception occurs.
trap
{
Write-Warning "`n*** Script execution aborting. See below for problem encountered during execution. ***"
$_.Message
break
}
}

# Launch script
RelocateCache $targetUrl
[/code]

Tip #2: Review Size and Expiration Settings

When an Office_Viewing_Service_Cache site collection is provisioned within a Web application by the Office Web Apps Cache Creation timer job, it is initially configured to hold cached document resources for 30 days. As mentioned in Tip #1, a cache site collection can also grow to a maximum of 100GB by default.

Whether or not these default settings are appropriate for a Web application depends primarily upon the nature of the site collections housed within the Web application. When site collections contain primarily static documents or content that changes infrequently, it makes sense to allow the cache to grow larger and expire content less often than normal. This maximizes the benefit obtained from caching since document content turns over less frequently.

On the other hand, site collections that experience frequent document turnover and heavy collaboration traffic tend to benefit very little from large cache sizes and long expiration periods. In site collections of this nature, cached content tends to become stale quickly. Little benefit is derived from holding onto document resources that may only be good for days or even hours, so maximum cache size is reduced and expiration periods are shortened.

Tip #3: Give Yourself Some Warning

Since each Office Web App cache is a Team Site and like any other site collection, you can leverage standard SharePoint site collection features and capabilities to help you out. One such mechanism that can be of assistance is the ability to have an e-mail warning sent to site collection owners once a site collection’s size hits a predefined threshold. In the case of the Office Web Apps cache, such a warning could be a cue to increase the maximum size of the cache site collection or perhaps lower the expiration period for document resources housed within the site collection.

Like the maximum cache size setting described in Tip #2, the ability to send e-mail warnings once the cache reaches a threshold is actually tied to SharePoint’s site collection quota capabilities. The maximum size of the cache site collection is handled as a storage quota, and the warning threshold maps directly to the quota’s warning threshold as shown below in Figure 4. In the case of Figure 4, a maximum cache size of 50GB is in effect for the cache site collection, and the e-mail warning threshold is set for 25GB.

Quota

Figure 4: Quota settings for an Office Web Apps cache site collection

Knobs and Dials

Tips #2 and #3 discussed some of the more straightforward Office Web Apps cache settings that are available to you, but you might be wondering how you actually go about changing them.

The AdjustOwaCache.ps1 PowerShell script that appears below provides you with an easy way to review and change the settings discussed. Simply save the script, execute it, and supply the URL of the Web application containing the Office Web Apps cache you’d like to adjust. The script will show you the cache’s current settings and give you the opportunity to modify them.

[code language=”powershell”]
<#
.SYNOPSIS
AdjustOwaCache.ps1
.DESCRIPTION
Dumps several common OWA cache settings to the console for a selected Web application and provides a mechanism for altering the those values
.NOTES
Author: Sean McDonough
Last Revision: 08-June-2011
.PARAMETER targetUrl
A Web application where Office Web Apps are in use
.EXAMPLE
AdjustOwaCache.ps1 http://www.TargetWebApplication.com
#>
param
(
[string]$targetUrl = "$(Read-Host ‘Target Web application URL [e.g. http://hostname]&#8217;)"
)

function AdjustCache($targetUrl)
{
# Ensure that the SharePoint cmdlets are loaded before continuing
$spCmdlets = Get-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction silentlycontinue
if ($spCmdlets -eq $Null)
{ Add-PSSnapin Microsoft.SharePoint.PowerShell }

# Create an easy converter for GB to bytes
$GBtoBytes = 1024 * 1024 * 1024

# Get a reference to the cache site collection and extract the values we’ll be
# working with and (potentially) altering.
$cacheSite = Get-SPOfficeWebAppsCache -WebApplication $targetUrl -ErrorAction stop
$wacSize = $cacheSite.Quota.StorageMaximumLevel / $GBtoBytes
$wacWarn = $cacheSite.Quota.StorageWarningLevel / $GBtoBytes
$wacExpire = 30
if ($cacheSite.RootWeb.Properties.ContainsKey("waccacheexpirationperiod"))
{ $wacExpire = $cacheSite.RootWeb.Properties["waccacheexpirationperiod"] }
Write-Host "Current OWA cache values for ‘$targetUrl’"
Write-Host "- Maximum Cache Size (GB): $wacSize"
Write-Host "- Warning Threshold (GB): $wacWarn"
Write-Host "- Expiration Period (Days): $wacExpire"

# Give the user the option to make changes.
$yesOrNo = Read-Host "Would you like to change one or more values? [y/n]"
if ($yesOrNo -eq "y")
{
[Int64]$newWacSize = Read-Host "- Maximum Cache Size (GB)"
Write-Host "- Warning Threshold (GB)"
[Int64]$newWacWarn = Read-Host " (supply 0 for no warning)"
[int]$newWacExpire = Read-Host "- Expiration Period (Days)"

# Convert GB values to bytes and set the cache
$newWacSize = ($newWacSize * $GBtoBytes)
$newWacWarn = ($newWacWarn * $GBtoBytes)
Set-SPOfficeWebAppsCache -WebApplication $targetUrl -ExpirationPeriodInDays $newWacExpire -MaxSizeInBytes $newWacSize -WarningSizeInBytes $newWacWarn -ErrorAction stop
}

# Abort script processing in the event an exception occurs.
trap
{
Write-Warning "`n*** Script execution aborting. See below for problem encountered during execution. ***"
$_.Message
break
}
}

# Launch script
AdjustCache $targetUrl
[/code]

Conclusion

The Office Web Apps are a powerful addition to SharePoint 2010 and pave the way for greater collaboration on Office documents without the need for the Microsoft Office suite of client applications. The Office Web Apps cache is an important part of the larger Office Web Apps equation, and the cache is generally pretty good about taking care of itself. As shown in this article, though, it is still a good idea to relocate the cache from its default location. At the same time, a little bit of tuning and e-mail alerting can go a long way towards ensuring that the cache operates optimally for you in your environment.

Caching, You Ain’t No Friend Of Mine

I love caching and all that it can do to boost performance, but caching for SharePoint in the cloud isn’t the same as it is on-premises. In this post, I explore why that is for Object Caching – and what you can do about it.

I've got a caching-induced headacheI’m a big fan of leveraging caching to improve performance. If you look over my blog, you’ll find quite a few articles that cover things like implementing BLOB caching within SharePoint, working with the Object Cache, extending your own code with caching options, and more. And most of those posts were written in a time when the on-premises SharePoint farm was king.

The “caching picture” began shifting when we started moving to the cloud. SharePoint Online and hosted SharePoint services aren’t the same as SharePoint on-premises, and the things we rely upon for performance improvements on-premises don’t necessarily have our backs when we move out to the cloud.

Yeah, I’m talking about caching here. And as much as it breaks my heart to say it, caching – you ain’t no friend of mine out in SharePoint Online.

Why the heartbreak?

To understand why a couple of SharePoint’s traditional caching mechanisms aren’t doing you any favors in a multi-tenant service like SharePoint Online (with or without Office 365), it helps to first understand how memory-based caching features – like SharePoint’s Object Cache – work in an on-premises environment.

On-Premises

The typical on-premises environment has a small number of web front-ends (WFEs) serving content to users, and the number of site collections being served-up is relatively limited. For purposes of illustration, consider the following series of user requests to an environment possessing two WFEs behind a load balancer:

On-Premises Request Results

Assuming the WFEs have just been rebooted (or the application pools backing the web applications for target site collection have just been recycled) – a worst-case scenario – the user in Request #1 is going to hit a server (either #1 or #2) that does not have cached content in its Object Cache. For this example, we’ll say that the user is directed to WFE #1. Responses from WFE #1 will be slower as SharePoint works to generate the content for the user and populate its Object Cache. The WFE will then return the user’s response, but as a result of the request, its Object Cache will contain site collection-specific content such as navigational sitemaps, Content Query Web Part (CQWP) query results, common site property values, any publishing page layouts referenced by the request, and more.

The next time the farm receives a request for the same site collection (Request #2), there’s a 50/50 shot that the user will be directed to a WFE that has cached content (WFE #1, shown in green) or doesn’t yet have any cached content (WFE #2). If the user is directed to WFE #1, bingo – a better experience should result. Let’s say the user gets unlucky, though, and hits WFE #2. The same process as described earlier (for WFE #1) ensues, resulting in a slower response to the user but a populated Object Cache on WFE #2.

By the time we get to Request #3, both WFEs have at least some cached content for the site collection being visited and should thus return responses more quickly. Assuming memory pressure remains low, these WFEs will continue to serve cached content for subsequent requests – until content expires out of the cache (forcing a re-fetch and fill) or gets forced out for some reason (again, memory pressure or perhaps an application pool recycle).

Another thing worth noting with on-premises WFEs is that many SharePoint administrators use warm-up scripts and services in their environments to make the initial requests that are described (in this example) by Request #1 and Request #2. So, it’s possible in these environments that end-users never have to start with a completely “cold” WFE and make the requests that come back more slowly (but ultimately populate the Object Caches on each server).

SharePoint Online

Let’s look at the same initial series of interactions again. Instead of considering the typical on-premises environment, though, let’s look at SharePoint Online.

Cloud

The first thing you may have noticed in the diagrams above is that we’re no longer dealing with just two WFEs. In a SharePoint Online tenant, the actual number of WFEs is a variable number that depends on factors such as load. In this example, I set the number of WFEs to 50; in reality, it could be lower or (in all likelihood) higher.

Request #1 proceeds pretty much the same way as it did in the on-premises example. None of the WFEs have any cached content for the target site collection, so the WFE needs to do extra work to fetch everything needed for a response, return that information, and then place the results in its Object Cache.

In Request #2, one server has cached content – the one that’s highlighted in green. The remaining 49 servers don’t have cached content. So, in all likelihood (49 out of 50, or 98%), the next request for the same site collection is going to go to a different WFE.

By the time we get to Request #3, we see that another WFE has gone through the fetch-and-fill operation (again, highlighted in green). But, there’s something else worth noting that we didn’t see in the on-premises environment; specifically, the previous server which had been visited (in Request #1) is now red, not green. What does this mean? Well, in a multi-tenant environment like SharePoint Online, WFEs are serving-up hundreds and perhaps thousands of different site collections for each of the residents in the SharePoint environment. Object Caches do not have infinite memory, and so memory pressure is likely to be a much greater factor than it is on-premises – meaning that Object Caches are probably going to be ejecting content pretty frequently.

If the Object Cache on a WFE is forced to eject content relevant to the site collection a user is trying to access, then that WFE is going to have to do a re-fetch and re-fill just as if it had never cached content for the target site collection. The net effect, as you might expect, is longer response times and potentially sub-par performance.

The Take-Away

If there’s one point I’m trying to make in all of this, it’s this: you can’t assume that the way a SharePoint farm operates on-premises is going to translate to the way a SharePoint Online farm (or any other multi-tenant farm) is going to operate “out in the cloud.”

Is there anything you can do? Sure – there’s plenty. As I’ve tried to illustrate thus far, the first thing you can do is challenge any assumptions you might have about performance that are based on how on-premises environments operate. The example I’ve chosen here is the Object Cache and how it factors into the performance equation – again, in the typical on-premises environment. If you assume that the Object Cache might instead be working against you in a multi-tenant environment, then there are two particular areas where you should immediately turn your focus.

Navigation

By default, SharePoint site collections use structural navigation mechanisms. Structural navigation works like this: when SharePoint needs to render a navigational menu or link structure of some sort, it walks through the site collection noting the various sites and sub-sites that the site collection contains. That information gets built into a sitemap, and that sitemap is cached in the Object Cache for faster retrieval on subsequent requests that require it.

Without the Object Cache helping out, structural navigation becomes an increasingly less desirable choice as site hierarchies get larger and larger. Better options include alternatives like managed navigation or search-driven navigation; each option has its pros and cons, so be sure to read-up a bit before selecting an option.

Content Query Web Parts

When data needs to be rolled-up in SharePoint, particularly across lists or sites, savvy end-users turn to the CQWP. Since cross-list and cross-site queries are expensive operations, SharePoint will cache the results of such a query using – you guessed it – the Object Cache. Query results are then re-used from the Object Cache for a period of time to improve performance for subsequent requests. Eventually, the results expire and the query needs to be run again.

So, what are users to do when they can’t rely on the Object Cache? A common theme in SharePoint Online and other multi-tenant environments is to leverage Search whenever possible. This was called out in the previous section on Navigation, and it applies in this instance, as well.

An alternative to the CQWP is the Content Search Web Part (CSWP). The CSWP operates somewhat differently than the CQWP, so it’s not a one-to-one direct replacement … but it is very powerful and suitable in most cases. Since the CSWP pulls its query results directly from SharePoint’s search index, it’s exceptionally fast – making it just what the doctor ordered in a multi-tenant environment.

Quick note (2/1/2016): Thanks to Cory Williams for reminding me that the CSWP is currently only available to SharePoint Online Plan 2 and other “Plan 3” (e.g., E3, G3) users. Many enterprise customers fall into this bucket, but if you’re not one of them, then you won’t find the CSWP for use in your tenant :-(

There are plenty of good resources online for the CSWP, and I regularly speak on it myself; feel free to peruse resources I have compiled on the topic (and on other topics).

Wrapping-Up

In this article, I’ve tried to explain how on-premises and multi-tenant operations are different for just one area in particular; i.e., the Object Cache. In the future, I plan to cover some performance watch-outs and work-arounds for other areas … so stay tuned!

Additional Reading and References

  1. MSDN: Navigation options for SharePoint Online
  2. MSDN: Using Content Search Web Part instead of Content Query Web Part to improve performance in SharePoint Online
  3. SharePoint Interface: Presentations and Materials

Do You Know What’s Going to Happen When You Enable the SharePoint BLOB Cache?

The SharePoint BLOB Cache can be a very powerful tool for use in improving farm performance and scalability, but some planning should take place before the BLOB Cache is enabled. In this post, I explain how end users can suffer if BLOB Cache planning isn’t performed. I also make some recommendations on how to configure the BLOB Cache to provide administrators with performance benefits that don’t come at the cost of a negative end user experience.

The topic of the SharePoint BLOB Cache and how it operates jumped back into the front of my brain recently given some conversations I’ve had and things I’ve seen (e.g., a promising CodePlex project called the SharePoint 2010 BlobCache Manager).

SharePoint PSA

"Just Do It" Post-It NoteThis post is my way of doing something akin to a SharePoint public service announcement. I’ve recently seen some caching-related functionality and topics – especially the BLOB Cache – getting some real traction in different circles, and I think that the attention and love is generally a good thing. I am somewhat concerned, though, by the fact that the discussions and projects that have been surfacing don’t seem to say much beyond the Post-It on the right.

What do I mean by “Just do-it?” Well, here’s the high-level summary of what I’ve been seeing people say, post, and practice with the SharePoint BLOB Cache:

  • The SharePoint BLOB Cache can lighten the load on your SQL Servers by caching BLOB (binary large object) data such as images, video, audio, CSS, etc., on your web front-ends (WFEs)
  • BLOB assets are then served directly from the WFEs. This prevents regular round trips from the WFEs to SQL Servers for every BLOB item needed, and this conserves network bandwidth and reduces SQL Server load.
  • To realize the benefits of the BLOB Cache, simply turn it on and you’re good to go. Nothing to it!

To be fair, I think that I’ve done a disservice by contributing to the perception that all you need to do to kick-start BLOB caching is change this web.config line …

[sourcecode language=”xml”]
<BlobCache location="C:\BlobCache\14" path="\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="10" enabled="false" />
[/sourcecode]

… to this:

[sourcecode language=”xml”]
<BlobCache location="C:\BlobCache\14" path="\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="10" enabled="true" />
[/sourcecode]

If you look closely, you’ll see that the only difference between the two XML elements is that the enabled attribute is changed from false to true in the second example.

As you might have guessed, I wouldn’t be writing this blog post if simply changing the BlobCache element’s enabled attribute to true didn’t cause potential problems.

The Small Print

Disclaimer text that includes some BLOB cache usage warningsAt the recent SPTechCon in San Francisco, I gave a five-minute lightning talk called Pushing SharePoint’s ‘Go Faster’ Button. It was a lighthearted look at SharePoint performance, and it focused on a couple of caching changes that could be easily implemented to improve SharePoint performance. One of the recommended changes was (surprise surprise) to simply “turn on” SharePoint’s BLOB Cache.

I only had five minutes to deliver the lightning talk, so I had to cram all of the disclaimers for what I was recommending into the legal style slide that appears on the left. Although the slide got a chuckle from the crowd (the print did look pretty small on-screen), I actually did invest some time in its warnings and watch-outs for anyone who wanted to go and dig them up later.

Of the two tips I delivered in the lightning talk, Tip #2 dealt with the SharePoint BLOB cache. I included a very specific warning in the “Disclaimer of Liability” aimed at those who sought to simply “set it and forget it.” The text of that warning read:

Failure to specify a max-age attribute in the BlobCache element of the web.config will result in the default value of 86,400 seconds (24 hours) being used. Use of a non-zero max-age attribute will result in the attachment of client-side cacheability headers to assets that are being BLOB cached, and such headers can result in BLOB assets being cached on the client beyond the duration of the current user session; such caching can easily result in "stale" BLOB resources being used from the client rather than newer ones being fetched from the WFE, so adjust max-age values carefully.

Put another way: if you simply enable the BLOB cache and do nothing else, your users may be getting a SharePoint behavior change that you hadn’t intended for them to have.

Why Did You Have To Bring Age Into This?

The sticking point with SharePoint’s default BlobCache element and attribute settings is that a max-age of 24 hours is assumed and used when the max-age attribute isn’t explicitly specified or set. What does that mean? I wrote a separate post a while back titled Client-Server Interactions and the max-age Attribute with SharePoint BLOB Caching, and that post addressed the effect that explicit and implicit max-age attribute value specifications have on BLOB Caching. I recommend checking out the post for the full background; for anyone who needs a quick summary, though, I can distill it down to two bullet points:

  • Enabling the BLOB Cache without specifying a max-age attribute means that BLOBs will be cached on both the WFEs in your farm and within users’ browser caches (through the use of Cache-Control HTTP headers).
  • In collaboration environments and anyplace else where BLOB assets may be edited or turn over frequently (within the course of a day), the default client-side caching behavior can mess with the UI/UX of your SharePoint site in all sorts of interesting ways.

What does this mean for the average user of SharePoint? Well, let me walk through a fictitious scenario with supporting detail – as told from the perspective of a SharePoint end user. If you already understand the problem, you’re short on time, and you want to get right to what I recommend, jump down to the “Recommendations Before You Enable the BLOB Cache” section.

Acme Online Goes Live!

Welcome to the Acme Corporation! The Acme Corporation recently completed a “webification” of its entire product catalog, and the end result is a publishing site collection that is implemented in SharePoint 2010. The site collection houses all of Acme’s products, and those products are available for the public to browse and order. Acme’s web content management team is responsible for maintaining the product catalog as it appears on the site, and that team is led by a crafty old fellow named Wile E. Coyote (who we’ll simply refer to as “Wiley” from here on out).

Wiley has many years of experience with Acme’s products and has tried nearly all of them personally; he’s something of a legend. He and his team worked diligently to get Acme’s products into SharePoint before the launch. Not all of the products made it into SharePoint before the launch, though, so a phased approach was taken to rolling out the entire catalog.

The Launch

A SharePoint article page featuring a bundle of dynamiteThe first products that Wiley and his team worked to get into SharePoint were Acme’s line of explosives. To prepare for the launch of the new online catalog, Wiley wrote up an article on Acme’s top-selling “Bundle o’ Dynamite” product. The article featured a picture of the Bundle o’ Dynamite, along with some descriptive text about the product, how it operates, a few safety warnings, and a couple of other informational points. When Wiley finished, a mockup of the article page looked like the screenshot seen on the left.

A Fiddler trace of the first request for the dynamite article pageUnbeknownst to Wiley, the Acme product catalog site collection is served-up by one Web application through one zone (the Default zone) on one WFE. This means that all product catalog requests, whether they come from customers or Wiley’s team, go to one IIS site on one server. The first time that someone (or more specifically, someone’s browser) requests the article page that Wiley put together, a series of web requests are kicked-off to pull down the page content, images, scripts, CSS, and everything else needed to render the page in a browser. This series of interactions (captured using Fiddler) is shown on the top right.

A Fiddler trace of the second request for the dynamite article pageSubsequent requests for the same article page (within the context of a single browser session) will follow the series of interactions seen directly to the right. One thing that you may notice upon inspecting the Fiddler trace is that subsequent page requests result in fewer calls back to the server. This is because SharePoint applies per session caching to many of the items it passes back to the browser, and this caching (which is not the same as BLOB caching) removes the need for constant re-fetching of items that haven’t changed.

In both of the Fiddler traces above, the focus is on the newsarticleimage.jpg file  – the file which houses a picture of the Bundle o’ Dynamite. The first time the browser requests the image within a session, a successful HTTP 200 response is returned to the browser along with the image. Also important to note is the Cache-Control header that comes back with the image:

[sourcecode language=”text”]
Cache-Control: private,max-age=0
[/sourcecode]

The private part of the Cache-Control header tells the client browser to cache the image locally for the duration of the browser session. The max-age=0 portion says, in effect, that subsequent uses of the image by the browser (from its cache) should be validated with a call back to the WFE to ensure that the image hasn’t changed.

And that’s what is shown happening in the second Fiddler trace. When subsequent page requests attempt to use the image, a GET request from the browser is answered by the WFE with

[sourcecode language=”text”]
HTTP/1.1 304 NOT MODIFIED
[/sourcecode]

This response code tells the browser that the image hasn’t changed and that it’s safe to use the locally cached copy. If the image were to change, then an HTTP 200 would be returned instead and the new/updated version of the image would be sent to the browser.

When the browser is closed, the locally cached copy of the image is flushed and the process begins anew the next time the browser opens.

Meep Meep

Not long after the launch of Acme’s online product catalog, customers began complaining that browsing the catalog was simply too slow. After some discussion, Management decided to bring in Roadrunner Consulting to assess the site and make suggestions that would improve performance.

Roadrunner’s team raced around (as they are wont to do), ran some tests, made some observations, and provided a list of suggestions. At the top of the list was “Implement SharePoint BLOB Caching.”

So, Acme’s SharePoint administrators jumped right in and turned on BLOB caching. Since the site is served up through a single IIS site (SharePoint zone), the admins set enabled=“true” in the BlobCache element of the site’s web.config file. No other changes were made to the BlobCache element.

So, what happened? Well, things got snappier! The administrators watching their back-end performance noticed that the file system on the WFE started to cache BLOBs that were being requested by users. Each request to the WFE for one of those BLOBs resulted in the BLOB being served back directly from the WFE without a round-trip to the SQL Server. Internal network bandwidth utilization dropped significantly, and the SQL Servers started breathing a bit easier. The administrators were most definitely happy with the change they’d made … and it was as easy as setting enabled=”true” in the BlobCache element of the web.config file. Talk about the greatest thing since sliced bread! Everyone exchanged a round of high-fives after the change was made, and talks of how the geeks would rise up to dominate the world resumed.

Dynamite Article Page - First Request with BLOB Caching enabledSo, how do things look on the client side after enabling the BLOB Cache? Well, when someone goes to retrieve Wiley’s article for the first time, the first browser request series for the page looks much like it did without the BLOB Cache enabled. See the Fiddler trace on the right.

There is one very important difference when retrieving items with the BLOB Cache enabled, though, and you have to look closely to see it. Do you see the Cache-Control HTTP header that is returned with the request for the newsarticleimage.jpg image? It’s different than it was before the BLOB Cache was enabled. Now it says

[sourcecode language=”text”]
Cache-Control: public, max-age=86400
[/sourcecode]

Whoa … what does this mean? Well, it means two important things. First, the public designation means that when the image is cached by the browser, it will no longer be private to the current session. It can be re-used across sessions, so it won’t necessarily “go away” when the browser is closed.

Second, the max-age=86400 means that the image will continue to “live” in the browser’s cache for 86400 seconds, or 24 hours. For that period of time, the browser won’t even attempt to contact the WFE to see if the image has changed; it will just use the copy that it holds onto. Nothing short of a browser cache flush (which is manual intervention by the user) will change this behavior.

Dynamite Article Page - Subsequent page requests with BLOB Caching enabledAnd that’s what we see with the Fiddler trace on the right. This trace represents what subsequent page requests look like for the next 24 hours. Notice that the newsarticleimage.jpg image doesn’t get re-requested or checked. There are no HTTP 304 response codes coming back, because the browser simply isn’t requesting the image; it’s using its cached copy.

Admittedly, the Fiddler trace will look a little different when the browser is closed and re-opened … but a re-fetch of the newsarticleimage.jpg file will not take place for a full 24 hours unless a user clears the browser cache.

What does this change in behavior mean for actual users of the site? Read on to find out …

Running Off the Edge of the Cliff

The corrected article page showing the TNT barrelShortly after the BLOB Cache changes were made, Wiley got an (unrelated) call from the Fulfillment Department. They were furious because they’d been getting all sorts of returns for the Bundle o’ Dynamite. The reason for the returns? It’s because Wiley put the wrong image in his article page!

Even though Acme sells a product called the “Bundle o’ Dynamite,” the actual product that ships is a barrel of TNT. Since the product image was wrong, customers were incorrectly concluding that they’d get several sticks of dynamite instead of a barrel, and this was rubbing many of them the wrong way. Who knew?

Wiley went out to SharePoint, checked the article that he wrote, and saw that he did indeed use a series of dynamite sticks for an image. The page should have actually appeared as it does in the screenshot that is above and to the left. After a quick facepalm, Wiley realized that he needed to make a change – and fast.

Wiley went out to the Publishing Images library for the site collection and uploaded a new version of the newsarticleimage.jpg image file – one that contained a barrel of TNT instead of a bundle of dynamite. He then browsed to the article page and did a refresh.

Nothing changed.

Wiley hit F5 in his browser. Still nothing changed.

Over the course of the hour that followed, Wiley grew increasingly more bewildered and panicked as he tried in vain to get the new TNT barrel to show up on the article page. He uploaded the image several more times, closed and re-opened his browser, deleted and then reloaded the image, re-published and re-approved the actual article page, and even got the administrators to flush the SharePoint BLOB Cache. None of the actions made a difference.

The Coyote Never Wins

Why didn’t any of Wiley’s efforts make a difference? Because what Wiley didn’t understand was that there was nothing he could do short of flushing his cache that would prompt the browser to re-request the updated image. The browser started using the cached copy of the image after the first request Wiley made in the morning; i.e., the request to verify that the image on the page was incorrect as Fulfillment indicated. For another 24 hours (86400 seconds), the browser would continue to use the cached image.

Wiley’s image problem was just one of the potential issues that might surface as a result of the BLOB Cache change. It was also one of the more visible problems. In looking at the path attribute of the BlobCache element, you might have noticed some of the other file types that got cached by default – file types with js (JavaScript) and css (Cascading Style Sheets) extensions, for example. Any of those file types which were served from site collection lists and libraries would also be impacted by the “fetch once and use for 24 hours” behavior.

Recommendations Before You Enable the BLOB Cache

A frustrated end userI hope the example featuring Wiley did an adequate job of explaining why I think that blindly turning on the BLOB Cache can be a bad thing for end users. Having seen first-hand what an improperly configured BLOB Cache can do to the user experience, I’d like to offer up a handful of suggestions based on my own experience.

1. Don’t just “enable” the BLOB Cache with its out-of-the-box (OOTB) default settings. There are a couple of OOTB settings that you should really think hard about changing. I mentioned the default max-age value you get if you don’t actually specify the attribute value. I’m going to talk more about that one in a bit. Also: do you really want the BLOB Cache using your system drive (C:) as its target location for cached files? Most admins I know aren’t particularly friendly with that idea, so relocate the BLOB Cache to another drive.

2. If your Web application has only one zone (i.e., the Default zone), strongly consider specifying a max-age attribute value of zero (max-age=”0”). Why do I say this? Because it avoids the situation I described with Wiley above, and it’s a compromise that gives administrators some of the performance boosts they seek without completely shafting users in the process.

Dynamite Article Page - max-age = 0 in effectWhen the BLOB Cache is enabled and a max-age attribute value of 0 is explicitly specified, things change a bit. BLOB caching and offloading still happens on the WFEs, so administrators get the internal performance boosts they were probably seeking in the first place. On the other side of the equation (i.e., the “user side”), persistent client side caching ceases as shown on the left. Although the Cache-Control header still specifies public cacheability, the max-age=0 ensures that the browser will round-trip to the server each time it intends to use a locally cached resource to ensure that the most up-to-date copy of the resource is in the cache. This will keep users like Wiley from going off the deep end due to the wonky and inconsistent user experience that afflicts users who need to edit and proof a site that employs persistent client-side caching.

3. If you have a Web application that is extended to two or more zones, apply BLOB Cache settings that are appropriate for each zone. This is relatively common in public-facing SharePoint site collections and Web applications where anonymous access is in-use. In these particular scenarios, there are usually at least two SharePoint zones per Web application: an internal zone (typically the Default zone) through which editors and other users may authenticate to carry out content work, and an external zone (e.g., the Internet zone) which is set up for anonymous access and “external consumption.”

In this dual-zone scenario, it makes sense to configure each zone (IIS site) differently since usage patterns differ between zones. The BlobCache element in the web.config for the internal (Default) zone, for example, should probably be configured according to #2 (above – the one zone scenario with a max-age attribute value of zero). For the web.config that is used in the external zone, though, it may make sense to apply a non-zero max-age value for use with the BLOB Cache – especially since anonymous users aren’t (normally) content editors. A non-zero max-age means fewer trips (overall) to your WFEs from outside the LAN environment, and this helps to keep bandwidth utilization on your Internet connection. There is still a risk that external users may see “stale” content, but the impact is generally more acceptable for straight viewers since they aren’t actively working on content.

4. Consider changing the path expression to restrict what goes into the BLOB Cache. The default path expression for SharePoint 2010’s BlobCache element looks like this:

[sourcecode language=”text”]
\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$
[/sourcecode]

Most administrators are savvy enough to add and remove file extensions from this expression as needed; for example, taking |wmv out of the path expression means that the BLOB Cache will no longer store and serve files with a .wmv extension. Adding and removing extensions really only scratches the surface of what can be done, though. The path attribute value is actually a regular expression, so the full power of regular expressions can be applied to select and exclude files for use with the BLOB Cache.

Suppose you want to explicitly control which images, videos, and other files (that match the list of extensions) end up in the BLOB Cache? Maybe you want to specially name files you intend to cache with an additional .cache extension before the actual file type extension (e.g., .gif). To accomplish this, you could change the path expression to this:

[sourcecode language=”text”]
\.cache\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$
[/sourcecode]

With this path expression, filenames like these would be included in the BLOB Cache:

  • SampleImage.cache.jpg
  • MyVideo.cache.wmv

… but anything without the additional .cache qualifier would get omitted, such as:

  • AnotherImage.jpg
  • ExcludeThisVideo.wmv

This is just a simple example, but hopefully it gives you an idea of what you could do with the path regular expression to control the contents of the BLOB Cache.

Summing It Up

The SharePoint BLOB Cache is a powerful mechanism to improve farm performance and scalability, but it shouldn’t be turned on without some forethought and a couple of changes to the default BlobCache element attribute values.

If you are an administrator and have enabled the BLOB Cache with its default values, check with your users. They might have some feedback for you …

Additional Reading and Resources

  1. CodePlex: SharePoint 2010 BlobCache Manager
  2. Event: SPTechCon San Francisco 2012
  3. Prezi: Pushing SharePoint’s ‘Go Faster’ Button
  4. Blog Post: Client-Server Interactions and the max-age Attribute with SharePoint BLOB Caching
  5. Tool: Fiddler Web Debugging Proxy

SharePoint Summer Fun

This post covers my summer SharePoint activities, including a number of appearances at SharePoint Saturday events and SPUGs. I also talk about a few other tidbits, including an appearance on Microsoft’s Talk TechNet broadcast.

My family recently relocated from the west side of Cincinnati to the east side, and it’s been a major undertaking – as anyone who’s familiar with Jim Borgman’s comic series on the east and west sides of Cincinnati can appreciate. Between the move and some other issues, I had planned on taking it easy with SharePoint activities for a while.

Despite that goal, it seems I still have a handful of SharePoint-related things planned this summer. Here’s what’s going on.

Office Web Apps’ Cache Article

Idera SharePoint SmartsAs a product manager for Idera, I occasionally author articles for the company’s SharePoint Smarts e-newsletter. A couple of weeks back, I wrote an article titled Quick Tips for Managing the SharePoint 2010 Office Web Apps’ Cache. The article basically provides an overview of the Office Web Apps’ cache and how it can be maintained for optimal performance.

The main reason I’m calling the article out here (in my blog) is because I put together a couple of PowerShell scripts that I included in the article. The first script relocates the Office Web Apps’ cache site collection to a different content database for any given Web application. The second script displays current values for some common cache settings and gives you the opportunity to change them directly.

The scripts (and article contents) are helpful for anyone trying to manage the Office Web Apps in SharePoint 2010. Check them out!

Talk TechNet Appearance

On Wednesday, July 6th (tomorrow!), I’ll be on Talk TechNet with Keith Combs and Matt Hester. I’m going to be talking with Keith and Matt about SharePoint, disaster recovery, and anything else that they want to shoot the breeze about. 60 minutes seems like a long time, but I know how quickly it can pass once my mouth starts going …

Here’s the fun part (for you): the episode is presented live, and anyone who registers for the event can “call in” with questions, comments, etc. Feel free to call in and throw me a softball question … or heckle me, if that’s your style! Although I don’t know Keith personally (yet), I do know Matt – and knowing Matt, things will be lighthearted and lively.

Evansville SPUG

On Thursday the 7th (yeah, this is a busy week), I’ll be heading down to Evansville, Indiana, to speak at the Evansville user group. This is something that Rob Wilson and I have been discussing for quite some time, and I’m glad that it’s finally coming to fruition!

I’ll be presenting my SharePoint 2010 and Your DR Plan: New Capabilities, New Possibilities! session. The abstract reads as follows:

Disaster recovery planning for a SharePoint 2010 environment is something that must be performed to insure your data and the continuity of business operations. Microsoft made significant enhancements to the disaster recovery landscape with SharePoint 2010, and we’ll be taking a good look at how the platform has evolved in this session. We’ll dive inside the improvements to the native backup and restore capabilities that are present in the SharePoint 2007 platform to see what has been changed and enhanced. We’ll also look at the array of exciting new capabilities that have been integrated into the SharePoint 2010 platform, such as unattended content database recovery, SQL Server snapshot integration, and configuration-only backup and restore. By the time we’re done, you will possess a solid understanding of how the disaster recovery landscape has changed with SharePoint 2010.

It’ll be a bit of a drive from here to Evansville and back, but I’m really looking forward to talking shop with Rob and his crew on Thursday!

SharePoint Saturday New York City (SPSNYC)

SPS New York City LogoI’ll be heading up to New York City at the end of the month to present at SharePoint Saturday New York City on July 30th. I’ll be presenting SharePoint 2010 and Your DR Plan: New Capabilities, New Possibilities! session, and it should be a lot of fun.

Amazingly enough, the primary registration (400 seats) for the event “sold out” in a little over three days. Holy smokes – that’s fast! The event is now wait listed, so if you haven’t yet signed up … you probably won’t get a spot  :-(

CincySPUG

On August 4th, I’ll be heading back up to Mason, Ohio, to present for my friends at the Cincinnati SharePoint User Group. My presentation topic this time around will be “Caching-In” for SharePoint Performance. Here’s the abstract:

Caching is a critical variable in the SharePoint scalability and performance equation, but it’s one that’s oftentimes misunderstood or dismissed as being needed only in Internet-facing scenarios. In this session, we’ll build an understanding of the caching options that exist within the SharePoint platform and how they can be leveraged to inject some pep into most SharePoint sites. We’ll also cover some sample scenarios, caching pitfalls, and watch-outs that every administrator should know.

Like most of my presentations, this one started as a PowerPoint. I converted it over to Prezi format some time ago, and I’ve been having a lot of fun with it since. I hope the CincySPUG folks enjoy it, as well!

SharePoint Saturday The Conference (SPSTC)

SPSTC LogoIf you haven’t heard of SharePoint Saturday The Conference yet, then the easiest way for me to describe is this way: it’s a SharePoint Saturday event on steroids. Instead of being just one Saturday, the event is three days long. Expected attendance is 2500 to 3000 people. It’s going to be huge.

I submitted a handful of abstracts for consideration, and I know that I’ll be speaking at the event. I just don’t know what I’ll be talking about at this point.  If you’re going to be in the Washington, DC area on August 11th through 13th, though, consider signing up for the conference!

SharePoint Saturday Columbus (SPSColumbus)

SPS Columbus LogoThe 2nd SharePoint Saturday Columbus event will be held on August 20th, 2011, at the OCLC Conference Center in Columbus, Ohio. Registration is now open, and session submissions are being accepted through the end of the day tomorrow (7/6).

Along with Brian Jackett, Jennifer Mason, and Nicola Young, I’m helping to plan and execute the event on the 20th. I’m handling speaker coordination again this year – a role that I do enjoy! We’ve had a number of great submissions thus far; in the next week or so, we (the organizing committee) will be putting our heads together to make selections for the event. Once those selections have been made, I’ll be communicating with everyone who submitted a session.

If you live in Ohio and don’t find Columbus to be an exceptionally long drive, I encourage you to head out to the SharePoint Saturday site and sign up for the event. It’s free, and the training you’ll get will be well-worth the Saturday you spend!

Additional Reading and References

  1. Jim Borgman: East Side/West Side of Cincinnati comic series
  2. Company: Idera
  3. Article: Quick Tips for Managing the SharePoint 2010 Office Web Apps’ Cache
  4. Event: Talk TechNet Webcast, Episode 43
  5. Blog: Keith Combs
  6. Blog: Matt Hester
  7. User Group: Evansville SPUG site
  8. Blog: Rob Wilson
  9. Event: SharePoint Saturday New York City
  10. User Group: CincySPUG site
  11. Software/Service: Prezi
  12. Event: SharePoint Saturday The Conference
  13. Event: SharePoint Saturday Columbus
  14. Blog: Brian Jackett
  15. Blog: Jennifer Mason
  16. Twitter: Nicola Young

Client-Server Interactions and the max-age Attribute with SharePoint BLOB Caching

This post discusses how client-side caching and the max-attribute work with SharePoint BLOB caching. Client-server request/response interactions are covered, and some max-age watch-outs are also detailed.

I first presented (in some organized capacity) on SharePoint’s platform caching capabilities at SharePoint Saturday Ozarks in June of 2010, and since that time I’ve consistently received a growing number of questions on the topic of SharePoint BLOB caching.  When I start talking about BLOB caching itself, the area that seems to draw the greatest number of questions and “really?!?!” responses is the use of the max-age attribute and how it can profoundly impact client-server interactions.

I’d been promising a number of people (including Todd Klindt and Becky Bertram) that I would write a post about the topic sometime soon, and recently decided that I had lollygagged around long enough.

Before I go too far, though, I should probably explain why the max-age attribute is so special … and even before I do that, we need to agree on what “caching” is and does.

Caching 101

Why does SharePoint supply caching mechanisms?  Better yet, why does any application or hardware device employ caching?  Generally speaking, caching is utilized to improve performance by taking frequently accessed data and placing it in a state or location that facilitates faster access.  Faster access is commonly achieved through one or both of the following mechanisms:

  • By placing the data that is to be accessed on a faster storage medium; for example, taking frequently accessed data from a hard drive and placing it into memory.
  • By placing the data that is to be accessed closer to the point of usage; for example, offloading files from a server that is halfway around the world to one that is local to the point of consumption to reduce round-trip latency and bandwidth concerns.  For Internet traffic, this scenario can be addressed with edge caching through a content delivery network such as that which is offered by Akamai’s EdgePlatform.

Oftentimes, data that is cached is expensive to fetch or computationally calculate.  Take the digits in pi (3.1415926535 …) for example.  Computing pi to 100 decimals requires a series of mathematical operations, and those operations take time.  If the digits of pi are regularly requested or used by an application, it is probably better to compute those digits once and cache the sequence in memory than to calculate it on-demand each time the value is needed.

Caching usually improves performance and scalability, and these ultimately tend to translate into a better user experience.

SharePoint and caching

Through its publishing infrastructure, SharePoint provides a number of different platform caching capabilities that can work wonders to improve performance and scalability.  Note that yes, I did say “publishing infrastructure” – sorry, I’m not talking about Windows SharePoint Services 3 or SharePoint Foundation 2010 here.

With any paid version of SharePoint, you get object caching, page output caching, and BLOB caching.  With SharePoint 2010 and the Office Web Applications, you also get the Office Web Applications Cache (for which I highly recommend this blog post written by Bill Baer).

Each of these caching mechanisms and options work to improve performance within a SharePoint farm by using a combination of the two mechanisms I described earlier.  Object caching stores frequently accessed property, query, and navigational data in memory on WFEs.  Basic BLOB caching copies images, CSS, and similar resource data from content databases to the file system of WFEs.  Page output caching piggybacks on ASP.NET page caching and holds SharePoint pages (which are expensive to render) in memory and serves them back to users.  The Office Web Applications Cache stores the output of Word documents and PowerPoint presentations (which is expensive to render in web-accessible form) in a special site collection for subsequent re-use.

Public-facing SharePoint

Each of the aforementioned caching mechanisms yields some form of performance improvement within the SharePoint farm by reducing load or processing burden, and that’s all well and good … but do any of them improve performance outside of the SharePoint farm?

What do I even mean by “outside of the SharePoint farm?”  Well, consider a SharePoint farm that serves up content to external consumers – a standard/typical Internet presence web site.  Most of us in the SharePoint universe have seen (or held up) the Hawaiian Airlines and Ferrari websites as examples of what SharePoint can do in a public-facing capacity.  These are exactly the type of sites I am focused on when I ask about what caching can do outside of the SharePoint farm.

For companies that host public-facing SharePoint sites, there is almost always a desire to reduce load and traffic into the web front-ends (WFEs) that serve up those sites.  These companies are concerned with many of the same performance issues that concern SharePoint intranet sites, but public-facing sites have one additional concern that intranet sites typically don’t: Internet bandwidth.

Even though Internet bandwidth is much easier to come by these days than it used to be, it’s not unlimited.  In the age of gigabit Ethernet to the desktop, most intranet users don’t think about (nor do they have to concern themselves with) the actual bandwidth to their SharePoint sites.  I can tell you from experience that such is not the case when serving up SharePoint sites to the general public

So … for all the platform caching options that SharePoint has, is there anything it can actually do to assist with the Internet bandwidth issue?

Enter BLOB caching and the max-age attribute

As it turns out, the answer to that question is “yes” … and of course, it centers around BLOB caching and the max-age attribute specifically.  Let’s start by looking at the <BlobCache /> element that is present in every SharePoint Server 2010 web.config file.

BLOB caching disabled

[sourcecode language=”xml”]
<BlobCache location="C:\BlobCache\14" path="\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="10" enabled="false" />
[/sourcecode]

This is the default <BlobCache /> element that is present in all starting SharePoint Server 2010 web.config files, and astute readers will notice that the enabled attribute has a value of false.  In this configuration, BLOB caching is turned off and every request for BLOB resources follows a particular sequence of steps.  The first request in a browser session looks like this:

image

In this series of steps

  1. A request for a BLOB resource is made to a WFE
  2. The WFE fetches the BLOB resource from the appropriate content database
  3. The BLOB is returned to the WFE
  4. The WFE returns an HTTP 200 status code and the BLOB to the requester
    Here’s a section of the actual HTTP response from server (step #4 above):

[sourcecode highlight=”2″]
HTTP/1.1 200 OK
Cache-Control: private,max-age=0
Content-Length: 1241304
Content-Type: image/jpeg
Expires: Tue, 09 Nov 2010 14:59:39 GMT
Last-Modified: Wed, 24 Nov 2010 14:59:40 GMT
ETag: "{9EE83B76-50AC-4280-9270-9FC7B540A2E3},7"
Server: Microsoft-IIS/7.5
SPRequestGuid: 45874590-475f-41fc-adf6-d67713cbdc85
[/sourcecode]

You’ll notice that I highlighted the Cache-Control header line.  This line gives the requesting browser guidance on what it should and shouldn’t do with regard to caching the BLOB resource (typically an image, CSS file, etc.) it has requested.  This particular combination basically tells the browser that it’s okay to cache the resource for the current user, but the resource shouldn’t be shared with other users or outside the current session.

    Since the browser knows that it’s okay to privately cache the requested resource, subsequent requests for the resource by the same user (and within the same browser session) follow a different pattern:

image

When the browser makes subsequent requests like this for the resource, the HTTP response (in step #2) looks different than it did on the first request:

[sourcecode]
HTTP/1.1 304 NOT MODIFIED
Cache-Control: private,max-age=0
Content-Length: 0
Expires: Tue, 09 Nov 2010 14:59:59 GMT
[/sourcecode]

    A request is made and a response is returned, but the HTTP 304 status code indicates that the requested resource wasn’t updated on the server; as a result, the browser can re-use its cached copy.  Being able to re-use the cached copy is certainly an improvement over re-fetching it, but again: the cached copy is only used for the duration of the browser session – and only for the user who originally fetched it.  The requester also has to contact the WFE to determine that the cached copy is still valid, so there’s the overhead of an additional round-trip to the WFE for each requested resource anytime a page is refreshed or re-rendered.

BLOB caching enabled

Even if you’re not a SharePoint administrator and generally don’t poke around web.config files, you can probably guess at how BLOB caching is enabled after reading the previous section.  That’s right: it’s enabled by setting the enabled attribute to true as follows:

[sourcecode language=”xml”]
<BlobCache location="C:\BlobCache\14" path="\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="10" enabled="true" />
[/sourcecode]

When BLOB caching is enabled in this fashion, the request pattern for BLOB resources changes quite a bit.  The first request during a browser session looks like this:

image

In this series of steps

  1. A request for a BLOB resource is made to a WFE
  2. The WFE returns the BLOB resource from a file system cache

The gray arrow that is shown indicates that at some point, an initial fetch of the BLOB resource is needed to populate the BLOB cache in the file system of the WFE.  After that point, the resource is served directly from the WFE so that subsequent requests are handled locally for the duration of the browser session.

As you might imagine based on the interaction patterns described thus far, simply enabling the BLOB cache can work wonders to reduce the load on your SQL Servers (where content databases are housed) and reduce back-end network traffic.  Where things get really interesting, though, is on the client side of the equation (that is, the Requester’s machine) once a resource has been fetched.

What about the max-age attribute?

You probably noticed that a max-age attribute wasn’t specified in the default (previous) <BlobCache /> element.  That’s because the max-age is actually an optional attribute.  It can be added to the <BlobCache /> element in the following fashion:

[sourcecode language=”xml”]
<BlobCache location="C:\BlobCache\14" path="\.(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="10" enabled="true" max-age=”43200” />
[/sourcecode]

Before explaining exactly what the max-age attribute does, I think it’s important to first address what it doesn’t do and dispel a misconception that I’ve seen a number of times.  The max-age attribute has nothing to do with how long items stay within the BLOB cache on the WFE’s file system.  max-age is not an expiration period or window of viability for content on the WFE.  The server-side BLOB cache isn’t like other caches in that items expire out of it.  New assets will replace old ones via a maintenance thread that regularly checks associated site collections for changes, but there’s no regular removal of BLOB items from the WFE’s file system BLOB cache simply because of age.  max-age has nothing to do with server side operations.

So, what does the max-age attribute actually do then?  Answer: it controls information that is sent to requesters for purposes of specifying how BLOB items should be cached by the requester.  In short: max-age controls client-side cacheability.

The effect of the max-age attribute

max-age values are specified in seconds; in the case above, 43200 seconds translates into 12 hours.  When a max-age value is specified for BLOB caching, something magical happens with BLOB requests that are made from client browsers.  After a BLOB cache resource is initially fetched by a requester according to the previous “BLOB caching enabled” series of steps, subsequent requests for the fetched resource look like this for a period of time equal to the max-age:

image

You might be saying, “hey, wait a minute … there’s only one step there.  The request doesn’t even go to the WFE?”  That’s right: the request doesn’t go to the WFE.  It gets served directly from local browser cache – assuming such a cache is in use, of course, which it typically is.

Why does this happen?  Let’s take a look at the HTTP response that is sent back with the payload on the initial resource request when BLOB caching is enabled:

[sourcecode highlight=”2″]
HTTP/1.1 200 OK
Cache-Control: public, max-age=43200
Content-Length: 1241304
Content-Type: image/jpeg
Last-Modified: Thu, 22 May 2008 21:26:03 GMT
Accept-Ranges: bytes
ETag: "{F60C28AA-1868-4FF5-A950-8AA2B4F3E161},8pub"
Server: Microsoft-IIS/7.5
SPRequestGuid: 45874590-475f-41fc-adf6-d67713cbdc85
[/sourcecode]

The Cache-Control header line in this case differs quite a bit from the one that was specified when BLOB caching was disabled.  First, the use of public instead of private tells the receiving browser or application that the response payload can be cached and made available across users and sessions.  The response header max-age attribute maps directly to the value specified in the web.config, and in this case it basically indicates that the payload is valid for 12 hours (43,200 seconds) in the cache.  During that 12 hour window, any request for the payload/resource will be served directly from the cache without a trip to the SharePoint WFE.

Implications that come with max-age

On the plus side, serving resources directly out of the client-side cache for a period of time can dramatically reduce requests and overall traffic to WFEs.  This can be a tremendous bandwidth saver, especially when you consider that assets which are BLOB cached tend to be larger in nature – images, media files, etc.  At the same time, serving resources directly out of the cache is much quicker than round-tripping to a WFE – even if the round trip involves nothing more than an HTTP 304 response to say that a cached resource may be used instead of being retrieved.

While serving items directly out of the cache can yield significant benefits, I’ve seen a few organizations get bitten by BLOB caching and excessive max-age periods.  This is particularly true when BLOB caching and long max-age periods are employed in environments where images and other BLOB cached resources are regularly replaced and changed-out.  Let me illustrate with an example.

Suppose a site collection that hosts collaboration activities for a graphic design group is being served through a Web application zone where BLOB caching is enabled and a max-age period of 43,200 seconds (12 hours) is specified.  One of the designers who uses the site collection arrives in the morning, launches her browser, and starts doing some work in the site collection.  Most of the scripts, CSS, images, and other BLOB assets that are retrieved will be cached by the user’s browser for the rest of the work day.  No additional fetches for such assets will take place.

In this particular scenario, caching is probably a bad thing.  Users trying to collaborate on images and other similar (BLOB) content are probably going to be disrupted by the effect of BLOB caching.  The max-age value (duration) in-use would either need to be dialed-back significantly or BLOB caching would have to be turned-off entirely.

What you don’t see can hurt you

There’s one more very important point I want to make when it comes to BLOB caching and the use of the max-age attribute: the default <BlobCache /> element doesn’t come with a max-age attribute value, but that doesn’t mean that there isn’t one in-use.  If you fail to specify a max-age attribute value, you end up with the default of 86,400 seconds – 24 hours.

This wasn’t always the case!  In some recent exploratory work I was doing with Fiddler, I was quite surprised to discover client-side caching taking place where previously it hadn’t.  When I first started playing around with BLOB caching shortly after MOSS 2007 was released, omitting the max-age attribute in the <BlobCache /> element meant that a max-age value of zero (0) was used.  This had the effect of caching BLOB resources in the file system cache on WFEs without those resources getting cached in public, cross-session form on the client-side.  To achieve extended client-side caching, a max-age value had to be explicitly assigned.

Somewhere along the line, this behavior was changed.  I’m not sure where it happened, and attempts to dig back through older VM images (for HTTP response comparisons) didn’t give me a read on when Microsoft made the change.  If I had to guess, though, it probably happened somewhere around service pack 1 (SP1).  That’s strictly a guess, though.  I had always gotten into the habit of explicitly including a max-age value – even if it was zero – so it wasn’t until I was playing with the BLOB caching defaults in a SharePoint 2010 environment that I noticed the 24 hour client-side caching behavior by default.  I then backtracked to verify that the behavior was present in both SharePoint 2007 and SharePoint 2010, and it affected both authenticated and anonymous users.  It wasn’t a fluke.

So watch-out: if you don’t specify a max-age value, you’ll get 24 hour client-side caching by default!  If users complain of images that “won’t update” and stale BLOB-based content, look closely at max-age effects.

An alternate viewpoint on the topic

As I was finishing up this post, I decided that it would probably be a good idea to see if anyone else had written on this topic.  My search quickly turned up Chris O’Brien’s “Optimization, BLOB caching and HTTP 304s” post which was written last year.  It’s an excellent read, highly informative, and covers a number of items I didn’t go into.

Throughout this post, I took the viewpoint of a SharePoint administrator who is seeking to control WFE load and Internet bandwidth consumption.  Chris’ post, on the other hand, was written primarily with developer and end-user concerns in mind.  I wasn’t aware of some of the concerns that Chris points out, and I learned quite a few things while reading his write-up.  I highly recommend checking out his post if you have a moment.

Additional Reading and References

  1. Event: SharePoint Saturday Ozarks (June 2010)
  2. Blob Post: We Drift Deeper Into the Sound … as the (BLOB Cache) Flush Comes
  3. Blog: Todd Klindt’s SharePoint Admin Blog
  4. Blog: Becky Bertram’s Blog
  5. Definition: lollygag
  6. Technology: Akamai’s EdgePlatform
  7. Wikipedia: Pi
  8. TechNet: Cache settings operations (SharePoint Server 2010)
  9. Bill Baer: The Office Web Applications Cache
  10. SharePoint Site: Hawaiian Airlines
  11. SharePoint Site: Ferrari
  12. W3C Site: Cache-Control explanations
  13. Tool: Fiddler
  14. Blog Post: Chris O’Brien: Optimization, BLOB caching and HTTP 304s

Manually Clearing the MOSS 2007 BLOB Cache

This post investigates manual flushing of the MOSS BLOB cache via file system deletion, why such flushes might be needed, and how they should be carried out. Some common troubleshooting questions (and answers to them) are also covered.

It’s a fact of life when dealing with many caching systems: for all the benefits they provide, they occasionally become corrupt or require some form of intervention to ensure healthy ongoing operation.  The MOSS Binary Large Object (BLOB) cache, or disk-based cache, is no different.

Is BLOB Cache Corruption a Common Problem?

In my experience, the answer is “no.”  The MOSS BLOB cache generally requires little maintenance and attention beyond ensuring that it has enough disk space to properly store the objects it fetches from the lists within the content databases housing your publishing site collections.

How Should a Flush Be Carried Out?

ObjectCacheSettings.aspx Application PageWhen corruption does occur or a cache flush is desired for any reason, the built-in “Disk Based Cache Reset” option is typically adequate for flushing the BLOB cache on a single server and single web application zone.  This option (circled in red on the page shown to the right) is exposed through the Site collection object cache menu item on a publishing site’s Site Collection Administration menu.  Executing a flush is as simple as checking the supplied checkbox and clicking the OK button at the bottom of the page.  When a flush is executed in this fashion, it affects only the server to which the postback occurs and only the web application through which the request is directed.  If a site collection is extended to multiple web applications, only one web application’s BLOB cache is affected by this operation.

BLOB Cache Farm Flush SolutionAlternatively, my MOSS 2007 Farm-Wide BLOB Cache Flushing Solution (screenshot shown on the right) can be used to clear the BLOB cache folders associated with a target site collection across all servers in a farm and across all web applications (zones) serving up the site collection.  This solution utilizes a different mechanism for flushing, but the net effect produced is the same as for the out-of-the-box (OOTB) mechanism: all BLOB-cached files for the associated site collection are deleted from the file system, and the three BLOB cache tracking files for each affected web application (IIS site) are reset.

For more information on the internals of the BLOB Cache, the flush process, and the files I just mentioned, see my previous post entitled We Drift Deeper Into the Sound … as the (BLOB Cache) Flush Comes.

Okay, I Tried a Flush and it Failed.  Now What?

If the aforementioned flush mechanisms simply aren’t working for you, you’re probably staring down the barrel of a manual BLOB cache flush.  Just delete all of the files in the target BLOB cache folder (as specified in the web.config) and you should be good to go, right?

Wrong.

Jumping in and simply deleting files without stopping requests to the affected site collection (or rather, the web application/applications servicing the site collection) risks sending you down the road to (further) cache corruption.  This risk may be small for sites that see little traffic or are relatively small, but the risk grows with increasing request volume and site collection size.  Allow me to illustrate with an example.

The Context

Let’s say that you decided to manually clear the BLOB cache for a sizable publishing site collection that is heavily trafficked.  You go into the file system, find your BLOB cache folder (by default, C:\blobCache), open it up, select all files and sub-folders contained within, and press the <Delete> key on your keyboard.  Deletion of the BLOB cache files and sub-folders commences.

Deleting the sub-folders and files isn’t an instantaneous operation, though.  It takes some time.  While the deletion is taking place, let’s say that your MOSS publishing site collections are still up and servicing requests.  The web applications for which BLOB caching is enabled are still attempting to use the very folders and files currently being deleted.

The Race Condition

For the duration of the deletion, a race condition is in effect that can yield some fairly unpredictable results.  Consider the following possible execution sequence.  Note: this example is hypothetical, but I’ve seen results on multiple occasions that infer this execution sequence (or something similar to it).

  1. The deletion operation deletes one or more of the .bin files at the root of a web application’s BLOB cache folder.  These files are used by MOSS to track the contents of the BLOB cache, the number of times it was flushed, etc.
  2. A request for a resource that would normally be present in the BLOB cache arrives at the web server.  An attempted lookup for the resource in the BLOB cache folder fails because the .bin files are gone as a result of the actions taken in the last step.
  3. The absence of the .bin files kicks off some housekeeping.  Ultimately, a “fresh” set of .bin files written out.
  4. The requested resource is fetched into the BLOB cache (sub-)folder structure and the .bin files are updated so that subsequent requests for the resource are served from the file system instead of the content database.
  5. The deletion operation, which has been running the whole time, deletes the file and/or folder containing the resource that was just fetched.

Once the deletion operation has concluded, a resource that was fetched in step #4 is tracked in the BLOB cache’s dump.bin file, but as a result of step #5, the resource no longer actually exist in the BLOB cache file system.  Net effect: requests for these resources return HTTP 404 errors.

Since image files are the most common BLOB-cached resources, broken link images (for example, that nasty red “X” in place of an image in Internet Explorer) are shown for these tracked-but-missing resources.  No amount of browser refreshing brings the image back from the server; only an update to the image in the content database (which triggers a re-fetch of the affected resource into the BLOB cache) or another flush operation fixes the issue as long as BLOB caching remains enabled.

Proper Manual Clearing

The key to avoiding the type of corruption scenario I just described is to ensure that requests aren’t serviced by the web application or applications that are tied to the BLOB cache.  Luckily, this is accomplished in a relatively straightforward fashion.

Before attempting either of the approaches I’m about to share, though, you need to know where (in the server file system) your BLOB cache root folder is located.  By default, the BLOB cache root folder is located at C:\blobCache; however, most conscientious administrators change this path to point to a data drive or non-system partition.

Location of BLOB cache root folder in web.config If you are unsure of the location of the BLOB cache root folder containing resources for your site collection, it’s easy enough to determine it by inspecting the web.config file for the web application housing the site collection.  As shown in the sample web.config file on the right, the location attribute of the <BlobCache> element identifies the BLOB cache root folder in which each web application’s specific subfolder will be created.

Be aware that saving any changes to the web.config file will result in an application pool recycle, so it’s generally a good idea to review a copy of the web.config file when inspecting it rather than directly opening the web.config file itself.

The Quick and Dirty Approach

When you just want to “get it done” as quickly as possible using the least number of steps, this is the process:

  1. World Wide Web Publishing Service Stop the World Wide Web Publishing Service on the target server.  This can be accomplished from the command line (via net stop w3svc) or the Services MMC snap-in (via Start –> Administrative Tools –> Services) as shown on the right.
  2. Once the World Wide Web Publishing Service stops, simply delete the BLOB cache root folder.  Ensure that the deletion operation completes before moving on to the next step.
  3. Restart the World Wide Web Publishing service (via Services or net start w3svc).

Though this approach is quick with regard to time and effort invested, it’s certainly “dirty,” coarse, and not without disadvantages.  Using this approach prevents the web server from servicing *any* web requests for the duration of the operation.  This includes not only SharePoint requests, but requests for any other web site that may be served from the server.

Second, the “quick and dirty” approach wipes out the entire BLOB cache – not just the cached content associated with the web application housing your site collection (unless, of course, you have a single web application that hasn’t been extended).  This is the functional equivalent of trying to drive a nail with a sledgehammer, and it’s typically overkill in most production scenarios.

The Controlled (Granular) Approach

There is a less invasive alternative to the “Quick and Dirty” technique I just described, and it is the procedure I recommend for production environments and other scenarios where actions must be targeted and impact minimized.  The screenshots that follow are specific to IIS7 (Windows Server 2008), but the fundamental activities covered in each step are the same for IIS6 even if execution is somewhat different.

  1. Locating the IIS site ID of the target web application Determine the IIS ID of the web application servicing the site collection for which the flush is being performed.  This is easily accomplished using the Internet Information Services (IIS) Manager (accessible through the Administrative Tools menu) as shown to the right.  If I’m interested in clearing the BLOB cache of a site collection that is hosted within the InternalHomeWeb (Default) web application, for example, the IIS site ID of interest is 1043653284.
  2. Locating the name of the application pool associated with the web application Determine the name of application pool that is servicing the web application.  In IIS7, this is accomplished by selecting the web application (InternalHomeWeb (Default)) in the list of sites and clicking the Basic Settings… link under Edit Site in the Site Actions menu on the right-hand side of the window.  The dialog box that pops up clearly indicates the name of the associated application pool (as shown on the right, circled in red).  Note the name of the application pool for the next step.
  3. Stopping the target application pool Stop the application pool that was located in the previous step.  This will shutdown the web application and prevent MOSS from serving up requests for the site collections housed within the web application, thus avoiding the sort of race condition described earlier.  If multiple application pools are used to partition web applications within different worker processes, then shutting down the application pool is “less invasive” than stopping the entire World Wide Web Publishing Service as described in “The Quick and Dirty Approach.”  If all (or most) web applications are serviced by a single application pool, though, then there may be little functional benefit to stopping the application pool.  In such a case, it may simply be easier to stop the World Wide Web Publishing Service as described in “The Quick and Dirty Approach.”
  4. BLOB cache folder for selected web applicationOpen Windows Explorer and navigate to the BLOB cache root folder.  For the purposes of this example, we’ll assume that the BLOB cache root folder is located at E:\MOSS\BLOB Cache.  Within the root folder should be a sub-folder with a name that matches the IIS site ID determined in step #1 (1043653284).  Either delete the entire sub-folder (E:\MOSS\BLOB Cache\1043653284), or select the files within the sub-folder and delete them (as shown above).
  5. Once the deletion has completed, restart the application pool that was shutdown in step #3.  If the World Wide Web Publishing Service was shutdown instead, restart it.

Taking the approach just described affects the fewest number of cached resources necessary to ensure that the site collection in question (or rather, its associated web application/applications) starts with a “clean slate.”  If web applications are partitioned across multiple application pools, then this approach also restricts the resultant service outage to only those site collections ultimately being served by the application being shutdown and restarted.

Some Common Questions and Concerns

Q: I have multiple servers or web front-ends.  Do I need to take them all down and manually flush them as a group?

The BLOB cache on each MOSS server operates independently of other servers in the farm, so the answer is “no.”  Servers can be addressed one at a time and in any order desired.

Q: I’ve successfully performed a manual flush and brought everything back up, but I’m *still* seeing an old image/script/etc.  What am I doing wrong?

Interestingly enough, this type of scenario oftentimes has little to do with the actual server-side BLOB cache itself.

One of the attributes that can (and should) be configured when enabling the BLOB cache is the max-age attribute.  The max-age attribute specifies the duration of time, in seconds, that client-side browsers should cache resources that are retrieved from the MOSS BLOB cache.  Subsequent requests for these resources are then served directly out of the client-side cache and not made to the MOSS server until a duration of time (specified by the max-age attribute) is exceeded.

If a BLOB cache is flushed and it appears that old or incorrect resources (commonly images) are being returned when requested, it might be that the resources are simply cached on the local system and being returned from the cache instead of being fetched from the server.  Flushing locally-cached items (or deleting “Temporary Internet files” in Internet Explorer’s terminology) is a quick way to ensure that requests are being passed to the SharePoint server.

Q: I’m running into problems with a manual deletion.  Sometimes all files within the cache folder can’t be deleted, or sometimes I run into strange files that have a size of zero bytes.  What’s going on?

I haven’t seen this happen too often, but when I have seen it, it’s been due to problems with (or corruption in) the underlying file system.  If regular CHKDSK operations aren’t scheduled for the drive housing the BLOB cache, it’s probably time to set them up.

Additional Reading and References

  1. MSDN: Caching In Office SharePoint 2007
  2. CodePlex: MOSS 2007 Farm-Wide BLOB Cache Flushing Solution
  3. Blog Post: We Drift Deeper Into the Sound … as the (BLOB Cache) Flush Comes

MOSS Object Cache Memory Tuning is not an Intuitive Process

This post discusses the process of tuning the memory allocation for the Object Cache that is used by MOSS publishing sites. It includes some warnings regarding the “Publishing cache hit ratio” performance counter, and it describes the counter-intuitive use of the

I’ve been meaning to do a small write-up on a couple of key Object Cache points, but other things kept trumping my desire to put this post together.  I finally found the nudge I needed (or rather, gave myself a kick in the butt) after discussing the topic a bit with Andrew Connell following a presentation he gave at a SharePoint Users of Indiana user group meeting.  Thanks, Andrew!

A Brief Bit of Background

As I may have mentioned in a previous post, I’ve spent the bulk of the last two years buried in a set of Internet-facing MOSS publishing sites that are the public presence for my current client.  Given that my current client is a Fortune 50 company, it probably comes as no surprise when I say that the sites see quite a bit of daily traffic.  Issues due to poor performance tuning and inefficient code have a way of making themselves known in dramatic fashion.

Some time ago, we were experiencing a whole host of critical performance issues that ultimately stemmed from a variety of sources: custom code, infrastructure configuration, cache tuning parameters, and more.  It took a team of Microsoft experts, along with professionals working for the client, to systematically address each item and bring operations back to a “normal” state.  Though we ultimately worked through a number of different problem areas, one area in particular stood out: the MOSS Object Cache and how it was “tuned.”

What is the MOSS Object Cache?

The MOSS Object Cache is memory that’s allocated on a per-site collection basis to store commonly-accessed objects, such as navigational data, query results (cross-list and cross-site), site properties, page layouts, and more.  Object Caching should not be confused with Page Output Caching (which is an extension of ASP.NET’s built-in page caching capability) or BLOB Caching/Disk-Based Caching (which uses the local server’s file system to store images, CSS, JavaScript, and other resource-type objects).

Publishing sites make use of the Object Cache without any intervention on the part of administrators.  By default, a publishing site’s Object Cache receives up to 100MB of memory for use when the site collection is created.  This allocation can be seen on the Object Cache Settings site collection administration page within a publishing site:

Object Cache Settings Page

Note that I said that up to 100MB can be used by the Object Cache by default.  The size of the allocation simply determines how large the cache can grow in memory before item ejection, flushing, and possible compactions result.  The maximum cache size isn’t a static allocation, so allocating 500MB of memory, for example, won’t deprive the server of 500MB of memory unless the amount of data going into the cache grows to that level.  I’m taking a moment to point this out because I wasn’t (personally) aware of this when I first started working with the Object Cache.  This point also becomes a relevant point in a story I’ll be telling in a bit.

Microsoft’s TechNet site has an article that provides pretty good coverage of caching within MOSS (including the Object Cache), so I’m not going to go into all of the details it covers in this post.  I will make the assumption that the information presented in the TechNet article has been read and understood, though, because it serves as the starting point for my discussion.

Object Cache Memory Tuning Basics

The TechNet article indicates that two specific indicators should be watched for tuning purposes.  Those two indicators, along with their associated performance counters, are

  • Cache hit ratio (SharePoint Publishing Cache/Publishing cache hit ratio)
  • Object discard rate (SharePoint Publishing Cache/Total object discards)

The image below shows these counters highlighted on a MOSS WFE where all SharePoint Publishing Cache counters have been added to a Performance Monitor session:

Basic Publishing Tuning Performance Counters

According to the article, the Publishing cache hit ratio should remain above 90% and a low object discard rate should be observed.  This is good advice, and I’m not saying that it shouldn’t be followed.  In fact, my experience has shown Publishing cache hit ratio values of 98%+ are relatively common for well-tuned publishing sites possessing largely static content.

The “Dirty Little Secret” about the Publishing Cache Hit Ratio Counter

As it turns out, though, the Publishing cache hit ratio counter should come with a very large warning that reads as follows:

WARNING: This counter only resets with a server reboot. Data it displays has been aggregating for as long as the server has been up.

This may not seem like such a big deal, particularly if you’re looking at a new site collection.  Let me share a painful personal experience, though, that should drive home how important a point this really is.

I was attempting to do a little Object Cache tuning for a client to help free up some memory to make application pool recycles cleaner, and I was attempting to see if I could adjust the Object Cache allocations for multiple (about 18) site collections downward.  We were getting into a memory-constrained position, and a review of the Publishing cache hit ratio values for the existing site collections showed that all sites were turning in 99%+ cache hit ratios.  Operating under the (previously described) mistaken assumption that Object Cache memory was statically allocated, I figured that I might be able to save a lot of memory simply by adjusting the memory allocations downward.

Mistaken understanding in mind, I went about modifying the Object Cache allocation for one of the site collections.  I knew that we had some data going into the cache (navigational data and a few cross-list query result sets), so I figured that we couldn’t have been using a whole lot of memory.  I adjusted the allocation down dramatically (to 10MB) on the site collection and I periodically checked back over the course of several hours to see how the Publishing cache hit ratio fared.

After a chunk of the day had passed, I saw that the Publishing cache hit ratio remained at 99%+.  I considered my assumption and understanding about data going into the Object Cache to be validated, and I went on my way.  What I didn’t realize at the time was that the actual Publishing cache hit ratio counter value was driven by the following formula:

Publishing cache hit ratio = total cache hits / (total cache hits + total cache misses) * 100%

Note the pervasive use of the word “total” in the formula.  In my defense, it wasn’t until we engaged Microsoft and made requests (which resulted in many more internal requests) that we learned the formulas that generate the numbers seen in many of the performance counters.  To put it mildly, the experience was “eye opening.”

In reality, the site collection was far from okay following the tuning I performed.  It truly needed significantly more than the 10MB allocation I had given it.  If it were possible to reset the Publishing cache hit ratio counter or at least provide a short-term snapshot/view of what was going on, I would have observed a significant drop following the change I made.  Since our server had been up for a month or more, and had been doing a good job of servicing requests from the cache during that time, the sudden drop in objects being served out of the Object Cache was all but undetectable in the short-term using the Publishing cache hit ratio.

To spell this out even further for those who don’t want to do the math: a highly-trafficked publishing site like one of my client’s sites may service 50 million requests from the Object Cache over the course of a month.  Assuming that the site collection had been up for a month with a 99% Object Cache hit ratio, plugging the numbers into the aforementioned formula might look something like this:

Publishing cache hit ratio = 49500000 / (49500000 + 500000) * 100% = 99.0%

50 million Object Cache requests per month breaks down to about 1.7 million requests per day.  Let’s say that my Object Cache adjustment resulted in an extremely pathetic 10% cache hit ratio.  That means that of 1.7 million object requests, only 170000 of them would have been served from the Object Cache itself.  Even if I had watched the Publishing cache hit ratio counter for the entire day and seen the results of all 1.7 million requests, here’s what the ratio would have looked like at the end of the day (assuming one month of uptime):

Publishing cache hit ratio = 51200000 / (51200000 + 2030000) * 100% = 96.2%

Net drop: only about 2.8% over the course of the entire day!

Seeing this should serve as a healthy warning for anyone considering the use the Publishing cache hit ratio counter alone for tuning purposes.  In publishing environments where server uptime is maximized, the Publishing cache hit ratio may not provide any meaningful feedback unless the sampling time for changes is extended to days or even weeks.  Such long tuning timelines aren’t overly practical in many heavily-trafficked sites.

So, What Happens When the Memory Allocation isn’t Enough?

In plainly non-technical terms: it gets ugly.  Actual results will vary based on how memory starved the Object Cache is, as well as how hard the web front-ends (WFEs) in the farm are working on average.  As you might expect, systems under greater stress or load tend to manifest symptoms more visibly than systems encountering lighter loads.

In my case, one of the client’s main sites was experiencing frequent Object Cache thrashing, and that led to spells of extremely erratic performance during times when flushes and cache compactions were taking place.  The operations I describe are extremely resource intensive and can introduce blocking behavior in the request pipeline.  Given the volume of requests that come through the client’s sites, the entire farm would sometimes drop to its knees as the Object Cache struggled to fill, flush, and serve as needed.  Until the problem was located and the allocation was adjusted, a lot of folks remained on-call.

Tuning Recommendations

First and foremost: don’t adjust the size of the Object Cache memory allocation downwards unless you’ve got a really good reason for doing so, such as extreme memory constraints or some good internal knowledge indicating that the Object Cache simply isn’t being used in any substantial way for the site collection in question.  As I’ve witnessed firsthand, the performance cost of under-allocating memory to the Object Cache can be far worse than the potential memory savings gained by tweaking.

Second, don’t make the same mistake I made and think that the Object Cache memory allocation is a static chunk of memory that’s claimed by MOSS for the site collection.  The Object Cache uses only the memory it needs, and it will only start ejecting/flushing/compacting the Object Cache after the cache has become filled to the specified allocation limit.

And now, for the $64,000-contrary-to-common-sense tip …

For tuning established site collections and the detection of thrashing behavior, Microsoft actually recommends using the Object Cache compactions performance counter (SharePoint Publishing Cache/Total number of cache compactions) to guide Object Cache memory allocation.  Since cache compactions represent the greatest threat to ongoing optimal performance, Microsoft concluded (while working to help us) that monitoring the Total number of cache compactions counter was the best indicator of whether or not the Object Cache was memory starved and in trouble:

Total number of cache compactions highlighted in Performance Monitor

Steve Sheppard (a very knowledgeable Microsoft Escalation Engineer with whom I worked and highly recommend) wrote an excellent blog post that details the specific process he and the folks at Microsoft assembled to use the Total number of cache compactions counter in tuning the Object Cache’s memory allocation.  I recommend reading his post, as it covers a number of details I don’t include here.  The distilled guidelines he presents for using the Total number of cache compactions counter basically break counter values into three ranges:

  • 0 or 1 compactions per hour: optimal
  • 2 to 6 compactions per hour: adequate
  • 7+ compactions per hour: memory allocation insufficient

In short: more than six cache compactions per hour is a solid sign that you need to adjust the site collection’s Object Cache memory allocation upwards.  At this level of memory starvation within the Object Cache, there are bound to be secondary signs of performance problems popping up (for example, erratic response times and increasing ASP.NET request queue depth).

Conclusion

We were able to restore Object Cache performance to acceptable levels (and adjust our allocation down a bit), but we lacked good guidance and a quantifiable measure until the Total number of cache compactions performance counter came to light.  Keep this in your back pocket for the next time you find yourself doing some tuning!

Addendum

I owe Steve Sheppard an additional debt of gratitude for keeping me honest and cross-checking some of my earlier statements and numbers regarding the Publishing cache hit ratio.  Though the counter values persist beyond an IISReset, I had incorrectly stated that they persist beyond a reboot and effectively never reset.  The values do reset, but only after a server reboot.  I’ve updated this post to reflect the feedback Steve supplied.  Thank you, Steve!

Additional Reading and References

  1. User Group: SharePoint Users of Indiana
  2. Blog: Andrew Connell
  3. TechNet: Caching In Office SharePoint Server 2007
  4. Blog: Steve Sheppard

We Drift Deeper Into the Sound … as the (BLOB Cache) Flush Comes

This post investigates BLOB caching within MOSS and includes a discussion of how the BLOB cache is internally implemented, how flushing operations are carried out, and the differences between single-server (UI) and farm-wide flushes.

Most publishing site administrators have at least some degree of familiarity with the binary large object (BLOB) cache that is supplied by the MOSS platform, but trying to find information describing how it actually works its magic can be tough.  This post is an attempt to shed a bit of light on the structure, implementation, and operations of the BLOB cache.

Before going too far, though, I should apologize to the group Motorcycle for twisting the title and lyrics of one of their more popular trance songs (“As The Rush Comes”) for the purpose of this post.  I guess I simply couldn’t resist the opportunity to have a little (slightly juvenile) fun.

What is the MOSS BLOB Cache?

Also known as disk-based caching, BLOB caching is one of the three forms of caching supplied/supported by MOSS (not WSS) out-of-the-box (OOTB).  Simply put, the BLOB cache is a mechanism that allows MOSS to locally store “larger” list items (images, CSS, and more) within the file system of web front-ends (WFEs) so that these resources can be served to callers more efficiently than round-tripping to the content database each time a request for such a resource is received.

The rest of this post assumes that you’re familiar with the basics of the MOSS BLOB cache.  If you aren’t, I’d recommend checking out MSDN (“Caching In Office SharePoint 2007”) for a primer.

Some BLOB Cache Internals

Before discussing how flushes are carried out, it’s worth spending a few minutes talking about the internals of the BLOB cache.  Having an understanding of what’s going on “under the hood” helps when explaining some of peculiarities I’ll be describing a little later in this post.

The MOSS BLOB caching mechanism is implemented primarily with the help of two types (classes) that live within the Microsoft.SharePoint.Publishing namespace: the BlobCache type and its associated BlobCacheEntry type.  Each BlobCache object possesses a dictionary that houses BlobCacheEntry instances, and each BlobCacheEntry object represents an SPListItem (SharePoint list item) object that is being stored (cached) in the local file system of the server.

The scope of any BlobCache instance is a single IIS web site, and this is no surprise given that the BlobCache is enabled and disabled through the following (default) entry in the SharePoint web site’s web.config file:

<BlobCache location="C:\blobCache" path="\.(gif|jpg|png|css|js)$" maxSize="10" enabled="false" />

As shown, BLOB caching is disabled by default.  Since BLOB caching is enabled and disabled via the web.config file, configuration and “awareness of operation” is largely a manual affair.  From within the SharePoint browser UI, it cannot be easily determined if BLOB caching is enabled or disabled in the same way that this information can be determined for page output caching and object caching.

This leads to another point that is also worth mentioning: though an Internet Information Services (IIS) web site and a SharePoint web application are fairly synonymous in the case of a single zone web application, the one-to-one equivalence breaks down when a web application is extended to multiple zones from within Central Administration.  In such an extended scenario, each zone (Default, Internet, Intranet, Extranet, and Custom) has its own IIS web site with its own web.config, so it is possible that BLOB caching can be both enabled and disabled for site collections being exposed.  The URL used to access a site collection becomes important in this scenario.

Setting the Wheels in Motion

The <BlobCache /> section that resides within the web.config for an IIS web site is recognized and processed by the MOSS PublishingHttpModule type.  As its name implies, this type (which also resides in the Microsoft.SharePoint.Publishing namespace) is an HttpModule.  Being an HttpModule, the PublishingHttpModule must be present as a child of the <httpModules /> element within the web.config for an IIS web site in order to do carry out its duties.  Under normal circumstances, MOSS takes care of this:

PublishingHttpModule Wire-Up

The PublishingHttpModule itself is responsible for coordinating a number of caching-related operations for MOSS (more than just BLOB caching), and these operations all begin when an instance of the PublishingHttpModule is initialized at the same time that IIS is setting up the SharePoint/ASP.NET application pipeline.  When IIS sets up this pipeline and the PublishingHttpModule.Init method is called, the following actions take place with regard to the BLOB cache:

  1. The site’s web.config configuration settings for the BLOB cache get read and processed.
  2. Assuming settings are found, the PublishingHttpModule creates a new BlobCache object instance to service the (IIS) web site.  This happens whether or not BLOB caching is actually enabled.  Put another way: all sites for which the PublishingHttpModule is active have a BlobCache object “assigned” to them whether that object is in use (enabled) or not.
  3. The BlobCache instance takes care of a number of startup housekeeping items like computing file paths, setting up internal dictionaries, and ensuring that a consistent and ready state is established to facilitate requests.
  4. Assuming all settings are consistent and valid, the BlobCache object instance registers itself with the hosting environment; it then spins-up a separate (independent) thread to rehydrate saved settings (for cached objects), create indexes, and perform some additional startup activities.  This “maintenance thread” then stays alive to regularly perform background checks for things like flush requests, site changes, etc. – but only if BLOB caching is enabled within the web.config.  If BLOB caching isn’t enabled, no additional work is performed on the thread.
  5. Finally, the BlobCache instance’s RewriteUrl method is registered as a handler for the AuthorizeRequest method of the SharePoint application (HttpApplication) for which the pipeline was established.  Since the AuthorizeRequest method fires for each SharePoint web request prior to actual page processing, it gives the BlobCache instance a chance to inspect a requested URL and possibly do something with it – such as serve an object back from the disk-based BLOB cache instead of allowing the request to proceed through “normal channels” (which may involve database object lookup). 

At the end of this process, a BlobCache object exists for all publishing sites (that is, sites where the PublishingHttpModule is active).  Again, this happens whether or not BLOB caching is actually enabled for the IIS site … though the BlobCache instance will only process requests (that is, perform useful actions in the RewriteUrl method) if it has been enabled to do so via the appropriate web.config setting.

BLOB Cache File System Structure

The following image illustrates the file system of a typical server that is implementing BLOB caching.  In the case of this server, the BLOB cache location has been set to E:\MOSS\BLOBCache within the web.config file of each IIS web site utilizing the cache:

Sample BLOB Cache File and Folder Structure

Within the E:\MOSS\BLOBCache folder are two subfolders named 748546212 and 1553899298.  Each of these folders houses BLOB cache content for a different IIS site; each web site for which BLOB caching is enabled ends up with its own folder.  The folder names (for example, 748546212) are nothing more than each web site’s ID value as assigned by IIS.  These ID values are readily visible within the Internet Information Manager (IIS) Manager snap-in, making it easy to correlate folders with their associated IIS web sites.

Within each BLOB cache subfolder (web site folder) are three files that are maintained by MOSS; more specifically, they’re maintained by the BlobCache object instance servicing the web site.  These files are critical to the operation of the BLOB cache, and they (primarily) serve to persist critical BlobCache variables and state during application pool shutdowns (when the BlobCache object is destroyed):

  • change.bin: This file contains serialized change tokens (SPChangeToken) for objects being cached in the local file system.  These tokens allow the BlobCache maintenance thread to query the content source(s) and subsequently update the contents of the BLOB cache with any items that are identified as having changed since the last maintenance sweep.
  • dump.bin: This file contains a serialized copy of the BlobCache’s cache dictionary.  The dictionary maintains information for all objects being tracked and maintained by the BlobCache object; each key/value pair in the dictionary consists of a local file path (key) and it’s associated BlobCacheEntry (value).
  • flushcount.bin: This file contains nothing more than the serialized value of the cacheFlushCount for the BlobCache object.  Practically speaking, this value allows a BlobCache to determine if a flush has been requested while it was shutdown.

In a properly functioning BLOB cache, these three .bin files will always be present.  If any of these files should become corrupt or be deleted, the BlobCache will execute a flush to remedy its inconsistent state.

In a site where web requests have been processed and files have been cached, additional folders and files will be present in addition to the change.bin, dump.bin, and flushcount.bin files.  Additional folders (and subfolders) reflect the URL path hierarchy of the site being serviced by the BlobCache object.  The files within these (path) folders correspond one-to-one with list items (that is, BLOB assets) that have been requested, and the cached files themselves have the same name as their corresponding list items with the addition of a .cache extension.

As an example, consider a site collection that is located at http://www.myurl.com and for which BLOB caching is enabled.  If the BLOB cache is configured to cache JPEG images and a user requests http://www.myurl.com/PublishingImages/test.jpeg, we can expect two things once the request has completed:

  • the BLOB cache folder servicing the http://www.myurl.com site within the server’s file system will have a subfolder within it named PUBLISHINGIMAGES.
  • The PUBLISHINGIMAGES subfolder will have a file named TEST.JPEG.cache.

Small side note which may be evident: the BlobCache object creates all cache-resident paths and filenames (save for the .cache extension) in uppercase.

What Are the Mechanics of a Flush?

The BlobCache can flush itself if it detects any internal problems (for example, one or more of its .bin files is missing or corrupt), but the process can also be requested by an external source or event.  The actual BLOB cache flush process is relatively straightforward and follows this progression (assuming the BLOB cache has a working folder; that is, it hasn’t somehow been deleted):

  1. The BlobCache acquires a writer lock for its working folder to prevent other operations during the flush that’s about to be conducted.
  2. The BlobCache attempts to move it’s working folder to a temporary location – a new folder identified by a freshly generated globally unique identifier (GUID) string – in preparation for the flush.
  3. If the previous folder move (to the temporary “GUID folder”) succeeded, the BlobCache attempts to delete the temporary folder.  If the previous move attempt failed, the BlobCache attempts an in-place deletion of the working folder.
  4. If the folder deletion attempt fails, the BlobCache waits two seconds before attempting the folder deletion operation once again.  If the deletion fails a second time, the BlobCache leaves the temporary folder (or the original folder if the folder move failed in step #2) alone and proceeds.
  5. The BlobCache performs internal housekeeping to clean up dictionaries, reset tracking variables, create a new BLOB cache subfolder (again, folder name is derived from the IIS site ID), and write out a new set of state files (change.bin, dump.bin, and flushcount.bin) to the folder.
  6. With everything cleaned-up and ready to go, the BlobCache releases its Mutex writer lock and normal operations resume.

Single-Server Flush Versus Farm-Wide Flush

I mentioned that an external source or event can request a flush.  A flush is typically requested in one of two ways:

  • A single-server flush can be requested from within the SharePoint browser UI via the Site Collection Administration column’s “Site collection object cache” link.
  • A farm-wide flush can be requested via STSADM.exe (note the qualifiers supplied by Maxime Bombardier at the bottom of the page) or with the help of a third-party tool like my MOSS 2007 Farm-Wide BLOB Cache Flushing Solution.

A single-server flush request is executed through the SharePoint browser UI on the ObjectCacheSettings.aspx application page.  The relevant portion of that page appears below:

The ObjectCacheSettings.aspx Page

A request that is made through the ObjectCacheSettings.aspx page results in a direct call to the BlobCache object servicing the associated IIS site (and working folder) on the server receiving the postback (flush) request.  Once the FlushCache call is made, the BlobCache carries out the flush as previously described.

A farm-wide flush request, on the other hand, is carried out in a very different fashion.  The following is a section of the BlobCacheFarmFlush.aspx page from the BlobCacheFarmFlush solution:

The BlobCacheFarmFlush.aspx Page

A farm-wide flush is executed by incrementing a custom property value (named blobcacheflushcount) on the target site collection’s parent SPWebApplication.  A change in this property value propagates to all servers since the affected SPWebApplication.Properties collection is updated and maintained in the SharePoint farm configuration database.  Each BlobCache object servicing a site collection under the affected SPWebApplication picks up the property change and carries out a flush on the working folder it is responsible for managing.

Request Mechanism Impact on Flush Process

As you might expect, the choice of flush request mechanism (single-server versus farm-wide) has a profound effect on what actually happens during the flush process.

Consider a MOSS farm that has two WFEs (MOSSWFE1 and MOSSWFE2) serving up page requests for a single site collection.  The site collection is exposed through an IIS web site on each server with a URL of http://internal.samplesite.com, and this URL is associated with the default web application.  The site collection is also exposed through a web application that has been extended to the Internet zone, and its IIS site has a URL of http://www.samplesite.com.  BLOB caching is enabled on both servers for each of the two IIS web sites, so a total of four working folders (2 servers * 2 sites) are in-play for BLOB caching purposes.  A (simplified) visual representation looks something like this:

MOSS Farm with Two WFEs

Each of the aforementioned IIS web sites is represented by circled numbers 1 through 4 in the diagram above, while the configuration database is represented by a circled number 5; I’ll be referring to these (numbers) in the descriptions that follow.  Pay attention, too, to the IDs for each of the two IIS sites on each server (748546212 for the Internet zone and 1553899298 for the default zone).

Single-Server Flush

Requesting a single-server flush via the SharePoint browser UI results in a request to (or rather, through) one site on one server.  Prior to such a request, let’s look at how the BLOB cache might appear on MOSSWFE1:

MOSSWFE1 BLOB Cache (Pre-Flush)

As you can see, the BLOB cache folders for both IIS sites on MOSSWFE1 (that is, #1 and #2 in the previous farm diagram) have cached items in them.  The http://www.samplesite.com (#1) site has a “MISCELLANEOUS SHOTS” subfolder (which will have one or more cached resources in it), and the internal.samplesite.com site (#2) has a “BRIAN HEATHERS WEDDING” subfolder (also with cached resources).

For the sake of discussion, let’s say that single-server BLOB cache flush request is made against MOSSWFE1 through the site collection via #2 (the internal.samplesite.com site).  Once the flush has been executed, the BLOB cache folder structure would appear as follows:

MOSSWFE1 BLOB Cache (Post-Flush)

Notice that the “BRIAN HEATHERS WEDDING” subfolder is gone from the site with ID 1553899298 (internal.samplesite.com, or #2).  Further examination of the folder would also confirm that all .bin files had been reset – a clear sign that a flush had taken place.  The cache folder for the other site at 748546212 (http://www.samplesite.com, or #1), on the other hand, remains unchanged.  Each of the BLOB cache folders (#3 and #4) on MOSSWFE2 also remain unaffected.

A single-server flush, therefore, is not only restricted to a single server (MOSSWFE1 in this example), but it also impacts only the specific IIS site (or SharePoint zone) through which the flush request is made.  In the case of the example above, a site administrator requesting a BLOB cache flush through http://internal.samplesite.com has no impact whatsoever on any of the cached files for http://www.samplesite.com.

This can have significant implications in many Internet publishing scenarios where publicly facing sites (zones) only permit anonymous access for security reasons.  In such situations, no OOTB mechanism exists to actually permit a flush request for the public zone/site given that such a flush is a privileged operation available only to site collection administrators.

Thankfully, there is a way to address this problem …

Farm-Wide Flush

In a farm-wide flush, the point of origin for the change that initiates a flush is #5 – the farm configuration database.  As described earlier in this post, the blobcacheflushcount property on the SPWebApplication (web application) that houses the target site collection (in the case of the BlobCacheFarmFlush solution) is incremented.  When the property is incremented, the BlobCache instances servicing the IIS sites under the SPWebApplication detect the property value change and carry out a flush.

Examining the file system for sites #3 and #4 on MOSSWFE2 prior to a farm-wide flush, we might see the following folder structure:

MOSSWFE2 BLOB Cache (Pre-Flush)

Once a farm-wide flush has been executed via STSADM or through a tool like the BlobCacheFarmFlush solution, the BLOB cache area of the file system (for sites #3 and #4) on MOSSWFE2 would appear like this:

MOSSWFE2 BLOB Cache (Post-Flush)

A review of MOSSWFE1 would reveal the same file system changes; BLOB cache folders for #1 and #2 would also be reset.

Unlike the single-server BLOB cache flush via the SharePoint browser UI, a farm-wide flush impacts all WFEs in the farm serving up the site collection.  Arguably the more important (and non-obvious) difference, though, is that the farm-wide flush impacts all zones/IIS sites for the web application serving the site collection.  In the case of the example above, a farm-wide flush request through any of the available URLs on either server results in BLOB caches for #1, #2, #3, and #4 being flushed.  This tends to make a farm-wide flush the preferred flush mechanism for the publishing site example I cited earlier (where public access occurs through an anonymous-only zone/site).

A Watch-Out with Farm-Wide Flush Requests

There is one additional point that should be made with regard to farm-wide flushes.  In order for a flush to take place on a WFE, the IIS application pool servicing the targeted web application must be running.  If the application pool isn’t running (hasn’t yet been started or perhaps has shutdown due lack of requests), it will appear that the flush had “no affect” on the server.

The reason for this is relatively straightforward.  As described towards the beginning of this post, BlobCache object instances and their associated maintenance threads are created when IIS establishes a SharePoint pipeline (and SPHttpApplication) for request processing.  If this pipeline isn’t yet ready to service requests for a targeted web application (perhaps because the IIS worker process hasn’t started-up or the application pool was recycled but not “primed”), then the SPWebApplication’s blobcacheflushcount property change won’t be detected at the time it is altered.  No maintenance thread = no property change detection = no flush.

Since the cacheFlushCount for each BLOB cache is serialized and tracked via the flushcount.bin file, though, detection of the web application’s flush property value change occurs as soon as the BlobCache object is instantiated at the time of pipeline setup.  The result is that a BLOB cache flush occurs as soon as the worker process or new application domain (and by extension, the BlobCache instance and its maintenance thread) spins-up to begin servicing requests.

Conclusion

It is my hope that this overview provides you with some insight into the internals of the MOSS BLOB cache, as well as a basis for understanding how flush mechanisms differ.  As always, I welcome any feedback or questions you might have.

Additional Reading and References

  1. MSDN: Caching In Office SharePoint 2007
  2. Microsoft Support: ASP.NET HTTP Modules and HTTP Handlers Overview
  3. MSDN: Object Caching
  4. CodePlex: MOSS 2007 Farm-Wide BLOB Cache Flushing Solution

The ApplyApplicationContentToLocalServer Method and Why It Comes Up Short

This post explores the SPWebService’s ApplyApplicationContentToLocalServer method, the constraints one faces when using it, and an alternative to its use when updating application page sitemap files.

Caching capabilities that are available (or exposed) through MOSS are something I spend a fair number of working hours focusing on.  MOSS publishing farms can make use of quite a few caching options, and wise administrators find ways to leverage them all for maximum scalability and performance. While helping a client work through some performance and scalability issues recently, I ran into some annoying problems with disk-based caching – also known as BLOB (Binary Large OBject) caching. These problems inspired me to create the BlobCacheFarmFlush solution that I’ve shared on CodePlex, and it was during the creation of this solution that I wrangled with the ApplyApplicationContentToLocalServer method.

Background

The BlobCacheFarmFlush solution itself has a handful of moving parts, and the element I’m going to focus on in this post is the administration page (BlobCacheFarmFlush.aspx) that gets added to the farm upon Feature activation.  In particular, I want to share some of the lessons I learned while figuring out how to get the page’s navigational (breadcrumb) support operating properly.

Unlike “standard” content pages that one might deploy through a SharePoint Feature or solution package, application pages (also called “layouts pages” because they go into the LAYOUTS folder within SharePoint’s 12 hive) don’t come with wired-up breadcrumb support.  An example of the type of breadcrumb to which I’m referring appears below (circled in red):

Application Page Breadcrumb Example

Unless additional steps are taken during the installation of your application pages (beyond simply placing them in the LAYOUTS folder), breadcrumbs like the one shown above will not appear.  It’s not that application pages (which derive from LayoutsBasePage or UnsecuredLayoutsBasePage) don’t include support for breadcrumbs – they do.  The reason breadcrumbs fail to show is because the newly added application pages themselves are not integrated into the sitemap files that describe the navigational hierarchy of the layouts pages.

Wiring Up Breadcrumb Support

Getting breadcrumbs to appear in your own application pages requires that you update the layouts sitemap files for each of the (IIS) sites serving up content on each of the SharePoint web front-end (WFE) servers in your farm.  The files to which I’m referring are named layouts.sitemap and appear in the _app_bin folder of each IIS site folder on the WFE.  An example of one such file (in its _app_bin folder) appears below.

A SharePoint Site's LAYOUTS SiteMap File

I’m a “best practices” kind of guy, so when I was doing research for my BlobCacheFarmFlush solution, I was naturally interested in trying to make the required sitemap modifications in a way that was both easy and supported.  It didn’t take much searching on the topic before I came across Jan Tielens’ blog post titled “Adding Breadcrumb Navigation To SharePoint Application Pages, The Easy Way.”  In his blog post, Jan basically runs through the scenario I described above (though in much greater detail than I presented), and he mentions that another reader (Brian Staton) turned him onto a very simple and straightforward way of making the required sitemap modifications.  I’ll refer you to Jan’s blog post for the specifics, but the two-step quick summary goes like this:

  1. Create a layouts.sitemap.*.xml file that contains your sitemap navigation additions and deploy it to the LAYOUTS folder within SharePoint’s 12 hive on a server.
  2. Execute code that implements one of the two approaches shown below (typically on Feature activation) :
// Approach #1: Top-down starting at the SPFarm level
SPFarm.Local.Services.GetValue<SPWebService>().ApplyApplicationContentToLocalServer();

// Approach #2: Applying to the sites within an SPWebApplication
myWebApp.WebService.ApplyApplicationContentToLocalServer();

This isn’t much code, and it’s pretty clear that the magic rests with the ApplyApplicationContentToLocalServer method.  This method carries out a few operations, but the one in which we’re interested involves taking the new navigation nodes in the layouts.sitemap.*.xml file and integrating them into the layouts.sitemap file for each IIS site residing under a target SPWebService instance.  With the new nodes (which tie the new application pages into the navigational hierarchy) present within each layouts.sitemap file, breadcrumbs appear at the top of the new application pages when they are rendered.

I took this approach for a spin, and everything looked great!  My sitemap additions were integrated as expected, and my breadcrumb appeared on the BlobCacheFarmFlush.aspx page.  All was well .. until I actually deployed my solution to its first multi-server SharePoint environment.  That’s when I encountered my first problem.

Problem #1: The “Local” Part of the ApplyApplicationContentToLocalServer Method

When I installed and activated the BlobCacheFarmFlush solution in a multi-server environment, the breadcrumbs failed to appear on my application page.  It took a little legwork, but I discovered that the ApplyApplicationContentToLocalServer method has “Local” in its name for a reason: the changes made through the method’s actions only impact the server on which the method is invoked.

This contrasts with the behavior that SharePoint objects commonly exhibit.  The changes that are made through (and to) many SharePoint types impact data that is actually stored in SQL Server, and changes made through any farm member get persisted back to the appropriate database and become available through all servers within the farm.  The ApplyApplicationContentToLocalServer method, on the other hand, carries out its operations directly against the files and folders of the server on which the method is called, and the changes that are made do not “automagically” appear on or through other farm members.

The Central Administration host server for the farm in which I was activating my Feature wasn’t one of the WFEs serving up my application page.  When I activated my Feature from within Central Admin, my navigation additions were incorporated into the affected sites on the local (Central Admin) host … but the WFEs serving up actual site pages (and my application page) were not updated.  Result: no breadcrumb on my application page.

This issue is one of those problems that wouldn’t normally be discovered in a typical development environment.  Most of the SharePoint developers I know do their work within a virtual machine (VM) of some sort, so it’s not until one moves out of such an environment and into a multi-server environment that this type of deployment problem even makes itself known.  This issue only serves to underscore how important it is to test Features and solutions in a typical target deployment environment before releasing them for general use.

Putting my thinking cap back on, I worked to come up with another way to integrate the sitemap changes I needed in a way that was multi-server friendly.  The ApplyApplicationContentToLocalServer method still seemed like a winner given all that it did for a single line of code; perhaps all I needed to do was create and run a one-time custom timer job (that is, schedule a custom SPJobDefinition subclass) on each server within the farm and have that timer job execute the ApplyApplicationContentToLocalServer method locally.

I whipped-up a custom timer job to carry out this action and took it for a spin.  That’s when I ran into my second problem.

Problem #2: Rights Required for ApplyApplicationContentToLocalServer Method Invocation

The documentation for the ApplyApplicationContentToLocalServer method ends with this one line:

Only local administrators can call this method.

Prior to the creation of the custom timer job that I was going to use to update the sitemap files on each of the WFEs, I had basically ignored this point.  The local administrator requirement quickly became a barricade for my custom timer job, though.

Timer jobs, both SharePoint-supplied and custom, are executed within the context of the SharePoint Timer Service (OWSTIMER.EXE).  The Timer Service runs in an elevated security context with regard to the SharePoint farm, but its privileges shouldn’t extend beyond the workings of SharePoint.  Though some SharePoint administrators mistakenly believe that the Timer Service account (also known as the “database access account” or “farm service account”) requires local administrator rights on each server within the SharePoint farm, Microsoft spells out that this is neither required nor recommended.

The ApplyApplicationContentToLocalServer method works during Feature activation when the activating user is a member of the Local Administrators group on the server where activation is taking place – a common scenario.  The process breaks down, however, if the method call occurs within the context of the SharePoint Timer Service account because it isn’t (or shouldn’t be) a member of the Local Administrators group.  Attempts to call the ApplyApplicationContentToLocalServer method from within a timer job fail and result in an “Access Denied” message being written to the Application Event Log.  A quick look at the first section of code inside the method itself (using Reflector) makes this point pretty clearly:

if (!SPAdministrationServiceUtilities.IsCurrentUserMachineAdmin())
{
    throw new SecurityException(SPResource.GetString("AccessDenied", new object[0]));
}

This revelation told me that the ApplyApplicationContentToLocalServer method simply wasn’t going to cut the mustard for my purposes unless I wanted to either (a) require that the Timer Service account be added to the Local Administrators group on each server in the farm, or (b) require that an administrator manually execute an STSADM command or custom command line application to carry out the method call.  Neither of these were acceptable to me.

Method Deconstruction

Since I couldn’t use the ApplyApplicationContentToLocalServer method directly, I wanted to dissect it to the extent that I could in order to build my own process in a manner that replicated the method’s actions as closely as possible.  Performing the dissection (again via Reflector), I discovered that the method was basically iterating through each SPIisWebSite in each SPWebApplication within the SPWebService object being targeted.  As implied by its type name, each SPIisWebSite represents a web site within IIS – so each SPIisWebSite maps to a physical web site folder within the file system at C:\Inetpub\wwwroot\wss\VirtualDirectories (by default if IIS folders haven’t been redirected).

Once each of the web site folder paths is known, it isn’t hard to drill down a bit further to each layouts.sitemap file within the _app_bin folder for a given IIS web site.  With the fully qualified path to each layouts.sitemap file computed, it’s possible to carry out a programmatic XML merge with the new sitemap data from a layouts.sitemap.*.xml file that is deployed with a custom Feature or solution.  The ApplyApplicationContentToLocalServer method carries out such a merge through the private (and obfuscated) MergeAspSiteMapFiles method of the SPAspSiteMapFile internal type, but only after it has created a backup copy of the current layouts.sitemap file using the SPAspSiteMapFile.Copy method.

The Solution

With an understanding of the process that is carried out within the ApplyApplicationContentToLocalServer method, I proceeded to create my own class that effectively executed the same set of steps.  The result was the UpdateLayoutsSitemapTimerJob custom timer job definition that is part of my BlobCacheFarmFlush solution.  This class mimics the enumeration of SPWebApplication and SPIisWebSite objects, the backup of affected layouts.sitemap files, and the subsequent XML sitemap merge of the ApplyApplicationContentToLocalServer method.  The class is without external dependencies (beyond the SharePoint object model), and it is reusable in its current form.  Simply drop the class into a SharePoint project and call its DeployUpdateTimerJobs static method with the proper parameters – typically from the FeatureActivated method of a custom SPFeatureReceiver.  The class then takes care of provisioning a timer job instance that will update the layouts.sitemap navigational hierarchy for affected sites on each of the servers within the farm.

As an aside: while putting together the UpdateLayoutsSitemapTimerJob, there were times when I thought I had to be missing something.  On a handful of occasions, I found myself thinking, “Certainly there had to be a multi-server friendly version of the ApplyApplicationContentToLocalServer method.”  When I didn’t find one (after much searching), I had the good fortune of stumbling upon Vincent Rothwell’s “Configuring the breadcrumb for pages in _layouts” blog post.  Vincent’s post predates my own by a hefty two and a half years, but in it he describes a process that is very similar to the one I eventually ended up implementing in my custom timer job.  Seeing his post helped me realize I wasn’t losing my mind and that I was on the right track.  Thank you, Vincent.

Conclusion

I can sum up the contents of this post pretty simply: when developing application pages that entail sitemap updates, avoid using the ApplyApplicationContentToLocalServer method unless you’re (a) certain that your Feature will be installed into single server environments only, or (b) willing to direct those doing the installation and activation to carry out some follow-up administration on each WFE in the SharePoint farm.

Why does the ApplyApplicationContentToLocalServer method exist?  I did some thinking, and my guess is that it is leveraged primarily when service packs, hotfixes, and other additions are configured via the SharePoint Products and Technologies Configuration Wizard.  Anytime a SharePoint farm is updated with a patch or hotfix, the wizard is run on each server by a local administrator.

An examination of the LAYOUTS folder on one of my farm members provided some indirect support for this notion.  In my LAYOUTS folder, I found the layouts.sitemap.search.xml file, and it was dated 3/25/2008.  I believe (I’m not positive) that this file was deployed with the SharePoint Infrastructure Updates in the middle of 2008, and those updates introduced a number of new search admin pages for MOSS.  Since the contents of the layouts.sitemap.search.xml file include quite a few new search-related navigation nodes, my guess is that the ApplyApplicationContentToLocalServer method was leveraged to merge the navigation nodes for the new search pages when the configuration wizard was run.

In the meantime, if you happen to find a way to use this method in a multi-server deployment scenario that doesn’t involve the configuration wizard, I’d love to hear about it!  The caveat, of course, is that it has to be a best-practices approach – no security changes, no extra manual work/steps for farm administrators, etc.

Additional Reading and References

  1. MSDN: Caching In Office SharePoint 2007
  2. CodePlex: MOSS 2007 Farm-Wide BLOB Cache Flushing Solution
  3. Jan Tielens: Adding Breadcrumb Navigation To SharePoint Application Pages, The Easy Way
  4. MSDN: SPWebService.ApplyApplicationContentToLocalServer Method
  5. TechNet: Plan for administrative and service accounts (Office SharePoint Server)
  6. Red Gate Software: .NET Reflector
  7. CodePlex: UpdateLayoutsSitemapTimerJob class
  8. Vincent Rothwell: Configuring the breadcrumb for pages in _layouts
%d bloggers like this: