The Threadripper

Circuit BoardThis post is me finally doing what I told so many people I was going to do a handful of weeks back: share the “punch list” (i.e, the parts list) I used to put together my new workstation. And unsurprisingly, I chose to build my workstation based upon AMD’s Threadripper CPU.

Getting Old

I make a living and support my family through work that depends on a computer, as I’m sure many of you do. And I’m sure that many of you can understand when I say that working on a computer day-in and day-out, one develops a “feel” for its performance characteristics.

While undertaking project work and other “assignments” over the last bunch of months, I began to feel like my computer wasn’t performing with the same “pep” that it once had. It was subtle at first, but I began to notice it more and more often – and that bugged me.

So, I attempted to uninstall some software, kill off some boot-time services and apps that were of questionable use, etc. Those efforts sometimes got me some performance back, but the outcome wasn’t sustained or consistent enough to really make a difference. I was seriously starting to feel like I was wading through quicksand anytime I tried to get anything done.

The Last Straw

StrawsThere isn’t any one event that made me think “Jeez, I really need a new computer” – but I still recall the turning point for me because it’s pretty vivid in my mind.

I subscribe to the Adobe Creative Cloud. Yes, it costs a small fortune each year, and each time I pay the bill, I wonder if I get enough use out of it to justify the expense. I invariably decide that I do end up using it quite a bit, though, so I keep re-upping for another year. At least I can write it off as a business expense.

Well, I was trying to go through a recent batch of digital photos using Adobe Lightroom, and my system was utterly dragging. And whenever my system does that for a prolonged period, I hop over to the Windows Task Manager and start monitoring. And when I did that with Lightroom, this is what I saw:

Note the 100% CPU utilization in the image. Admittedly, RamboxPro looks like the culprit here, and it was using a fair bit of memory … but that’s not the whole story.

Since the start of this ordeal, I’ve become more judicious in how many active tabs I spin-up in Rambox Pro. It’s a great utility, but like every Chromium-based tool, it’s an absolute pig when it comes to memory usage. Have you ever looked at your memory consumption when you have a lot of Google Chrome tabs open? That’s what’s happening with Rambox Pro. So be warned and be careful.​

I’m used to the CPU spiking for brief periods of time, but the CPU sat pegged at 100% utilization for the duration that Lightroom was running – literally the entire time. And not until I shut down Lightroom did the utilization start to settle back down.

I thought about this for a while. I know that Adobe does some work to optimize/enhance its applications to make the most of systems with multiple CPU cores and symmetric multiprocessing when it’s available to the applications. The type of tasks most Adobe applications deal with are the sort that people tend to buy beefy machines for, after all: video editing, multimedia creation, image manipulation, etc.

After observing Lightroom and how it brought my processor to its knees, I decided to do a bit of research.

Research and Realization

Lab ResearchAt the time, my primary workstation was operating based on an Intel Core i7-5960X Extreme processor. When I originally built the system, there was no consumer desktop processor that was faster or had more cores (that I recall). Based on the (then) brand new Haswell E series from Intel, the i7-5960X had eight cores that each supported hyperthreading. It had an oversized L3 cache of 20MB, “new” virtualization support and extensions, 40 PCIe lanes, and all sorts of goodies baked-in. I figured it was more than up to handling current, modern day workstation tasks.

Yeah – not quite.

In researching that processor, I learned that it had been released in September of 2014 – roughly six years prior. Boy, six years flies by when you’re not paying attention. Life moves on, but like a new car that’s just been driven off the lot, that shiny new PC you just put together starts losing value as soon as you power it up.

The Core i7 chip and the system based around it are still very good at most things today – in fact, I’m going to set my son up with that old workstation as an upgrade from his Core-i5 (which he uses primarily for video watching and gaming). But for the things I regularly do day in and day out – running VMs, multimedia creation and editing, etc., that Core i7  system is significantly behind the times. With six years under its belt, a computer system tends to start receiving email from AARP 

The Conversation and Approval

So, my wife and I had “the conversation,” and I ultimately got her buy-in on the construction of a new PC. Let me say, for the record, that I love my wife. She’s a rational person, and as long as I can effectively plead my case that I need something for my job (being able to write it off helps), she’s behind me and supports the decision.

Tracy and I have been married for 17 years, so she knows me well. We both knew that the new system was going to likely cost quite a bit of money to put together … because my general thinking on new computer systems (desktops, servers, or whatever) boils down to a few key rules and motivators:

  1. Nine times out of ten, I prefer to build a system (from parts) over buying one pre-assembled. This approach ensures that I get exactly what I want in the system, and it also helps with the “continuing education” associated with system assembly. It also forces me to research what’s currently available at the time of construction, and that invariably ends up helping at least one or two friends in the assembly of new systems that they want to put together or purchase.
  2. I generally try to build the best performing system I can with what’s available at the time. I’ll often opt for a more expensive part if it’s going to keep the system “viable” for a longer period of time, because getting new systems isn’t something I do very often. I would absolutely love to get new systems more often, but I’ve got to make these last as long as I can – at least until I’m independently wealthy (heh … don’t hold your breath – I’m certainly not).
  3. As an adjunct to point #2 (above), I tend to opt for more expensive parts and components if they will result in a system build that leaves room for upgrades/part swaps down the road. Base systems may roll over only every half-dozen years or so, but parts and upgrades tend to flow into the house at regular intervals. Nothing simply gets thrown out or decommissioned. Old systems and parts go to the rest of the family, get donated to a friend in need, etc.
  4. When I’m building a system, I have a use in mind. I’m fortunate that I can build different computers for different purposes, and I have two main systems that I use: a primary workstation for business, and a separate machine for gaming. That doesn’t mean I won’t game on my workstation and vice-versa, any such usage is secondary; I select parts for a system’s intended purpose.
  5. Although I strive to be on the cutting edge, I’ve learned that it’s best to stay off the bleeding edge when it comes to my primary workstation. I’ve been burned a time or two by trying to get the absolute best and newest tech. When you depend on something to earn a living, it’s typically not a bad idea to prioritize stability and reliability over the “shiny new objects” that aren’t proven yet.

Threadripper: The Parts List

At last – the moment that some of you may have been waiting for: the big reveal!

I want to say this at the outset: I’m sharing this selection of parts (and some of my thinking while deciding what to get) because others have specifically asked. I don’t value religious debates over “why component ‘xyz’ is inferior to ‘abc'” nearly as much as I once did in my youth.

So, general comments and questions on my choice of parts are certainly welcome, but the only thing you’ll hear are crickets chirping if you hope to engage me in a debate …

The choice of which processor to go with wasn’t all that difficult. Well, maybe a little.

Given that this was going into the machine that would be swapped-in as my new workstation, I figured most medium-to-high end current processors available would do the job. Many of the applications I utilize can get more done with a greater number of processing cores, and I’ve been known to keep a significant number of applications open on my desktop. I also continue to run a number of virtual machines (on my workstation) in my day-to-day work.

In recent years, AMD has been flogging Intel in many different benchmarks – more specifically, the high-end desktop (non-gaming) performance range of benchmarks that are the domain of multi-core systems. AMD’s manufacturing processes are also more advanced (Intel is still stuck on 10nm-14nm while AMD has been on 7nm), and they’ve finally left the knife at home and brought a gun to the fight – especially with Threadripper. It reminds me of period of time decades ago when AMD was able to outperform Intel with the Athlon FX-series (I loved the FX-based system I built!).

I realize benchmarks are won by some companies one day, and someone else the next. Bottom line for me: performance per core at a specific price point has been held by AMD’s Ryzen chips for a while. I briefly considered a Ryzen 5 or 9 for a bit, but I opted for the Threadripper when I acknowledged that the system would have to last me a fairly long time. Yes, it’s a chunk of change … but Threadripper was worth it for my computing tasks.

Had I been building a gaming machine, it’s worth noting that I probably would have gone Intel, as their chips still tend to perform better for single-threaded loads that are common in games.

First off, you should know that I generally don’t worry about motherboard performance. Yes, I know that differences exist and motherboard “A” may be 5% faster than motherboard “B.” At the end of the day, they’re all going to be in the same ballpark (except for maybe a few stinkers – and ratings tend to frown on those offerings …)

For me, motherboard selection is all about capabilities and options. I want storage options, and I especially want robust USB support. Features and capabilities tend to become more available as cost goes up (imagine that!), and I knew right off that I was going to probably spend a pretty penny for the appropriate motherboard to drop that Threadripper chip into.

I’ve always good luck with ASUS motherboards, and it doesn’t hurt that the ROG Zenith II Extreme Alpha was highly rated and reviewed. After all, it has a name that sounds like the next-generation terminator, so how could I go wrong?!?!?!

Everything about the board says high end, and it satisfies the handful of requirements I had. And some I didn’t have (but later found nice, like that 10Gbps Ehternet port …)

“Memory, all alone in the moonlight …”

Be thankful you’re reading that instead of listening to me sing it. Barbra Streisand I am not.

Selecting memory doesn’t involve as many decision points as other components in a new system, but there are still a few to consider. There is, of course, the overall amount of memory you want to include in the system. My motherboard and processor supported up to 256GB, but that would be overkill for anythings I’d be doing. I settled on 128GB, and I decided to get that as 4x32GB DIMMS rather than 8x16GB so I could expand (easily) later if needed.

Due to their architecture, it has been noted that the performance of Ryzen chips can be impacted significantly by memory speeds. The “sweet spot” before prices grew beyond my desire to purchase appeared to be about 3200MHz. And if possible, I wanted to get memory with the lowest possible CAS (column access strobe) latency I could find, as that number tends to matter the most with memory timings (of CAS, tRAS, tRP, and tRCD.)

I found what I wanted with the Corsair Vengeance RGB series. I’ve had a solid experience with Corsair memory in the past, so once I confirmed the numbers it was easy to pull the trigger on the purchase.

There are 50 million cases and case makers out there. I’ve had experience with many of them, but getting a good case (in my experience) is as much about timing as any other factor (like vendor, cost, etc).

Because I was a bit more focused on the other components, I didn’t want to spend a whole lot of time on the case. I knew I could get one of those diamonds in the rough (i.e., cheap and awesome) if I were willing spend some time combing reviews and product slicks … but I’ll confess: I dropped back and punted on this one. I pulled open my Maximum PC and/or PC Gamer magazines (I’ve been subscribing for years) and looked at what they recommended.

And that was as hard as it got. Sure, the Cosmos C700P was pricy, but it looked easy enough to work with. Great reviews, too.

When the thing was delivered, the one thing I *wasn’t* prepared for was sheer SIZE of the case. Holy shnikes – this is a BIG case. Easily the biggest non-server case I’ve ever owned. It almost doesn’t fit under my desk but thankfully it just makes it with enough clearance that I don’t worry.

Oh yeah, there’s something else I realized with this case: I was acrruing quite the “bling show” of RGB lighting-capable components. Between the case, the memory, and the motherboard, I had my own personal 4th of July show brewing.

Power supplies aren’t glamorous, but they’re critical to any stable and solid system. 25 years ago, I lived in an old apartment with atrocious power. I would go through cheap power supplies regularly. It was painful and expensive, but it was instructional. Now, I do two things: buy an uninterruptible power supply (UPS) for everything electronic, and purchase a good power supply for any new build. Oh, and one more thing: always have another PSU on-hand.

I started buying high-end Corsair power supplies around the time I built my first gaming machine which utilized videocards in SLI. That was the point in nVidia’s history when the cards had horrible power consumption stats … and putting two of them in a case was a quick trip to the scrap heap for anything less than 1000W.

That PSU survived and is still in-use in one of my machines, and that sealed the deal for me for future PSU needs.

This PSU can support more than I would ever throw at it, and it’s fully modular *and* relatively high efficiency. Fully modular is the only to go these days; it definitely cuts down on cable sprawl.

Much like power supplies, CPU coolers tend not to be glamorous. The most significant decision point is “air cooled” or “liquid cooled.” Traditionally, I’ve gone with air coolers since I don’t overclock my systems and opt for highly ventillated cases. It’s easier (in my opinion) and tends to be quite a bit cheaper.

I have started evolving my thinking on the topic, though – at least a little bit. I’m not about to start building custom open-loop cooling runs like some of the extreme builders out there, but there are a host of sealed closed-loop coolers that are well-regarded and highly rated.

Unsurprisingly, Corsair makes one of the best (is there anything they don’t do?) I believe Maximum PC put the H100i PRO all-in-one at the top of their list. It was a hair more than I wanted to spend, but in the context of the project’s budget (growing with each piece), it wasn’t bad.

And oh yeah: it *also* had RGB lighting built-in. What the heck?

I initially had no plans (honestly) of buying another videocard. My old workstation had two GeForce 1080s (in SLI) in it, and my thinking was that I would re-use those cards to keep costs down.

Ha. Ha ha. “Keep costs down” – that’s funny! Hahahahahahaha…

At first, I did start with one of the 1080s in the case. But there were other factors in the mix I hadn’t foreseen. Those two cards were going to take up a lot room in the case and limit access to the remaining PCI express slots. There’s also the time-honored tradition of passing one of the 1080s down to my son Brendan, who is also a gamer.

Weak arguments, perhaps, but they were enough to push me over the edge into the purchase of another RTX 2080Ti. I actually picked it up at the local Micro Center, and there’s a bit of a story behind it. I originally purchase the wrong card (one that had connects for an open-loop cooling system), so I returned it and picked up the right card while doing so. That card (the right one) was only available as an open box item (at a substantially reduced price). Shortly after powering my system on with the card plugged in, it was clear why it was open-box: it had hardware problems.

Thus began the dance with EVGA support and the RMA process. I’d done the dance before, so I knew what to expect. EVGA has fantastic support anyway, so I was able to RMA the card back (shipping nearly killed me – ouch!), and I got a new RTX 2080Ti at an ultimately “reasonable” price.

Now my son will get a 1080, I’ve got a shiny new 2080Ti … and nVidia just released the new 30 series. Dang it!

Admittedly, this was a Micro Center “impulse buy.” That is, the specific choice of card was the impulse buy. I knew I was going to get an external sound card (i.e., aside from the motherboard-integrated sound) before I’d really made any other decision tied to the new system.

For years I’ve been hearing that the integrated sound chips they’re now putting on motherboards have gotten good enough that the need for a separate, discrete sound card is no longer necessary for those wanting high-quality audio. Forget about SoundBlaster – no longer needed!

I disagree.

I’ve tried using integrated sound on a variety of motherboards, and there’s always been something … sub-standard. In many cases, the chips and electronics simply weren’t shielded enough to keep powerline hum and other interference out. In other cases, the DSP associated with the audio would chew CPU cycles and slow things down.

Given how much I care about my music – and my picky listening habits (we’ll say “discerning audiophile tendencies”) – I’ve found that I’m only truly happy with a sound card.

I’d always gotten SoundBlaster cards in the past, but I’ve been kinda wondering about SoundBlaster for a while. They were still making good (or at least “okay”) cards in my opinion, but their attempts to stay relevant seemed to be taking them down some weird avenues. So, I was open to the idea of another vendor.

The ASUS card looked to be the right combo of a high signal-to-noise, low distortion minimalist card. And thus far, it’s been fantastic. An impulse buy that actually worked out!

Much like the choice of CPU, picking the SSD that would be used as my Windows system (boot) drive wasn’t overly difficult. This was the device that my system would be booting from, using for memory swapping, and other activities that would directly impact perceived speed and “nimbleness.” For those reasons alone, I wanted to find the fastest SSD I could reasonably purchase.

Historically, I’ve purchased Samsung Pro SSD drives for boot drive purposes and have remained fairly brand loyal. If something “ain’t broke, ya don’t fix it.” But when I saw that Seagate had a new M.2 SSD out that was supposed to be pretty doggone quick, I took notice. I picked one up, and I can say that it’s a really sweet SSD.

The only negative thing or two that Tom’s Hardware had to say about it was that it was “costly” and had “no heatsink.” In the plus category, Tom’s said that it had “solid performance,” a “large write cache,” that it was “power efficient,” had “class-leading endurance,” and they like its “aesthetics.” They also said it “should be near the top of your best ssds list.”

And about the cost: Micro Center actually had the drive for substantially less than what the drive is listing as, so I jumped at it. I’m glad I did, because I’ve been very happy with its performance. Happiness is based on nothing more than my perception. Full disclosure: I haven’t actually benchmarked system performance (yet), so I don’t have numbers to share. Maybe a future post …

Unsurprisingly, my motherboard selection came with built-in RAID capability. That RAID capability actually extended to NVMe drives (a first for one of my systems), so I decided to take advantage of it.

Although it’s impractical from a data stability and safety standpoint, I decided that I was going to put together a RAID-0 (striped) “disk” array with two M.2 drives. I figured I didn’t need maximum performance (as I did with my boot/system drive), so I opted to pull back a bit and be a little more cost-efficient.

It’s no surprise (or at least, I don’t think it should be a surprise), then, that I opted to go with Samsung and a pair of 970 EVO plus M.2 NVMe drives for that array. I got a decent deal on them (another Micro Center purchase), and so with two of the drives I put together a 4TB pretty-darn-quick array – great for multimedia editing, recording, a temporary area … and oh yeah: a place to host my virtual machine disks. Smooth as butta!

For more of my “standard storage” needs – where data safety trumped speed of operations – I opted for a pair of Seagate IronWolf 6TB NAS drives in a RAID-1 (mirrored) array configuration. I’ve been relatively happy with Seagate’s NAS series. Truthfully, both Seagate and Western Digitial did a wonderful thing by offering their NAS/Red series of drives. The companies acknowledge the reality that a large segment of the computing population are leaving machines and devices running 24/7, and they built products to work for that market. I don’t think I’ve had a single Red/NAS-series drive fail yet … and I’ve been using them for years now.

In any case, there’s nothing amazing out these drives. They do what their supposed to do. If I lose one, I just need to get another back in and let the array rebuild itself. Sure, I’ll be running in degraded fashion for a while, but that’s a small price to pay for a little data safety.

I believe in protection in layers – especially for data. That’s a mindset that comes out of my experience doing disaster recovery and business continuity work. Some backup process that you “set and forget” isn’t good enough for any data – yours or mine. That’s a perspective I tried to share and convey in the DR guides that John Ferringer and I wrote back in the SharePoint 2007 and 2010 days, and it’s a philosophy I adhere to even today.

The mirroring of the 6TB IronWolf drives provides one layer of data protection. The additional 10TB Western Digital Red drive I added as a system level backup target provides another. I’ve been using Acronis True Image as a backup tool for quite a few years now, and I’m generally pretty happy with the application, how it has operated, and how it has evolved. About the only thing that still bugs me (on a minor level) is the relative lack of responsiveness of UI/UX elements within the application. I know the application is doing a lot behind the scenes, but as a former product manager for a backup product myself (Idera SharePoint Backup), I have to believe that something could be done about it.

Thoughts on backup product/tool aside, I back up all the drives in my system to my Z: drive (the 10TB WD drive) a couple of times per week:

Acronis Backup Intervals

I use Acronis’ incremental backup scheme and maintain about month’s worth of backups at any given time; that seems to strike a good balance between capturing data changes and maintaining enough disk space.

I have one more backup layer in addition to the ones I’ve already described: off-machine. Another topic for another time …

Last but not least, I have to mention my trust Blu-ray optical drive. Yes, it does do writing … but I only ever use it to read media. If I didn’t have a large collection of Blu-rays that I maintain for my Plex Server, I probably wouldn’t even need the drive. With today’s Internet speeds and the ease of moving large files around, optical media is quickly going the way of the floppy disk.

I had two optical drives in my last workstation, and I have plenty of additional drives downstairs, so it wasn’t hard at all to find one to throw in the machine.

And that’s all I have to say about that.

Some Assembly Required

Of course, I’d love to have just purchased the parts and have the “assembly elves” show up one night while I was sleeping, do their thing, and I’d have woken up the next morning with a fully functioning system. In reality, it was just a tad a bit more involved that that. 

I enjoy putting new systems together, but I enjoy it a whole lot less when it’s a system that I rely upon to get my job done. There was a lot of back and forth, as well as plenty of hiccups and mistakes along the way.

I took a lot of pictures and even a small amount of video while putting things together, and I chronicled the journey to a fair extent on Facebook. Some of you may have even been involved in the ongoing critique and ribbing (“Is it built yet?”). If so, I want to say thanks for making the process enjoyable; I hope you found it as funny and generally entertaining as I did. Without you folks, it wouldn’t have been nearly as much fun. Now, if I can just find a way to magically pay the whole thing off …

The Media Chronicle

I’ll close this post out with some of the images associated with building Threadripper (or for Spencer Harbar: THREADRIPPER!!!)

Definitely a Step Up

I’ll conclude this post with one last image, and that’s the image I see when I open Windows Device Manager and look and look at the “Processors” node:Device Manager

I will admit that the image gives me all sorts of warm fuzzies inside. Seeing eight hyperthreading cores used to be impressive, but now that I’ve got 32 cores, I get a bit giddy.

Thanks for reading!

References and Resources

What CDN Usage Does for SharePoint Online (SPO) Performance

If you need the what’s what on CDNs (content delivery networks), this is a bit of quick reading that will get you up to speed with what a CDN is, how to configure your SPO tenant to use a CDN, and the benefits that CDNs can bring.

The (Not Entirely Obvious) TL;DR Answer

CDN

Since I’m taking the time to write about the topic, you can safely guess that yes, CDNs make a difference withSPO page operations. In many cases, proper CDN configuration will make a substantial difference in SPO page performance. So enable CDN use NOW!

The Basis For That Answer: Introduction

Knowing that some folks simply want the answer up-front, I hope that I’ve satisfied their curiosity. The rest of this post is dedicated to explaining content delivery networks (CDNs), how they operate, and how you can easily enable them for use within your SharePoint Online (SPO) sites.

Let me first address a misconception that I sometimes encountered among SPO administrators and developers (including some MVPs) – that being that CDNs don’t really “do a whole lot” to help site and/or page performance. Sure, usage of a CDN is recommended … but a common misunderstanding is that a CDN is really more of a “nice-to-have” than “need-to-have” element for SPO sites. Of the people saying such things, oftentimes that judgment comes without any real research, knowledge, or testing. Skeptics typically haven’t read the documentation (the “non-RTFM crowd”) and haven’t actually spent any time profiling and troubleshooting the performance of SPO sites. Since I enjoy addressing perf. problems and challenges, I’ve been fortunate to experience firsthand the benefits that CDNs can bring. By the end of this post, I hope I’ll have made converts of a CDN skeptic or two.

What Is A CDN?

Abstract Network

A CDN is a Content Delivery Network. There are a lot of (good) web resources that describe and illustrate what CDNs are and how they generally operate (like this one and this one), so I’m not going to attempt to “add value” with my own spin. I will simply call attention to a couple of the key characteristics that we really care about in our use of CDNs with SPO.

  1. A CDN, at its core, can be thought of as a system of distributed (typically geographically so) servers for caching and offloading of SPO content. Rather than needing to go to the Microsoft network and data center where your tenant is located in order to fetch certain files from SPO, your browser can instead go to a (geographically) closer CDN server to get those same files.
  2. By virtue of going to a closer CDN instead of the Microsoft network, the chance that you’ll have a “bigger pipe” with more bandwidth – and less latency/delay – are greater. This usually translates directly to an improvement in performance.
  3. In addition to giving us the opportunity to download certain SPO files faster and with less delay, CDNs can do other things to improve the experience for the SPO files they serve. For instance, CDN servers can pass files back to the browser with cache-control headers that allow browsers to re-serve downloaded files to other users (i.e, to users who haven’t actually download the files), store downloaded files locally (to avoid having to download them again for a period of time), and more.

If you didn’t know about CDNs prior to this post, or didn’t understand how they could help you, I hope you’re beginning to see the possibilities!

The Arrival Of The Office 365 CDN

It wasn’t all that long ago that Microsoft was a bit more “modest” in its use of CDNs. Microsoft certainly made use of them, but prior to the implementation of its own content delivery networks, Microsoft frequently turned to a company called Akamai for CDN support.

When I first started presenting on SharePoint and its built-in caching mechanisms, I often spoke about Akamai and their edge network when talking about BLOB caching and how the max-age cache-control header could be configured and misconfigured. Back then, “Akamai” was basically synonymous with “CDN,” and that’s how many of us thought about the company. They were certainly leading the pack in the CDN service space.

Back then, if you were attempting to download a large file from Microsoft (think DVD images, ISO files, etc.), then there was a good change that the download link your browser would receive (from Microsoft’s servers) would actually point to an Akamai edge node near your location geographically instead of a Microsoft destination.

Fast forward to today. In addition to utilizing third-party CDNs like those deployed by Akamai, Microsoft has built (and is improving) their own first-party CDNs. There are a couple of benefits to this. First, many data regulations you may be subject to that prevent third-party housing of your data (yes, even in temporary locations like a CDN) can be largely avoided. In the case of CDNs that Microsoft is running, there is no hand-off to a third party and thus much less practical concern regarding who is housing your data.

Second, with their own CDNs, Microsoft has a lot more latitude and ability to extend the specifics of CDN configuration and operation its customers. And that’s what they’ve done with the Office 365 CDN.

Set Up The O365 CDN For Tenant’s Use

Now we’re talking! This next part is particularly important, and it’s what drove the creation of this post. It’s also the one bit of information that I promised Scott Stewart at Microsoft that I would try to get “out in the wild” as quickly and as visibly as possible.

So, if you remember nothing else from this post,please remember this:

Set-SPOTenantCdnEnabled -CdnType Public -Enable $true

That is the line of PowerShell that needs to be executed (against your SPO tenant, so you need to have a connection to your tenant established first) to enable transparent CDN support for public files. Run that, and non-sensitive files of public origin from SPO will begin getting cached in a CDN and served from there.

The line of PowerShell I shared goes through the SharePoint Online Management Shell – something most organizations using SPO (and their admins in particular) have installed somewhere.

It is also possible to enable CDN support if you’re using the PNP PowerShell module, if that’s your preference, by executing the following PowerShell:

Set-PnPTenantCdnEnabled -CdnType Public -Enable $true

No matter how you enable the CDN, it should be noted that the PowerShell I’ve elected to share (above) enables CDN usage for files of public origin only. It is easy enough to alter the parameters being passed in our PowerShell command so as to cover all files, public and private, by switching -CdnType to Both (with the SPO management shell) or executing another line of PowerShell after the first that swaps –type Public with –type Private (in the case of the SharePointPnP PowerShell module).

The reason I chose only public enablement is because your organization may be bound by restrictions or policies that prohibit or limit CDN use with private files. This is discussed a bit in the O365 CDN post originally cited, but it’s best to do your own research.

Enabling CDN support for public files, however, is considered to be safe in general.

What Sort Of Improvements Can I Potentially See?

I’ve got a series of images that I use to illustrate performance improvements when files are served via CDN instead of SPO list/library, and those files are from Microsoft. Thankfully, MS makes the images I tend to use (and a discussion of them) free available, and they are presented at this link for your reading and reference.

The example that is called out in the link I just shared involves offloading of the jQuery JavaScript library from SPO to CDN. The real world numbers that were captured reduced fetch-and-load time from just over 1.5 seconds to less than half a second (<500ms). That is no small change … and that’s for just one file!

The Other (Secret) Benefit Of CDNs

I guess “Secret” is technically the wrong choice of term here. A more accurate description would be to say that I seldom hear or see anyone talking about another CDN benefit I consider to be very important and significant. That benefit, quite simply, involves improving file fetching and retrieval parallelism when a web page and associated assets (CSS, JS, images, etc.) are requested for download by your browser. In plain English: CDNs typically improve file downloading by allowing the browser to issue a greater number of concurrent file requests.

To help with this concept and its explanation, I’ve created a couple of diagrams that I’ll share with you. The first one appears below, and it is meant to represent the series of steps a browser might execute when retrieving everything needed to show a (SharePoint/SPO) page. As we’ve talked about, what is commonly thought of as a single page in a SharePoint site is, more accurately, a page containing all sorts of dependent assets: image files, JavaScript files, cascading style sheets, and a whole bunch more.

A request for a SharePoint page housed at http://www.thesite.com might start out with one request, but your browser is going to need all of the files referenced within the context of that page (default.aspx, in our case) to render correctly. See below:

To get what’s needed to successfully render the example SharePoint page without CDN support, we follow the numbers:

  1. Your browser issues an HTTP request for the page you want to load – http://www.thesite.com/default.aspx in the case of example above.
  2. That page request goes to (and is served by) the web server/front-end that can return the page.
  3. Our page needs other files to render properly, like styling.css, logo.png, functions.js, and more. These get queued-up and returned according to some rules – more on this in a minute.
  4. In step four (4), files get returned to the browser. Notice I say “no more than six at a time” in the illustration. That’s important and will come into play once we start introducing CDN support to the page/site.

You might be wondering, “Only six files at a time? Really? Why the limitation?” Well, I should start by saying the limit is probably six … maybe a bit more, perhaps a bit less. It depends on the browser you’re using what the specific number is. There was a good summary answer on StackOverflow to a related (but slightly different) question that provides some additional discussion.

Section eight (8) of the HTTP specification (RFC 2616) specifically addresses HTTP connections, how they should be handled, how proxies should be negotiated, etc. For our purposes, the practical implementation of the HTTP specification by modern browsers generally limits the number of concurrent/active connections a browser can have to any given host or URL to six (6).

Notice how I worded that last sentence. Since you folks are smart cookies, I’ll bet you’re already thinking “Wait a minute. CDNs typically have different URLs/hosts from the sites they cache” and you’re imaging what happens (or can happen) when a new source (i.e., different host/URL) is introduced.

This illustration roughly outlines the fetch process when a CDN is involved:

Steps one (1) through four (4) of the fetch process with a CDN are basically still the same as was illustrated without a CDN a bit earlier. When the page is served-up in step three (3) and returned in step four (4), though, there are some differences and additional activity taking place:

  1. Since at least one CDN is in-use for the SPO environment, some of the resource links within the page that is returned will have different URLs. For instance, whereas styling.css was previously served from the SPO environment in the non-CDN example, it might now be referenced through the CDN host shown as http://cdn.source.com/styling.css
  2. The requested file is retrieved, and …
  3. Files come back to the client browser from the CDN at the same time they’re being passed-back from the SPO environment.

Since we’re dealing with two different URLs/hosts in our CDN example (http://www.thesite.com and cdn.source.com), our original six (6) file concurrent download limitation transforms into a 12 file limitation (two hosts serving six files a time, 2 x 6 = 12).

Whether or not the CDN-based process is ultimately faster than without a CDN depends on a great many factors: your Internet bandwidth, the performance of your computer, the complexity/structure of the page being served-up, and more. In the majority of cases, though, at least some performance improvement is observed. In many cases, the improvement can be quite substantial (as referenced and discussed earlier).

Additional Note: 8/24/2020

In a bit of laziness on my part, I didn’t do a prior article search before writing this post. As fate would have it, Bob German (a friend and fellow MVP – well, he was an MVP prior to joining Microsoft a couple of years back) wrote a great post at the end of 2017 that I became aware of this morning with a series of tweets. Bob’s post is called “Choosing a CDN for SharePoint Client Solutions” and is a bit more developer-oriented. That being said, it’s a fantastic post with good information that is a great additional read if you’re looking for more material and/or a slightly different perspective. Nice work, Bob!

Post Update: 8/26/2020

Anders Rask was kind enough to point out that the PnP PowerShell line I originally had listed wasn’t, in fact, PnP PowerShell. That specific line of PowerShell has since been updated to reflect the correct way of altering a tenant’s CDN with the PnP PowerShell cmdlets. Many thanks for the catch, Anders!

Conclusion

So, to sum-up: enable CDN use within your SPO tenant. The benefits are compelling!

References

  1. Microsoft Docs: Use The Office 365 Content Delivery Network (CDN) With SharePoint Online
  2. Imperva: What Is A CDN?
  3. Akamai: What Does CDN Stand For?
  4. MDN Web Docs: Cache-Control
  5. Company: Akamai
  6. Presentations: Caching-In For SharePoint Performance
  7. Akamai: Download Delivery
  8. Microsoft Docs: Configure Cache Settings For A Web Application In SharePoint Server
  9. Blog Post: Do You Know What’s Going To Happen When You Enable The SharePoint BLOB Cache?
  10. LinkedIn: Scott Stewart
  11. Microsoft Docs: Enabling O365 CDN support for public origin files.
  12. Microsoft Docs: Get Started With SharePoint Online Management Shell
  13. Microsoft Docs: PnP PowerShell Overview
  14. Microsoft Docs: Set Up And Configure The Office 365 CDN By Using PnP PowerShell
  15. Microsoft Docs: What Performance Gains Does A CDN Provide?
  16. Push Technologies: Browser Connection Limitations
  17. StackOverflow: How many maximum number of simultaneous Chrome connections/threads I can start through Selenium WebDriver?
  18. W3.org: RFC 2616, Section 8: Connection

Workflow 1.0 Beta and SQL Server Aliases Do Not Play Nicely Together

My recent attempts to configure the Windows Azure Workflow service (Workflow 1.0 Beta) with a SQL Server alias didn’t go so well. If you’re playing with Workflow 1.0 Beta, stay away from aliases!

Bad behaviour I’ve been doing a bit of build-out with the new SharePoint 2013 Preview in anticipation of some development work, and I’ve documented a few snags that I’ve hit along the way. Although I ran into some additional problems with the SharePoint 2013 Preview yesterday, this post isn’t about SharePoint specifically; it’s about the Windows Azure Workflow service – also known (at this point in time) simply as Workflow 1.0 Beta.

A Bit of Background

If you’re brand-new to the SharePoint 2013 scene, you may not yet have heard: the future for workflow lies outside of SharePoint, not within it. The Windows Azure Workflow service (yes, it even has “Azure” in the name if you’re running it on-premise and not in the cloud) is industrial-strength stuff, and it promises all sorts of improvements over workflow as we know it (and use it) right now.

To take advantage of Windows Azure Workflow at this point in the SharePoint 2013 release cycle requires the installation of the Workflow 1.0 Beta. The installation is not a particularly complicated process, but that’s probably because I’ve been using a solid resource.

Note: the “solid resource” I’m referring to is CriticalPath Training’s VM setup guide. I’ve been using it as a reference as I’ve been doing my SharePoint 2013 build-outs; the guide itself is fantastic and comes with some supporting PowerShell scripts to help things along. The guide and scripts are freely available here – you just need to create an account on the CriticalPath Training site to download them. I recommend them if you’re just getting started with the SharePoint 2013 Preview.

So, what’s my beef with the Workflow 1.0 Beta? To summarize it in a few works: Workflow 1.0 Beta doesn’t seem to work with SQL Server aliases. I certainly tried, but in the end I was forced to abandon using an alias.

How I Initially Configured It

If you read my previous “An unexpected error has occurred” post, then you know that there are four different VMs I’m configuring for a SharePoint 2013 environment. Two of those VMs are of interest in the discussion about Workflow 1.0 Beta configuration:

  • SP2013-SQL. A SQL Server 2013 Enterprise VM
  • SP2013-APPS. A utility server for running Workflow 1.0 Beta and other “off-box” services

As a general rule of thumb, anytime I need to establish a SQL Server connection, I try to create a SQL Server alias to avoid tightly coupling my SQL Server consumers/clients directly to a SQL Server instance. This buys me some flexibility in the unfortunate event that a server dies, I need to relocate databases, etc.

SQL Server Alias ConfigurationI was planning to install the Workflow 1.0 Beta on my SP2013-APPS virtual machine, and I knew that Workflow 1.0 Beta would need to connect to my SP2013-SQL SQL Server. So, I created both a 32-bit alias and a 64-bit alias called SpSqlAlias for the default SQL Server instance residing on SP2013-SQL (which happened to be at IP address 172.16.0.2) as shown on left.

Trying to configure with a SQL aliasOnce the alias was created and all other prerequisites were addressed, I started the Workflow 1.0 Beta installation process. In the Workflow Configuration Wizard, I supplied my SQL Server alias in place of a server name, checked the connection, and was given a green check-mark. As the configuration process started, everything looked good. Even the Service Bus farm management and gateway databases were created without issue.

The problems started shortly thereafter, though, during the creation of a default container. Basically, I didn’t get any further. I literally stared at the screen on the right for a full ten (10) minutes without seeing any meaningful activity in the Details box. After 10 minutes had elapsed, the configuration process failed and I was treated to an exception message and stack trace. Omitting the inner exception detail, here’s what I was told:

[sourcecode language=”text”]
System.Management.Automation.CmdletInvocationException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) —> System.Data.SqlClient.SqlException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) —> System.ComponentModel.Win32Exception: The system cannot find the file specified
[/sourcecode]

Validating the Alias

Of course, the first thing I double-checked was the SQL Server to ensure that it was responding. It was. I even backed through the configuration wizard a couple of steps and verified (with the “Test Connection” button) that I could reach the SQL Server. No issues there: my SQL Server alias was valid as far as the configuration wizard was concerned.

Looking more closely at the exception message left me suspicious. This part in particular made me raise my eyebrow:

(provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server)

Named Pipes Provider? I had specified a TCP/IP alias, not Named Pipes. Changing the permitted 32-bit and 64-bit client protocols (again, via the SQL Server Configuration Manager) to make sure that TCP/IP was enabled and Named Pipes was disabled …

Permitted Client Protocols

… made no difference, either – I’d still get an exception from the Named Pipes Provider. It looked as though one or more steps in the configuration process were “doing their own thing,” ignoring my alias and client protocols configuration, and (as a result) having trouble reaching the SQL Server.

Trying to Go with the Flow

Named Pipe AliasThe thought that entered my mind was, “Ok – don’t fight it if you don’t have to.” If the configuration wizard was going to fall back to using Named Pipes, then I’d go ahead and set up a Named Pipes alias. I wasn’t thrilled about the idea, but I’d rather have the SQL Server alias in-place than no alias at all.

So much for that thought.

I played with the actual Named Pipes alias format quite a bit, but in the end the result was always the same.

Trying to configure with SQL alias (named pipes) and failing

Attempts to use a TCP/IP alias always failed partway through configuration, and attempts to use a Named Pipes alias never even got started.

The Result

I gave it some more thought … and came up empty. So, I dumped any remaining aliases, ensured that all client protocols were back to their fully enabled state, and tried to do the configuration with just the SQL Server host name (to connect to the default instance).

The result?

 Successful completion of configuration

Using just the host name, I had no issues performing the configuration.

The Conclusion

If you are setting up Workflow 1.0 Beta, stay away from SQL Server aliases. As best as I can tell, they aren’t (yet) supported. I’m hopeful that this is just a beta bug or limitation.

On the other hand, if you think I’ve gone off the deep end and can find some way to get the Workflow 1.0 Beta configuration to run with SQL Server aliases, please let me know – I’d love to hear about it!

References and Resources

  1. Blog Post: "An unexpected error has occurred” after Installing SharePoint 2013
  2. Microsoft Download Center: Workflow 1.0 Beta
  3. TechNet: What’s new in workflow in SharePoint Server 2013
  4. CriticalPath Training: SharePoint Server 2013 Preview Virtual Machine Setup Guide
  5. MSDN: Create or Delete a Server Alias for Use by a Client (SQL Server Configuration Manager)

“An unexpected error has occurred” after Installing SharePoint 2013

After installing the current SharePoint 2013 preview build, I was greeted by “An unexpected error has occurred” message while trying to navigate to the Central Administration site. This post represents the steps I took to troubleshoot the problem and implement a least-privileges fix for it.

Smiley Pill - You May Need It You’ve undoubtedly heard the news: SharePoint 2013 is coming. The preview is available right now, and you can download it from TechNet if you want to join in the fun. Just make sure you can meet the hardware and environmental prerequisites. They’re somewhat brutal.

As you might have guessed from the title of this post, I’ve been trying to get in on the SharePoint 2013 fun. There are a number of things I’m supposed to be working on for SharePoint 2013, so building out a SharePoint 2013 environment with the new preview build has been high on my list of things to do.

This post is about a very recent experience with a SharePoint 2013 installation and configuration … and yes, it’s one that had me looking long and hard for a happy pill.

As with many of my other blog posts, this post takes a winding, iterative approach towards analyzing problems and trying to find solutions. Please bear with me or jump to the “Implementing the Change” section near the end if you want to blindly apply a change (based on the blog post title) and hope for the best.

Hitting a Small Snag

An unexpected error has occurredThis blog post would be something of a disappointment if all it said was “… SharePoint 2013 installed without issue, and my environment lived happily ever after.”

No such luck; just look at the screenshot on the left. Sometimes I feel like I’m a magnet for “bad technology karma” despite my attempts to keep a clean slate in that area. Of course, SharePoint 2013 is only in the preview stages of release, so hiccups are bound to occur. I accept that. Like many of you, I went through it with SharePoint 2010 and SharePoint 2007, as well.

Strangely, though, I built-out a SharePoint 2013 environment with an earlier build (prior to the release of the current preview) some time ago. That’s why I was really surprised to see the message shown in the screenshot immediately upon completing a run of the SharePoint 2013 Products Configuration Wizard:

An unexpected error has occurred.

That’s it. No additional information, no qualification – just a technological “whoops” accompanied by the equivalent of a shoulder shrug from my VM environment.

The Setup

Let me take a step back to describe the environment I had put into place before trying to install and configure the SharePoint 2013 binaries.

One major difference between my latest SharePoint 2013 setup attempt and the previous (successful) attempt was the make-up of the server environment. After learning of some of the install restrictions that are specific to SharePoint 2013 (for example, Office Web Apps require their own server), I decided to build out the following virtual servers on my laptop and assemble them into a domain:

  • SP2013-DC: a Windows 2008 R2 Enterprise domain controller (for my virtual spdc.com domain)
  • SP2013-SQL: a Windows 2008 R2 Enterprise server running SQL Server 2012 Enterprise
  • SP2013-WFE: a Windows 2008 R2 Enterprise all-in-one SharePoint 2013 Server
  • SP2013-APPS: a Windows 2008 R2 Enterprise “extra” server for roles/components that couldn’t be installed alongside SharePoint

Overkill? Perhaps, but I wanted to get a feel for how the different components might interact in a “real” production environment.

I also opted for a least privileges install so that I could start to understand where some of the security boundaries had shifted versus SharePoint 2010. Since I planned to use the farm for my development efforts, I didn’t want to make the common developer mistake of shoehorning everything onto one server with unrestricted privileges. Such an approach dodges security-related issues during development, but it also tends to yield code that falls apart (or at least generates security concerns) upon first contact with a “real” SharePoint environment.

Failed Troubleshooting

As stated earlier, my setup problems started after I installed the SharePoint 2013 bits and ran the SharePoint 2013 Products Configuration Wizard. The browser window that popped-up following the configuration wizard’s run was trying to take me to the Farm Configuration wizard that lives inside the Central Administration site. Clearly I hadn’t gotten very far in configuring my environment.

I started looking in some of the usual locations for additional troubleshooting hints. Strangely, I couldn’t quickly find any:

  • The Central Administration site application pool looked okay and was spun-up
  • My Application and System event logs were pretty doggone clean – exceptionally few errors and warnings, and none that appeared relevant to current problem
  • I didn’t see anything in the Security log to suggest problems

I tried an IISRESET. I rebooted the VM. I checked my SQL alias to make sure nothing was messed-up there. I checked my farm service account permissions in SQL Server to ensure that the account had the dbcreator and securityadmin role assignments as well as rights to the associated databases. Heck, I even deprovisioned the server and re-ran the SharePoint 2013 Products Configuration Wizard twice – once with a complete wipe of the databases. Nothing I did seemed to make a difference. Time after time, I kept getting “An unexpected error has occurred.”

Some Insight

Maybe it was my go ‘rounds with previous SharePoint beta releases, or maybe it was a combination of Eric Harlan’s and Todd Klindt’s spirits reaching out to me (the point of commonality between Todd and Eric: the two of them are fond of saying “it’s always permissions”). Whatever the source, I decided to start playing around with some account rights. Since I was setting up a least-privileges environment, it made sense that rights and permissions (or some lack of them) could be a factor.

Application Pools The benefit of having gotten nearly nowhere on my farm configuration task was that there wasn’t much to really troubleshoot. Only a handful of application pools had been created (as shown on the right), and only one or two accounts were actually in-play. Since my Central Administration site was having trouble coming up, and knowing that the Central Administration site runs in the context of the farm service/timer service account, I focused my efforts there.

In my farm, I had assigned SPDC\svcSPFarm for use by the timer service. This account was a basic domain account at the start – nothing special, and no interesting rights to speak of. To see if I could make any progress on getting the Central Administration site to come up, I dropped the account into the Domain Admins group and tried to access the Central Administration site again.

I had no luck at first … but after an IISRESET and a re-launch of the site, Central Administration came up. I pulled the account out of the Domain Admins group and re-tried the site. It came up, but again – after an IISRESET, I was back to “An unexpected error has occurred.”

I repeated the process again, but the second time around I used the local (SP2013-WFE) Administrators group instead of the Domain Admins group. The results were the same: adding SPDC\svcSPFarm to the Administrators group allowed me to bring Central Administration up, and removing the account from the Admininstrators group brought things back down.

Hunch confirmed: it looked like I was dealing with some sort of rights or permissions issue.

Of course, knowing that there is a rights or permissions issue and knowing what the specific issue is are two very different things. The practical part of me screamed “just leave the account in the Administrators group and move on.”

Unfortunately, I don’t deal well with not knowing why something doesn’t work. It’s a personal hang-up that I have. So, I started with some low-impact/low-effort troubleshooting: I adjusted my VM’s Audit Policy settings (via the Local Security Policy MMC snap-in) to report on all failures that might pop-up.

Unfortunately, the only thing this change actually did for me was reveal that some sort of WinHttpAutoProxySvc service issue was popping-up when SPDC\svcSPFarm wasn’t an administrator. After a few minutes of researching the service, I decided that it probably wasn’t an immediate factor in the problem I was trying to troubleshoot.

So much for finding a quick answer.

Wading Into the Muck

I knew that I needed to dig deeper, and I knew where my troubleshooting was going to take me next. Honestly, I wasn’t too excited.

I dug into my SysInternals folder and dug out Process Monitor. For those of you who aren’t familiar with Process Monitor, I’ll sum it up this way: it’s the “nuclear option” when you need diagnostic information regarding what’s happening with the applications and services running on your system. Process Monitor collects file system activity, Registry reads/writes, network calls – pretty much everything that’s happening at a process level. It’s a phenomenal tool, but it generates a tremendous amount of information. And you need to wade through that information to find what you’re looking for.

I did an IISRESET, fired-up Process Monitor, and tried to bring up the Central Administration site once again. Since the SPDC\svcSPFarm account was no longer an administrator, I knew that the site would fail to come up. My hope was that Process Monitor would provide some insight into where things were getting stuck.

Over the course of the roughly 30 seconds it took the application pool to spin-up and then hand me a failure page, Process Monitor collected over 220,000 events.

Gulp.

I don’t know how you feel about it, but 220,000 events was downright intimidating to me. “Browsing” 220,000 events wasn’t going to be feasible. I’d worked with Process Monitor before, though, and I knew that the trick to making headway with the tool was in judicious use and application of its filtering capabilities.

Initially, I created filters to rule out a handful of processes that I knew wouldn’t be involved – things like Internet Explorer (iexplore.exe), Windows Explorer (Explorer.EXE), etc. Each filter that I added brought the number of events down, but I was still dealing with thousands upon thousands of events.

ProcMon FilterAfter a little thinking, I got a bit smarter with my filtering. First, I knew that I was dealing with an ASP.NET application pool; that was, after all, where Central Administration ran. That meant that the activity in which I was interested was probably taking place within an IIS worker process (w3wp.exe). I set a filter to show only those events that were tied to w3wp.exe activity.

Second, I knew that my farm service account (SPDC\svcSPFarm) was at the heart of my rights and permissions issue. So, I decided to filter out any activity that wasn’t tied to this account.

Applying those two filters got me down to roughly 50,000 events. Excluding SUCCESS results dropped me to 10,000 events. Some additional tinkering and exclusions brought the number down even lower. I was still wading through a large number of results, though, and I didn’t see anything that I could put my finger on.

Next, I decided to place SPDC\svcSPFarm back into the Administrators group and do another Process Monitor capture. As expected, I captured a few hundred thousand events. I went through the process of applying filters and whittling things down as I had done the first time. Then I spent a lot of time going back and forth between the successful and unsuccessful runs looking for differences that might explain what I was seeing.

Two Bit Comedy

After doing a number of comparisons, I began to focus on a series of entries that were tagged with a result message of BAD IMPERSONATION (as seen below). I was seeing 145 of these entries (out of 220,000+ events) when the Central Administration site was failing to come up. When SPDC\svcSPFarm was part of the local Administrators group, though, I wasn’t seeing any of the entries.

BAD IMPERSONATION entries in Process Monitor

My gut told me that these BAD IMPERSONATION entries were probably a factor in my situation, so I started looking at them a bit more closely.

System.ServiceModel.Web Event Many of the entries were seemingly non-specific attempts to access the Registry, but I did notice a handful of file and Registry accesses where an explicit impersonation attempt was being made with the current user’s account context. In the example on the right, for instance, an attempt was being made by the worker process to use my account context (SPDC\s0ladmin) for a CreateFile operation – and that attempt was failing.

This led to me formulate (what may seem like an obvious) hypothesis: seeing the BAD IMPERSONATION results, I suspected that the SPDC\svcSPFarm account was lacking something like the ability to replace a process-level token, log on interactively, or something like that. I’m certainly no expert when it comes to the specific boundaries and abilities associated with each rights assignment, but again – my gut was telling me that I should probably play around with some of the User Rights Assignments (via Local Security Policy) to see if I might get lucky.

A Fortunate Discovery

I popped open the Local Security Policy MMC snap-in on the SP2013-WFE VM once again, and I navigated down to User Rights Assignment node. At first glance, I feared that my gut feeling was off-the-mark. Looking through the rights assignments available, I saw that SPDC\svcSPFarm had already been granted the ability to Replace a process level token and Log on as a service – presumably by the SharePoint 2013 Products Configuration Wizard.

Impersonate a client after authentication I continued looking at the various rights assignments, though, and I discovered one that looked promising: Impersonate a client after authentication. SPDC\svcSPFarm hadn’t been granted that right in my environment, and it seemed to me that such a right might be handy in getting rid of the BAD IMPERSONATION results I was seeing with Process Monitor. I took a leap, granted SPDC\svcSPFarm the ability to Impersonate a client after authentication (as shown on the left), performed an IISRESET, and tried to reach the Central Administration site.

And I’ll be darned if it didn’t actually work.

I don’t normally get lucky like that, but hey – I wasn’t going to argue with it. I browsed around the Central Administration site for a bit to see if the site would remain responsive, and I didn’t notice anything out of the ordinary. I also performed an IISRESET and brought the Central Administration site back up with Process Monitor running just to double-check things. Sure enough, the BAD IMPERSONATION results were gone.

The Fix?

SharePoint 2013 Central Administration Site I honestly have no idea whether this problem was specific to my environment or something that might be occurring in other SharePoint 2013 preview environments. I also don’t know if my solution is the “appropriate” solution to resolve the issue. It works for now, but I still have a lot of configuration and actual development work left to do to validate what I’ve implemented.

Since I’m trying to maintain a least-privileges install, though, I’m willing to try this out for a while instead of falling back to placing my farm service account (SPDC\svcSPFarm) in the Administrators group. Placing the account in that group is a last resort for me.

In case you were wondering: I did perform some level of verification on this change. Since the account I was running as (SPDC\s0ladmin) was itself a member of Domain Admins, I created a standard domain user account (SPDC\joe.nobody – he’s always my go-to guy in these situations) and added it to the Farm Administrators group in Central Administration. I then did an IISRESET and opened a browser to the Central Administration site from the domain controller (SP2013-DC) to see if SPDC\joe.nobody could indeed access the site. No troubles. The fact that the SPDC\joe.nobody account wasn’t a member of either Domain Admins or the local Administrators group (on SP2013-WFE) did not block the account from reaching Central Administration. No “An unexpected error has occurred” reared its head.

Implementing the Change

If you are of a similar mindset to me (i.e., you don’t like to elevate privileges unnecessarily) and find yourself unable to reach Central Administration with the same symptoms I’ve described, here is the quick run-through on how to grant your farm/timer service account the Impersonate a client after authentication right as I did:

  1. On your SharePoint Server, go to Start > Administrative Tools > Local Security Policy to open the Local Security Policy MMC snap in.
  2. When the snap-in opens, navigate (in the left Tree view) to the Security Settings > Local Policies > User Rights Assignment node.
  3. Locate the Impersonate a client after authentication policy in the right-hand pane.
  4. Right-click the policy and select the Properties item that appears in the pop-up menu.
  5. A dialog box will appear. Click the Add User or Group … button on the dialog box.
  6. In the Select Users, Computers, Service Accounts, or Groups dialog box that appears, add your farm service/timer service account.
  7. Click the OK button on each of the two open dialog boxes to exit out of them.
  8. Close the Local Security Policy MMC snap-in.
  9. Perform an IISRESET and verify that the Central Administration site actually comes up instead of “An unexpected error has occurred”

Conclusion

If the change that I described in this post and implemented in my environment causes problems or requires further adjustment, I’ll update this post. My goal certainly isn’t to mislead – only to share and hopefully help those who may find themselves in the same situation as me.

If you’ve seen this problem in your SharePoint 2013 preview environment, please let me know. I’d love to hear about it, as well as how your worked through (or around) it!

UPDATE (9/4/2012)

I ran into the same issue with the account that was being used to serve up non-Central Admin site collections; i.e., the account that I was using as the identity for the application pools servicing the web applications I created. In my environment, this was SPDC\svcSpContentWebs as seen below (for the SharePoint – 80 application pool):

IIS Application Pools

Attempts to bring up a site collection without the Impersonate a client after authentication privilege being assigned to the SPDC\svcSpContentWebs account would usually yield nothing more than a blank screen. As with the farm service account, there was very little to troubleshoot until I went in with Process Monitor to look for a bunch of BAD IMPERSONATION results:

ProcMon for svcSpContentWebs

At this point, I’m willing to bet that any other accounts that are assigned as application pool identities will need to be granted the Impersonate a client after authentication privilege, as well.

In addition to the Impersonate a client after authentication privilege, I also ended up having to grant the SPDC\svcSpContentWebs account the Log on as a batch job privilege from within the Local Security Policy MMC snap-in. Without the privilege to Log on as a batch job, I was receiving an HTTP 503 error every time I tried to bring up a site collection. Troubleshooting this problem wasn’t as difficult, though; examining the System event log helped with the following description for the WAS (Windows Process Activation Service) warning on an Event 5021 that was appearing:

The identity of application pool SharePoint – 80 is invalid. The user name or password that is specified for the identity may be incorrect, or the user may not have batch logon rights. If the identity is not corrected, the application pool will be disabled when the application pool receives its first request.  If batch logon rights are causing the problem, the identity in the IIS configuration store must be changed after rights have been granted before Windows Process Activation Service (WAS) can retry the logon. If the identity remains invalid after the first request for the application pool is processed, the application pool will be disabled. The data field contains the error number.

In my case, my account credentials were correct, but for some reason the Log on as batch job right hadn’t been assigned to the SPDC\svcSpContentWebs account. Each time the application pool tried to spin up, it failed and was stopped; I’d then get two warnings from WAS (5021 and 5057) in my System event log, and that would be followed by a WAS 5059 error.

References and Resources

  1. TechNet: Download Microsoft SharePoint 2013 Preview
  2. TechNet: Plan Office Web Apps Server Preview
  3. Blog: Eric Harlan
  4. Blog: Todd Klindt
  5. TechNet: Windows Sysinternals Process Monitor