Revisiting the Basement Datacenter in 2016

Here we are in 2016. If you’ve been following my blog for a while, you might recall a post I threw together back in 2010 called Portrait of a Basement Datacenter. Back in 2010, I was living on the west side of Cincinnati with my wife (Tracy) and three year-old twins (Brendan and Sabrina). We were kind of shoehorned into that house; there just wasn’t a lot of room. Todd Klindt visited once and had dinner with us. He didn’t say it, but I’m sure he thought it: “gosh, there’s a lot of stuff in this little house.”

Servers in 2010All of my computer equipment (or rather, nearly all of my computer equipment) was in the basement. I had what I called a “basement datacenter,” and it was quite a collection of PCs and servers in varying form factors and with a variety of capabilities.

The image on the right is how things looked in 2010. Just looking at the picture brings back a bunch of memories for me, and it also reminds me a bit of what we (as server administrators) could and couldn’t easily do. For example, nowadays we virtualize nearly everything without a second thought. Six years ago, virtualization technology certainly existed … but it hadn’t hit the level of adoption that it’s cruising at today. I look at all the boxes on the right and think “holy smokes – that’s a lot of hardware. I’m glad I don’t have all of that anymore.” It seemed like I had drives and computers everywhere, and they were all sucking down juice. I had two APC 1600W UPS units that were acting as battery backups back then. With all the servers plugged-in, they were drawing quite a bit of power. And yeah – I had the electric bill to prove it.

So, What’s Changed?

For starters, we now live on the east side of Cincinnati and have a much bigger house than we had way back when. Whenever friends come over and get a tour of the house, they inevitably head downstairs and get to see what’s in the unfinished portion of the basement. That’s where the servers are nowadays, and this is what my basement datacenter looks like in 2016:

Servers in 2016Purpose of each server

In reality, quite a bit has changed. We have much more space in our new house, and although the “server area” is smaller overall, it’s basically a dedicated working area where all I really do is play with tech, fix machines, store parts, etc. If I need to sit at a computer, I go into the gaming area or upstairs to my office. But if I need to fix a computer? I do it here.

In terms of capabilities, the last six years have been good to me.

All Hail The Fiber

Back on the west side of town, I had a BPL (broadband-over-powerline) Internet hookup from Duke Energy and The CURRENT Group. Nowadays, I don’t even know what’s happening with that technology. It looks like Duke Energy may be trying to move away from it? In any case, I know it gave me a symmetric pipe to the Internet, and I think I had about 10Mbps up and down. I also had a secondary DSL connection (from Cincinnati Bell) that was about 2.5Mbps down and 1Mbps up.

Once I moved back to the east side of Cincinnati and Anderson Township, the doors were blown off of the barn in terms of bandwidth. Initially, I signed with Time Warner Cable for a 50Mbps download / 5Mbps upload primary connection to my house. I made the mistake of putting in a business circuit (well, I was running a business), so while it gave me some static IP address options, it ended up costing a small fortune.

InternetSpeed2016My costly agreement with Time Warner ended last year, and for that I’m thankful. Nowadays, I have Cincinnati Bell Fiber coming to my house (Fioptics), and it’s a full-throttle connection. I pay for gigabit download speeds and have roughly a 250Mbps upload pipe. Realistically, the bandwidth varies … but there’s a ton of it, even on a bad day. The image on the right shows the bandwidth to my desktop as I’m typing this post. No, it’s not gigabit (at this moment) … but really, should I complain about 330Mbps download speeds from the Internet? Realistically speaking, some of the slowdown is likely due to my equipment. Running full gigabit Ethernet takes good wiring, quality switches, fast firewalls, and more. You’re only as fast as your slowest piece of equipment.

I do keep a backup connection with Time Warner Cable in case the fiber goes down, and my TMG firewall does a great job of failing over to that backup connection if something goes wrong. And yes, I’ve had a problem with the fiber once or twice. But it’s been resolved quickly, and I was back up in no time. Frankly, I love Cincinnati Bell’s fiber.

What About Storage?

ProRaidIn the last handful of years, storage limits have popped over and over again. You can buy 8TB drives on Amazon.com right now, and they’re not prohibitively expensive? We’ve come a long way in just a half dozen years, and the limits just keep expanding.

I have a bunch of storage downstairs, and frankly I’m pretty happy with it. I’ve graduated from the random drives and NAS appliances that used to occupy my basement. These days, I use Mediasonic RAID enclosures. You pop some drives in, connect an eSATA cable (or USB cable, if you have to), and away you go. They’ve been great self-contained pass-through drive arrays for specific virtual machines running on my Hyper-V hosts.  I’ve been running the Mediasonic arrays for quite a few years now, and although this isn’t a study in “how to build a basement datacenter,” I’d recommend them to anyone looking for reliable storage enclosures. I keep one as a backup unit (because eventually one will die), and as a group they seem to be in good shape at this point in time. The enclosures supply the RAID-5 that I want (and yeah, I’ve had *plenty* of drives die), so I’ve got highly-available, hot-swappable storage where I need it.

Oh, and don’t mind the minions on my enclosures. Those of you with children will understand. Those who don’t have children (or who don’t have children in the appropriate age range) should either just wait it out or go watch Despicable Me.

Hey? What About The Cloud?

Servers and their shelfThe astute will ask “why are you putting all this hardware in your house instead of shifting to the cloud?” You know, that’s a good question. I work for Cardinal Solutions Group, and we’re a Microsoft managed partner with a lot of Office 365 and Azure experience. Heck, I’m Cardinal’s National Solution Manager for Office 365, so The Cloud is what I think about day-in and day-out.

First off, I love the cloud. For enterprise scale engagements, the cloud (and Microsoft’s Azure capabilities, in particular) are awesome. Microsoft has done a lot to make it easier (not “easy,” but “easier”) for us to build for the cloud, put our stuff (like pictures, videos, etc.) in the cloud, and get things off of our thumb drives and backup boxes and into a place where they are protected, replicated, and made highly available.

What I’m doing in my basement doesn’t mean I’m “avoiding” the cloud. Actually, I moved my family onto an Office 365 plan to give them email and capabilities they didn’t have before. My kids have their first email address now, and they’re learning how to use email through Office 365. I’m going to move the SharePoint site collection that I maintain for our family (yes, I’m that big of a geek) over to SharePoint Online because I don’t want to wrangle with it at home any longer. Keeping SharePoint running is a pain-in-the-butt, and I’m more than happy to hand that over the Office 365 folks.

I’ll still be tinkering with SharePoint VMs for sure with the work I do, but I’m happy to turn over operational responsibility to Microsoft for my family’s site collection.

The Private Cloud

ServerShelfLeftSo even though I believe in The Cloud (i.e, “the big cloud that’s out there with all of our data”), I also believe in the “private cloud,” “personal cloud,” or whatever you want to call it. When I work from the Cardinal office, my first order of business is to VPN back to my house (again, through my TMG Firewall – they’ll have to pry it from my cold, dead hands) so that I have access to all of my files and systems at home.

Accessing stuff at home is only part of it, though. The other part is just knowing that I’m going through my network, interacting with my systems, and still feeling like I have some control in our increasingly disconnected world. My Plex server is there, and my file shares are available, and I can RDP into my desktop to leverage its power for something I’m working on. There’s a comfort in knowing my stuff is on my network and servers.

Critical data makes it to the cloud via OneDrive, Dropbox, etc, but I still can’t afford to pay for all of my stuff to be in the cloud. Prices are dropping all of the time, though. Will I ever give up my basement datacenter? Probably not, because maintaining it helps me keep my technical skills sharpened … but it’s also a labor of love.

Additional Reading and References

  1. Blog Post: Portrait of a Basement Datacenter
  2. Blog: Todd Klindt’s SharePoint Admin Blog
  3. Department of Justice: Current Group Broadband Overview
  4. Site: Cincinnati Bell Fioptics
  5. TechNet: Threat Management Gateway
  6. Amazon.com: Seagate Archive 8 TB Internal Hard Drive
  7. Amazon.com: Mediasonic PRORAID Drive Enclosure
  8. Amazon.com: Despicable Me
  9. Company: Cardinal Solutions Group

Bare Metal Bugaboos

Having recently recovered from a firewall outage using Windows Server 2008’s bare metal restore capabilities, I figured I’d write a quick post to cover how I did it. I also cover one really big learning I picked up as a result of the process.

I had one of those “aw nuts” moments last night.

At some point yesterday afternoon, I noticed that none of the computers in the house could get out to the Internet.  After verifying that my wireless network was fine and that internal DNS was up-and-running, I traced the problem back to my Forefront Threat Management Gateway (TMG) firewall.  Attempting to RDP into it proved fruitless, and when I went downstairs and looked at the front of the server, I noticed the hard drive activity light was constantly lit.

So, I powered the server off and brought it back on.  Problem solved … well, not really. It happened again a couple of hours later, so I repeated the process and made a mental note that I was going to have to look at the server when I had a chance.

Demanding My Attention

Well, things didn’t “stay fixed.”  Later in the evening, the same lack of connectivity surfaced again.  I went to the basement, powered the server off, and brought it back up.  That time, though, the server wouldn’t start and complained about having nothing to boot from.

As I did a reset and watched it boot again, I could see the problem: although the server knew that something was plugged in for boot purposes, it couldn’t tell that what was plugged in was a 250GB SATA drive.  Ugh.

When I run into those types of situation, the remedy is pretty clear: a new hard drive.  I always have a dozen or more hard drives sitting around (comes from running a server farm in the basement), and I grabbed a 500GB Hitachi drive that I had leftover from another machine.  Within five minutes, the drive was in the server and everything was hooked back up.

Down to the Metal

Of course, a new hard drive was only half of the solution.  The other half of the equation involved restoring from backup.  In this case, a bare metal restore from backup was the most appropriate course of action since I was starting with a blank disc.

For those who may not be familiar with the concept of bare metal restoration, you can get a quick primer from Wikipedia.  I use Microsoft’s System Center Data Protection Manager 2010 (DPM) to protect the servers in my environment, so I knew that I had an image from which I could restore my TMG box.  I just dreaded the thought of doing so.

Why the worry?  Well, I think Arthur C. Clarke summed it up best with the following quote:

Any sufficiently advanced technology is indistinguishable from magic.

The Cold Sweats

Now bare metal restore isn’t “magic,” but it is relatively sophisticated technology … and it’s still an area that seems plagued with uncertainties.

I have to believe that I’m not the only one who feels this way.  I’ve co-authored two books on SharePoint disaster recovery, and the second book includes a chapter I wrote that covers bare metal restore on a Windows 2008 server.  My experience with bare metal restores can be summarized as follows: when it works, it’s awesome … but it doesn’t always work as we’d want it to.  When it doesn’t work, it’s plain ol’ annoying in that it doesn’t explain why.

So, it’s with that mindset that I started the process of trying to clear away my server’s lobotomized state.  These are the steps I carried out to get ready for the restore:

  1. DPM consoleI went into the DPM console, selected the most recent bare metal restore recovery point available to me (as shown on the right), and restored the contents of the folder to a network file share– in my case, \\VMSS-FILE1\RESTORENote: you’ll notice a couple of restore points available after the one I selected; those were created in the time since I did the restore but before I wrote this post.
  2. The approximately 21GB bare metal restore image was created on the share.  I do have gigabit Ethernet on my network, and since I recently built-out a new DPM server with faster hardware, it really didn’t take too long to get the image restored to the designated file share – maybe five minutes or so.  The result was a single folder in the designated file share.
  3. Folder structure for restore shareI carried out a little manipulation on the folder that DPM created; specifically, I cut out two levels of sub-folders and made sure that the WindowsImageBackup folder was available directly from the top of the share as shown at the left.  The Windows Recovery Environment (or WinRE) is picky about this detail; if it doesn’t see the folder structure it expects when restoring from a network share, it will declare that nothing is available for you to restore from – even though you know better.

In Recovery

With my actual restore image ready to go on the file share, I booted into the WinRE using a bootable USB memory stick with Windows 2008 R2 Server on it.  I walked through the process of selecting Repair your computer, navigating out to the file share, choosing my restore image, etc.  The process is relatively easy to stumble through, but if you want it in a lot of detail, I’d encourage you to read Chapter 5 (Windows Server 2008 Backup and Restore) in our SharePoint 2010 Disaster Recovery Guide.  In that chapter, I walk through the restore process in step-by-step fashion with screenshots.

Additional restore options dialogI got to the point in the wizard where I was prompted to select additional options for restore as shown on the left.  By default, the WinRE will format and repartition discs as needed.  In my case, that’s what I wanted; after all, I was putting a brand new drive in (one that was larger than the original), so formatting and partitioning was just what the doctor ordered.  I also had the ability to exclude some drives (through Exclude disks) from the recovery process – not something I had to worry about given that my system image only covered one hard drive.  If my hard drive required additional drivers (as might be needed with a drive array, RAID card, or something equivalent), I also had the opportunity to supply them with the Install drivers option.  Again, this was a basic in-place restore; the only thing I needed was a clean-up of the hard drive I supplied, so I clicked Next.

Confirmation dialogI confirmed the details of the operation on the next page, and everything looked right to me.  I then paused to mentally double-check everything I was about to do.

In my experience, the dialog on the left is the last point of easily grasped normal wizard activity before the WinRE restore wizard takes off and we enter “magic land.”  As I mentioned, when restores work … they just chug right along and it looks easy.  When bare metal and system state restores don’t work, though, the error messages are often unintelligible and downright useless from a troubleshooting and remediation perspective.  I hoped that my restore would be one of the happy restores that chugged right along and made me proud of my backup and restore prowess.

I crossed my fingers and clicked the Next button.

<Insert Engine Dying Noises Here>

A picture of the restore going belly-upThe screenshot on the right shows what happened almost immediately after I clicked next.

Well, you knew this blog post would be a whole lot less interesting if everything went according to plan.

Once I worked through my panic and settled down, I looked a little closer.  I understood The system image restore failed without much interpretation, but I had no idea what to make of

Error details: The parameter is incorrect. (0x80070057)

That was the extent of what I had to work with.  All I could do was close out and try again.  Sheesh.

Head Scratching

Advanced options dialogLet’s face it: there aren’t a whole lot of options to play with in the WinRE when it comes to bare metal restore.  The screenshot on the left shows the Advanced options you have available to you, but there really isn’t much to them.  I experimented with the Automatically check and update disk error information checkbox, but it really didn’t have an effect on the process.  Nevertheless, I tried restores with all combinations of the checkboxes set and cleared.  No dice.

With the Advanced options out of the way, there was really only one other place to look: the Exclude disks dialog.  I knew Install drivers wasn’t needed, because I had no trouble accessing my disks and wasn’t using anything like a RAID card or some other advanced disk configuration.

Disk exclusion dialogI popped open the disk exclusion dialog (shown on the right) and tried running a restore after excluding all of the disks except the Hitachi disk to which I would be writing data (Disk 2).  Again, no dice – I still continued to get the aforementioned error and couldn’t move forward.

I knew that DPM created usable bare metal images, and I knew that the WinRE worked when it came to restoring those images, so I knew that I had to be doing something wrong.  After another half an hour of goofing around, I stopped my thrashing and took stock of what I had been doing.

My Inner Archimedes

My eureka moment came when I put a few key pieces of information together:

  • While writing the chapter on Windows Server 2008 Backup and Restore for the SharePoint 2010 DR book, I’d learned that image restores from WinRE are very persnickety about the number of disks you have and the configuration of those disks.
  • When DPM was creating backups, only three hard drives were attached to the server: the original 250GB system drive and two 30GB SSD caching drives.
  • Booting into WinRE from a memory stick was causing a distinctly visible fourth “drive” to show up in the list of available disks.
    The bootable USB stick had to be a factor, so I put it away and pulled out a Windows Server 2008 R2 installation disk.  I then booted into the WinRE from the DVD and walked through the entire restore process again.  When I got to the confirmation dialog and pressed the Next button this time around, I received no The parameter is incorrect errors – just a progress bar that tracked the restore operation.

Takeaway

The one point that’s going to stick with me from here on out is this: if I’m doing a bare metal restore, I need to be booting into the WinRE from a DVD or from some other source that doesn’t affect my drives list.  I knew that the disks list was sensitive on restore, but I didn’t expect USB drives to have any sort of effect on whether or not I could actually carry out the desired operation.  I’m glad I know better now.

Additional Reading and References

  1. Product Overview: Forefront Threat Management Gateway 2010
  2. Wikipedia: Bare-metal restore
  3. Product Overview: System Center Data Protection Manager 2010
  4. Book: SharePoint 2010 Disaster Recovery Guide

DPM, RPC, and DCOM with Forefront TMG 2010

In this post, I discuss a couple of DCOM/RPC snags I ran into while configuring Microsoft’s Data Protection Manager (DPM) 2007 client protection agent to run on my new Forefront Threat Management Gateway (TMG) 2010 server. I also walk through the troubleshooting approach I took to resolve the issues that appeared.

In a recent post, I was discussing my impending move to Microsoft’s Forefront Threat Management Gateway (TMG) 2010 on my home network.  As part of the move, I was going to decommission two Microsoft Internet Security and Acceleration (ISA) 2006 servers and an old Windows Server 2008 remote access services (RAS) box and replace them with a single TMG 2010 server – a big savings in terms of server maintenance and power consumption.

I completed the move about a week ago, and I’ve been very happy with TMG thus far.  TMG’s ISP redundancy and load balancing features have been fantastic, and they’ve allowed me to use my Internet connections much more effectively.

As a user of ISA since its original 2000 release, I also had no problem jumping in and working with the TMG management GUI.  It was all very familiar from the get-go.  Call me “very satisfied” thus far.

This afternoon, I took a few moments to un-join the old ISA servers from the domain, power them down, and clean things up.  I had also planned to take a little time integrating the new TMG box into my Data Protection Manager (DPM) 2007 backup rotation.  Unfortunately, though, the DPM integration took a bit longer than expected …

Backup Brigade

For those unfamiliar with the operation of DPM, I’ll take a couple of moments to explain a bit about how it works.  In order for DPM to do its thing, any computer that is going to be protected must have the DPM 2007 Protection Agent installed on it.  Once the DPM Protection Agent is installed and configured, the DPM server communicates through the agent (which operates as a Windows service) to leverage the protected system’s Volume Shadow Copy Service (VSS) for backups.

Installing the DPM agent typically isn’t a big challenge for common client computers, and it can be accomplished directly from within the DPM management GUI itself.  When the agent is installed through the GUI, DPM connects to the computer to be protected, installs the agent, and configures it to point back to the DPM server.  No manual intervention is required.

On some systems, though, it’s simply easier to install and configure the agent directly from the to-be-protected system itself.  A locked-down server (like a TMG box) falls into this category, so I manually put the agent installation package on the TMG server, ran it, and carried out the follow-up PowerShell Attach-ProductionServer (from the DPM Management Shell) on the DPM server.  The install proceeded without issue, and the associated attach (on the DPM server) went off without a hitch.  I thought I was good to go.

I fired up the management GUI on the DPM Server, went into the Agents tab under Management, and discovered that I couldn’t connect to the TMG server.

DPM Agent Issue

The Checklist

The fact that I couldn’t connect to the TMG server (SS-TMG1) from my DPM box was a bit of an eyebrow lifter, but it wasn’t entirely unexpected.  Communication between a DPM server and the DPM agent leverages DCOM, and I’d had to jump through a few hoops to ensure that the DPM server could communicate with the ISA boxes previously.

I suspected that an RPC/DCOM issue was in play, but I was having trouble seeing where the problem might be. So, I reviewed where I was at.

  • Without an exception, Windows Firewall will block communication between a DPM server and its agents.  I confirmed that Windows Firewall wasn’t in play and that TMG itself was handling all of the firewall action.
  • Examining TMG, I confirmed that I had a rule in place that permitted all traffic between my DPM server (SS-TOOLS1) and the TMG box itself.
    DPM-TMG Access Rule 
  • DPM-TMG Access Rule (with RPC)Strict RPC compliance is another potential problem with DPM on both ISA and TMG, as requiring strict compliance blocks any DCOM traffic.  DCOM (and any other traffic that doesn’t explicitly begin RPC exchanges by communicating with the RPC endpoint mapper on the target server) gets dropped by the RPC Filter unless the checkbox for Enforce strict RPC compliance is unchecked.  I confirmed that my rule wasn’t requiring strict compliance (as shown on the right).
  • I made sure that my DPM server wasn’t listed as a member of either the Enterprise Remote Management Computers or Remote Management Computers Computer Sets in TMG.  These two Computer Sets are specially impacted by a couple of TMG System Policy rules that can impact their ability to call into TMG via RPC and DCOM.
  • I reviewed all System Policy rules that might impact inbound RPC calls to the TMG server, and I couldn’t find any that would (or should) be influencing DPM’s ability to connect to its agent.

I also went out to the Forefront TMG Product Team’s Blog to see what advice they might have to offer, and I found this excellent article on RPC and TMG – well worth a read if you’re trying to troubleshoot RPC problems.  Unfortunately, it didn’t offer me any tips that would help in my situation.

Watching The Traffic

I may have simply had tunnel vision, but I was still obsessed with the notion that strict RPC checking was causing my problems.  To see if it was, I decided to fire-up TMG’s live logging and see what was happening when DPM tried to connect to its agent.  I set the logging to display only traffic that was originated from the DPM box, and this is what I saw.

Watching Traffic from DPM

There was nothing wrong that I could see.  My access rule was clearly being utilized (the one that doesn’t enforce strict RPC checking), and I wasn’t seeing any errors – just connection initiations and closes.  Traffic from DPM to TMG looked clean.

Taking A Step Back

I was frustrated, and I clearly needed to consider the possibility that I didn’t have a good read on the problem.  So, I went to the Windows Application event log to see if it might provide some insight.  I probably should have started with the event logs instead of TMG itself and firewall rules … but better late than never, I figured.

DPM Agent Can't CommunicatePopping open Event Viewer, I was greeted with the image you see on the left.  What I saw was enlightening, for I did have a problem with communication between the DPM agent and the DPM server.  The part that intrigued me, though, was the fact that the problem was with outbound communication (that is, from TMG server to DPM server) – not the other way around as I had originally suspected.  All of my focus had been on troubleshooting traffic coming into TMG because I’d been interpreting the errors I’d seen to mean that the DPM server couldn’t reach the agent – not that the agent couldn’t “phone home,” so to speak.

I knew for a fact that the DPM Server, SS-TOOLS1, didn’t have the Windows Firewall service running.  Since the service wasn’t running, there was no way that the agent’s attempts to communicate with DPM could (or rather, should) be getting blocked at the destination.  That left the finger of blame pointing at TMG.

On The Way Out

I decided to repeat the traffic watching exercise I’d conducted earlier, but instead of watching traffic coming into the TMG box from my DPM server, I elected to watch traffic going the other direction – from TMG to DPM.  Here’s what I saw:

Watching Traffic to DPM

The “a-ha” moment for me came when I saw the firewall rule that was actually governing RPC traffic to the DPM box from TMG.  It wasn’t the DPM All <=> SS-TMG1 rule I’d established — it was a system policy rule called [System] Allow RPC from Forefront TMG to trusted servers.

System policy rules are normally hidden in the firewall policy tab, so I had to explicitly show them to review them.  Once I did, there it was – rule 22.

System Policy Rule 22

Note that this rule applies to all traffic from the TMG server to the Internal network; I’ll be talking about that more in a bit.

System Policy EditorI couldn’t edit the rule in-place; I needed to use the System Policy editor.  So, I fired up the System Policy Editor and traced the rule back to its associated configuration group.  As it turned out, the rule was tied to the Active Directory configuration group under Authentication Services.

As the picture on the left clearly shows, the Enforce strict RPC compliance checkbox was checked.  Once I unchecked it and applied the configuration change, the DPM agent began communicating with the DPM server without issue.  Problem solved.

What Happened?

I was fairly sure that I hadn’t experienced this sort of trouble installing the DPM Protection Agent under ISA Server 2006, so I tried to figure out what might have happened.

I hadn’t recalled having to adjust the target system policy under ISA when installing the DPM agent originally, but a quick boot and check of my old ISA server revealed that the checkbox was indeed unchecked (meaning that strict RPC compliance wasn’t being enforced).  I’d apparently made the change at some point and forgotten about it.  I suspect I’d messed with it at some point in the distant past while working on passing AD information through ISA, getting VPN functionality up-and-running, or perhaps something else.

Implications

Bottom line: TMG enforces strict compliance for RPC traffic that originates on the TMG server (Local Host) and is destined for the Internal network.  Since System Policy Rules are applied before administrator-defined Firewall Policy Rules, RPC traffic from the TMG server to the Internal network will always be governed by the system policy unless that policy is disabled.

In this particular scenario, the DPM 2007 Protection Agent’s operation was impacted.  Even though I’d created a rule that I thought would govern interactions between DPM and TMG, the reality is that it only governed RPC traffic coming into TMG – not traffic going out.

In reality, any service or application that sends DCOM traffic originating on the TMG server to the Internal network is going to be affected by the Allow RPC from Forefront TMG to trusted servers rule unless the associated system policy is adjusted.

Conclusion

The core findings of this post have been documented by others (in a variety of forms/scenarios) for ISA, but this go-round with TMG and the DPM association caught me off-guard such that I thought it would be worth sharing my experience with other firewall administrators.  If anyone else moving to TMG takes the “build it from the ground up” approach that I did, then the system policy I’ve been discussing may get missed.  Hopefully this post will serve as a good lesson (or reminder for veteran firewall administrators).

UPDATE (10/26/2010)

Thomas K. H. Bittner (former MVP from Germany who runs the Windows Server System Reference Architecture blog) contacted me and shared a blog post he wrote on configuring DPM 2010 and TMG communication; the post can be found here: http://msmvps.com/blogs/wssra/archive/2010/10/20/configure-the-forefront-tmg-2010-to-allow-dpm-2010-communication.aspx.  Thomas’ post is fantastic in that it is extremely granular in its configuration of communication channels between TMG and DPM.  If you would prefer to lock things down more securely than I demonstrate, then I highly recommend checking out the post that Thomas wrote.

Additional Reading and References

  1. Recent Post: Portrait of a Basement Datacenter
  2. Microsoft: Forefront Threat Management Gateway 2010
  3. Microsoft: Internet Security and Acceleration Server 2006
  4. Microsoft: System Center Data Protection Manager
  5. Forefront TMG Product Team Blog: RPC Filter and “Enable strict RPC compliance”

Portrait of a Basement Datacenter

In this post, I take a small detour from SharePoint to talk about my home network, how it has helped me to grow my skill set, and where I see it going.

Whenever I’m speaking to other technology professionals about what I do for a living, there’s always a decent chance that the topic of my home network will come up.  This seems to be particularly true when talking with up-and-coming technologists, as I’m commonly asked by them how I managed to get from “Point A” (having transitioned into IT from my previous life as a polymer chemist) to “Point B” (consulting as a SharePoint architect).

I thought it would be fun (and perhaps informative) to share some information, pictures, and other geek tidbits on the thing that seems to consume so much of my “free time.”  This post also allows me to make good on the promise I made to a few people to finally put something online for them to see.

Wait … “Basement Datacenter?”

For those on Twitter who may have seen my occasional use of the hashtag #BasementDatacenter: I can’t claim to have originated the term, though I fully embrace it these days.  The first time I heard the term was when I was having one of the aforementioned “home network” conversations with a friend of mine, Jason Ditzel.  Jason is a Principal Consultant with Microsoft, and we were working together on a SharePoint project for a client a couple of years back.  He was describing his love for his recently acquired Windows Home Server (WHS) and how I should have a look at the product.  I described why WHS probably wouldn’t fit into my network, and that led Jason to comment that Microsoft would have to start selling “Basement Datacenter Editions” of its products.  The term stuck.

So, What Does It Look Like?

Basement Datacenter - Legend Basement Datacenter - Front Shot Two pictures appear on the right.  The left-most shot is a picture of my server shelves from the front.  Each of the computing-related items in the picture is labeled in the right-most shot.  There are obviously other things in the pictures, but I tried to call out the items that might be of some interest or importance to my fellow geeks.

Behind The Servers Generally speaking, things look relatively tidy from the front.  Of course, I can’t claim to have the same degree of organization in the back.  The shot on the left displays how things look behind and to the right of the shots that were taken above.  All of the power, network, and KVM cabling runs are in the back … and it’s messy.  I originally had things nicely organized with cables of the proper length, zip ties, and other aids.  Unfortunately, servers and equipment shift around enough that the organization system wasn’t sustainable.

While doing the network planning and subsequent setup, I’m happy that I at least had the foresight to leave myself ample room to move around behind the shelves.  If I hadn’t, my life would be considerably more difficult.

On the topic of shelves: if you ever find yourself in need of extremely heavy duty, durable industrial shelves, I highly recommend this set of shelves from Gorilla Rack.  They’re pretty darn heavy, but they’ll accept just about any amount of weight you want to put on them.

I had to include the shot below to give you a sense of the “ambiance.”

Under The Cover Of Colorful Lighting

Anyone who’s been to my basement (which I lovingly refer to as “the bunker”) knows that I have a thing for dim but colorful lighting.  I normally illuminate my basement area with Christmas lights, colored light bulbs, etc.  Frankly, things in the basement are entirely too ugly (and dusty) to be viewed under normal lighting.  It may be tough to see from this shot, but the servers themselves contribute some light of their own.

Why On Earth Do You Have So Many Servers?

After seeing my arrangement, the most common question I get is “why?”  It’s actually an easy one to answer, but to do so requires rewinding a bit.

Many years ago, when I was a “young and hungry” developer, I was trying to build a skill set that would allow me to work in the enterprise – or at least on something bigger than a single desktop.  Networking was relatively new to me, as was the notion of servers and server-side computing.  The web had only been visual for a while (anyone remember text-based surfing?  Quite a different experience …), HTML 3 was the rage, Microsoft was trying to get traction with ASP, ActiveX was the cool thing to talk about (or so we thought), etc.

It was around that time that I set up my first Windows NT4 server.  I did so on the only hardware I had leftover from my first Pentium purchase – a humble 486 desktop.  I eventually got the server running, and I remember it being quite a challenge.  Remember: Google and “answers at your fingertips” weren’t available a decade or more ago.  Servers and networking also weren’t as forgiving and self-correcting as they are nowadays.  I learned a awful lot while troubleshooting and working on that server.

Before long, though, I wanted to learn more than was possible on a single box.  I wanted to learn about Windows domains, I wanted to figure out how proxies and firewalls worked (anyone remember Proxy Server 2.0?), and I wanted to start hosting online Unreal Tournament and Half Life games for my friends.  With everything new I learned, I seemed to pick up some additional hardware.

When I moved out of my old apartment and into the house that my wife and I now have, I was given the bulk of the basement for my “stuff.”  My network came with me during the move, and shortly after moving in I re-architected it.  The arrangement changed, and of course I ended up adding more equipment.

Fast-forward to now.  At this point in time, I actually have more equipment than I want.  When I was younger and single, maintaining my network was a lot of fun.  Now that I have a wife, kids, and a great deal more responsibility both in and out of work, I’ve been trying to re-engineer things to improve reliability, reduce size, and keep maintenance costs (both time and money) down.

I can’t complain too loudly, though.  Without all of this equipment, I wouldn’t be where I’m at professionally.  Reading about Windows Server, networking, SharePoint, SQL Server, firewalls, etc., has been important for me, but what I’ve gained from reading pales in comparison to what I’ve learned by *doing*.

How Is It All Setup?

I actually have documentation for most of what you see (ask my Cardinal SharePoint team), but I’m not going to share that here.  I will, however, mention a handful of bullets that give you an idea of what’s running and how it’s configured.

  • I’m running a Windows 2008 domain (recently upgraded from Windows 2003)
  • With only a couple of exceptions, all the computers in the house are domain members
  • I have redundant ISP connections (DSL and BPL) with static IP addresses so I can do things like my own DNS resolution
  • My primary internal network is gigabit Ethernet; I also have two 802.11g access points
  • All my equipment is UPS protected because I used to lose a lot of equipment to power irregularities and brown-outs.
  • I believe in redundancy.  Everything is backed-up with Microsoft Data Protection Manager, and in some cases I even have redundant backups (e.g., with SharePoint data).

There’s certainly a lot more I could cover, but I don’t want to turn this post into more of a document than I’ve already made it.

Fun And Random Facts

Some of these are configuration related, some are just tidbits I feel like sharing.  All are probably fleeting, as my configuration and setup are constantly in flux:

Beefiest Server: My SQL Server, a Dell T410 with quad-core Xeon and about 4TB worth of drives (in a couple of RAID configurations)

Wimpiest Server: I’ve got some straggling Pentium 3, 1.13GHz, 512MB RAM systems.  I’m working hard to phase them out as they’re of little use beyond basic functions these days.

Preferred Vendor: Dell.  I’ve heard plenty of stories from folks who don’t like Dell, but quite honestly, I’ve had very good luck with them over the years.  About half of my boxes are Dell, and that’s probably where I’ll continue to shop.

Uptime During Power Failure: With my oversize UPS units, I’m actually good for about an hour’s worth of uptime across my whole network during a power failure.  Of course, I have to start shutting down well before that (to ensure graceful power-off).

Most Common Hardware Failure: Without a doubt, I lose power supplies far more often than any other component.  I think that’s due in part to the age of my machines, the fact that I haven’t always bought the best equipment, and a couple of other factors.  When a machine goes down these days, the first thing I test and/or swap out is a power supply.  I keep at least a couple spares on-hand at all times.

Backup Storage: I have a ridiculous amount of drive space allocated to backups.  My DPM box alone has 5TB worth of dedicated backup storage, and many of my other boxes have additional internal drives that are used as local backup targets.

Server Paraphernalia: Okay, so you may have noticed all the “junk” on top of the servers.  Trinkets tend to accumulate there.  I’ve got a set of Matrix characters (Mr. Smith and Neo), a PIP boy (of Fallout fame), Cheshire Cat and Alice (from American McGee’s Alice game), a Warhammer mech (one of the Battletech originals), a “cat in the bag” (don’t ask), a multimeter, and other assorted stuff.

Cost Of Operation: I couldn’t begin to tell you, though my electric bill is ridiculous (last month’s was about $400).  Honestly, I don’t want to try to calculate it for fear of the result inducing some severe depression.

Where Is It All Going?

As I mentioned, I’m actively looking for ways to get my time and financial costs down.  I simply don’t have the same sort of time I used to have.

Given rising storage capacities and processor capabilities, it probably comes as no surprise to hear me say that I’ve started turning towards virtualization.  I have two servers that act as dedicated Hyper-V hosts, and I fully expect the trend to continue.

Here are a few additional plans I have for the not-so-distant future:

  • I just purchased a Dell T110 that I’ll be configuring as a Microsoft Forefront Threat Management Gateway 2010 (TMG) server.  I currently have two Internet Security and Acceleration Server 2006 servers (one for each of my ISP connections) and a third Windows Server 2008 for SSL VPN connectivity.  I can get rid of all three boxes with the feature set supplied by one TMG server.  I can also dump some static routing rules and confusing firewall configuration in the process.  That’s hard to beat.
  • I’m going to see about virtualizing my two domain controllers (DCs) over the course of the year.  Even though the machines are backed-up, the hardware is near the end of its usable life.  Something is eventually going to fail that I can’t replace.  By virtualizing the DCs, I gain a lot of flexibility (I can move them around on physical hardware) and can get rid of two more physical boxes.  Box reduction is the name of the game these days!  I’ll probably build a new (virtual) DC on Windows Server 2008 R2; migrate FSMO roles, DNS, and DHCP responsibilities to it; and then phase out the physical DCs – rather than try a P2V move.
  • With SharePoint Server 2010 coming, I’m going to need to get some even beefier server hardware.  I’m learning and working just fine with the aid of desktop virtualization right now (my desktop is a Core i7-920 with 12GB RAM), but that won’t cut it for “production use” and testing scenarios when SharePoint Server 2010 goes RTM.

Conclusion

If the past has taught me anything, it’s that additional needs and situations will arise that I haven’t anticipated.  I’m relatively confident that the infrastructure I have in place will be a solid base for any “coming attractions,” though.

If you have any questions or wonder how I did something, feel free to ask!  I can’t guarantee an answer (good or otherwise), but I do enjoy discussing what I’ve worked to build.

Additional Reading and References

  1. LinkedIn: Jason Ditzel
  2. Product: Gorilla Rack Shelves
  3. Networking: Cincinnati Bell DSL
  4. Networking: Current BPL
  5. Microsoft: System Center Data Protection Manager
  6. Dell: PowerEdge Servers
  7. Microsoft: Hyper-V Getting Started Guide
  8. Movie: The Matrix
  9. Gaming: Fallout Site
  10. Gaming: American McGee’s Alice
  11. Gaming: Warhammer BattleMech
  12. Microsoft: Forefront Threat Management Gateway 2010
  13. Microsoft: Internet Security & Acceleration Server 2006