Finding Duplicate GUIDs in Your SharePoint Site Collection

In this self-described “blog post you should never need,” I talk about finding objects with duplicate GUIDs in a client’s SharePoint site collection. I supply the PowerShell script used to find the duplicate GUIDs and offer some suggestions for how you might remedy such a situation.

This is a bit of an oldie, but I figured it might help one or two random readers.

Let me start by saying something right off the bat: you should never need what I’m about to share.  Of course, how many times have you heard “you shouldn’t ever really need this” when it comes to SharePoint?  I’ve been at it a while, and I can tell you that things that never should happen seem to find a way into reality – and into my crosshairs for troubleshooting.

Disclaimer

The story and situation I’m about to share is true.  I’m going to speak in generalities when it comes to the identities of the parties and software involved, though, to “protect the innocent” and avoid upsetting anyone.

The Predicament

I was part of a team that was working with a client to troubleshoot problems that the client was encountering when they attempted to run some software that targeted SharePoint site collections.  The errors that were returned by the software were somewhat cryptic, but they pointed to a problem handling certain objects in a SharePoint site collection.  The software ran fine when targeting all other site collections, so we naturally suspected that something was wrong with only one specific site collection.

After further examination of logs that were tied to the software, it became clear that we had a real predicament.  Apparently, the site collection in question contained two or more objects with the same identity; that is, the objects had ID properties possessing the same GUID.  This isn’t anything that should ever happen, but it had.  SharePoint continued to run without issue (interestingly enough), but the duplication of object GUIDs made it downright difficult for any software that depended on unique object identities being … well, unique.

Although the software logs told us which GUID was being duplicated, we didn’t know which SharePoint object or objects the GUID was tied to.  We needed a relatively quick and easy way to figure out the name(s) of the object or objects which were being impacted by the duplicate GUIDs.

Tackling the Problem

It is precisely in times like those described that PowerShell comes to mind.

My solution was to whip-up a PowerShell script (FindDuplicateGuids.ps1) that processed each of the lists (SPList) and webs (SPWeb) in a target site collection.  The script simply collected the identities of each list and web and reported back any GUIDs that appeared more than once.

The script created works with both SharePoint 2007 and SharePoint 2010, and it has no specific dependencies beyond SharePoint being installed and available on the server where the script is run.

########################
# FindDuplicateGuids.ps1
# Author: Sean P. McDonough (sean@sharepointinterface.com)
# Blog: http://SharePointInterface.com
# Last Update: August 29, 2013
#
# Usage from prompt: ".\FindDuplicateGuids.ps1 <siteUrl>"
#   where <siteUrl> is site collection root.
########################


#########
# IMPORTS
# Import/load common SharePoint assemblies that house the types we'll need for operations.
#########
Add-Type -AssemblyName "Microsoft.SharePoint, Version=12.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c"


###########
# FUNCTIONS
# Leveraged throughout the script for one or more calls.
###########
function SpmBuild-WebAndListIdMappings {param ($siteUrl)
	$targetSite = New-Object Microsoft.SharePoint.SPSite($siteUrl)
	$allWebs = $targetSite.AllWebs
	$mappings = New-Object System.Collections.Specialized.NameValueCollection
	foreach ($spWeb in $allWebs)
	{
		$webTitle = "WEB '{0}'" -f $spWeb.Title
		$mappings.Add($spWeb.ID, $webTitle)
		$allListsForWeb = $spWeb.Lists
		foreach ($currentList in $allListsForWeb)
		{
			$listEntry = "LIST '{0}' in Web '{1}'" -f $currentList.Title, $spWeb.Title
			$mappings.Add($currentList.ID, $listEntry)
		}
		$spWeb.Dispose()
	}
	$targetSite.Dispose()
	return ,$mappings
}

function SpmFind-DuplicateMembers {param ([System.Collections.Specialized.NameValueCollection]$nvMappings)
	$duplicateMembers = New-Object System.Collections.ArrayList
	$allkeys = $nvMappings.AllKeys
	foreach ($keyName in $allKeys)
	{
		$valuesForKey = $nvMappings.GetValues($keyName)
		if ($valuesForKey.Length -gt 1)
		{
			[void]$duplicateMembers.Add($keyName)
		}
	}
	return ,$duplicateMembers
}


########
# SCRIPT
# Execution of actual script logic begins here
########
$siteUrl = $Args[0]
if ($siteUrl -eq $null)
{
	$siteUrl = Read-Host "`nYou must supply a site collection URL to execute the script"
}
if ($siteUrl.EndsWith("/") -eq $false)
{
	$siteUrl += "/"
}
Clear-Host
Write-Output ("Examining " + $siteUrl + " ...`n")
$combinedMappings = SpmBuild-WebAndListIdMappings $siteUrl
Write-Output ($combinedMappings.Count.ToString() + " GUIDs processed.")
Write-Output ("Looking for duplicate GUIDs ...`n")
$duplicateGuids = SpmFind-DuplicateMembers $combinedMappings
if ($duplicateGuids.Count -eq 0)
{
	Write-Output ("No duplicate GUIDs found.")
}
else
{
	Write-Output ($duplicateGuids.Count.ToString() + " duplicate GUID(s) found.")
	Write-Output ("Non-unique GUIDs and associated objects appear below.`n")
	foreach ($keyName in $duplicateGuids)
	{
		$siteNames = $combinedMappings[$keyName]
		Write-Output($keyName + ": " + $siteNames)
	}
}
$dumpData = Read-Host "`nDo you want to send the collected data to a file? (Y/N)"
if ($dumpData -match "y")
{
	$fileName = Read-Host "  Output file path and name"
	Write-Output ("Results for " + $siteUrl) | Out-File -FilePath $fileName
	$allKeys = $combinedMappings.AllKeys
	foreach ($currentKey in $allKeys)
	{
		Write-Output ($currentKey + ": " + $combinedMappings[$currentKey]) | Out-File -FilePath $fileName -Append
	}
}
Write-Output ("`n")

Running this script in the client’s environment quickly identified the two lists that contained the same ID GUIDs.  How did they get that way?  I honestly don’t know, nor am I going to hazard a guess …

What Next?

If you’re in the unfortunate position of owning a site collection that contains objects possessing duplicate ID GUIDs, let me start by saying “I feel for you.”

Having said that: the quickest fix seemed to be deleting the objects that possessed the same GUIDs.  Those objects were then rebuilt.  I believe we handled the delete and rebuild manually, but there’s nothing to say that an export and subsequent import (via the Content Deployment API) couldn’t be used to get content out and then back in with new object IDs. 

A word of caution: if you do leverage the Content Deployment API and do so programmatically, simply make sure that object identities aren’t retained on import; that is, make sure that SPImportSettings.RetainObjectIdentity = false – not true.

Additional Reading and References

  1. TechNet: Import and export: STSADM operations
  2. MSDN: SPImportSettings.RetainObjectIdentity