Find large files through Search in SharePoint 2010

How to get a list of large files in a (very large) SharePoint 2010 Farm, sorted by file size?

I discussed about this requirement with my friend and teammate Riccardo Celesti and we ended up with the same answer: use search!

Unfortunately, since the customer’s environment is still running on top of SharePoint Server 2010, the solution was not that immediate: we would need to add a metadata property mapped to the file size (and schedule a full crawl afterwards, which may be not feasible).

Or… it’ SharePoint 2010, right? We still have the FullText SQL Query Language available.

Typically, the SQL Query Language should not be an option, since this feature has been removed in SharePoint 2013 and was already considered obsolete in SharePoint 2010 (take a look at the “tip” at the beginning the MSDN page).

Anyway, this was a “fire and forget” requirement, so here’s what we ended up with:

Add-PSSnapin Microsoft.SharePoint.PowerShell
$site = Get-SPSite http://the-url-of-a-site
$query = new-Object Microsoft.Office.Server.Search.Query.FullTextSqlQuery($site);
$query.ResultTypes = "RelevantResults"
$query.RowLimit = 1000
$query.EnableStemming = $false
$query.TrimDuplicates = $false
$query.QueryText = "SELECT URL, filename, size, path FROM SCOPE() WHERE IsDocument=1 ORDER BY size DESC"
$results = $query.Execute()
$table = New-Object System.Data.DataTable
$table.Load($results["RelevantResults"], "OverwriteChanges")
$table | ogv

 

Hope useful 🙂

PowerShell Profiles and Processor Architectures

I’m using PowerShell profiles quite extensively, especially because I’m typically working on several (virtual) machines and profiles are definitely easy to keep settings in sync.

Profiles are independent from the Processor Architecture, which means that you should be careful when loading snap-ins or invoking features that rely on a specific architecture.

One way to ensure that your profile scripts run smoothly on both x86 and x64 processes is to apply conditional logics based on the PROCESSOR_ARCHITECTURE environment variable ($env:PROCESSOR_ARCHITECTURE).

I would suggest you to perform this check even if you think that you will only be executing scripts with the x64 Shell.

You may think “no, I’ll never be using the x86 version of PowerShell”… you’re probably right, until you try to launch a script from within the Visual Studio IDE 🙁

Uffa

How to quickly identify large lists with PowerShell

It’s easy, and it’s just one line (without word wrapping J).

Get-SPWebApplication http://webappurl | Get-SPSite | Get-SPWeb |% { $_.Lists | select @{n=“Url”;e={$_.RootFolder.ServerRelativeUrl} }, Title, ItemCount } | sort ItemCount -Descending

Here I’m traversing the contents of a Web Application in order to iterate over each List (in each Web, in each Site of the Web Application):

Get-SPWebApplication http://webappurl | Get-SPSite | Get-SPWeb |% { $_.Lists | … }

Then, I’m using the Select-Object command-let to add a custom object to the pipeline, built from the list data using both standard properties (Title, ItemCount) and a calculated property for the list url:

{ $_.Lists | select @{n=“Url”;e={$_.RootFolder.ServerRelativeUrl} }, Title, ItemCount }

Finally, it’s just a matter of manipulating the custom objects collections, applying sorting, filtering, grouping or any other set operations according to your specific needs.

Here’s the output you may receive:

In addition, you may choose to bring these data into Excel for further analysis, which is extremely easy to achieve adding the Export-CSV command-let at the end of the pipeline.

Here’s a sample Excel spreadsheet generated from the data above, where I’ve applied a custom number filter to the data set and I have created a line chart on the data series:

Deleting a Site Collection that cannot be deleted

Today I stumbled upon a strange issue on a production SharePoint 2013 Farm.

I had a bunch of site collections created using a batch script, and one of those was unreachable.

It’s quite common (well, at least… it happens sometimes J) that a site creation process may fail, resulting in resources not provisioned correctly.

Since file provisioning is one of the last operation that is performed during site creation, you may get a 404 accessing the site home page.

But this was not the case. I got a 400 response (i.e. Bad Request). I could not even navigate to application pages (which are not “provisioned” as ghosted resources).

The symptoms of something gone wrong were quite evident in several places.

The Central Administration displayed the site in the sites list, but without any reference to the Content Database where it should have been created. No way to remove it using the web interface (all pages displaying information about the site had no content at all).

Ok, let’s clean it up and remove via script.

A simple Get-SPSite returned a valid object. But a subsequent Remove-SPSite failed with the dreaded “Unknown SPRequest error.occurred”.

I had not time to investigate, so I had to find a quick solution, sort of a “force delete” where the site cannot be deleted.

Therefore I used a not-so-well-know operation on the Content Database object: Microsoft.SharePoint.Administration.SPContentDatabase::ForceDeleteSite (see http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.administration.spcontentdatabase.forcedeletesite.aspx).

The PowerShell code is definitely simple:

$site = Get-SPSite http://siteurl

$siteId = $site.Id

$siteDatabase = $site.ContentDatabase 

$siteDatabase.ForceDeleteSite($siteId, $false, $false)

As the documentation clearly states:

This way I managed to remove the corrupt site collection (and recreate it again with the same command I had used for the batch script, which completed successfully).

Hope useful J

SharePoint Management Shell vs standard PowerShell console

You know that most of the administrative tasks you perform on a SharePoint Farm can be accomplished using PowerShell scripting. You get a bunch of commandlets (more than 600 indeed!) that are registered through the Microsoft.SharePoint.PowerShell snapin, and you can even add your own commandlets using standard SharePoint deployment techiques.

That’s why sometime you just end up firing the default PowerShell console and typing

Add-PSSnapIn Microsoft.SharePoint.PowerShell

Well… you get something that is “similar” to the SharePoint Management Shell Smile

The SharePoint Management Shell is nothing more than a standard PowerShell console (powershell.exe) where tha snapin is automatically loaded upon startup, leveraging a startup script:

C:\Windows\System32\WindowsPowerShell\v1.0\PowerShell.exe  -NoExit  ” & ‘ C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\CONFIG\POWERSHELL\Registration\\sharepoint.ps1 ‘ “

The sharepoint.ps1 script is just a few lines of code:

$ver = $host | select version
if ($ver.Version.Major -gt 1)  {$Host.Runspace.ThreadOptions = “ReuseThread”}
Add-PsSnapin Microsoft.SharePoint.PowerShell
Set-location $home

In the above snippet you can notice the automatic loading of the SharePoint snapin and a “change directory” instruction that brings you to your default home directory (tipically, c:\users\yourusername).

But there’s an additional line:

if ($ver.Version.Major -gt 1) {$Host.Runspace.ThreadOptions = “ReuseThread”}

This line sets the PSThreadOptions of the PowerShell runspace to ReuseThread, which means that the same thread is used for all command line invocations since the shell has started.

The other options are (see http://msdn.microsoft.com/it-it/library/system.management.automation.runspaces.psthreadoptions(v=vs.85).aspx):

  • Default
    On the local computer,
    UseNewThread is used for runspaces and ReuseThread is used for runspace pools. Server settings are used for remote runspaces and runspace pools. This field is introduced in Windows PowerShell 2.0.
  • ReuseThread
    Creates a new thread for the first invocation and then re-uses that thread in subsequent invocations. This field is introduced in Windows PowerShell 2.0.
  • UseCurrentThread
    Execution occurs on the thread that called the Invoke method. This option is not valid for asynchronous calls. This field is introduced in Windows PowerShell 2.0.
  • UseNewThread
    Creates a new thread for each invocation. This field is introduced in Windows PowerShell 2.0.

If you want a proof of this behavior, just run both a standard powershell console and a SharePoint Management Shell, then run the followind command a few times in each shell.

[Threading.Thread]::CurrentThread.ManagedThreadId

This instruction prints the internal identifier of the managed thread (i.e. the CLR Thread object)

You’ll notice that you get the same value in the SharePoint Management Shell, whereas you get different values in the standard one. By the way, PowerShell ISE uses the ReuseThread option too.

What’s the issue with the default value, i.e. having a new thread spawn (or picked up from a pool) for every command invocation?

Well… first of all there are some SharePoint objects that are not completely thread safe, sou you should pay careful attention when using them from different threads (even if, in this particular case, you have multiple threads that are not executed in parallel Smile)

Another noticeable issue you may experience, though, is resource leakage: if you create SharePoint objects in one thread and suddenly have no reference to that thread any longer, you end up wasting resources that are not disposed correctly.

Wrapping it up… the SharePoint Management Shell is nothing special, but it’s a little bit more than adding a snap in!!

Efficiently purge large lists

Sometimes you may need to clear the contents of a huge SharePoint list.

You can do this through the web user interface, but you will probably face some limitations: bulk operations are limited to 100 elements, the folder structure may be an obstacle, etc…

If you prefer writing script code (and if you have remote access to a farm server), you may gain quite a lot of time.

Here’s how:

function ClearList([Microsoft.SharePoint.SPWeb]$web, [Microsoft.SharePoint.SPList]$list)

{

    $sbDelete = new-object System.Text.StringBuilder

    $sbDelete.Append("<?xml version=`"1.0`" encoding=`"UTF-8`"?><Batch>")

    $command = "<Method><SetList>" + $list.ID + "</SetList><SetVar Name=`"ID`">{0}</SetVar><SetVar Name=`"Cmd`">Delete</SetVar></Method>";

    $items = $l.Items

    foreach ($item in $items)

    {

        [void]$sbDelete.Append([System.String]::Format($command, $item.ID.ToString()))

    }

    [void]$sbDelete.Append("</Batch>");

    try

    {

        $web.ProcessBatchData($sbDelete.ToString());

    }

    catch [Exception] {

        Write-Host $_.Exception.ToString()

    }   

}

This can be used just for SharePoint list, for document libraries you should use a slightly different approach, anyway…

Hope useful!

PowerShell – Importing and Exporting the command History

You know… one of the key architectural design of PowerShell states that “everything is an object”.

So, once you start thinking with this approach in mind, even some tedious task can become extremely easy.

Consider, for example, a situation when you have typed tons of commands (or a combination of those) in a PowerShell console. and you wish to “save your work” without having to retype everything again.

If you are thinking abuot copying and pasting everything from the console ui (whose screen buffer size you had previously set to 1 million rows, didn’t you? Smile), wait a while and try to execute:

Get-History

You’ll get back a collection of objects that represent the commands you have typed so far.

And of course, since they are PowerShell objects, they can be serialzed and saved just by executing the Export-Clixml commandlet:

Get-History | Export-Clixml -Path c:\yourhistoryfile.xml

Then, just close your console, run it again and run type something like:

Add-History -InputObject (Import-Clixml -Path c:\yourhistoryfile.xml)

Now, type Get-History again and… voila, everything’s there!

You can now execute one of the commands just by running the Invoke-History cmdlet, passing in the command index as returned by the Get-History output.

Easy, isn’t it?

PowerShell input parameters – No script is an island!

A script is almost always dependent on values provided by the caller, or at least on the environment that hosts the script execution.

Thus, if it’s true that the “core” of a script is its embedded logic, it is also very important to write a robust data collection strategy: parameters, type checking, value checking and fallback logics are key topics in this area.

Enter the power of script blocks!

One of my favorite features of the PowerShell scripting language is its use of script blocks, something similar to anonymous functions which you may be used to writing lambda expressions in C# or using some functional language of your choice.

A nice use of script blocks is for script parameters initialization.

Consider, for example, this simple script file:

Param([int]$n)

$n * 10

This extremely useful 🙂 computation relies on an integer being passed in by the caller.

But what happens if the user forgets to pass a number to the script?

Well… nothing indeed, the $n variable will have a default value of 0 and the script will return 0.

If 0 is not a good value for you, you can use default values for parameters:

Param([int]$n = 1)

$n * 10

Which returns 10 (10 * 1) if no input is received.

What if you want to do some value checking, instead, in order to ensure that a value is provided explicitly by the user?

A default parameter value can be an expression… oops… a script block!

Just try this, if you want to throw an error if no input is available:

Param([int]$n = $(throw “No input”))

$n * 10

Or if you want to give your users a last chance, you could force him to provide data by explicitly requesting it:

Param([int]$n = $(Read-Host -Prompt “Tell me more….”))

$n * 10

Cool, isnt’it?

How to set a Custom SP2010 Error Page with PowerShell

Custom application pages mapping is a brand new feature you get in SharePoint 2010.

What this does is let you specify what page you want the system to redirect to after some occurrence.

These occurrences can be the user login, the Access Denied, as well as the generic error page.

Here’s how you can set this all up, with just a couple of lines of PowerShell scripting:

Add-PSSnapin microsoft.sharepoint.powershell –ErrorAction SilentlyContinue

$wa = Get-SPWebApplication http://yoursite.com

$url = "/_LAYOUTS/GreenPoint2010/Error.aspx"

$cp = [Microsoft.SharePoint.Administration.SPWebApplication+SPCustomPage]"Error"

$wa.UpdateMappedPage($cp, $url)

$wa.Update()

How to download a WSP file from the Farm Solution Store

I’m following several migration projects during the last months, especially from SharePoint 2007 to SharePoint 2010, and of course I’m trying to stay within the standard approach as much as I can.

By “standard” I mean (roughly):

  • Premigration check
  • Deploy code customizations
  • Apply farm and webapp configurations
  • Perform a content database attach

This can be smooth or can be painful, especially during the last phase.

But… well, things can start to go wrong early, and unfortnuately not only during the premigration check.

For example, more than often we found a ton of wsps deployed, then ask our customer to provide us the wsp (the package, not necessarily the source code) and this fails with a sad “I don’t know where they are”. Or, which may be even worse, we get a different version of the package, which may or may not work as ecpected.

That’s why I wrote these 5 lines of PowerShell code that let us download the exact wsp package that is deployed.

This script is supposed to run on a SharePoint 2007 farm (that’s why I cannot leverage the commandlets included in the SharePoint 2010 PowerShell SnapIn).

Here it is:

[System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”)

$farm = [Microsoft.SharePoint.Administration.SPFarm]::Local

$solution = $farm.Solutions[“nameofthewsp.wsp”]

$file = $solution.SolutionFile

$file.SaveAs(“c:\wsp\nameofthewsp.wsp”)

With some extra work, you could make this script iterate over all installed solution packages and, maybe, save them as a single zip file for convenience.