Find large files through Search in SharePoint 2010

How to get a list of large files in a (very large) SharePoint 2010 Farm, sorted by file size?

I discussed about this requirement with my friend and teammate Riccardo Celesti and we ended up with the same answer: use search!

Unfortunately, since the customer’s environment is still running on top of SharePoint Server 2010, the solution was not that immediate: we would need to add a metadata property mapped to the file size (and schedule a full crawl afterwards, which may be not feasible).

Or… it’ SharePoint 2010, right? We still have the FullText SQL Query Language available.

Typically, the SQL Query Language should not be an option, since this feature has been removed in SharePoint 2013 and was already considered obsolete in SharePoint 2010 (take a look at the “tip” at the beginning the MSDN page).

Anyway, this was a “fire and forget” requirement, so here’s what we ended up with:

Add-PSSnapin Microsoft.SharePoint.PowerShell
$site = Get-SPSite http://the-url-of-a-site
$query = new-Object Microsoft.Office.Server.Search.Query.FullTextSqlQuery($site);
$query.ResultTypes = "RelevantResults"
$query.RowLimit = 1000
$query.EnableStemming = $false
$query.TrimDuplicates = $false
$query.QueryText = "SELECT URL, filename, size, path FROM SCOPE() WHERE IsDocument=1 ORDER BY size DESC"
$results = $query.Execute()
$table = New-Object System.Data.DataTable
$table.Load($results["RelevantResults"], "OverwriteChanges")
$table | ogv

 

Hope useful 🙂

Refinement Filter Generator – Show more links

Refinements are a welcome addition to the Search/Query capabilities in the default SharePoint 2010 user interface. I remember that most of our SharePoint 2007 projects that involved some search results customization have been implemented relying on the Faceted Search project (still available on CodePlex) or on some commercial tolls (Ontolica, now Surfray, being one of those).

The way refinement panels work is somewhat complicated, as it entails server components (aka filter generators), server controls (OOTB) as well as some XSLT tricks that you can use to customize the refinement UI.

One of these tricks determines how the “Show More Links” feature is realized.

You need the help of the filter generator in order to discriminate between “visible” links and links that are instead hidden, waiting for an explicit request by the user.

The filter generator returns a bunch of XML markup, where you may have two sections (two elements) that contain either “top results” and “all results” (the threshold is typically defined by a property of the generator instance).

You may end-up with something like this [most of the code has been omitted]:

XmlElement element = filterXml.CreateElement("FilterCategory");

Element.SetAttribute("Id", category.Id);

element.SetAttribute("ShowMoreLink", category.ShowMoreLink);

element.SetAttribute("MoreLinkText", category.MoreLinkText);

element.SetAttribute("LessLinkText", category.LessLinkText);

XmlElement topElements = filterXml.CreateElement("Filters");

XmlElement moreElements = filterXml.CreateElement("MoreFilters");

Of course, most of the hard work is done by the logics of the refinement filter generator.

But… there’s something that is up to you – the site builder – which is of course the presentation of these aggregations.

OOTB, SharePoint 2010 comes with some XSLT that probably fits most of your needs.

But this may not be enough when, for example, you have developed a custom filter generator (or you have downloaded or bought one) and you need to make it available.

The example I’m bringing to your attention, and which of course gives the title to this post, is related to the way the OOTB XSLT renders the “All Results” fragment.

Just take a look and try to find-out what I’m talking about:

<xsl:if test="$ShowMoreLink=’TRUE’">

<xsl:variable name="MoreFilters" select="MoreFilters/Filter" />

<xsl:choose>

<xsl:when test="$FilterCategoryId and ($FilterCategoryId != ”) and ($FilterCategoryType = ‘Microsoft.Office.Server.Search.WebControls.ManagedPropertyFilterGenerator’)">

</xsl:when>

<xsl:when test="$FilterCategoryId and ($FilterCategoryId != ”) and ($FilterCategoryType = ‘Microsoft.Office.Server.Search.WebControls.TaxonomyFilterGenerator’)">…

</xsl:when>

<xsl:choose>

Got it?

The OOTB XSLT template applies (and then renders) the “All Results” section differently according to the type (the .NET Type) of the refinement filter generator. In the above snippet, there’s a “choose/when” statement (the equivalent of the “switch/case” or “Select/Case” syntax construct you may be familiar with) that is there exactly for this purpose.

Needless to say that, if you need to render links also for refinement panels based on your custom generator, you also need to update the XSLT transformation accordingly:

<xsl:when test="$FilterCategoryId and ($FilterCategoryId != ”) and ($FilterCategoryType = ‘GreenTeam.SharePoint2010.EnterpriseSearch.MultiValueFilterGenerator, GreenTeam.SharePoint2010.EnterpriseSearch, Version=1.0.0.0, Culture=neutral,PublicKeyToken=7b398aba874a4ea9’)">

</xsl:when>

Voila!

Refinement Panel Metadata Threshold

If you find your refinement suddenly disappearing, leaving a confused user (and some hedache for you), double check this property!

Its behavior is straightforward, and is exactly what the propperty name suggests.

You can define a threshold (unsigned integer value) that controls when the refiner is shown, based on the results that are returned by the search query.

If the occurrences of the underlying metadata property do not exceeed this threshold, the refiner simply disappears.

This may be appreciated: imagine a situation where you have defined, say, twenty refiners and you do not want to flood the page with every single kind of faceted filter. You need a way to prioritize this, and the Metadata Threshold property is what you are looking for.

On the opposite, you can of course eliminate it (just set the value to 1) if you need to keep a consistent layout across query executions.

If a property is there, and if you know what it does and how to prevent it, you gain flexibility!

Extract Crawled Properties Mappings

Add-PSSnapin microsoft.sharepoint.powershell -ErrorAction SilentlyContinue

$searchapp = Get-SPEnterpriseSearchServiceApplication “Search Service Application Name Here”

$category = Get-SPEnterpriseSearchMetadataCategory –Identity “Category Name Here” -SearchApplication $searchapp

$properties = Get-SPEnterpriseSearchMetadataCrawledProperty -SearchApplication $searchapp -Category $category

foreach ($crawled in $properties)

{

$name = $crawled.Name

$managedProperties = $crawled.GetMappedManagedProperties()

foreach ($managed in $managedProperties)

{

$managedName = $managed.Name   

ac “c:\search.csv” “$name,$managedName”

}

}

I definitely love PowerShell 🙂