Everything you need to know about Cloud Hybrid Search

Summary: This article discusses the new cloud hybrid search service application. Use this article to configure cloud hybrid search in your organization and learn what you need to know.

What is cloud hybrid search
In the past, Microsoft has attempted to provide hybrid search scenarios between your on-premises SharePoint environment and SharePoint Online. These solutions were based on query federation. For instance, when you searched for a document in SharePoint Online in your on-premises environment, the query would be sent to the on-premises environment, and the results are returned back to the user in SharePoint Online. Microsoft released a script to automate this: http://blogs.msdn.com/b/spses/archive/2015/11/17/office-365-sharepoint-hybrid-configuration-wizard.aspx.

In September 2015, Microsoft released the new cloud hybrid search service application.

Instead of using query federation to surface results in your environment, it relies on indexing your on-premises content in Office 365. This takes away a lot of complexity setting it up, and makes it possible to mix results from SharePoint on-premises and Office 365 in a single result block. You can set up this new feature using SharePoint 2013 or SharePoint 2016.

Figure 1 shows a representation of the “old” hybrid search architecture and figure 2 shows the “new” hybrid search architecture.
Hybrid federated search architecture (old)
Figure 1: Hybrid federated search architecture

In this old scenario, the user enters a query in the on-premises search center. SharePoint sends the query to the on-premises query component and the SharePoint Online query component. Results cannot be interleaved out-of-the-box in this scenario, as there are separated indexes for SharePoint on-premises and SharePoint Online. However, there are several third-party solutions available that make it possible.
New cloud hybrid search architecture

Figure 2: New cloud hybrid search architecture

In this scenario, the cloud search service crawls the content sources on-premises and sends the parsed content to the SharePoint Online content processing component. After processing the content and doing ACL mapping – for security trimming purposes – the data is saved in the SharePoint Online index. Because the index is saved online, it is now possible to interleave results in your search results and use the data in Delve.

Please note that the new cloud hybrid search is still in preview, so things might change along the way!

If you want your on-premises SharePoint to show SharePoint Online results, you still have to configure the outgoing federated hybrid search. See the following link for more information: http://blogs.technet.com/b/wbaer/archive/2014/03/24/one-way-outbound-hybrid-search-step-by-step-and-onedrive-for-business.aspx

Why cloud hybrid search
Not all companies are ready to make the move to the cloud for all their workloads. In order to help customers make the move for specific workloads, Microsoft now provides an easy way to gradually move to the cloud while maintaining a great search experience for end users.

By using the new Cloud hybrid search solution, users are able to search content from the following sources from within SharePoint Online:

  • SharePoint 2007/2010/2013/2016
  • File shares
  • BCS

The index for all these sources is indexed in Office 365, which gives Microsoft the ability to interleave results across sources based on relevancy, use the Office 365 ranking model and even include all of this in Delve!

Organizations can also scale down search infrastructure as content processing and analytics are handled by Office 365.

Prerequisites for cloud hybrid search
In order to use the new Hybrid search functionality, make sure you have installed the following prerequisites for your environment.

SharePoint on-premises

  • If you use SharePoint 2013, make sure you installed the August 2015 CU or later. I recommend the latest CU without known regressions, as there have been improvements to the hybrid search.
  • Public preview of SharePoint 2016 IT Preview.

Office 365
The cloud search service application is currently not available for customers outside the regular Office 365 multitenant service, including China data center customers and Government cloud customers.
Account synchronization
Accounts need to be synchronized to Office 365 in order to have a single identity for users. All users that want to make use of Office 365 hybrid search need a SharePoint Online license assigned.

The tools below are supported to perform directory synchronization:

If you do not have any of the above synchronization tools deployed in your environment, I recommend using AADConnect. It also has the possibility to configure ADFS for you, so you can enjoy the full Single-sign on experience.

Software needed during configuration of hybrid search
On the SharePoint server where you are performing the configuration of hybrid search, you will need to install the following prerequisite software in this specific order.

Onboarding script
The onboarding script will create the trust between your on-premises SharePoint environment and Office 365. You can download the script along with documentation from the Microsoft Connect Site if you’re a member of the preview program.

If you do not have access to the cloud hybrid search preview program, you can request access via the link http://connect.microsoft.com/office/SelfNomination.aspx?ProgramID=8647&pageType=1. Make sure you are using the latest version prior to execution.

Creating the cloud search service application
After you have installed all the prerequisites, it’s now time to create the cloud search service application, which is pretty straightforward. You could use any script that you prefer; just add the parameter, “CloudIndex $true” to the New-SPEnterpriseSearchServiceApplication cmdlet.

On the server that is running SharePoint Server 2013 or SharePoint Server 2016 Preview, copy the sample script below and save it as CreateCloudSSA.ps1 and run it. This will create a single-server Search Service Application topology. If you want a highly available search service infrastructure, you have to manually adjust the script to your needs.

This script was taken from: http://blogs.msdn.com/b/spses/archive/2015/09/15/cloud-hybrid-search-service-application.aspx

## Gather mandatory parameters ##    
## Note: SearchServiceAccount needs to already exist in Windows Active Directory as per Technet Guidelines https://technet.microsoft.com/library/gg502597.aspx ##    

[Parameter(Mandatory=$true)][string] $SearchServerName,      
[Parameter(Mandatory=$true)][string] $SearchServiceAccount,     
[Parameter(Mandatory=$true)][string] $SearchServiceAppName,     
[Parameter(Mandatory=$true)][string] $DatabaseServerName     

Add-PSSnapin Microsoft.SharePoint.Powershell -ea 0     

## Validate if the supplied account exists in Active Directory and whether supplied as domain\username    
    if ($SearchServiceAccount.Contains("\")) # if True then domain\username was used     
    $Account = $SearchServiceAccount.Split("\")     
    $Account = $Account[1]     
    else # no domain was specified at account entry     
    $Account = $SearchServiceAccount     
    $domainRoot = [ADSI]''     
    $dirSearcher = New-Object System.DirectoryServices.DirectorySearcher($domainRoot)     
    $dirSearcher.filter = "(&(objectClass=user)(sAMAccountName=$Account))"     
    $results = $dirSearcher.findall()     
if ($results.Count -gt 0) # Test for user not found     
    Write-Output "Active Directory account $Account exists. Proceeding with configuration"     
## Validate whether the supplied SearchServiceAccount is a managed account. If not make it one.    
if(Get-SPManagedAccount | ?{$_.username -eq $SearchServiceAccount})      
        Write-Output "Managed account $SearchServiceAccount already exists!"     
        Write-Output "Managed account does not exists - creating it"     
$ManagedCred = Get-Credential -Message "Please provide the password for $SearchServiceAccount" -UserName $SearchServiceAccount     
        New-SPManagedAccount -Credential $ManagedCred     
         Write-Output "Unable to create managed account for $SearchServiceAccount. Please validate user and domain details"     
Write-Output "Creating Application Pool"      
$appPool = New-SPServiceApplicationPool -name $appPoolName -account $SearchServiceAccount     
Write-Output "Starting Search Service Instance"     
Start-SPEnterpriseSearchServiceInstance $SearchServerName     
Write-Output "Creating Cloud Search Service Application"

$searchApp = New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true     
Write-Output "Configuring Admin Component"     
$searchInstance = Get-SPEnterpriseSearchServiceInstance $SearchServerName     
$searchApp | get-SPEnterpriseSearchAdministrationComponent | set-SPEnterpriseSearchAdministrationComponent -SearchServiceInstance $searchInstance     
$admin = ($searchApp | get-SPEnterpriseSearchAdministrationComponent)     
Write-Output "Waiting for the admin component to be initialized"     
do {Write-Output .;Start-Sleep 10;} while ((-not $admin.Initialized) -and ($timeoutTime -ge (Get-Date)))     
if (-not $admin.Initialized) { throw 'Admin Component could not be initialized'}     

Write-Output "Inspecting Cloud Search Service Application"

$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName     

Write-Output "Setting IsHybrid Property to 1"     

#Output some key properties of the Search Service Application    
Write-Host "Search Service Properties"      
Write-Host "Hybrid Cloud SSA Name    : " $searchapp.Name     
Write-Host "Hybrid Cloud SSA Status  : " $searchapp.Status     
Write-Host "Cloud Index Enabled      : " $searchApp.CloudIndex     
Write-Output "Configuring Search Topology"     

$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName     
$topology = $searchApp.ActiveTopology.Clone()     

$oldComponents = @($topology.GetComponents())
if (@($oldComponents | ? { $_.GetType().Name -eq "AdminComponent" }).Length -eq 0)
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AdminComponent $SearchServerName))     
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.CrawlComponent $SearchServerName))     
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.ContentProcessingComponent $SearchServerName))     
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AnalyticsProcessingComponent $SearchServerName))     
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.QueryProcessingComponent $SearchServerName))     
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.IndexComponent $SearchServerName,0))     

$oldComponents  | ? { $_.GetType().Name -ne "AdminComponent" } | foreach { $topology.RemoveComponent($_) }
Write-Output "Activating topology"     

do {Write-Output .;Start-Sleep 10;} while (($searchApp.GetTopology($topology.TopologyId).State -ne "Active") -and ($timeoutTime -ge (Get-Date)))     

if ($searchApp.GetTopology($topology.TopologyId).State -ne "Active")  { throw 'Could not activate the search topology'}     
Write-Output "Creating Proxy"     
$searchAppProxy = new-spenterprisesearchserviceapplicationproxy -name ($SearchServiceAppName+"_proxy") -SearchApplication $searchApp     
Write-Output " Cloud hybrid search service application provisioning completed successfully."     
    else # The Account Must Exist so we can proceed with the script     
    Write-Output "Account supplied for Search Service does not exist in Active Directory."     
    Write-Output "Script is quitting. Please create the account and run again."     
} # End Else

The output should look similar to figure 3.Create-SSA.ps1 output, creating a cloud search service application
Figure 3: Create-SSA.ps1 output, creating a cloud search service application

Proxy configuration for hybrid cloud search
If your organization uses a proxy to allow Internet access, you have to configure this proxy for hybrid cloud search as well. For a more in-depth article, please look at http://sharepointrelated.com/2015/12/11/cloud-hybrid-search-proxy-settings/, but for now we can just add the proxy settings to the machine config: “C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config\machine.config”

Here’s an example of what this would look like:

    <proxy usesystemdefault="false" proxyaddress="" bypassonlocal="true" />

Place this anywhere between your <configuration> and </configuration> tag. To make it easier to find when you need it, you could place it right before the </configuration> tag.

Onboarding the cloud search service application
After successfully installing the prerequisites and configuring the cloud search service application, it is time to start the onboarding process. The onboarding process will create a trust between your SharePoint on-premises and Office 365 environment. This will allow SharePoint to move the index to Office 365 for further processing.

Run the onboarding script:

.\Onboard-CloudHybridSearch.ps1 -PortalUrl "https://yourtenant.sharepoint.com" -CloudSSAId "<Cloud Search Service Application name>"

Enter your Global Administrator credentials when prompted.

Figure 4 shows what your output should resemble.
Cloud hybrid search onboarding script

Figure 4: Running the cloud hybrid search onboarding script on the server that runs SharePoint Server 2013

The script changes quite frequently. The script name and the parameters have changed a bit since I ran the script. Make sure you check to see what the correct parameters are when you run the script.

Configure content source in cloud search service application
You can configure the content source in your new cloud search service application as you would in any other on-premises SharePoint environment. As Figure 5 shows, you configure the content source of the cloud search service application in Search Administration.
Configure the content source for the cloud search service application
Figure 5: In SharePoint Search Administration you can edit (configure) the content source for the cloud search service application.

Enter the start addresses that you would like to crawl and start a full crawl for the content source. After the crawl is done, check the crawl log for the specific content source to see if all went well.
Check the cloud search service application crawl logs for errors or warningsFigure 6: Check the crawl logs for any errors or warning

If you see 1 Top Lever Error with the following error message:

AzureServiceProxy caught Exception: *** Microsoft.Office.Server.Search.AzureSearchService.AzureException: AzurePlugin was not able to get Tenant Info from configuration server    
 at Microsoft.Office.Server.Search.AzureSearchService.AzureServiceProxy.GetAzureTenantInfo(String portalURL, String realm, String&amp; returnPropertyValue, String propertyName)    
 at Microsoft.Office.Server.Search.AzureSearchService.AzureServiceProxy.SubmitDocuments(String azureServiceLocation, String authRealm, String SPOServiceTenantID, String SearchContentService_ContentFarmId, String portalURL, String testId, String encryptionCert, Boolean allowUnencryptedSubmit, sSubmitDocument[] documents, sDocumentResult[]&amp; results, sAzureRequestInfo&amp; RequestInfo) ***

Make sure to check your proxy configuration (http://sharepointrelated.com/2015/12/11/cloud-hybrid-search-proxy-settings/).

Verifying results: perform a query in SharePoint Online
In Office 365, search for a document and it will return results for both SharePoint Online and SharePoint on-premises if cloud hybrid search is configured correctly.

Figure 7 shows example results from a search that includes the following sources:

  • SharePoint Online
  • SharePoint on-premises
  • File shares

Searching content in Office 365 returning results from both on-premises and SharePoint Online
Figure 7: Searching content in Office 365 returning results from both on-premises and SharePoint Online

If you want to return results only from your on-premises site, you can use the “isexternalcontent:1” property.

As figure 8 shows, this returns only on-premises results.
Search results only from on-premises
Figure 8: Using the isexternalcontent:1 property shows search results only from on-premises.

That’s it!

I hope this helps you use the new hybrid cloud search service application.

* All figures in this article were taken in a client’s test environment, with approval.

Cloud Hybrid Search proxy settings

Let me start by saying thanks to @johankroese and @vanHooijdonk for helping with troubleshooting this issue!
If you want the solution without going through the full post, scroll down to the end of this blog post.

The new Cloud Hybrid Search has been released in preview in September 2015. Seeing the demos I was really excited to try this. One of our customers was already in need of this technology, so we proposed to configure Cloud Hybrid Search on their test environment.

The onboarding process seemed pretty straightforward and we used http://blogs.msdn.com/b/spses/archive/2015/09/15/cloud-hybrid-search-service-application.aspx as a reference. We read the documentation, grabbed the on-boarding script, installed the prerequisites and got started.

After running the onboarding script the first time, some errors were thrown, but after running the script for the 2nd time, everything seemed to work out just perfectly. All steps were executed, no red parts in the output of the script.OnboardingSearchThe next step would be to start a Full crawl. We created a specific content source in the on-premises environment. This content source contains only 1 site collection with a few documents inside, just to check the functionality. Unfortunately, the crawl failed. After inspecting the log, we saw the following:

AzureServiceProxy caught Exception: *** Microsoft.Office.Server.Search.AzureSearchService.AzureException: AzurePlugin was not able to get Tenant Info from configuration server
at Microsoft.Office.Server.Search.AzureSearchService.AzureServiceProxy.GetAzureTenantInfo(String portalURL, String realm, String&amp;amp; returnPropertyValue, String propertyName)
at Microsoft.Office.Server.Search.AzureSearchService.AzureServiceProxy.SubmitDocuments(String azureServiceLocation, String authRealm, String SPOServiceTenantID, String SearchContentService_ContentFarmId, String portalURL, String testId, String encryptionCert, Boolean allowUnencryptedSubmit, sSubmitDocument[] documents, sDocumentResult[]&amp;amp; results, sAzureRequestInfo&amp;amp; RequestInfo) ***

For some reason, SharePoint wasn’t able to submit index data for documents to Azure for processing. We created a thread on the TechNet forum for Cloud Search Service Application Preview.

As we were aware that a proxy was being used, we started checking the proxy configuration. There are a lot of places where you can configure a proxy if you are looking for them:

  • Web.config
  • Netsh winhttp set proxy
  • Browser settings for service account
  • Search Service Application proxy

After trying all these, we still couldn’t get it to work. After using Wireshark, we found out that the upload was still not using our proxy server. Instead, it tried uploading directly to Azure.

The solution
After discussing this with my colleague @vanHooijdonk, he reminded me that there is another place where you can configure proxy settings. The machine config: “C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config\machine.config”

    <proxy usesystemdefault="false" proxyaddress="" bypassonlocal="true" />

After applying the proxy there, everything started to work! Place the proxy settings anywhere between your <configuration> and </configuration> tag. Personally, I placed it right before the end, so I can find it easier next time.

A new blog post will come soon with a more detailed explanation on how to configure Cloud Hybrid Search.SearchResults

Download all content in a site collection


I’ve been working on a script that will allow you to download all files that are stored in SharePoint in a given site collection.

If the path does not exist, the script will prompt you to create it for you. Before the script runs, it also checks if the site collection exists.

Run the script like this:

.\Get-SPContent.ps1 -SiteCollection "<SiteCollectionURL>" -Destination "<Path>"


The console shows which libraries were exported to your file system.

—– * Advanced * —–

If you have specific requirements as to which (type of) libraries you want to export, you can change the following line to fit your requirements:

$lists = $web.lists | ?{$_.itemcount -ge "1" -And $_.Hidden -eq $false -And $_.BaseType -eq "DocumentLibrary"} #Excludes all hidden libraries and empty libraries

Below is the code you can save as Get-SPContent.ps1

[ValidateScript({asnp *sh* -EA SilentlyContinue;if (Get-SPSite $_){$true}else{Throw "Site collection $_ does not exist"}})]
if (Test-Path $_)
$d = $_
$title = "Create Folder?";
$message = "$_ doesn't exist, do you want the script to create it?";
$yes = New-Object System.Management.Automation.Host.ChoiceDescription "&Yes", "Creates directory $_";
$no = New-Object System.Management.Automation.Host.ChoiceDescription "&No", "Exits script";
$options = [System.Management.Automation.Host.ChoiceDescription[]]($yes,$no);
$result = $host.ui.PromptForChoice($title,$message,$options,1);
0 {New-Item $d -Type Directory;$true}
1 {Throw "Please create the folder before running the script again. `nExiting script"}

Asnp *sh* -EA SilentlyContinue

Start-SPAssignment -Global | Out-Null

function Get-SPWebs($SiteCollection){
$SiteCollection = Get-SPSite $SiteCollection
$webs = @()
$SiteCollection.allwebs | %{$webs += $_.url}
return $webs

function Get-SPFolders($webs)
foreach($web in $webs)
$web = Get-SPWeb $web
Write-Host "`n$($web.url)"

$lists = $web.lists | ?{$_.itemcount -ge "1" -And $_.Hidden -eq $false -And $_.BaseType -eq "DocumentLibrary"} #Excludes all hidden libraries and empty libraries
#$lists = $web.lists | ?{$_.title -eq "Documents" -and $_.itemcount -ge "1" -And $_.BaseType -eq "DocumentLibrary"} #Change any identifier here
foreach($list in $lists)
Write-Host "- $($list.RootFolder.url)"

#Download files in root folder
$rootfolder = $web.GetFolder($list.RootFolder.Url)

#Download files in subfolders
foreach($folder in $list.folders)
$folder = $web.GetFolder($folder.url)



function Download-SPContent($folder)
foreach($file in $folder.Files)
$binary = $file.OpenBinary()
$stream = New-Object System.IO.FileStream($destination + "/" + $file.Name), Create
$writer = New-Object System.IO.BinaryWriter($stream)

$webs = Get-SPWebs -SiteCollection $Sitecollection
Get-SPFolders -Webs $webs

Stop-SPAssignment -Global

Restore deleted site collections SharePoint 2013

In SharePoint 2013 it is possible to restore a accidently deleted site collection. For more information, read this article: http://technet.microsoft.com/en-us/library/hh272537.aspx

You can use the Restore-SPDeletedSite cmdlet to restore a site collection.

However, if you removed the site collection using the Remove-SPSite cmdlet using PowerShell, the site collection will not be stored in a SPDeletedSite object.

This means you cannot restore a site collection that has been removed using PowerShell.


Add PDF mimetype for all Web Applications one-liner

By default, PDF files cannot be opened directly from SharePoint 2010/SharePoint 2013.

To add the PDF mimetype to all Web Applications (Instead of doing it seperately for each Web Application), you can use the following one-liner:

Get-SPWebApplication | %{$_.AllowedInlineDownloadedMimeTypes.Add("application/pdf");$_.Update()}

Get all subsites of a subsite

Getting a list of all subsites of a particular site (not a site collection) was a little more work than I expected, so here is how I did it.

Let’s say we have the following situation site structure:


What if we want an overview of all sites under “Https://portal.sharepointrelated.com/Projects”?

My first thought was to use the “Webs” property of the SPWeb object. Unfortunately, this only shows the direct subsites for this site. This means that for “Https://portal.sharepointrelated.com/projects”, it only shows the Level 3 sites.


To work around this, I used the “AllWebs” property of the SPSite object and filtered the URL’s starting with “Https://portal.sharepointrelated.com/projects”.

Here is the code used: (Download .zip file)

param ( [Parameter(Mandatory=$true)][ValidateNotNullOrEmpty()] [String]$StartWeb, [Boolean]$IncludeStartWeb = $true )

Add-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue

$subsites = ((Get-SPWeb $StartWeb).Site).allwebs | ?{$_.url -like "$StartWeb*"}

foreach($subsite in $subsites) { Write-Host $subsite.url }

As you can see in the source code, I added 2 parameters to the script:

StartWeb: String. This is the starting URL. All subsites under this site will be showed in the result.

IncludeStartWeb: Boolean. When set to $false, the output will not include the URL provided in the StartWeb parameter.