Integrating Sitecore’s Azure Search Provider

Having previously worked with the Lucene and Solr search provider, I decided to test out how easy it is to switch from using the Lucene search provider to the Azure search provider. I set about changing the default configurations on a Habitat installation; Habitat 8.2 update 5 which comes pre-configured for Lucene. I ran into a couple of interesting issues and learned some things along the way.

Sitecore have a helpful guide for setting up Azure Search and connecting to it. Once you have the search service set up in Azure and you have configured your Sitecore instance to connect to that service, you can get back to reconfiguring the Sitecore instance and Habitat solution.

Configuration Changes

Enable the following files in App_Config/Include:

ContentTesting/Sitecore.ContentTesting.Azure.IndexConfiguration.config
XM/Sitecore.FXM.Azure.DomainsSearch.DefaultIndexConfiguration.config
FXM/Sitecore.FXM.Azure.DomainsSearch.Index.Master.config
FXM/Sitecore.FXM.Azure.DomainsSearch.Index.Web.config.config
ListManagement/Sitecore.ListManagement.Azure.Index.List.config
ListManagement/Sitecore.ListManagement.Azure.IndexConfiguration.config
Social/Sitecore.Social.Azure.Index.Master.config
Social/Sitecore.Social.Azure.Index.Web.config
Social/Sitecore.Social.Azure.IndexConfiguration.config
Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config
Sitecore.ContentSearch.Azure.Index.Analytics.config
Sitecore.ContentSearch.Azure.Index.Core.config
Sitecore.ContentSearch.Azure.Index.Master.config
Sitecore.ContentSearch.Azure.Index.Web.config
Sitecore.Marketing.Azure.Index.Master.config
Sitecore.Marketing.Azure.Index.Web.config
Sitecore.Marketing.Azure.IndexConfiguration.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.Index.Master.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.Index.Web.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.IndexConfiguration.config

Disabled the following files in App_Config/Include:

ContentTesting/Sitecore.ContentTesting.Lucene.IndexConfiguration.config
FXM/Sitecore.FXM.Lucene.DomainsSearch.DefaultIndexConfiguration.config
FXM/Sitecore.FXM.Lucene.DomainsSearch.Index.Master.config
FXM/Sitecore.FXM.Lucene.DomainsSearch.Index.Web.config
ListManagement/Sitecore.ListManagement.Lucene.Index.List.config
ListManagement/Sitecore.ListManagement.Lucene.IndexConfiguration.config
Social/Sitecore.Social.Lucene.Index.Analytics.Facebook.config
Social/Sitecore.Social.Lucene.Index.Master.config
Social/Sitecore.Social.Lucene.Index.Web.config
Social/Sitecore.Social.Lucene.IndexConfiguration.config
Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config
Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.Xdb.config
Sitecore.ContentSearch.Lucene.Index.Analytics.config
Sitecore.ContentSearch.Lucene.Index.Core.config
Sitecore.ContentSearch.Lucene.Index.Master.config
Sitecore.ContentSearch.Lucene.Index.Web.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Master.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Web.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.IndexConfiguration.config
Sitecore.Marketing.Lucene.Index.Master.config
Sitecore.Marketing.Lucene.Index.Web.config
Sitecore.Marketing.Lucene.IndexConfiguration.config
Sitecore.Speak.ContentSearch.Lucene.config

Code Changes

The web.config contains an <appSetting> value for setting the search provider. The support providers can be set to Lucene, Solr or Azure. Set this to Azure:

<add key="search:define" value="Azure" />

The next steps are to update the Habitat code to work with Azure. There are 2 configuration files that are set up to add additional fields to the Lucene index. These files need to be updated to conform to the Azure configuration requirements for adding fields to the Azure Cloud Index.

Open the configuration file located in “App_Config\Include\Foundation\” for Foundation.LocalDatasource.config and replace the <contentSearch> section with the following:

<contentSearch>
<indexConfigurations>
<defaultCloudIndexConfiguration type="Sitecore.ContentSearch.Azure.CloudIndexConfiguration, Sitecore.ContentSearch.Azure">
<documentOptions>
<fields hint="raw:AddComputedIndexField">
<field fieldName="local_datasource_content" storageType="NO" indexType="TOKENIZED">Sitecore.Foundation.LocalDatasource.Infrastructure.Indexing.LocalDatasourceContentField, Sitecore.Foundation.LocalDatasource</field>
</fields>
</documentOptions>
</defaultCloudIndexConfiguration>
</indexConfigurations>
</contentSearch>

Open the configuration file located in “App_Config\Include\Foundation\” for Foundation.Indexing.config and replace the <contentSearch> section with the following:

<contentSearch>
<indexConfigurations>
<defaultCloudIndexConfiguration type="Sitecore.ContentSearch.Azure.CloudIndexConfiguration, Sitecore.ContentSearch.Azure">
<fieldMap type="Sitecore.ContentSearch.Azure.FieldMaps.CloudFieldMap, Sitecore.ContentSearch.Azure">
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="all_templates" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.Collections.Generic.List`1[[System.String, mscorlib]]" settingType="Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure">
<Analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
<field fieldName="has_presentation" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.Boolean" settingType="Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure" />
<field fieldName="has_search_result_formatter" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.Boolean" settingType="Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure" />
<field fieldName="search_result_formatter" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" type="System.String" settingType="Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure" />
</fieldNames>
</fieldMap>
<virtualFields type="Sitecore.ContentSearch.VirtualFieldProcessorMap, Sitecore.ContentSearch">
<processors hint="raw:AddFromConfiguration">
<add fieldName="content_type" type="Sitecore.Foundation.Indexing.Infrastructure.Fields.SearchResultFormatterComputedField,Sitecore.Foundation.Indexing"/>
</processors>
</virtualFields>
<documentOptions type="Sitecore.ContentSearch.Azure.CloudSearchDocumentBuilderOptions,Sitecore.ContentSearch.Azure" >
<fields hint="raw:AddComputedIndexField">
<field fieldName="has_presentation" storageType="no" indexType="untokenized">Sitecore.Foundation.Indexing.Infrastructure.Fields.HasPresentationComputedField, Sitecore.Foundation.Indexing</field>
<field fieldName="all_templates" storageType="no" indexType="untokenized">Sitecore.Foundation.Indexing.Infrastructure.Fields.AllTemplatesComputedField, Sitecore.Foundation.Indexing</field>
<field fieldName="has_search_result_formatter" storageType="no" indexType="untokenized">Sitecore.Foundation.Indexing.Infrastructure.Fields.HasSearchResultFormatterComputedField, Sitecore.Foundation.Indexing</field>
<field fieldName="search_result_formatter" storageType="no" indexType="untokenized">Sitecore.Foundation.Indexing.Infrastructure.Fields.SearchResultFormatterComputedField, Sitecore.Foundation.Indexing</field>
</fields>
</documentOptions>
</defaultCloudIndexConfiguration>
</indexConfigurations>
</contentSearch>

You also need to add additional lines to the Sitecore <settings> section to set the default search index to the Azure Search provider. The <settings> should be set as follows:

<settings>
<setting name="ContentSearch.ParallelIndexing.Enabled" value="true" />
<setting name="ContentSearch.DefaultIndexType">
<patch:attribute name="value">Sitecore.ContentSearch.Azure.CloudSearchProviderIndex, Sitecore.ContentSearch.Azure</patch:attribute>
</setting>
<setting name="ContentSearch.DefaultIndexConfigurationPath">
<patch:attribute name="value">contentSearch/indexConfigurations/defaultCloudIndexConfiguration</patch:attribute>
</setting>
</settings>

After making these configuration changes, run the indexing manager to create the search indexes including the newly mapped fieldnames defined above.

Running into Issues

The Sitecore Azure Search provider comes with a set of limitations in comparison to Solr and Lucene. This means that you will get different search results when using Azure compared to Solr/Lucene.

When debugging the searchService.cs code I found that it was erroring out at the following line.

rootPredicates = rootPredicates.Or(item => item.Path.StartsWith(provider.Root.Paths.FullPath));

Due to the limitations of Sitecore Search with Azure using “StartsWith” will match terms that are located in any part of the field value. This is not the desired result in this instance so this line needed to be amended.

I tested updating this to use “Contains” and passed in the Root.ID rather than the FullPath string value:

rootPredicates = rootPredicates.Or(item => item.Paths.Contains(provider.Root.ID));

This was still causing an error and after investigating the data stored in the sitecore_master_index I could see that the field “path_1” was not maintaining the data item ID in its original format.

For example, the Sitecore Root.ID of

{11111111-1111-1111-1111-111111111111}

Was stored in the index as

11111111111111111111111111111111

I decided to test updating the predicate to reference the “path_1” field directly. First, I needed to strip the Root.ID value to match the format saved in the index and then use this stripped value in the predicate.

var stripID = provider.Root.ID.ToString().Replace("{", "").Replace("-", "").Replace("}", "");
rootPredicates = rootPredicates.Or(item => item["path_1"].Contains(stripID));

This allowed the searchService to execute without error.

Testing

I then set about testing the search. The main drawback that I found was when searching for a phrase. If you stick to only searching for single words the search works as expected. When searching for multiple words, each word is searched separately and the results are then combined. Results returned are not restricted to results that match the full phase.

If you search for the term “contact us”, you will get results that contain “contact” and results that contain “us”. An About Us page could be returned in the results as it contains the word “us” even if it does not contain the word “contact”. This operation should be considered when thinking about using the Azure search with Sitecore.

Interested in learning more about Sitecore? Contact us today.

Originally published at arekibo.com.

--

--

--

Ireland’s largest independent digital agency dedicated to connecting your business to your customers. Passionate about all things digital!

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

The Product Owner is part of the Scrum Team

The Product Owner is part of the Scrum Team

Must known Github codes

Image used from Unsplash.com

Give a Developer a Hug

Three high-growth vscode extensions

Introducing Grandeur Cloud

Glovebox — Final Project

2014 Koenigsegg Regera

Develop ZeroDefect API’s with ZeroCode!

Simplifying parallell processing in .NET

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arekibo Communications

Arekibo Communications

Ireland’s largest independent digital agency dedicated to connecting your business to your customers. Passionate about all things digital!

More from Medium

OCI REST APIs from Postman

DB migrate from on-premise to Azure MySQL(2)

Databricks Workspace SSO: Integration with Keycloak and SAML 2.0

Databricks admin console single sign-on form

Monitor Virtual machine changes by Azure Event Grid and Logic Apps