Sitecore Search Highlighting with Solr : the highlights

In this post

Examples of how to get going with search result highlighting, using the Sitecore ContentSearch API and Solr

What does highlighting look like?

Solr’s highlighting system is extremely powerful. A simple use-case is to show the part of the document which matched a user’s search terms. We call this part a snippet. We can even supply some HTML to wrap the matching terms:

Search: healthy
Wrap with: <em> </em>
Snippet: The <em>healthy</em> workplace toolkits support you either as a health care employer..

Code: A Basic Search

Our documents have a field called ‘Summary’. Sitecore and the ContentSearch API don’t know about this field by default, so we create a custom SearchResultItem class to include the field in our search results:

using System;
using System.Runtime.Serialization;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.SearchTypes;

public class SearchResultWithSummary : SearchResultItem
{
    [IndexField("summary_t")]
    [DataMember]
    public virtual string Summary { get; set; }
}

Let’s search for any documents with the word healthy in the Summary field. Note that highlighting is currently only supported when we search directly through SolrNet, so we’ll construct the query that way.

const string searchField = "summary_t";
const string searchValue = "healthy";

var index = ContentSearchManager.GetIndex(string.Format("sitecore_{0}_index", Sitecore.Context.Database.Name));
using (var context = index.CreateSearchContext())
{
	var results = context.Query(new SolrQueryByField(searchField, searchValue), new QueryOptions());

	foreach (var result in results)
	{
		@result.Summary
                // Results:
		// - The healthy workplace toolkits support you either as a health care employer, RCN workplace representative, employment agency or host organisation to create healthy working environments.
		// - Engaging families, communities and schools to change the outlook of a generation. The Healthy Weight Commitment Foundation is a broad-based, not-for-profit organization whose mission is to help reduce obesity.
		// - People who are homeless are more likely than the general population to have poor health. Through our Healthy Futures project, we help homeless people when they are admitted to hospital.
	}
}

Code: Let’s add highlighting!

We populate a QueryOptions object with a HighlightingParameters configuration, and pass this in when creating our query. We specify (Field) the field to include in the highlight snippet returned by Solr, (BeforeTerm) the token to place before our matched terms, and (AfterTerm) the token to place after the matched terms.

const string searchField = "summary_t";
const string searchValue = "healthy";

var queryOptions = new QueryOptions
{
	Highlight = new HighlightingParameters
	{
		Fields = new[] { searchField },
		BeforeTerm = "<em>",
		AfterTerm = "</em>"
	}
};

Now, let’s execute our query, passing in the queryOptions object. The results object we get back now contains a populated Highlights collection.

var index = ContentSearchManager.GetIndex(string.Format("sitecore_{0}_index", Sitecore.Context.Database.Name));
using (var context = index.CreateSearchContext())
{
	var results = context.Query<SearchResultWithSummary>(new SolrQueryByField(searchField, searchValue), queryOptions);

	foreach (var result in results)
	{
		var highlights = results.Highlights[result.Fields["_uniqueid"].ToString()];

		if (highlights.Any())
		{
			<ul>
				@foreach (var highlight in highlights)
				{
					<li style="color: #696969">@result.Name</li>
					//The Healthy Workplace Toolkits
					<li>@Html.Raw(string.Join(",", highlight.Value))</li>
					// - The <em>healthy</em> workplace toolkits support you either as a health care employer, RCN workplace representative, employment agency or host organisation to create <em>healthy</em> working environments.
				}
			</ul>        
		}
	}
}

Controlling the size of the snippet

Solr allow us to pass in a parameter, Fragsize, to control the length of the snippet returned to us. I recommend playing around with this to suit your needs.

var queryOptions = new QueryOptions
{
	Highlight = new HighlightingParameters
	{
		Fields = new[] { searchField },
		BeforeTerm = "<em>",
		AfterTerm = "</em>",
		Fragsize = 30
	}
};
// - The <em>healthy</em> workplace toolkits support

A choice of highlighters!

Solr supports different highlighters – take a look at the “Choosing a Highlighter” section in the Solr documentation: https://lucene.apache.org/solr/guide/6_6/highlighting.html

The newest, shiniest highlighter (which shipped with Solr 6.4) is the Unified Highlighter (https://lucene.apache.org/solr/guide/6_6/highlighting.html#Highlighting-TheUnifiedHighlighter). By using this highlighter instead, we can remove the Fragsize parameter and instead get back a whole sentance, containing our highlighted terms. We have to add another parameter to the QueryOptions object, ExtraParams, to tell Solr which highlighter to use:

var queryOptions = new QueryOptions
{
	Highlight = new HighlightingParameters
	{
		Fields = new[] { searchField },
		BeforeTerm = "<em>",
		AfterTerm = "</em>"
	},
	ExtraParams = new List<KeyValuePair<string, string>>
	{
		new KeyValuePair<string, string>("hl.method", "unified")
	}
};
// - Through our <em>healthy</em> Futures project, we help homeless people when they are admitted to hospital.

Can I use Linq?

To make use of the QueryOptions object, we have to query directly through SolrNet. Losing our fancy ContentSearch Linq capabilities is a big deal! Here’s a not-so-great workaround to get it back. We serialize the Linq query to a string, then use it to create a native SolrNet query, attaching our QueryOptions once again.

var query = context.GetQueryable().Where(x => x.Summary.Contains(searchValue));
var solrQuery = new SolrQuery(((IHasNativeQuery)query).Query.ToString());
var results = context.Query(solrQuery, queryOptions);

Feedback

I’d love to hear nicer ways of working with Linq and Highlighting – please let me know any work you’ve done in this area!

Hundreds of renderings? Your first-page-load could be sloooow

In this post

Having many subfolders of MVC views could impact page-load time.

Helix-style Feature folders

In a Helix-style solution, it’s common to group your MVC views by feature:

 /Views/Navigation/Nav.cshtml
 /Views/Navigation/Secondary/SecondaryNav.cshtml
 /Views/News/Headlines.cshtml
 /Views/News/Ticker/NewsTicker.cshtml

Large solutions may see 50, 60, 70+ MVC views making up a single page. If these views are in separate subfolders, we’ve noticed a performance penalty.

Just Helix-style solutions?

No, definitely not. Any solution with many views in many subfolders. Sitecore or no-Sitecore.

When will this affect me?

Each time you deploy to a new folder (ie, D:\Web\Octopus-1.2.3.4\), a new Temporary ASP.NET Files folder is populated with JIT-compiled versions of your .cshtml files. Typically you can see slow first-page-load times after a new deployment.

The technical details

Shout out: Oleg Volkov’s blog details what is going on here: https://ogvolkov.wordpress.com/2017/02/28/slow-asp-net-mvc-website-startup-in-case-of-many-view-folders/. Thanks, Oleg!

The System.Web.Compilation.BuildManager class (https://referencesource.microsoft.com/#System.Web/Compilation/BuildManager.cs,1662) contains a method, CompileWebFile(..), which JIT compiles your .cshtml files. In a handy performance boost, CompileWebFile(..) will batch this compilation, working on an entire directory at a time. This means that having 100 views in a single directory will compile a lot faster than having 100 views in 100 directories.

How much slower?

We did some strikingly unscientific testing by including 400 Partial Views on a page.

400 Views in 1 Folder

 @Html.Partial("~/Views/A/001.cshtml")
 @Html.Partial("~/Views/A/002.cshtml")
 ...
 @Html.Partial("~/Views/A/400.cshtml")
  • Create new directory, deploy to this directory
  • IIS Reset
  • First page load: 58s

400 Views in 40 Folder

 @Html.Partial("~/Views/B/1/001.cshtml")
 @Html.Partial("~/Views/B/1/002.cshtml")
 ...
 @Html.Partial("~/Views/E/10/010.cshtml")
  • Create new directory, deploy to this directory
  • IIS Reset
  • First page load: 3m26s

What’s the solution?

We went with MVC View precompilation (using https://github.com/StackExchange/StackExchange.Precompilation) because moving all .cshtml files to a single directory wasn’t a viable option. This brings the compilation time back down for us, and first-page-load after a deployment is now under 1 minute (previously 7+!).

 

Enable xConnect on a local developer machine

 

In this post

Example configuration files and certificate set-up steps.

What I wanted to do

Install xConnect alongside Sitecore XP 9.0.1 on a developer machine, in xp0 configuration. I didn’t have Powershell 5.1 installed, so had to go ahead without SIF. Yes, it was a bit of a nightmare.

Prerequisites

  • Packages for XP Single from Sitecore Downloads
  • DACPAC databases from that package installed
  • xConnect IIS site from that package hosted locally (devxc.perks.com)
  • Sitecore XP9 hosted locally (dev.perks.com)

Generate a server certificate

New-SelfSignedCertificate -certstorelocation cert:\LocalMachine\My -dnsname *.perks.com

Generate a client certificate

New-SelfSignedCertificate -certstorelocation cert:\LocalMachine\My -dnsname devxc.perks.com

Note down the thumbprint for later:

Thumbprint                                Subject
----------                                -------
7E8DAE07DA298A9681D867F4B65BF4241C064A92  CN=devxc.perks.com

Export and Import

Export the client and server certificates (using Certificate Manager) and re-import them to the following locations:

  • LocalMachine > Trusted Root Certification Authorities

Assign the *.perks.com certificate

Assign the *.perks.com server certificate to the IIS sites dev.perks.com and devxc.perks.com

Ensure devxc.perks.com has Require SSL [ON] and Client certificates [Accept]

Add certificate details to Sitecore XP Connectionstrings.config

<add name="xconnect.collection.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.referencedata.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.marketingautomation.reporting.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.marketingautomation.operations.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />

Add certificate details to xConnect Connectionstrings.config

<add name="xconnect.collection.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.referencedata.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />

Modify xConnect AppSettings.config

<add key="AllowInvalidClientCertificates" value="true" />
<add key="validateCertificateThumbprint" value="7E8DAE07DA298A9681D867F4B65BF4241C064A92" />

Restart! Restart!

Restart IIS, your machine, switch your house lights on and off a few times. Open a window.

Troubleshooting

Export and Import the certificates into:

  • Current User > Personal
  • Current User > Trusted Root Certification Authorities

Use Certificate Manager to grant Read permissions to your certificates.

Remove any non-self-signed certificates from your Local Machine > Trusted Root Certification Authorities store. Beware doing this on your work PC, as some corporate certificates may be changed.

Associated error messages

FATAL [Experience Analytics]: Failed to synchronize segments. Message: Ensure definition type did not complete successfully. StatusCode: 401, ReasonPhrase: 'Invalid certificate', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
Exception: System.InvalidOperationException
Message: The certificate was not found.
Source: Sitecore.Xdb.Common.Web
   at Sitecore.Xdb.Common.Web.CertificateWebRequestHandlerModifier.Process(HttpClientHandler handler)
ERROR Exception when executing agent aggregation/aggregator
Exception: Sitecore.XConnect.XdbCollectionUnavailableException
Message: The HTTP response was not successful: Forbidden
Source: Sitecore.Xdb.Common.Web

Next steps

Please don’t use any of these steps in production! I’m only hacking around to get things running locally.

For further reading, check out:

Feedback

Comment here or find @perks on Twitter. Feedback and corrections happily received.

xConnect error when using Deploy Marketing Definitions tool

In this post

Installing xConnect to an 8.0 > 9.0.1 Sitecore upgrade resulted in some duplicate items in the content tree. This stops the Deploy Marketing Definitions tool from completing.

Problem

I had nearly completed an installation of xConnect with Sitecore 9.0.1, in xp0 configuration. Toward the end, I used the Control Panel > Analytics > Deploy Marketing Definitions tool. It thought for a little while, then blew up, asking me to check the logs. When I did, I found this message:

8876 16:00:59 ERROR One or more exceptions occurred while processing the subscribers to the 'item:saving' event.
Exception[1]: System.InvalidOperationException 
Message[1]: Multiple items were found by alias 'Field Completed' 
Source[1]: Sitecore.Marketing.xMgmt 
 at Sitecore.Marketing.Definitions.Repository.ItemDefinitionRepositoryBase`1.GetItemIdByAlias(String alias)
 at Sitecore.Marketing.Definitions.Repository.ItemDefinitionRepositoryBase`1.GetByAlias(String alias, CultureInfo cultureInfo, Boolean includeInactiveVersion)
 at Sitecore.Marketing.Definitions.DefinitionManagerBase`2.GetByAlias(String alias, CultureInfo cultureInfo, Boolean includeInactiveVersion)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.ValidateAlias[TDefinitionInterface](ItemData itemData, Template itemTemplate, Guid expectedTemplateId, Dictionary`2 templateIdsInheritanceDictionary)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.ValidateItemName(ItemData itemData)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.OnItemSaving(Object sender, EventArgs args)
 at Sitecore.Events.Event.EventSubscribers.RaiseEvent(String eventName, Object[] parameters, EventResult result)

Fix

There were two Field Completed items (with the same template), in my tree at master:/sitecore/system/Settings/Analytics/Page Events/*

Checking the create dates, I see that the items in the Forms folder are newer. I guess the earlier ones are an overhang from our previous Sitecore 8 installation. In any case, I just renamed the older versions to {0} Old and now the Deploy Marketing Definitions tool completes. I’m now going to remove the duplicate items entirely, and ensure any links are redirected to the new version.

Create a custom Solr index in Sitecore 9

Hello there. 

Hi! So you want to create a new Solr index?

Yes, I think so?

It’s a great idea. You’ll be familiar with the big three, sitecore_core_index, sitecore_master_index and sitecore_web_index, but you don’t have to stop there! You can create individual indexes for certain content types on your site, such as Products. Smaller, more individualised indexes are easier to maintain, troubleshoot, faster to rebuild and can be faster to query.

Are they hard to set up?

Not as hard as you’d expect! Let’s create one now.

OK. My Solr is set up and I can access the web UI on https://solr:8983/solr/#/ – what now?

Let’s create the physical Solr core.

  1. Find your Solr index folder for the sitecore_master_index. Mine was at C:\solr\solr-6.6.2\server\solr\sitecore_master_index
  2. Copy this whole folder (into the same parent folder) and call it sitecore_master_products_index
  3. Inside the sitecore_master_products_index folder, open up the core.properties file and change the name property to read sitecore_master_products_index
  4. Restart Solr (I use the solr stop and solr start commands – see below)
  5. Now, go to https://solr:8983/solr/#/ and check out your cores – you will have a new one!

Awesome, it’s there. So I get that we copied the sitecore_master_index and renamed it to sitecore_master_products_index – and in Solr I can see that it contains thousands of documents already, copied from sitecore_master_index. How do I clean things up?

Well, good question. We want to delete all of the existing items in this index and start afresh. You can do this via a web browser – just call this URL:

https://solr:8983/solr/sitecore_master_products_index/update?commit=true&stream.body=<delete><query>*:*</query></delete>

Radical. Everything is deleted. Soo. I want to use this index to only contain certain types of content from Sitecore. How do I configure it properly?

We need to just add a single configuration file to Sitecore. It’s below. It looks mostly like the configuration file for sitecore_master_index, but we change two important things, (a) which template types we want to include in our index and (b) which field types we want to include in our index. In your real solution, this will take a bit of time to set up, but being selective is the whole point of creating a custom index, and you’ll want to keep it as trim as possible.

Here’s the whole config file, which I’ve called Sitecore.ContentSearch.Solr.Index.Master.Products.config:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
  <sitecore role:require="Standalone or ContentManagement" search:require="solr">
    <contentSearch>
      <configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
        <indexes hint="list:AddIndex">
          <index id="sitecore_master_products_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
            <param desc="name">$(id)</param>
            <param desc="core">$(id)</param>
            <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
              <configuration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration">
                  <documentOptions type="Sitecore.ContentSearch.SolrProvider.SolrDocumentBuilderOptions, Sitecore.ContentSearch.SolrProvider">
                      <indexAllFields>false</indexAllFields>

                      <!-- Included fields -->
                      <include hint="list:AddIncludedField">
                          <ProductName>{E676F36E-B0E0-4BE5-998A-329A8F9055FD}</ProductName>
						  <LongDescription>{8A978A2E-0E7A-4415-9163-2F4ECF85A3AB}</LongDescription>
                      </include>

                      <!-- Included templates -->
                      <include hint="list:AddIncludedTemplate">
                          <Product>{665DC431-673A-4D63-B9A6-00EB148E693C}</Product>
                      </include>

                  </documentOptions>
              </configuration>
            <strategies hint="list:AddStrategy">
              <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/syncMaster" />
            </strategies>
            <locations hint="list:AddCrawler">
              <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
                <Database>master</Database>
                <Root>/sitecore</Root>
              </crawler>
            </locations>
            <enableItemLanguageFallback>false</enableItemLanguageFallback>
            <enableFieldLanguageFallback>false</enableFieldLanguageFallback>
          </index>
        </indexes>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

The two bits you’ll need to replace here are the bits commented as Included Fields and Included Templates:

<!-- Included fields -->
<include hint="list:AddIncludedField">
  <ProductName>{E676F36E-B0E0-4BE5-998A-329A8F9055FD}</ProductName>
  <LongDescription>{8A978A2E-0E7A-4415-9163-2F4ECF85A3AB}</LongDescription>
</include>

<!-- Included templates -->
<include hint="list:AddIncludedTemplate">
  <Product>{665DC431-673A-4D63-B9A6-00EB148E693C}</Product>
</include>

OK, done. I’ve added my list of templates, and fields here. So, can I reindex now and see my new content?

Absolutely. Go into Sitecore > Control Panel > Indexing Manager, find your index and rebuild it.

When you’re done, go back to the Solr UI and see your documents! If things didn’t go quite to plan, check in your site Crawling.log, which will contain any indexing errors.

Production ready?

Well, not quite. You might want to create a sitecore_web_products_index and use the Sitecore.ContentSearch.Solr.Index.Web.config configuration file as an example of how to register it in Sitecore. Using Sitecore’s conventions for master and web keep the surprises to a minimum.

Search on, pals!