Enable xConnect on a local developer machine


In this post

Example configuration files and certificate set-up steps.

What I wanted to do

Install xConnect alongside Sitecore XP 9.0.1 on a developer machine, in xp0 configuration. I didn’t have Powershell 5.1 installed, so had to go ahead without SIF. Yes, it was a bit of a nightmare.


  • Packages for XP Single from Sitecore Downloads
  • DACPAC databases from that package installed
  • xConnect IIS site from that package hosted locally (devxc.perks.com)
  • Sitecore XP9 hosted locally (dev.perks.com)

Generate a server certificate

New-SelfSignedCertificate -certstorelocation cert:\LocalMachine\My -dnsname *.perks.com

Generate a client certificate

New-SelfSignedCertificate -certstorelocation cert:\LocalMachine\My -dnsname devxc.perks.com

Note down the thumbprint for later:

Thumbprint                                Subject
----------                                -------
7E8DAE07DA298A9681D867F4B65BF4241C064A92  CN=devxc.perks.com

Export and Import

Export the client and server certificates (using Certificate Manager) and re-import them to the following locations:

  • LocalMachine > Trusted Root Certification Authorities

Assign the *.perks.com certificate

Assign the *.perks.com server certificate to the IIS sites dev.perks.com and devxc.perks.com

Ensure devxc.perks.com has Require SSL [ON] and Client certificates [Accept]

Add certificate details to Sitecore XP Connectionstrings.config

<add name="xconnect.collection.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.referencedata.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.marketingautomation.reporting.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.marketingautomation.operations.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />

Add certificate details to xConnect Connectionstrings.config

<add name="xconnect.collection.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />
<add name="xdb.referencedata.client.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=7E8DAE07DA298A9681D867F4B65BF4241C064A92;AllowInvalidClientCertificates=true" />

Modify xConnect AppSettings.config

<add key="AllowInvalidClientCertificates" value="true" />
<add key="validateCertificateThumbprint" value="7E8DAE07DA298A9681D867F4B65BF4241C064A92" />

Restart! Restart!

Restart IIS, your machine, switch your house lights on and off a few times. Open a window.


Export and Import the certificates into:

  • Current User > Personal
  • Current User > Trusted Root Certification Authorities

Use Certificate Manager to grant Read permissions to your certificates.

Remove any non-self-signed certificates from your Local Machine > Trusted Root Certification Authorities store. Beware doing this on your work PC, as some corporate certificates may be changed.

Associated error messages

FATAL [Experience Analytics]: Failed to synchronize segments. Message: Ensure definition type did not complete successfully. StatusCode: 401, ReasonPhrase: 'Invalid certificate', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
Exception: System.InvalidOperationException
Message: The certificate was not found.
Source: Sitecore.Xdb.Common.Web
   at Sitecore.Xdb.Common.Web.CertificateWebRequestHandlerModifier.Process(HttpClientHandler handler)
ERROR Exception when executing agent aggregation/aggregator
Exception: Sitecore.XConnect.XdbCollectionUnavailableException
Message: The HTTP response was not successful: Forbidden
Source: Sitecore.Xdb.Common.Web

Next steps

Please don’t use any of these steps in production! I’m only hacking around to get things running locally.

For further reading, check out:


Comment here or find @perks on Twitter. Feedback and corrections happily received.

xConnect error when using Deploy Marketing Definitions tool

In this post

Installing xConnect to an 8.0 > 9.0.1 Sitecore upgrade resulted in some duplicate items in the content tree. This stops the Deploy Marketing Definitions tool from completing.


I had nearly completed an installation of xConnect with Sitecore 9.0.1, in xp0 configuration. Toward the end, I used the Control Panel > Analytics > Deploy Marketing Definitions tool. It thought for a little while, then blew up, asking me to check the logs. When I did, I found this message:

8876 16:00:59 ERROR One or more exceptions occurred while processing the subscribers to the 'item:saving' event.
Exception[1]: System.InvalidOperationException 
Message[1]: Multiple items were found by alias 'Field Completed' 
Source[1]: Sitecore.Marketing.xMgmt 
 at Sitecore.Marketing.Definitions.Repository.ItemDefinitionRepositoryBase`1.GetItemIdByAlias(String alias)
 at Sitecore.Marketing.Definitions.Repository.ItemDefinitionRepositoryBase`1.GetByAlias(String alias, CultureInfo cultureInfo, Boolean includeInactiveVersion)
 at Sitecore.Marketing.Definitions.DefinitionManagerBase`2.GetByAlias(String alias, CultureInfo cultureInfo, Boolean includeInactiveVersion)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.ValidateAlias[TDefinitionInterface](ItemData itemData, Template itemTemplate, Guid expectedTemplateId, Dictionary`2 templateIdsInheritanceDictionary)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.ValidateItemName(ItemData itemData)
 at Sitecore.Marketing.xMgmt.Definitions.ItemEventHandler.OnItemSaving(Object sender, EventArgs args)
 at Sitecore.Events.Event.EventSubscribers.RaiseEvent(String eventName, Object[] parameters, EventResult result)


There were two Field Completed items (with the same template), in my tree at master:/sitecore/system/Settings/Analytics/Page Events/*

Checking the create dates, I see that the items in the Forms folder are newer. I guess the earlier ones are an overhang from our previous Sitecore 8 installation. In any case, I just renamed the older versions to {0} Old and now the Deploy Marketing Definitions tool completes. I’m now going to remove the duplicate items entirely, and ensure any links are redirected to the new version.

Create a custom Solr index in Sitecore 9

Hello there. 

Hi! So you want to create a new Solr index?

Yes, I think so?

It’s a great idea. You’ll be familiar with the big three, sitecore_core_index, sitecore_master_index and sitecore_web_index, but you don’t have to stop there! You can create individual indexes for certain content types on your site, such as Products. Smaller, more individualised indexes are easier to maintain, troubleshoot, faster to rebuild and can be faster to query.

Are they hard to set up?

Not as hard as you’d expect! Let’s create one now.

OK. My Solr is set up and I can access the web UI on https://solr:8983/solr/#/ – what now?

Let’s create the physical Solr core.

  1. Find your Solr index folder for the sitecore_master_index. Mine was at C:\solr\solr-6.6.2\server\solr\sitecore_master_index
  2. Copy this whole folder (into the same parent folder) and call it sitecore_master_products_index
  3. Inside the sitecore_master_products_index folder, open up the core.properties file and change the name property to read sitecore_master_products_index
  4. Restart Solr (I use the solr stop and solr start commands – see below)
  5. Now, go to https://solr:8983/solr/#/ and check out your cores – you will have a new one!

Awesome, it’s there. So I get that we copied the sitecore_master_index and renamed it to sitecore_master_products_index – and in Solr I can see that it contains thousands of documents already, copied from sitecore_master_index. How do I clean things up?

Well, good question. We want to delete all of the existing items in this index and start afresh. You can do this via a web browser – just call this URL:


Radical. Everything is deleted. Soo. I want to use this index to only contain certain types of content from Sitecore. How do I configure it properly?

We need to just add a single configuration file to Sitecore. It’s below. It looks mostly like the configuration file for sitecore_master_index, but we change two important things, (a) which template types we want to include in our index and (b) which field types we want to include in our index. In your real solution, this will take a bit of time to set up, but being selective is the whole point of creating a custom index, and you’ll want to keep it as trim as possible.

Here’s the whole config file, which I’ve called Sitecore.ContentSearch.Solr.Index.Master.Products.config:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
  <sitecore role:require="Standalone or ContentManagement" search:require="solr">
      <configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
        <indexes hint="list:AddIndex">
          <index id="sitecore_master_products_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
            <param desc="name">$(id)</param>
            <param desc="core">$(id)</param>
            <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
              <configuration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration">
                  <documentOptions type="Sitecore.ContentSearch.SolrProvider.SolrDocumentBuilderOptions, Sitecore.ContentSearch.SolrProvider">

                      <!-- Included fields -->
                      <include hint="list:AddIncludedField">

                      <!-- Included templates -->
                      <include hint="list:AddIncludedTemplate">

            <strategies hint="list:AddStrategy">
              <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/syncMaster" />
            <locations hint="list:AddCrawler">
              <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">

The two bits you’ll need to replace here are the bits commented as Included Fields and Included Templates:

<!-- Included fields -->
<include hint="list:AddIncludedField">

<!-- Included templates -->
<include hint="list:AddIncludedTemplate">

OK, done. I’ve added my list of templates, and fields here. So, can I reindex now and see my new content?

Absolutely. Go into Sitecore > Control Panel > Indexing Manager, find your index and rebuild it.

When you’re done, go back to the Solr UI and see your documents! If things didn’t go quite to plan, check in your site Crawling.log, which will contain any indexing errors.

Production ready?

Well, not quite. You might want to create a sitecore_web_products_index and use the Sitecore.ContentSearch.Solr.Index.Web.config configuration file as an example of how to register it in Sitecore. Using Sitecore’s conventions for master and web keep the surprises to a minimum.

Search on, pals!


Sitecore 9: ContentSearch Solr query quirks with spaces and wildcards

Sitecore provides a Linq powered IQueryable mechanism with which you can build powerful search queries. Your query will be translated into a native query for your underlying search engine (eg. Solr). There are some odd quirks (bugs?) with this translation in Sitecore 9.0 and 9.0.1 when your search term includes a space. Let’s take a look.

In the below examples, context is an instance of IProviderSearchContext, which you’d typically wire up with dependency injection. In each case, we’re looking to query something from the index based the item’s path in the Sitecore tree.

Querying on exact matches:

context.GetQueryable().Where(x => x.Path == "Hello");
 Translates to: {_fullpath:(Hello)}

Ok! This makes sense.

context.GetQueryable().Where(x => x.Path == "Hello World");
 Translates to: {_fullpath:("Hello World")}

Notice that if your query term has a space, we need to wrap the term in quotes.

context.GetQueryable().Where(x => x.Path == "\\Hello");
 Translates to: {_fullpath:(\\Hello)}

Backslash? No problem.

context.GetQueryable().Where(x => x.Path == "/Hello");
 Translates to: {_fullpath:(\/Hello)}

Forwardslash? We need to escape that with a ‘\’

context.GetQueryable().Where(x => x.Path == "\\Hello World");
 Translates to: {_fullpath:("\\Hello World")}

Backslash with space? No problem, just add the quotes.

context.GetQueryable().Where(x => x.Path == "/Hello World");
 Translates to: {_fullpath:("\/Hello World")}

As above, we’re all good, the forwardslash is just escaped.

Querying on partial matches – where things get interesting:

context.GetQueryable().Where(x => x.Path.Contains("Hello"));
 Translates to: {_fullpath:(*Hello*)}

All good. Here, we wrap our search term in a wildcard, *

context.GetQueryable().Where(x => x.Path.Contains("Hello World"));
 Translates to: {_fullpath:("\*Hello\\ World\*")}

Uh oh! Something weird has happened. The quotes and wildcard seem to have got mixed up, and we’ve ended up with something which won’t return the results we want. Having read more about wildcard / space combinations here , we probably want to end up with something simpler, like {_fullpath:(*Hello\ World*)}

context.GetQueryable().Where(x => x.Path.Contains("\\Hello"));
 Translates to: {_fullpath:(*\\Hello*)}

No problem with this partial match, as we don’t have a space to deal with.

context.GetQueryable().Where(x => x.Path.Contains("/Hello"));
 Translates to: {_fullpath:(*\/Hello*)}

Again, fine.

context.GetQueryable().Where(x => x.Path.Contains("\\Hello World"));
 Translates to: {_fullpath:("\*\\Hello\\ World\*")}

The space completely breaks everything here

context.GetQueryable().Where(x => x.Path.Contains("/Hello World"));
 Translates to: {_fullpath:("\*\/Hello\\ World\*")}

and here..


I raised this with Sitecore and it has been raised as a bug. In the meantime – if you can get away with using StartsWith rather than Contains, you’ll find this works OK:

context.GetQueryable().Where(x => x.Path.StartsWith("Hello World"));
 Translates to: {_fullpath:(Hello\ World*)}

Which is just about perfect.

Sitecore 9 : The partial view ‘/sitecore/shell/client/Speak/Layouts/Layouts/Speak-Layout.cshtml’ was not found or no view engine supports the searched locations. The following locations were searched: /sitecore/shell/client/Speak/Layouts/Layouts/Speak-Layout.cshtml

When logging into Sitecore 9 and trying to access the Launchpad (/sitecore/shell/sitecore/client/applications/launchpad) – you may get the following error:

The partial view '/sitecore/shell/client/Speak/Layouts/Layouts/Speak-Layout.cshtml' was not found or no view engine supports the searched locations. The following locations were searched:

This is actually an easy one – make sure you have configured your site to either ContentManagement or Standalone mode. In your Web.config:

<add key="role:define" value="ContentManagement" />


In my case – for testing, I had swapped to ContentDelivery mode and forgotten to change back.

Sitecore Solr Error: Processing Field Name. Resolving Multiple Field found on Solr Field Map. No matching template field on index field name, return type ‘String’ and field type ”

After an upgrade to Sitecore 9, our Sitecore search logs were filled with thousands of warnings, like the below:

WARN Processing Field Name : Overview Text. Resolving Multiple Field found on Solr Field Map. No matching template field on index field name 'overview_text', return type 'String' and field type ''

What’s the fix?

You need to add field mappings for each of the fields in your Solr index. In our case, we had no mapping for ‘overview_text’, so Sitecore / Solr didn’t know how to treat the field. Add a config patch and specify a returnType for the fields you see as warnings in the log:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
            <fieldNames hint="raw:AddFieldByFieldName">
              <field fieldName="overview_text" returnType="text" />

** UPDATE 25/01/2018 **

While the above is suitable for adding a few fields, having hundreds or thousands of fields in your Solr index will lead to having to maintain lots of the above configuration entries. I raised a ticket with Sitecore, and was told “We registered this behavior as a bug with the reference #​195567”. Sitecore’s suggested workaround is to add a log4Net filter which will stop the problematic entries from reaching the log. For example:

<!-- Filter out Solr log warnings-->
      <appender name="SearchLogFileAppender">
        <filter type="log4net.Filter.StringMatchFilter">
          <regexToMatch  value="Resolving Multiple Field found on Solr Field Map. No matching solr search field configuration on index field name|Search field name in Solr with Template Resolver is returning no entry|Resolving Multiple Field found on Solr Field Map. No matching template field on index field name|Solr with Template Resolver is returning multiple entry|is being skipped. Reason: No Field Type Name" />
          <acceptOnMatch value="false" />

Hopefully a proper fix or configuration guidance will be released at some point.

Sitecore Solr setup: Document is missing mandatory uniqueKey field: id

While reconfiguring Sitecore (8.2u5) to use Solr (6.6.1) instead of Lucene, I came across the following error:

Document is missing mandatory uniqueKey field: id

In full:

Job started: Index_Update_IndexName=sitecore_master_index|#Exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> SolrNet.Exceptions.SolrConnectionException: <?xml version="1.0" encoding="UTF-8"?>
<lst name="responseHeader"><int name="status">400</int><int name="QTime">1</int></lst><lst name="error"><lst name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str name="root-error-class">org.apache.solr.common.SolrException</str></lst><str name="msg">Document is missing mandatory uniqueKey field: id</str><int name="code">400</int></lst>
 ---> System.Net.WebException: The remote server returned an error: (400) Bad Request.
 at System.Net.HttpWebRequest.GetResponse()
 at HttpWebAdapters.Adapters.HttpWebRequestAdapter.GetResponse()
 at SolrNet.Impl.SolrConnection.GetResponse(IHttpWebRequest request)
 at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters)
 --- End of inner exception stack trace ---
 at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters)
 at SolrNet.Impl.SolrConnection.Post(String relativeUrl, String s)

Here’s what to check.

  • Does your Solr core index config directory have a file called managed-schema? If so, delete this file and reload the core. Solr will be ignoring any changes you’re making to schema.xml and using managed-schema instead. Deleting this file and reloading the core will pick up your latest version of schema.xml


Delete this file



Reload the core



Rebuild the index in Sitecore and the error should be gone. 


Digging into Unicorn configs in Sitecore Habitat

Sitecore Habitat ships with Unicorn as the serialization utility of choice. Unicorn works in the background during the development of a Sitecore implementation and writes out a YAML copy of templates, renderings and other Sitecore items to disk.
You can then add these YAML (.yml) files to your version control repository, merge changes with other developers and ensure they are picked up / deployed to target environments as part of your continuous integration (CI) pipeline.

In Habitat, we bundle a Unicorn-specific configuration file with each module – in line with the Helix principle that any assets relating to a module should be bundled and deployed with that module.

If we had a feature module called ‘Car’, we’d create a config patch in App_Config/Include/Feature called Feature.Car.Serialization.Config

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
    <configuration name="Feature.Car" description="Feature Car" dependencies="Foundation.*" extends="Helix.Feature">
      <include domain="modules" pattern="^Feature Car .*$" />

The key line setting up this configuration file is:

<configuration name="Feature.Car" description="Feature Car" dependencies="Foundation.*" extends="Helix.Feature">

Aside from the obvious name and description, dependencies tells Unicorn to run other configurations before this one (in our case, run any other configuration named Foundation.* first. Extends allows us to inherit from a parent config, Helix.Feature. This is a great feature which removes unnecessary code duplication in Helix solutions (which can have hundreds of modules!)

With this simple config (which extends the Helix.Feature abstract configuration), Unicorn will sync any Sitecore definition items under any folder called /Feature/Car in Sitecore. Happily, the Helix.Feature abstract config will ensure that Unicorn puts its .yml files alongside the rest of your source code for the Car module. We can track down the configuration line which determines the serialization location in Unicorn.Helix.config:

<configuration name="Helix.Feature" abstract="true" extends="Helix.Base">
   <include name="Templates" database="master" path="/sitecore/templates/$(layer)/$(module)" />
   <include name="Renderings" database="master" path="/sitecore/layout/renderings/$(layer)/$(module)" />
   <include name="Media" database="master" path="/sitecore/media library/$(layer)/$(module)" />


Read more about Unicorn configuration at Kam’s blog.

Elasticon London 2017 (for a Sitecore developer)

Even though Elasticsearch is built on the same foundations (Apache Lucene) as Solr, we in the Sitecore community don’t see a lot of cases where Solr or Coveo have been replaced by Elasticsearch as the main search component.

Today I’ve been at Elasticon London 2017 and have been soaking up new product releases by Elastic. Here’s some notes I scribbled and points which would be interesting to the Sitecore world.

Elasticsearch 6.0

  • Elasticsearch 6.0 is the next major release – but no firm release date has been set just yet. The official answer right now is “coming soon”.
  • Elastic have recognised how painful the upgrade process is – particularly between 2.x and 5.x. When performing the upgrade, you must re-index all of your data to make it 5.x compatible – a huge job on large clusters!
  • While the 2.x to 5.x upgrade path isn’t going to get much easier, Elastic are making sure that the 5.x to 6.x upgrade path doesn’t require you to re-index all data, or take nodes offline.
  • The .NET client creates REST requests with a JSON payload which proxies requests and responses between your code and the Elasticsearch cluster. This is the client model they’re sticking with, and are actually rewriting the native Java client (which currently doesn’t generate REST requests) to be more in line with the .NET client.
  • When it comes to swapping out Solr and Coveo for Elasticsearch, I think this would be a very individual decision based on the needs of your project.




Anomaly Detection

  • You can now create and set off unsupervised machine learning jobs to continually parse any data and highlight anomalies.
  • The engineer I spoke to said usually “around three weeks” of learning will be enough to begin pulling out anomalies.
  • This could have applications ranging from security (detecting usual IP or DNS activity), to marketing: spotting if a ‘suspicious’ user journey is taking place (whatever this may be!) or perhaps highlighting if a user is stuck or lost.
  • The Elastic stack uses Beats – agents which can monitor a set of files, database, or network packets and streams the data into an Elasticsearch instance / cluster.




  • Elastic founder and CEO Shay Banon announced during his keynote that Elastic have acquired Opbeat, a Copenhagen based company whose product adds monitoring and profiling to JavaScript applications (think Node.js, React and Angular).
  • While this might have limited applicability for a lot of Sitecore solutions (where React and Angular might not be the norm), the interesting thing here will be to wait and see how Elastic fit Opbeat into their stack. My guess would be that they’ll extend the product and make it a more general-purpose monitoring and profiling tool.



Machine Learning

  • Elastic’s Steve Dodson (who heads up the Machine Learning product) showed us the current offering, which again is centred around anomaly detection.
  • Most of Elastic’s demo use-cases for anomaly detection are for ops-level indicators like 404’s, 500s, response time, DDOS detection, and so on. Machine Learning kicks in to ignore ‘regular’ surges such as an increase in page response time during weekly batch jobs – but still alerting if the surge is stronger than usual.
  • With some fiddling, you could set up an anomaly detection profiler which tolerates a certain amount of server errors after a code release (and assumes you’ll fix them), but alerts you if it looks like you’ve broken something really big.
  • There was a preview of forecasting, a feature of the upcoming Elasticsearch 6.0. Forecasting does exactly as you’d expect – look at historical data and predict statistics for a future window.



Elasticsearch SQL

Being an ex-database nerd, I loved this session. I’m not sure how the wider search community are going to feel about writing SQL again, but here’s what Elastic have in development:

  • A SQL-like DSL which is 50% ANSI SQL and 50% Elasticsearch-specific syntax additions.
  • You can run queries like SHOW TABLES to list all indexes, DESCRIBE my_index to show fields (columns) and datatypes. You can run search queries like this:
SELECT * FROM my_index WHERE QUERY('+chris perks')
  • All SQL queries translate into the same old Elasticsearch QueryDSL
  • Other constructs they’re including are GROUP BY, HAVING, even JOINs are in there – all translating to their equivalent Elasticsearch QueryDSL commands.
  • You can even wrap SQL in JSON and use it via a REST call *confused face emoji*
  • This is still heavily in development and won’t be released for a while.



There’s plenty of overlap with what the Elastic stack offers, and what you’ll already have set up with Sitecore, Solr, and xDB. There’ll be a fair amount of plumbing work to get Elasticsearch set up properly with Sitecore, whereas you get this out of the box for Solr and/or Coveo.

As Elastic expand, they’re adding many new tools and capabilities to their stack, so it’s definitely not correct to see Elasticsearch as a like-for-like replacement for Solr or Coveo or parts of xDB.

I can see use cases for engaging Elasticsearch alongside your current setup, if you either need a particular Elastic capability which Sitecore / Solr doesn’t give you (such as Machine Learning, or streaming data). Or, you have more faith in the scaling capability of Elasticsearch than you do Sitecore and Solr.


Visualising Sitecore Analyzers

When Sitecore indexes your content, Lucene analyzers work to break down your text into a series of individual tokens. For instance, a simple analyzer might convert input text to lowercase, split into separate words, and remove punctuation:

  • input: Hi there! My name is Chris.
  • output tokens: “hi”, “there”, “my”, “name”, “is”, “chris”

While this happens behind the scenes, and is usually not of too much interest outside of diagnostics or curiosity, there’s a way we can view the output of the analyzers bundled with Sitecore.

Let’s get some input text to analyze, in both English and French:

var text = "Did the Quick Brown Fox jump over the Lazy Dog?";
var text_fr = "Le Fox brune rapide a-t-il sauté sur le chien paresseux?";

Next, let’s write a generic method which takes some text and a Lucene analyzer, and runs the text through the analyzer:

private static void displayTokens(Analyzer analyzer, string text)
    var stream = analyzer.TokenStream("content", new StringReader(text));
    var term = stream.AddAttribute();
    while (stream.IncrementToken())
      Console.Write("'" + term.Term + "', ");

Now, let’s try this out on some Sitecore analyzers!

  • CaseSensitiveStandardAnalyzer retains case, but removes punctuation and stop words (common words which offer no real value when searching)
displayTokens(new CaseSensitiveStandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text);
> 'Did', 'Quick', 'Brown', 'Fox', 'jump', 'over', 'Lazy', 'Dog'
  • LowerCaseKeywordAnalyzer convers the input to lowercase, but retains the punctuation and doesn’t split the input into separate words.
displayTokens(new LowerCaseKeywordAnalyzer(), text);
> 'did the quick brown fox jump over the lazy dog?
  • NGramAnalyzer breaks text up into trigrams which are useful for autocomplete. See more here.
displayTokens(new NGramAnalyzer(), text);
> 'did_the_quick', 'the_quick_brown', 'quick_brown_fox', 'brown_fox_jump', 'fox_jump_over', 'jump_over_the', 'over_the_lazy', 'the_lazy_dog
  • StandardAnalyzerWithStemming introduces stemming, which finds a common root for similar words (lazy, lazily, laze -> lazi)
displayTokens(new StandardAnalyzerWithStemming(Lucene.Net.Util.Version.LUCENE_30), text);
> 'Did', 'the', 'Quick', 'Brown', 'Fox', 'jump', 'over', 'the', 'Lazi', 'Dog'
displayTokens(new SynonymAnalyzer(new XmlSynonymEngine("synonyms.xml")), text);
> 'did', 'quick', 'fast', 'rapid', 'brown', 'fox', 'jump', 'over', 'lazy', 'dog
  • Lastly, we try a FrenchAnalyzer. Stop words are language specific, and so the community often contributes analyzers which will remove stop words in languages other than English. In the example below, we remove common French words.
displayTokens(new FrenchAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text_fr);
> 'le', 'fox', 'brun', 'rapid', 't', 'saut', 'chien', 'pares'

The full code is here: (https://gist.github.com/christofur/e2ea406c21bccd3b032c9b861df0749b)