Visualising Sitecore Analyzers

When Sitecore indexes your content, Lucene analyzers work to break down your text into a series of individual tokens. For instance, a simple analyzer might convert input text to lowercase, split into separate words, and remove punctuation:

  • input: Hi there! My name is Chris.
  • output tokens: “hi”, “there”, “my”, “name”, “is”, “chris”

While this happens behind the scenes, and is usually not of too much interest outside of diagnostics or curiosity, there’s a way we can view the output of the analyzers bundled with Sitecore.

Let’s get some input text to analyze, in both English and French:

var text = "Did the Quick Brown Fox jump over the Lazy Dog?";
var text_fr = "Le Fox brune rapide a-t-il sauté sur le chien paresseux?";

Next, let’s write a generic method which takes some text and a Lucene analyzer, and runs the text through the analyzer:

private static void displayTokens(Analyzer analyzer, string text)
{
    var stream = analyzer.TokenStream("content", new StringReader(text));
    var term = stream.AddAttribute();
    while (stream.IncrementToken())
    {
      Console.Write("'" + term.Term + "', ");
    }
}

Now, let’s try this out on some Sitecore analyzers!

  • CaseSensitiveStandardAnalyzer retains case, but removes punctuation and stop words (common words which offer no real value when searching)
displayTokens(new CaseSensitiveStandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text);
> 'Did', 'Quick', 'Brown', 'Fox', 'jump', 'over', 'Lazy', 'Dog'
  • LowerCaseKeywordAnalyzer convers the input to lowercase, but retains the punctuation and doesn’t split the input into separate words.
displayTokens(new LowerCaseKeywordAnalyzer(), text);
> 'did the quick brown fox jump over the lazy dog?
  • NGramAnalyzer breaks text up into trigrams which are useful for autocomplete. See more here.
displayTokens(new NGramAnalyzer(), text);
> 'did_the_quick', 'the_quick_brown', 'quick_brown_fox', 'brown_fox_jump', 'fox_jump_over', 'jump_over_the', 'over_the_lazy', 'the_lazy_dog
  • StandardAnalyzerWithStemming introduces stemming, which finds a common root for similar words (lazy, lazily, laze -> lazi)
displayTokens(new StandardAnalyzerWithStemming(Lucene.Net.Util.Version.LUCENE_30), text);
> 'Did', 'the', 'Quick', 'Brown', 'Fox', 'jump', 'over', 'the', 'Lazi', 'Dog'
displayTokens(new SynonymAnalyzer(new XmlSynonymEngine("synonyms.xml")), text);
> 'did', 'quick', 'fast', 'rapid', 'brown', 'fox', 'jump', 'over', 'lazy', 'dog
  • Lastly, we try a FrenchAnalyzer. Stop words are language specific, and so the community often contributes analyzers which will remove stop words in languages other than English. In the example below, we remove common French words.
displayTokens(new FrenchAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text_fr);
> 'le', 'fox', 'brun', 'rapid', 't', 'saut', 'chien', 'pares'

The full code is here: (https://gist.github.com/christofur/e2ea406c21bccd3b032c9b861df0749b)


using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Fr;
using Lucene.Net.Analysis.Tokenattributes;
using Sitecore.ContentSearch.LuceneProvider.Analyzers;
using System;
using System.IO;
namespace SitecoreAnalyzers
{
class Program
{
static void Main(string[] args)
{
var text = "Did the Quick Brown Fox jump over the Lazy Dog?";
var text_fr = "Le Fox brune rapide a-t-il sauté sur le chien paresseux?";
Console.WriteLine("** CaseSensitiveStandardAnalyzer **");
displayTokens(new CaseSensitiveStandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text);
Console.WriteLine("** LowerCaseKeywordAnalyzer **");
displayTokens(new LowerCaseKeywordAnalyzer(), text);
Console.WriteLine("** NGramAnalyzer **");
displayTokens(new NGramAnalyzer(), text);
Console.WriteLine("** StandardAnalyzerWithStemming **");
displayTokens(new StandardAnalyzerWithStemming(Lucene.Net.Util.Version.LUCENE_30), text);
Console.WriteLine("** SynonymAnalyzer – see http://firebreaksice.com/sitecore-synonym-search-with-lucene/ **");
displayTokens(new SynonymAnalyzer(new XmlSynonymEngine("synonyms.xml")), text);
Console.WriteLine("** FrenchAnalyzer **");
displayTokens(new FrenchAnalyzer(Lucene.Net.Util.Version.LUCENE_30), text_fr);
Console.ReadKey();
}
private static void displayTokens(Analyzer analyzer, string text)
{
var stream = analyzer.TokenStream("content", new StringReader(text));
var term = stream.AddAttribute<ITermAttribute>();
while (stream.IncrementToken())
{
Console.Write("'" + term.Term + "', ");
}
Console.Write("\n\n");
}
}
}

Returning JSON errors from Sitecore MVC controllers

ASP.NET MVC gives us IExceptionFilter, with which we can create custom, global exception handlers to apply to controller actions.

public class ExceptionLoggingFilter : FilterAttribute, IExceptionFilter
{
	public void OnException(ExceptionContext filterContext)
	{
		// filterContext now contains lots of information about our exception, controller, action, etc
		filterContext.Exception.Message;
		filterContext.Exception.StackTrace;
		filterContext.Controller.GetType().Name;
		filterContext.Result.GetType().Name;
		UserAgent = filterContext.HttpContext.Request.UserAgent;
	}
}

 

We can apply this filter to all Action methods, by adding our filter to the list of global filters:

public class FilterConfig {
	public static void RegisterGlobalFilters(GlobalFilterCollection filters) {
		filters.Add(new ExceptionLoggingFilter());
	}
}

 

and wiring this up to our application in our Application_Start method:

FilterConfig.RegisterGlobalFilters(GlobalFilters.Filters);

 

In Sitecore

As you may expect, Sitecore exposes this functionality as pipeline processors. Sitecore defined a custom IExceptionFilter implementation (see our snippet above) which kicks off the mvc.exception pipeline, passing along the ExceptionContext object.

As client developers, it is our job to create an appropriate processor to accept the ExceptionContext and do something with it. Let’s run through an example where we want to return a JSON representation of the error, loaded with as much useful information as possible.

For more reading on Sitecore controller actions returning JSON, have a look at John West’s post here: https://community.sitecore.net/technical_blogs/b/sitecorejohn_blog/posts/use-json-and-mvc-to-retrieve-item-data-with-the-sitecore-asp-net-cms

So, first up, create an empty handler class, which inherits from ExceptionProcessor:

public class JSONExceptionHandler :
	Sitecore.Mvc.Pipelines.MvcEvents.Exception.ExceptionProcessor
{
	public override void Process(Sitecore.Mvc.Pipelines.MvcEvents.Exception.ExceptionArgs args)
	{

	}
}

 

Create a Web.config include, to add this processor to the mvc.exception pipeline:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <mvc.exception>
        <processor type="Bleep.Handlers.JSONExceptionHandler, Bleep.Handlers"/>
      </mvc.exception>
    </pipelines>
  </sitecore>
</configuration>

 

Ok! Now our JSONExceptionHandler class will be called each time an exception occurs in MVC code. So, let’s grab all the detail we can from the ExceptionContext class and return it as JSON:

public override void Process(Sitecore.Mvc.Pipelines.MvcEvents.Exception.ExceptionArgs args)
{
	var filterContext = args.ExceptionContext;
 
	filterContext.Result = new JsonResult
	{
		JsonRequestBehavior = JsonRequestBehavior.AllowGet,
                  Data = new
		  {
    			Message = filterContext.Exception.Message,
    			StackTrace = filterContext.Exception.StackTrace,
    			Controller = filterContext.Controller.GetType().Name,
    			Result = filterContext.Result.GetType().Name,
    			UserAgent = filterContext.HttpContext.Request.UserAgent,
    			ItemName = args.PageContext.Item.Name,
    			Device = args.PageContext.Device.DeviceItem.Name,
    			User = filterContext.HttpContext.User.Identity.Name
		  }
	};
 
	filterContext.ExceptionHandled = true;
 
	// Log the error
	Sitecore.Diagnostics.Log.Error("MVC exception processing " 
                	+ Sitecore.Context.RawUrl, args.ExceptionContext.Exception, this);
}

 

This will produce a result such as:

ExceptionFilter2

Happy hacking!

A workaround for missing ViewData in Sitecore MVC

Passing data between Sitecore renderings can get tricky.

Sending messages between sibling renderings can lead us to worry about the order in which they render, and you may end up with renderings tightly coupled to other renderings. Jeremy Davis discusses ways to switch the order of rendering execution on his blog here: https://jermdavis.wordpress.com/2016/04/04/getting-mvc-components-to-communicate/

Share_Data_1

Pass data down, not across

My preferred approach is for renderings to be as isolated as possible and not need to talk to siblings. In a regular MVC site, we would instantiate a ViewModel, and pass it down to any child (or partial) views as needed. If a child view doesn’t change this ViewModel at all, we don’t have to worry about order of execution or changes of state.

In Sitecore, we can achieve this by wrapping child renderings in a parent Controller Rendering. This Controller Rendering creates and prepares the ViewModel, and then passes it down to one or more child renderings.

Share_Data_2

Let’s recap on the main points here:

  1. Our parent Controller Rendering creates and prepares a ViewModel. This parent specifies a view, which contains one or more placeholders.
  2. This ViewModel is passed along to any child renderings currently attached to the placeholders.
  3. During execution, child renderings do not modify the ViewModel. We may even consider the ViewModel immutable while rendering takes place.

Sitecore has a peculiarity here which makes our job difficult. Each rendering gets a new instance of ViewData – explained by Kern Herskind Nightingale here: http://stackoverflow.com/a/35210022/638064. This puts a stop to us using ViewData to pass our ViewModel down from the parent rendering to child renderings.

The Workaround

There’s a way you can ensure that ViewData is correctly passed down from parent to child renderings. Let’s go through how this is possible.

  1. In your top level controller, create a ViewModel, which will be passed down to all child renderings.
public ActionResult ParentContainer()
{
    var viewModel = new {PageSize = 3, CurrentPage = 2, Results = Sitecore.Context.Item.Fields["Results"].Value};
    return View();
}

  1. Add it to the ViewData collection in the current ViewContext
public ActionResult ParentContainer()
{
    var viewModel = new {PageSize = 3, CurrentPage = 2, Results = Sitecore.Context.Item.Fields["Results"].Value};
    ContextService.Get().GetCurrent().ViewData.Add("_SharedModel", viewModel);
    return View();
}
  1. In each child rendering, fetch the ViewModel and add it to the local ViewData for the current rendering (which will be empty at this point). View Renderings will do this step for you, so you don’t need to do anything special there
public ActionResult ChildRendering()
{
    // Get any ViewData previously added to this ViewContext
    var contextViewData = ContextService.Get().GetCurrent().ViewData;
    contextViewData.ToList().ForEach(x => ViewData.Add(x.Key, x.Value));
    return View();
}
  1. Et voila! You now have access to the same ViewModel for each of your child renderings.
@{
    Layout = null;
    var viewModel = ViewData["_SharedModel"];
}

Making it better

MVC offers us even better tools to remove code duplication. If you have a lot of child renderings needing access to your shared ViewModel, adding the code in step 3 will happen a lot. Let’s refactor that to an filter attribute.

public class RetrieveViewDataFilter : ActionFilterAttribute, IActionFilter
{
    public void OnActionExecuting(ActionExecutingContext filterContext)
    {
        //Merge ViewData from context
        var contextViewData = ContextService.Get().GetCurrent().ViewData;
        contextViewData.ToList().ForEach(x => filterContext.Controller.ViewData.Add(x.Key, x.Value));
    }

    public void OnActionExecuted(ActionExecutedContext filterContext)
    {
    }
}

Now, we just need to add this attribute to any Action Methods who may want to access shared ViewData from higher up in the stack

[RetrieveViewDataFilter]
public ActionResult ChildRendering()
{
    return View();
}

There we go. I’m sure Sitecore will amend their implementation at some point, but until then, we have an immutable, single direction ViewData flow.