Sitecore 7 & Solr – Multi term search

Share Button

The Solr search provider in Sitecore 7 works quite well if the search query contains a single term. However, if the search query contains multiple terms, you will need to tweak your code a little bit to get the expected results.

The following codes performs a search using a query that contains a single term.

 var indexName = "sitecore_web_index";
 var index = ContentSearchManager.GetIndex(indexName);
 using (var context = index.CreateSearchContext())
 {
 var query = "paint";
 var dataQuery = context.GetQueryable()
 .Where(i => i.Title == query);
 var results = dataQuery.GetResults().Hits.Select(h => h.Document);
 }

The Sitecore search provider generates the following Solr query:

title_t:(paint)

However, if you replace the

var query = "paint";

With

var query = "purple paint";

The generated Solr query will be:

title_t:("purple paint")

If the search query contains whitespace characters, the search provider will automatically wrap the query terms in double quotes. This will cause Solr to skip the query analysers – such as tokenization and stemming. So, it will only return results containing the string “purple paint” in the title, and will fail to match titles such as “purple-paint”, or “purple and red paint” and so on.

In order to work around the whitespace problem, you can implement one of the following solutions:

Solution 1 – Using the PredicateBuilder

The idea here is to avoid generating any queries that contain whitespace characters by converting the multiple terms queries into OR predicates as follows

         var indexName = "sitecore_web_index";
         var index = ContentSearchManager.GetIndex(indexName);        
         string keyword = "purple paint";
         var predicate = PredicateBuilder.True();
         foreach (var word in keyword.Split(new []{' '}))
         {
             predicate = predicate.Or(item => item.Title == word);
         }
         using (var context = index.CreateSearchContext())
         {
             var dataQuery = context.GetQueryable().Where(predicate);
         }

This will generate the following Solr query:

 title_t:(purple) OR title_t:(paint)

Which is an OK solution, but I am not a big fan of it because:

  1. Solr can perfectly handle the multi-terms queries if the generate Solr query doesn’t contain double quotes. i.e. title_t:(purple paint).
  2. The above code snippet looks simple because the query is used to search one field (title_t). However, if the query is used to search multiple fields such as title, description, tags and taxonomy. You will end up writing a predicate for each field (4 times in this case), which is a bit messy.

Solution 2 – Let Solr sort it out for you

The default configuration of the Solr search provider in Sitecore 7 maps the Sitecore text data types such as Rich Text, Single-Line Text and Multi-Line Text to the Solr text_general type.

The Solr text_general type is configured by default to use a set of tokenizers including the solr.StandardTokenizerFactory which splits the query into tokens using whitespace and punctuation characters as delimiters.

Luckily the Solr search provider doesn’t wrap the generated query in double quotes if the query contains punctuation characters. So, a simple query reformat function can fix the problem as follows.

private string FormatQuery(string query)
 {
    if (!string.IsNullOrEmpty(query)
    {
     return query.Trim(' ').Replace(' ', '@');
    }
    return query;
}

And the search code should look like:

 var indexName = "sitecore_web_index";
 var index = ContentSearchManager.GetIndex(indexName);
 using (var context = index.CreateSearchContext())
 {
 var query = FormatQuery("purple paint");
 var dataQuery = context.GetQueryable()
 .Where(i => i.Title == query);
 var results = dataQuery.GetResults().Hits.Select(h => h.Document);
 }

And the generated Solr Query will look like:

title_t:(purple@paint)

Which will be tokenised automatically by Solr to (purple) and (paint).

 

2 Comments

  1. Ali said:

    Hi
    you can use following line in your loop to get the desired results.
    predicate = predicate.AND(item => item.Title == word);

    change “OR” to “AND”

    December 22, 2015
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *