Extending Sitefinity Search and searching by category

Extending Sitefinity Search and searching by category

Posted on October 25, 2012 0 Comments
Extending Sitefinity Search

The content you're reading is getting on in years
This post is on the older side and its content may be out of date.
Be sure to visit our blogs homepage for our latest news, updates and information.

Overview

Sitefinity Search, true to the Sitefinity mantra, provides not only powerful set of features and a user friendly UI to manage the functionality, but also an interesting extension point that gives developers freedom to make all necessary  tweaks to meet any requirements.

In this blog post we are going to discuss the different extension points that the search feature provides and illustrate them in a sample that uses the different hooks in the Search Utility - index, search widget and search results widget.

The example

To illustrate the ideas discussed in this blog post we are going to cover a code sample. In this sample we are going to achieve searching by category. This consists of:

  • Including categories in our search index document and our search results
  • Extending the search box to include a dropdown for our existing categories
  • Extending the result widget to pass a query to Lucene that filters by the category 



But first lets get down to some of the core concepts of Sitefinity search and how it can be extended.

Extending Search

In terms of usability, Sitefinity search gives administrators and business users the ability set up very sophisticated search functionality. But it also goes beyond this. It gives developers the ground to be creative and extend the search functionality based on ideas and requirements that may not necessarily fit inside a box and we are going to discuss exactly those extension points in this post. To get in a bit more details on this let us first talk about the engine that Sitefinity Search is based on.

Sitefinity and Lucene .NET

Sitefinity provides its own implementation of the increasingly popular Lucene .NET search engine. A few benefits of this could be mentioned:

  • The search utility provides support for the immense list of great features that Lucene provides including scalability, performance, efficient search algorithms, fielded searching, simultaneous updates and searching, powerful query types etc.
  • The search utility is entirely in the hands of the Sitefinity development team and there is no need to rely on a third party search appliance that may or may not provide extension points. This from one perspective allows us to create great features like content aware search, search in documents, search in custom fields, search in custom modules and so on. 
  • On the other side it gives you the capability to utilize our extension points and meet any requirements at hand. With Lucene Sitefinity can expose hooks to the search functionality through different types of extension points allowing developers to hook into the engine and customize the search process to very powerful extent.
  • Sitefinity Search does not require additional licensing or rely on expensive search services and appliances

Core components of search and indexing

Each search engine essentially consists of a crawler(spider),an index and a search box. Crawling is really not the right term for built-in search engines as the CMS lets the index know about the changes and no crawling is needed.

Indexing is the process of extracting plain text from the CMS items (pages, blog posts, news) and saving it in an optimized structure for fast search in these contents.

Sitefinity creates and writes to index files that are stored in the file system – in the App_Data folder of the Sitefinity site.

What actually gets indexed?

question markThe fields that are getting indexed are by default Title, Content, Summary, ContentType, OriginalItemId, Link, DateCreated, PipeId, Language, IdentityField. This is enough to resolve each content item, but there is a reason we are actually indexing stuff using Lucene instead of querying the database – performance. Adding additional fields is supported out of the box


How the search works

Now that we have grounding on what the engine around the search indexes is, let’s look at the search process itself. Each time a user creates content (e.g news, events, lists item, custom content, page) the publishing system kicks in.

When you publish you practically invoke a content pipe that persists data to the publishing point. In more simple terms Sitefinity not only writes data to the database but also exposes this data and ask other systems – hey do you want to do your thing with that data, without bothering me with the details.  The point of asking this question is in technical terms the publishing point where thereafter the search outbound pipe gets all the data it wants and writes into the Lucene indices. Much in the same way the RSS pipe takes this data to expose an RSS feed or the twitter pipe...well...tweets.

From this point on things are in the court of the Search Pipe to take Sitefinity content and translate it to something Lucene can understand. The pipe sends all the needed data to the Search Provider which then writes to the Lucene files. The search index technically is a set of segmented binary files and you can think of it as the physical representation of the content item, with all the relevant fields persisted. The files themselves consist of a bunch of binary data that would probably not tell you much if you open it up in notepad, but you can think of it as key:value pairs for field:indexed content.

So as we keep publishing and editing content in Sitefinity those search indexes are automatically updated for us. Or, when we click reindex this is done in batch. 

the process of writing to the search index
How Sitefinity writes to the search index

What happens when we search

The search box widget is probably the simplest widget you can think of, it simply points the URL to a search results page.

There the search results widget is really the one talking to the Sitefinity APIs – the API here is the Search Service that passes the query to Lucene. So if we were to look at the anatomy of search those are the 3 extensible components that we have as well as the different extension points exposed by them



What parts can be extended?

Now that we have a better understanding of the entire Search infrastructure, a logical question that might arise is where can this be extended.

1)      Presentation – the search box and the search result widgets are based on a widget template that can be modified to display practically anything: fields, images, promoted searches. Additionally code behind can be added to this results widget. Since Lucene gives us back enough information about the content items that are currently indexed (outside of the indexed fields there is also metadata like the item GUID, URL, default page and type that in essence give you everything you need to extend this infinitely.

2)      Fields that are indexed. Sitefinity gives you the power to define custom fields and custom modules. These are not only part of the search infrastructure but you can also easily define where you want Sitefinity to search. For more information, check this blog post

3)      Search Queries. The queries you type in have the full power of the Lucene framework, which means:

  • Wildcards (Party in New * returns Party in New York and Party in New Mexico) ,phrased queries (“new york”)
  • Proximity queries ("foo bar"~4 means foo and bar are 4 words away)
  • Keyword matching (title:"foo bar" AND body:"quick fox")
  • Boolean conditions(title:Sitefinity AND content:Party)
  • Boosts (title:foo OR title:bar)^1.5 (body:foo OR body:bar) means the title will carry heavier weight)
  • etc.

Type in any of those to experiment how they relate to the search results. In the sample we discuss an elegant way that achieves keyword matching

4) The search input. You can manipulate the query that is passed to the results page and ultimately to Lucene, and by doing this you can easily add any type of functionality and logic behind search. In the code sample we are going to explore this option

5) Using the Search Service independently. The search service exposes public APIs which you can use to search for items in your own custom development. A great example can be seen in this Autocomplete blog post

Code Sample Walkthrough

In this code sample we are setting up search by category. Here are the steps that are involved in the initial preparation

  • Set up search to index the categories field. Reindex afterwards to have it index all content.
  • Add the project attached to this blog post to your solution and add reference SitefinityWebApp to it. Fix the assembly references if you are running a version different than Sitefinity 5.2
  • Register both the new search box and search results widget in the toolbox using Sitefinity Thunder
  • Add a virtual path for our search box widget
  • Just for kicks make the search results widget also highlight the category field using the Advanced Options by having Title,Content,Categories as search fields
  • Now kindly ask your users type in AND Categories:Community every time they search.
Kidding. Although this would actually work, because of the way you can query Lucene. But we will quickly build a great looking UI for our customers to choose from categories and submit the search. Here is what happens in the code sample:

In the project we have added the following tweaks to the search box:
  • A ComboBox displayed on the template(might as well be any other control)
  • This ComboBox is data bound to the categories on InitializeControls.
  • The Javascript takes the category selected and sends an additional query parameter to the search results page.

We have also added logic for modifying the search results.

This is actually the fun part – we hook directly into the Lucene and specifically in the place where Lucene does it’s query building. By inheriting the SearchResults widget you can override the BuildSearchQuery method. Here the code sample provides a nice extension method that generates the search query based on terms, AND/OR conditions etc. and it is configured explicitly to support categorized search as well. The categorized sample is only one of many examples you can explore with this mechanism to add any additional logic to your search query.

     var queryBuilder = service.CreateQueryBuilder();
            if (category != null)
            {
                queryBuilder.CreateGroup(QueryOperator.And);
            }
 
...
 
            if (category != null)
            {
                queryBuilder.AddTerm("Categories", HttpUtility.UrlDecode(category));
                queryBuilder.CloseGroup(); // close the category group
            }
 
            var compiledQuery = queryBuilder.GetQuery();


You can download the code sample below:

Search By Category code sample

Happy Searching!

Sitefinity-CMS-trial-blogs-banner1
progress-logo

The Progress Team

View all posts from The Progress Team on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.

Comments

Comments are disabled in preview mode.
Topics

Sitefinity Training and Certification Now Available.

Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.

Learn More
Latest Stories
in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation