The out-of-the-box Sitefinity CMS search indexing is based on Lucene.NET. Lucene uses a combination of the Vector Space Model (VSM) and the Boolean model of information Retrieval to determine how relevant a document is to a user's query. It assigns a default score between 0 and 1 to all search results, depending on multiple factors related to document relevancy. The score is dynamically calculated between multiple searches, meaning that same document can have different scores for different searches. This is due to the Lucene score normalization algorithms.
Sitefinity CMS exposes a mechanism for influencing the Lucene search results via choosing the best algorithm to calculate search score and boosting selected documents. This article explains how you can customize the Lucene scoring in Sitefinity CMS.
A common use case scenario is boosting the recently modified documents to appear as more relevant search results. To demonstrate customizing the default scoring mechanism, we’ll showcase this example. When customizing the Lucene scoring mechanism in Sitefinity CMS, the Sitefinity CMS API exposes the default Lucene score and all the document info, so you can design multiple approaches to boosting the score:
finalScore = defaultScore * (1/contentAge)
A multiplier function is when you design a value which will be used to multiply the default Lucene score. To boost documents based on how recent they are, content age is the most suitable value to consider. Content age represents the difference between now and the time the document was last modified. One disadvantage of this approach is that the multiplier function is linear and will not work very well when contentAge is 0. Another possible problem might be the maximum of the multiplier function becoming too huge, thus making the default score irrelevant.
finalScore = defaultScore * (1/(constant + contentAge))
An alternative approach is adding a constant to the formula, where the constant can be any number, depending how much we want to boost the new results. For example, 2 does the job relatively well.
Adding a constant makes the boosting function still linear, but it has an improved effect on boosting recent items more aggressively than older results.
finalScore = defaultScore * ((boostFactor / (maxRampFactor + days)) ^ (1 / curveAdjustmentFactor))
To address the potential of the boosting function to behave too linear, you can use more than one constant to introduce variables such as boostFactor, maxRampFactor and curveAdjustmentFactor. For example, a function that is getting the job well done could be:
finalScore = defaultScore * ((100 / (5 + days)) ^ (1 / 5))
To understand better how to fine-tune these constants fit your preference, preferences, refer to the following diagram visualizing the boosting formula:
To implement the custom Lucene scoring you need to plug in to the Sitefinity CMS LuceneSearchService and replace the default scoring algorithm with a custom one that inherits form the Lucene CustomScoreQueryclass.
To create a custom score query, you must start by adding a new class which inherits from the Lucene CustomScoreProvider. This provider is responsible for the search score logic. Inside the new class you must override the CustomScore method. This method gives you access to the Lucene document and the default score, which you can obtain by making a call to the base class method. From the document object you can extract the LastModified field value and use it to determine the document age in days. Now that you have access to the content age and default score, you can implement your desired custom scoring logic. For example, to implement an exponential boosting function with several constants, as described earlier in this article, you can add a method in your custom provider called CalculateBoost. You can call this method from the CustomScore method and pass the calculated content age as a parameter. Inside CalculateBoost you can calculate a boost value based on the additional constants you define and the content age input. Finally, you can return the calculated boost value, and use it inside the CustomScore method to adjust the default score (adjustedScore = baseScore * boost).
adjustedScore = baseScore * boost
Once you have completed implementing the custom score provider, you must add a new class and inherit from the Lucene CustomScoreQuery class. Inside this class you must override the GetCustomScoreProvider method, which instructs Lucene which provider to use when determining the search score. In the overridden GetCustomScoreProvider method you must return your custom score provider. The following code sample demonstrates the full implementation:
To configure Sitefinity CMS to use your custom score logic, you must create a custom LuceneSearchService, where you will return the custom score query instead of the default one.
You must start by adding a new class which inherits from the Sitefinity CMS LuceneSearchService class. Inside the new class, override the BuildLuceneQuery method. In your implementation of the BuildLuceneQuery method you must get an instance of the Lucene QueryParser, and parse the compiled query, which comes as a method argument. Then you must instantiate your custom score query class and pass the parsed query as an argument. Finally, return the object that is constructed by your custom score query class from the BuildLuceneQuery method. This way the parsed query will go through your custom logic and will be passed back to the Sitefinity CMS default code flow. The following sample demonstrates implementing a custom LuceneSearchService to achieve this functionality:
To complete the task, you must replace the default LuceneSearchService with your custom one. You can do this either through the Sitefinity CMS administrative backend or inside your website Global.asax class.
To replace the default LuceneSearchService with your custom one via configurations, follow these steps:
Alternatively, you can replace the default LuceneSearchService with your custom one through code via the Sitefinity CMS ServiceBus implementation. To do this, implement the following code inside your Global.asax:
NOTE: You can use the approach described in this article to boost your content search score based on any other field, using any custom algorithm. Just choose the formula that best represent the boost significance for your specific case and modify the default boost. You can also chain multiple boosting formulas.
Back To Top
To submit feedback, please update your cookie settings and allow the usage of Functional cookies.
Your feedback about this content is important
Copyright © 2023 Progress Software Corporation and/or its subsidiaries or affiliates.
All Rights Reserved.
Progress, Telerik, Ipswitch, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks for appropriate markings.
Powered by Progress Sitefinity