How to: Exclude a Page from Sitefinity Internal Search

How to: Exclude a Page from Sitefinity Internal Search

June 20, 2013 0 Comments

The content you're reading is getting on in years
This post is on the older side and its content may be out of date.
Be sure to visit our blogs homepage for our latest news, updates and information.

As you probably know, in Sitefinity CMS it is easy to disable page indexing from external search crawlers (like Google bot, etc.) by unchecking the "Allow search engines to index this page" property. However, that page will still be indexed by the internal Sitefinity search engine and will appear in the list of search results on your web site.

Use the steps below to gain more control over what pages are indexed automatically by Sitefinity. 

1. In Visual Studio create a class that inherits the PageInboundPipe class from the Telerik.Sitefinity.Publishing.Pipes namespace. Override its LoadPageNodes method:

public class PagePipeNoIndex : PageInboundPipe
{
    protected override IEnumerable<PageNode> LoadPageNodes()
    {
        return base.LoadPageNodes().Where(n => this.CanProcessItem(n));
    }
 
    public override bool CanProcessItem(object item)
    {
        if (item == null)
            return false;
 
        if (item is PageData)
        {
            var pageData = item as PageData;
            if (pageData.NavigationNode.IsBackend)
            {
                return false;
            }
            if (!pageData.Crawlable)
            {
                return false;
            }
        }
 
        if (item is PageNode)
        {
            var pageNode = (PageNode)item;
 
            if (pageNode.IsBackend)
                return false;
 
            if ((pageNode.NodeType != NodeType.Standard && pageNode.NodeType != NodeType.External) || !pageNode.Page.Crawlable)
            {
                return false;
            }
        }
 
        return base.CanProcessItem(item);
    }       
}

This method is invoked every time Sitefinity needs to update its pages' search index (e.g. a new page is created or an old page is updated). It will check the value of the Crawlable property which corresponds to the status of the "Allow search engines to index this page" checkbox and will not add the item to the index if it is unchecked. 

2. Replace the internal page pipe with our custom pipe from above - this is done in Global.asax.cs file as follows:

public class Global : System.Web.HttpApplication
{
    protected void Application_Start(object sender, EventArgs e)
    {
        Bootstrapper.Initialized += Bootstrapper_Initialized;
    }
 
    void Bootstrapper_Initialized(object sender, Telerik.Sitefinity.Data.ExecutedEventArgs e)
    {
        if (e.CommandName == "Bootstrapped")
        {
            ReplacePagePipeWithCustomPagePipe();
        }
    }
 
    private void ReplacePagePipeWithCustomPagePipe()
    {
        //Remove the default page pipe
        PublishingSystemFactory.UnregisterPipe(PageInboundPipe.PipeName);
 
        //This code will add the PagePipeNoIndex to the registered pipes with the original page pipe name
        //so when the publishing system try's to use the page pipe will use the new one
        PublishingSystemFactory.RegisterPipe(PageInboundPipe.PipeName, typeof(PagePipeNoIndex));
    }
...
}

That's it, build the project and from now on if you uncheck the "Allow search engines to index this page" checkbox the page will be hidden from both the external and internal search crawlers.

To learn more about the Publishing system in Sitefinity CMS please check this blog post or the online documentation.

Veselin Vasilev

View all posts from Veselin Vasilev on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.

Comments
Comments are disabled in preview mode.
Topics
 
 
Latest Stories in
Your Inbox
Subscribe
More From Progress
d12fcc0bdb669b804e7f71198c9619a7
5 Questions Automakers Should Ask to Improve Asset Uptime
Download Whitepaper
 
SF_MQ_WCM
2018 Gartner Magic Quadrant Web Content Management (WCM)
Download Whitepaper
 
What-Serverless-Means-For-Enterprice-Apps-Kinvey
What Serverless Means for Enterprise Apps
Watch Webinar