Index the contents of Excel files

To be able to index the contents of .xlsx files you must create a text extractor that gets the data from published .xlsx files and adds it to a search index.

Perform the following:

  1. Open your project in Visual Studio.
  2. Open the Package Manager Console and install the following NuGet package from Sitefinity's CMS NuGet repository:
    Install-Package Telerik.Windows.Documents.Spreadsheet -Version 2015.1.401.40
  3. Build the project and open it.
  4. Add the XlsxTextExtractor class to your project.
  5. Use the following steps that guide you through the implementation of the class:
    1. Inherit from ITextExtractor.
    2. Define the private members.
    3. Get the mime type.
    4. Implement the GetText method of XlsxTextExtractor class.
    5. Implement the private method ExecuteDocumentAction.

      GitHubLogo SAMPLE: For more information, see the XlsxTextExtractor.cs class the in Sitefinity's CMS documentation-samples GitHub repository.

  6. In Sitefinity CMS backend, navigate to Administration » Settings » Advanced » DocumentService » ExtractorSettings » Create new.
  7. In MimeType field, enter application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  8. In ExtractorType field, enter SitefinityWebApp.XlsxTextExtractor, SitefinityWebApp
  9. Save your changes and restart the application.
  10. Create a search index for documents and reindex it.
    For more information, see Administration: Create search indexes.

Was this article helpful?