Advanced Tagging with Sitecore Cortex and TextAnalytics

|
Comments
(0)

Cortex comes as one of the most awaited new features of Sitecore 9.1. Bringing to our beloved platform the power of Artificial Intelligence, Cortex can be seen for developers as a set of pre-cooked Sitecore patterns and structures to integrate with external AI services in different scenarios. 

Automatic AI-based Content Tagging is one of the things that can be done with Cortex. You can check the official documentation on how it works for content editors, it's pretty straight forward. When the process is finished, your items will have the "Semantics" field populated with automatically generated Tags. Pretty neat!

Text Analytics Tags

These tags can be used for further rule-based personalization, or feed and train an external AI API and achieve more sophisticated things, such AI-based recommendations.

The native implementation uses Open Calais API as Artificial Intelligence, and this same service is used by SXA at the multi-site tagging mechanism.

Using Text Analytics with Cortex for advanced Content Tagging

We introduce the new Sitecore ContentTagging Text Analytics for advanced AI-based Content Tagging with Cortex.

The module not only replaces the original Open Calais endpoint with Microsoft Text Analytics, but also adds some important features:

  1. Monitored fields - Ability to set up specific fields as the source of content used by Text Analytics for tagging, permitting the exclusion of content that might not be relevant for the AI;

  2. Skip unchanged items - Only items with changes in the monitored fields will be processed, speeding up the mass-tagging process and saving requests to the Web API;

  3. Selective tagging - You can specify what templates will be processed by making them inherit from our base template _TextAnalyticsEntity;

  4. Advanced tagging - You can specify a pattern for the tag naming, using the placeholder words {key} and {value}.
    Eg: The default configuration "{key} - {value}" will create items with the following names:
    Text Analytics Tags

  5. Entity Detection - The tags are created based on detected entities, returned by the Entity Recognition method. The whole information is also stored in a Name-Value List field "Entities".
    These values can be used for fast retrieving data to feed and train external AI APIs.
    Text Analytics Entities

  6. Sentiment Analysis - your content will be tagged according to the sentiment expressed, in a rating that goes from 0 (negative) to 1 (positive). Check this post to understand how it works.
    This information is available in the form of a numeric value provided by Text Analytics, and as a Sentiment Range (Negative, Neutral or Positive) that also can be customized.
    Text Analytics Sentiments

  7. Turn endpoints on and off - Sentiment Analysis and Entity Detection can be individually enabled or disabled

What you will need

Before you start, make sure that you have:

  1. Sitecore 9.1
  2. A valid Cognitive Services account
  3. Signed-up to Text Analytics API
  4. The access key and endpoint for Text Analytics

Installation

The module installation is pretty straight-forward:

  1. Download the module package here - make sure to get the newest version;
  2. Install the package using the Installation Wizard

In scaled environments, you only need to install this module at your Content Management nodes.

Configuration

After installing this package, make sure to go through the following steps. 

  1. Open /App_Config/Include/Sitecore.ContentTagging.TextAnalytics.config
  2. Add your TextAnalytics subscription key to the setting Sitecore.ContentTagging.TextAnalytics.SubscriptionKey
  3. Add your TextAnalytics endpoint to the setting Sitecore.ContentTagging.TextAnalytics.EndPoint
  4. Adjust the setting Sitecore.ContentTagging.TextAnalytics.FieldsToProcess to reflect your field names (or let it empty to use all fields)
  5. To have your items processed and tagged, activate their templates by making them inherit from /sitecore/templates/Modules/TextAnalyticsTagging/_Text Analytics Entity

Settings

Here is a list of the available settings

  1. Sitecore.ContentTagging.TextAnalytics.SubscriptionKey - Text Analytics subscription key
  2. Sitecore.ContentTagging.TextAnalytics.EndPoint - Text Analytics endpoint
  3. Sitecore.ContentTagging.TextAnalytics.DefaultConfigurationName - Configuration name for TextAnalytics
  4. Sitecore.ContentTagging.TextAnalytics.SentimentRangesRepository - Path or ID for the sentiments range repository
  5. Sitecore.ContentTagging.TextAnalytics.TagRepository - Path or ID for the tag repository (default is the path that Sitecore OOTB)
  6. Sitecore.ContentTagging.TextAnalytics.TagNamePattern - Naming pattern to use when creating tag items - {key} and {value} will be replaced by respective values
  7. Sitecore.ContentTagging.TextAnalytics.FieldsToProcess - Fields used as content to feed the TextAnalyzer - A pipe-delimited list of field names or IDs
    When empty, all fields are used, except internals (starting with "__")
  8. Sitecore.ContentTagging.TextAnalytics.SentimentAnalysisEnabled - Enable/Disable Sentiment Detection
  9. Sitecore.ContentTagging.TextAnalytics.EntityAnalysisEnabled - Enable/Disable Entity Detection for Tags

Source code

You can find source code here - feel free to clone and contribute!