Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2023/track-changes-rich-text

Tracking content changes for Rich Text

Some config defaults aren't right for every circumstance

Published 30 January 2023

A requirement which comes up every so often is that external systems need to know about changes to content that lives in Sitecore. As with most technical problems, there are a variety of ways that you can solve a problem like this, and they all have different pros and cons. One of my colleagues has been working on a project like this recently, and the approach required there meant we bumped up against an interesting configuration challenge. If you're writing code that monitors content changes you might need to think about this:

The challenge

If you find yourself needing to track changes in content from outside of your Sitecore instance, one of the possible options you have to look at are Sitecore's events database tables. Content changes that happen during editing and publishing get written into these, giving you an option to query them back from systems which don't natively know about Sitecore. Each event gets a row, and it includes a blob of json data to describe what changed.

For example, saving some changes to an item might produce a row like this:

A SQL query against the events table returning a row of change data

And when you look at the data stored in the InstanceData field, the json looks like:

{
  "ItemId": "3df2d3e5-5cd0-45db-a8cb-79ed5ed25236",
  "ItemName": "Rich Example",
  "LanguageName": "en",
  "TemplateId": "3ef9d40b-e62f-4ac3-8f31-8f146346ec27",
  "VersionNumber": 1,
  "FieldChanges": [
    {
      "FieldId": "d9cf14b1-fa16-4ba6-9288-e8a174d4d522",
      "OriginalValue": "20221105T191213Z",
      "Value": "20221105T191406Z"
    },
    {
      "FieldId": "8cdc337e-a112-42fb-bbb4-4143751e123f",
      "OriginalValue": "f6380303-c646-4f12-8c01-7dfec3b58574",
      "Value": "542ad055-06f5-4968-a2a7-25692141c9c2"
    },
    {
      "FieldId": "88cfa053-b4de-45a1-b1f1-81bc712a184e",
      "OriginalValue": "A plain text value",
      "Value": "New plain text value!"
    },
    {
      "FieldId": "badd9cf9-53e0-4d0c-bcc0-2d784c282f6a",
      "OriginalValue": "sitecore\\Admin",
      "Value": "sitecore\\Admin"
    }
  ],
  "IsSharedFieldChanged": false,
  "IsUnversionedFieldChanged": false,
  "PropertyChanges": []
}

					

It's showing the changes recorded - or so you might assume...

But if you take a look at the schema for the example item that changed here, you'll note there is more than one content field, and I changed both of them to get the data above:

Content Editor showing two content fields

When I match up the field IDs, the "Simple Copy" field is present there along with some "when was the item changed" fields, but the "Fancy Copy" rich text field in the schema isn't, despite its data having changed too... So why is that?

An explanation

Well, for performance reasons not all fields get written into the event tables by default. If you dig into the configuration of your site, in the sitecore.config file you'll find this:

<sitecore>
    ...snip...

    <!--  EVENT SETTINGS
    Here is a list of settings for different event types.
    -->
    <eventSettings>
        <!--  SAVED ITEM REMOTE SETTINGS
        The settings that control the item:saved:remote event.
        -->
        <savedItemRemoteSettings type="Sitecore.Events.Settings.SavedItemRemoteSettings, Sitecore.Kernel">
            <!--  EXCLUDE FIELD'S TYPES
             This setting allows you to specify which types of fields shouldn't be serialized when the item:saved:remote event
             is triggered and the EventQueue.SavedItemRemote.SerializeAllFields setting is set to 'true'.
            -->
            <exclude hint="list:ExcludeType">
                <Text>Rich Text</Text>
                <Text>Word Document</Text>
                <Text>html</Text>
            </exclude>
        </savedItemRemoteSettings>
    </eventSettings>

    ...snip...
</sitecore>

					

By default the eventing framework deliberately doesn't record the changes to some fields which might have particularly large values. It makes sense as a performance improvement, to avoid spamming your databases. But there are some circumstances where you might want these things recorded. And the "monitoring content changes externally" scenario is one of them. With some planning, it's easy enough to configure your system to clear out old rows from the events tables after a sensible period of time - so you can work to reduce the impact of tracking these things if you need to.

So if you do need to track these changes, you'll need a config patch to change this...

It seems like this should be easy - patching Sitecore config is usually simple. But the structure of this bit of config is trickier than many. Having a <Text/> element which has no attributes makes it difficult to write a <patch:delete/> command. You can replace the <exclude/> parent to set its children, but that's a slightly inelegant solution which has the risk of overwriting changes from other patches.

But Google solved this issue for me, by pointing me at a post from Jason St-Cyr from back in 2016. By using a patch:instead command which matches the inner text of the element, you can substitute in a patch:delete:

<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <eventSettings>
      <savedItemRemoteSettings type="Sitecore.Events.Settings.SavedItemRemoteSettings, Sitecore.Kernel">
        <exclude hint="list:ExcludeType">
          <Text patch:instead="Text[.='Rich Text']">
            <patch:delete/>
          </Text>
        </exclude>
      </savedItemRemoteSettings>
    </eventSettings>
  </sitecore>
</configuration>

					

By applying that patch Sitecore will start to record the change data for rich text fields, and when you look at the InstanceData field in the events table you'll see those changes:

{
  "ItemId": "3df2d3e5-5cd0-45db-a8cb-79ed5ed25236",
  "ItemName": "Rich Example",
  "LanguageName": "en",
  "TemplateId": "3ef9d40b-e62f-4ac3-8f31-8f146346ec27",
  "VersionNumber": 1,
  "FieldChanges": [
    {
      "FieldId": "11e266a0-4b59-49c6-be4d-10561f9a805e",
      "OriginalValue": "Rich text value!",
      "Value": "New rich text value!"
    },
    {
      "FieldId": "d9cf14b1-fa16-4ba6-9288-e8a174d4d522",
      "OriginalValue": "20221105T191406Z",
      "Value": "20221105T215505Z"
    },
    {
      "FieldId": "8cdc337e-a112-42fb-bbb4-4143751e123f",
      "OriginalValue": "542ad055-06f5-4968-a2a7-25692141c9c2",
      "Value": "660383c0-7873-4f48-8297-5c52aabaddee"
    },
    {
      "FieldId": "88cfa053-b4de-45a1-b1f1-81bc712a184e",
      "OriginalValue": "Plain text value!",
      "Value": "New plain text value!"
    },
    {
      "FieldId": "badd9cf9-53e0-4d0c-bcc0-2d784c282f6a",
      "OriginalValue": "sitecore\\Admin",
      "Value": "sitecore\\Admin"
    }
  ],
  "IsSharedFieldChanged": false,
  "IsUnversionedFieldChanged": false,
  "PropertyChanges": []
}

					

So if you need it, there's your answer...

Footnote: This has been sitting in my post queue for a while now. Since I wrote it Sitecore 10.3 got released, including Webhook connectivity for editorial events. If you're considering code to track changes from outside, you might find that helps you if you can use the v10.3 release. Sitecore would argue we should avoid spelunking through their database tables for support reasons, so using the supported Webhook endpoints reduces the risk a bit. Though the config issue above likely still applies.

↑ Back to top