Skip to content

Building a Custom Sitecore RenderField Processor to Validate Internal Links in RTE Fields

Broken internal links are one of those problems that quietly pile up in a Sitecore solution. Editors paste content, pages get renamed or moved, and suddenly your rich text fields are full of links that no longer go anywhere.

Sitecore doesn’t validate internal links at render time by default. If a link points to a deleted or unpublished item, it still renders. That’s bad for user experience, SEO, and confidence in the CMS.

In this post, I’ll walk through how to build a custom RenderField processor that validates internal links inside Rich Text Editor (RTE) fields and prevents broken links from rendering.

The Problem

Most Sitecore solutions rely heavily on RTE fields for flexible content. Editors can add links using the internal link picker, which stores links like this:

<a href="~/link.aspx?_id=GUID&_z=z">Some link</a>

At render time, Sitecore resolves this to a friendly URL. But if the target item is:

  • Deleted
  • Not published
  • Not accessible in the current language

the link still renders, often pointing to a 404.

You usually don’t notice until users report it.

Why Use a RenderField Processor?

Sitecore’s RenderField pipeline runs every time a field is rendered. That makes it a perfect place to inspect and modify RTE output before it hits the page.

With a custom processor, you can:

  • Detect internal links in RTE fields
  • Resolve the target item
  • Validate publishing and language
  • Remove or disable invalid links
  • Optionally log or flag the issue

All without changing templates or editor behavior.

High-Level Approach

The solution follows these steps:

  1. Run only for RTE fields
  2. Parse the rendered HTML
  3. Find internal Sitecore links
  4. Resolve the target item
  5. Validate the item
  6. Modify the output if the link is invalid

Creating the Custom Processor

First, create a processor

/// <summary>
/// Custom Sitecore RenderField processor that validates internal links in RTE fields.
/// Broken internal links are redirected to a 404 page.
/// </summary>
public class RTEContentResolver
{
    private const string InternalLinkPattern = @"~/link\.aspx\?_id=([A-Fa-f0-9]{32})&amp;_z=z";
    private const string NotFoundPageId = "{D7AEB6F8-A175-4559-ADA1-462E9EEEA3E2}";

    public virtual void Process(RenderFieldArgs args)
    {
        Assert.ArgumentNotNull(args, nameof(args));

        try
        {
            if (args.Item == null || args.Result == null)
            {
                return;
            }

            if (!string.Equals(args.FieldTypeKey, "rich text", StringComparison.OrdinalIgnoreCase))
            {
                return;
            }

            var fieldValue = args.FieldValue;
            if (string.IsNullOrWhiteSpace(fieldValue))
            {
                return;
            }

            var resolvedContent = ResolveRTELinks(fieldValue);

            // Update the result if links were modified
            if (!fieldValue.Equals(resolvedContent))
            {
                args.Result.FirstPart = resolvedContent;
            }
        }
        catch (Exception ex)
        {
            Log.Error($"Error in RTEContentResolver: {ex.Message}", ex, this);
        }
    }

    /// <summary>
    /// Resolves and validates all internal links in RTE content.
    /// </summary>
    /// <param name="rteContent">The RTE field content</param>
    /// <returns>The RTE content with validated links</returns>
    private string ResolveRTELinks(string rteContent)
    {
        if (string.IsNullOrWhiteSpace(rteContent))
        {
            return rteContent;
        }

        var regex = new Regex(InternalLinkPattern, RegexOptions.IgnoreCase);
        var matches = regex.Matches(rteContent);
        var result = rteContent;

        foreach (Match match in matches)
        {
            var itemIdString = match.Groups[1].Value;

            if (TryParseItemId(itemIdString, out var itemId))
            {
                // Check if item exists in web database and is published
                if (!IsItemPublished(itemId))
                {
                    var notFoundItemId = new ID(NotFoundPageId);
                    var originalLink = match.Value;
                    var replacementLink = $"~/link.aspx?_id={notFoundItemId:N}&amp;_z=z";
                    result = result.Replace(originalLink, replacementLink);
                    Log.Warn($"Broken or unpublished link found in RTE content. ItemId: {itemId} - Redirecting to 404 page.", this);
                }
            }
        }

        return result;
    }

    /// <summary>
    /// Attempts to parse a string as a valid Sitecore Item ID.
    /// </summary>
    private bool TryParseItemId(string itemIdString, out ID itemId)
    {
        itemId = null;

        if (string.IsNullOrWhiteSpace(itemIdString))
        {
            return false;
        }

        try
        {
            itemId = new ID(itemIdString);
            return !itemId.IsNull;
        }
        catch
        {
            return false;
        }
    }

    /// <summary>
    /// Checks if an item exists and is published in the web database.
    /// </summary>
    private bool IsItemPublished(ID itemId)
    {
        if (itemId.IsNull)
        {
            return false;
        }

        var webDatabase = Sitecore.Configuration.Factory.GetDatabase("web");
        if (webDatabase == null)
        {
            return false;
        }

        var item = webDatabase.GetItem(itemId);
        if (item == null)
        {
            return false;
        }

        // Check if item is published and accessible
        return item.Versions.Count > 0;
    }
}

This example removes invalid links entirely but keeps the inner text. You could also:

  • Replace with a span
  • Add a CSS class
  • Point to a fallback page

Registering the Processor

Next, patch the processor into the renderField pipeline.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <renderField>
        <processor
          patch:after="processor[@type='Sitecore.Pipelines.RenderField.GetFieldValue, Sitecore.Kernel']"
          type="YourNamespace.ValidateInternalLinksProcessor, YourAssembly" />
      </renderField>
    </pipelines>
  </sitecore>
</configuration>

Place this in a patch file under App_Config/Include.

Things to Watch Out For

A few practical considerations:

  • Performance: Parsing HTML on every render has a cost. Keep the logic tight.
  • Experience Editor: You may want to skip validation in EE to avoid confusing editors.
  • Caching: Output caching helps reduce repeated processing.
  • Multisite setups: Validate against the correct site and database.

Optional Enhancements

Once the basics work, you can extend this approach:

  • Log broken links for reporting
  • Show warnings in Experience Editor
  • Validate media links as well
  • Add feature flags to enable per site

Happi coding 🙂

Ramiro Batallas

Principal Backend Engineer at <a href="https://www.oshyn.com/" target="_blank" rel="noopener">Oshyn Inc.</a> With over 15 years of working as a .Net Software Developer, implementing applications with MCV, SQL, Sitecore, Episerver, and using methodologies like UML, CMMI, and Scrum. Furthermore, as a team player, I can be described as a self-motivator possessing excellent analytical, communication, problem-solving solving and decision-making.

View All Articles