Building a Custom Sitecore RenderField Processor to Validate Internal Links in RTE Fields
Broken internal links are one of those problems that quietly pile up in a Sitecore solution. Editors paste content, pages get renamed or moved, and suddenly your rich text fields are full of links that no longer go anywhere.
Sitecore doesn’t validate internal links at render time by default. If a link points to a deleted or unpublished item, it still renders. That’s bad for user experience, SEO, and confidence in the CMS.
In this post, I’ll walk through how to build a custom RenderField processor that validates internal links inside Rich Text Editor (RTE) fields and prevents broken links from rendering.
The Problem
Most Sitecore solutions rely heavily on RTE fields for flexible content. Editors can add links using the internal link picker, which stores links like this:
<a href="~/link.aspx?_id=GUID&_z=z">Some link</a>
At render time, Sitecore resolves this to a friendly URL. But if the target item is:
- Deleted
- Not published
- Not accessible in the current language
the link still renders, often pointing to a 404.
You usually don’t notice until users report it.
Why Use a RenderField Processor?
Sitecore’s RenderField pipeline runs every time a field is rendered. That makes it a perfect place to inspect and modify RTE output before it hits the page.
With a custom processor, you can:
- Detect internal links in RTE fields
- Resolve the target item
- Validate publishing and language
- Remove or disable invalid links
- Optionally log or flag the issue
All without changing templates or editor behavior.
High-Level Approach
The solution follows these steps:
- Run only for RTE fields
- Parse the rendered HTML
- Find internal Sitecore links
- Resolve the target item
- Validate the item
- Modify the output if the link is invalid
Creating the Custom Processor
First, create a processor
/// <summary>
/// Custom Sitecore RenderField processor that validates internal links in RTE fields.
/// Broken internal links are redirected to a 404 page.
/// </summary>
public class RTEContentResolver
{
private const string InternalLinkPattern = @"~/link\.aspx\?_id=([A-Fa-f0-9]{32})&_z=z";
private const string NotFoundPageId = "{D7AEB6F8-A175-4559-ADA1-462E9EEEA3E2}";
public virtual void Process(RenderFieldArgs args)
{
Assert.ArgumentNotNull(args, nameof(args));
try
{
if (args.Item == null || args.Result == null)
{
return;
}
if (!string.Equals(args.FieldTypeKey, "rich text", StringComparison.OrdinalIgnoreCase))
{
return;
}
var fieldValue = args.FieldValue;
if (string.IsNullOrWhiteSpace(fieldValue))
{
return;
}
var resolvedContent = ResolveRTELinks(fieldValue);
// Update the result if links were modified
if (!fieldValue.Equals(resolvedContent))
{
args.Result.FirstPart = resolvedContent;
}
}
catch (Exception ex)
{
Log.Error($"Error in RTEContentResolver: {ex.Message}", ex, this);
}
}
/// <summary>
/// Resolves and validates all internal links in RTE content.
/// </summary>
/// <param name="rteContent">The RTE field content</param>
/// <returns>The RTE content with validated links</returns>
private string ResolveRTELinks(string rteContent)
{
if (string.IsNullOrWhiteSpace(rteContent))
{
return rteContent;
}
var regex = new Regex(InternalLinkPattern, RegexOptions.IgnoreCase);
var matches = regex.Matches(rteContent);
var result = rteContent;
foreach (Match match in matches)
{
var itemIdString = match.Groups[1].Value;
if (TryParseItemId(itemIdString, out var itemId))
{
// Check if item exists in web database and is published
if (!IsItemPublished(itemId))
{
var notFoundItemId = new ID(NotFoundPageId);
var originalLink = match.Value;
var replacementLink = $"~/link.aspx?_id={notFoundItemId:N}&_z=z";
result = result.Replace(originalLink, replacementLink);
Log.Warn($"Broken or unpublished link found in RTE content. ItemId: {itemId} - Redirecting to 404 page.", this);
}
}
}
return result;
}
/// <summary>
/// Attempts to parse a string as a valid Sitecore Item ID.
/// </summary>
private bool TryParseItemId(string itemIdString, out ID itemId)
{
itemId = null;
if (string.IsNullOrWhiteSpace(itemIdString))
{
return false;
}
try
{
itemId = new ID(itemIdString);
return !itemId.IsNull;
}
catch
{
return false;
}
}
/// <summary>
/// Checks if an item exists and is published in the web database.
/// </summary>
private bool IsItemPublished(ID itemId)
{
if (itemId.IsNull)
{
return false;
}
var webDatabase = Sitecore.Configuration.Factory.GetDatabase("web");
if (webDatabase == null)
{
return false;
}
var item = webDatabase.GetItem(itemId);
if (item == null)
{
return false;
}
// Check if item is published and accessible
return item.Versions.Count > 0;
}
}
This example removes invalid links entirely but keeps the inner text. You could also:
- Replace with a span
- Add a CSS class
- Point to a fallback page
Registering the Processor
Next, patch the processor into the renderField pipeline.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<pipelines>
<renderField>
<processor
patch:after="processor[@type='Sitecore.Pipelines.RenderField.GetFieldValue, Sitecore.Kernel']"
type="YourNamespace.ValidateInternalLinksProcessor, YourAssembly" />
</renderField>
</pipelines>
</sitecore>
</configuration>
Place this in a patch file under App_Config/Include.
Things to Watch Out For
A few practical considerations:
- Performance: Parsing HTML on every render has a cost. Keep the logic tight.
- Experience Editor: You may want to skip validation in EE to avoid confusing editors.
- Caching: Output caching helps reduce repeated processing.
- Multisite setups: Validate against the correct site and database.
Optional Enhancements
Once the basics work, you can extend this approach:
- Log broken links for reporting
- Show warnings in Experience Editor
- Validate media links as well
- Add feature flags to enable per site
Happi coding 🙂