Getting related content based on categories in Episerver Find

Getting related content based on taxonomy is a common enough desire on any website and recently I was tasked with building something similar in Episerver using categories and Episerver Find.

What was needed was a block that resolved the categories for the page it was on, then found pages with similar categories ordering them by the most relevant (e.g. the most category matches). This sounds like it could be kind of complex, but I was surprised just how straight-forward it was—so I thought I'd share the steps involved.

First, you need to index the IDs for all categories and their ancestors (presuming you also want to match on those, this is of course optional). This is done via an extension method:

public static IEnumerable<int> CategoryIds(this ICategorizable categorizable)
{
	if (categorizable.Category == null)
	{
		return new List<int>();
	}

	var categoryIds = new List<int>();

	var categoryRepository = ServiceLocator.Current.GetInstance<CategoryRepository>();

	var root = categoryRepository.GetRoot();

	foreach (var categoryId in categorizable.Category)
	{
		var category = categoryRepository.Get(categoryId);

		// Resolve all categories up to the root, getting their ID
		while (category != null && !category.ID.Equals(root.ID))
		{
			categoryIds.Add(category.ID);
			category = category.Parent;
		}
	}

	return categoryIds.Distinct();
}

These values can then be included in the Episerver Find index:

SearchClient.Instance.Conventions.ForInstancesOf<ICategorizable>().IncludeField(x => x.CategoryIds())

That's the setup out of the way!

So, now we need a way to get content that matches any of those categories. To do this, we can use the Episerver Find FilterBuilder to create a simple Or filter that allows us to get any content as long as it has at least one category match:

public static ITypeSearch<IContent> MatchCategories(this ITypeSearch<IContent> search, ICollection<int> categoryIds)
{
	var categoryFilter = SearchClient.Instance.BuildFilter<IContent>();

	categoryFilter = categoryIds.Aggregate(categoryFilter,
		(current, categoryId) => current.Or(x => ((ICategorizable)x).CategoryIds().Match(categoryId)));

	return search.Filter(categoryFilter);
}

Now we can actually get some results if we use the above extension method, however, the problem is regardless of how many categories match all our results are scored equally, so we really have no idea of relevance. What'd be great is if we could utilize BoostMatching to boost results for each category match...well we can! Honestly, it's just a case of iterating over the category IDs and adding them individually:

foreach (var categoryId in categoryIds)
{
	search = search.BoostMatching(x => ((ICategorizable)x).CategoryIds().Match(categoryId), 1);
}

That's everything we need to get results scored by the number of categories which match, e.g. if a result contains 5 of the same categories as the request, it'll have a score of 5.

Putting this all together looks like this:

var search = _client.Search<IContent>()
	.Filter(x => x.MatchTypeHierarchy(typeof(ICategorizable)))
	.MatchCategories(categoryIds);

foreach (var categoryId in categoryIds)
{
	search = search.BoostMatching(x => ((ICategorizable)x).CategoryIds().Match(categoryId), 1);
}

var result = search.GetContentResult();

Sit back and enjoy that relevant content!

Comments

There are zero comments 😢