When you're working with a "provider" model for services in your applications you get used to the assumption that everything follows the Liskov Substitution Principle and whatever provider you plug in will work in the same way. Unfortunately, for software our in the real world that's not always entirely true. Recently I came across an example of this which helped point out a bug in some search code in Sitecore...
var index = fetchContextIndex(someContentItem); var predicate = buildTheSearchCriteria(currentState); using (IProviderSearchContext context = index.CreateSearchContext()) { var query = context .GetQueryable<SearchResultItem>() .Filter(predicate); var fullResultsSet = query.GetResults(); var totalResults = fullResultsSet.Count(); // Display the number of matches }
Cue some head scratching...
But as some of you no doubt already spotted, that's not the documented way you're supposed to get the total number of results for a query. As
a number of Google hits point out, the property
TotalSearchResults
is the thing we should be using here. And that returns the correct value for both Coveo and Lucene.
If the query had included pagination, the issue would have revealed itself straight away, as that would have highlighted the different behaviours of
Count()
and
TotalSearchResults
when your query result set is bigger than the results page size. But because the code in question didn't do that, the bug slipped through...
If you look into the code for the
SearchResults<TSource>
you'll see that this class exposing both the property
TotalSearchResults
and an
IEnumerable
:
The code for the
TotalSearchResults
property is set specifically by the provider generating the results:
public int TotalSearchResults { get; private set; }
That value is set by the constructor, and it can be independent of the size of results page being returned for this query.
But the value of a call to
Count()
for this collection will be based on the enumerator that the class exposes. The implementation of
IEnumerable
returns an enumeration taken from the inner
Hits
collection:
IEnumerator<SearchHit<TSource>> IEnumerable<SearchHit<TSource>>.GetEnumerator() { return this.Hits.GetEnumerator(); }
For Lucene, a query with no pagination will return all the index items matched up to the maximum defined in the config setting for "max result set size" (The
ContentSearch.SearchMaxResults
setting in your config files). In this case, that was more than 97 so the whole result set was returned and hence it looked like the code was working. But Coveo seems to default to a page of 10 results if you fail to specify pagination. If you think about it, that behaviour makes some sense. Lucene is running in the same process as your site, so it's not a big issue for it to return all the result data if you don't explicitly apply a pagination clause to your query. (You still should though!) It's just shuffling memory about, which is fairly fast to do. However Coveo runs out-of-process (and in the worst case might be out in the cloud if you use the SAAS version) so defaulting to only returning details for the first 10 results if there is no pagination clause could help prevent performance issues from huge result sets being pushed across the network.
So take care people – Barbara Liskov might not approve, but sometimes you need to be wary about swapping out providers. There can be justifications for why behaviour isn't always exactly the same, and those variations can lead to subtle bugs if you're not paying attention...
And reading the documentation so you understand the right way to use the objects in question helps too 😉
↑ Back to top