Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2022/revisit-extract-solr-query-sitecore

Revisiting extracting the raw Solr query

Not all breaking changes make it into release notes

Published 19 December 2022

A while back I wrote a post about how you could extract the raw Solr query from Sitecore's ContentSearch APIs. Usually the queries hid behind LINQ operations, but there are times where having the raw text can be helpful - sometimes Sitecore's API doesn't support the operations you need. That work was done under Sitecore v10.0, but having tried to repeat it under v10.2, I discover it no longer works. There have been some changes under the surface of ContentSearch which require a different approach. So if you need to do this under v10.2, here's how:

The issue url copied!

I sat down to write some grouped query code in a v10.2 project recently, and copied in my "grab the search query" code from my original post. But the compiler was not happy with it:

A compiler warning that the ContentSearch query cannot be cast to GenericQueryable as this is depreciated

For Google's benefit the important bits of that error are: [depreciated] class Sitecore.ContentSearch.Linq.Parsing.GenericQueryable<TElement,TQuery> and GenericQueryable<SearchResultItem,SolrCompositeQuery> is obsolete. Please use IndexQueryable<TElement, TQuery> instead.

Ignoring that compiler message didn't help - as I ended up with a NullReferenceException at runtime.

That doesn't bode well. So what has happened? Well firing up the debugger, and looking at the IQueryable at runtime, its base type is now IndexQueryable<T,SolrCompositeQuery>:

Debugger looking at the internal type for a search query, and seeing it is IndexQueryable<T,SolrCompositeQuery>

So under the surface, something that changed during the work for v10.1 or v10.2 has revised the internal type used to represent a Solr query before it's executed.

So lets cast our IQueryable<T> to this new type and see what we get:

Autocomplete for the IndexQueryable type - showing no public GetQuery() method now exists

An issue indeed. There's no public GetQuery() method here any more. Reverting back to ILSpy for a sec to examine the definition of this class, we can see the following in the code:

// Sitecore.ContentSearch.Linq.Parsing.IndexQueryable<TElement,TQuery>
using System.Linq.Expressions;

private TQuery GetQuery(Expression expression)
{
	return _queryTranslator.Translate(expression);
}

					

The change to the underlying types has hidden GetQuery() away as private now.

That's a frustrating change in the light of the documentation still suggesting there are queries where you need to know the raw Solr query language rather than just the LINQ expressions...

A solution url copied!

But all is not lost - Microsoft have provided us with Reflection to solve this problem. Yes it has some performance challenges, and yes it means our code might well break again with future releases of ContentSearch. But it allows a way forward here at least.

You can use the reflection APIs to get a reference to any method of a class and execute it, no matter whether it's public or private. You get the Type for your object, find the MethodInfo for the code you want to call, and pass in any parameters you need. So in this case, we can hack together a simple method that can call GetQuery() for us:

public string GetQuery<T>(IQueryable<T> query)
{
    var type = query.GetType();
    var method = type.GetMethod("GetQuery", BindingFlags.Instance | BindingFlags.NonPublic);

    var solrQuery = (SolrCompositeQuery)method.Invoke(query, new object[] { query.Expression });

    return solrQuery.ToString();
}

					

And that can be run against the IQueryable that gets created by a ContentSearch LINQ query:

using (var ctx = ContentSearchManager.GetIndex("sitecore_master_index").CreateSearchContext())
{
    var q = ctx.GetQueryable<SearchResultItem>()
        .Where(i => i.TemplateName == "Search Item")
        .Where(i => i.Paths.Contains(new Sitecore.Data.ID("{80A1CDA6-8A15-4BD6-8A71-59025A5A8219}")));

    var queryText = GetQuery(q);

    // do something with the query text
}

					

So that solves my immediate problem - I can get the basic query again now.

If you're doing this somewhere that performance is important, then you don't want to do the reflection method lookup for every execution. You could do this in a static constructor for whatever controller is running this search code, but I figure it's probably most useful to factor it out into a helper class:

public static class QueryExtensions<T> 
{
    private static readonly MethodInfo _getQuery;

    static QueryExtensions()
    {
        var type = typeof(IndexQueryable<T, SolrCompositeQuery>);
        _getQuery = type.GetMethod("GetQuery", BindingFlags.Instance | BindingFlags.NonPublic);
    }

    public static string GetQuery(IQueryable<T> query)
    {
        var solrQuery = (SolrCompositeQuery)_getQuery.Invoke(query, new object[] { query.Expression });

        return solrQuery.ToString();
    }
}

					

I don't think you can use this pattern to make an extension method - but it does allow you to have a per-query type static field for the method, which will reduce the slow reflection lookups. This will only call reflection once per query result type you use it against.

A side-note about geographic queries url copied!

If you were paying attention to the code example I showed in the first screenshot above, you'll have noticed a call to WithinRadius(). That's a useful LINQ method that ContentSearch provides for doing geographic queries. But you'll see an interesting result when you call my helper code above for a query which uses that method. Taking this LINQ expression:

var q = ctx.GetQueryable<SearchResultItem>()
    .Where(i => i.TemplateName == "Search Item")
    .Where(i => i.Paths.Contains(new Sitecore.Data.ID("{80A1CDA6-8A15-4BD6-8A71-59025A5A8219}")))
    .WithinRadius(i => i.Location, new Coordinate(52.843647, -3.944892), 5);

					

Printing the query with my example class above gives you this:

query: "(_templatename:(\"Search Item\") AND _path:(\"80a1cda68a154bd68a7159025a5a8219\")) AND _val_:__boost"

					

There's no geographic clause in there... So what's going on?

What I realised after a bit of thinking is that Solr has some ways of breaking up your query into "the main query" and a "filter query" to optimise its caching strategies. When you issue a geographic query via ContentSearch is puts the distance-related bits into the "filter query" part of the data sent to Solr, and you don't see that using the code above.

Doing a bit of digging, you can find this data though. The result of the call to GetQuery is a SolrCompositeQuery object. That includes a field for the main query as well as a Filter field for the filter query. The Filter field is of type SolrQuery and it has a Query property which returns its text. So the helper method defined above can be adjusted to return both:

public static class QueryExtensions<T> 
{
    private static readonly MethodInfo _getQuery;

    static QueryExtensions()
    {
        var type = typeof(IndexQueryable<T, SolrCompositeQuery>);
        _getQuery = type.GetMethod("GetQuery", BindingFlags.Instance | BindingFlags.NonPublic);
    }

    public static (string query, string filter) GetQuery(IQueryable<T> query)
    {
        var solrQuery = (SolrCompositeQuery)_getQuery.Invoke(query, new object[] { query.Expression });
        var filterQuery = (SolrQuery)solrQuery.Filter;

        return (solrQuery.ToString(), filterQuery.Query);
    }
}

					

And the results of running this against the LINQ query above are:

query: "(_templatename:(\"Search Item\") AND _path:(\"80a1cda68a154bd68a7159025a5a8219\")) AND _val_:__boost",
filter: "{!geofilt sfield=coordinate_rpt pt=52.843647,-3.944892 d=5}"

					

Problem solved...

↑ Back to top