Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2023/migrating-t4-to-source-generators

Some fun migrating a T4 Template to a Source Generator

Some pros, some cons, and a change of approach

Published 24 April 2023
.Net C# ~6 min. read

My recent post about messing up with inheritance came out of some work to migrate some (fairly old) T4 Template code generation to .Net's newer Source Generators feature. Excluding my own mistakes, this process wasn't as easy as I'd hoped. So it seemed like a good topic to jot some notes down about, in case others are facing similar challenges...

Background

A fairly common use case for T4 templates was to take some things already defined in a solution's code, and process them to generate extra code. A pattern here might be something like a developer defining a partial class that models some data:

namespace BlogExample
{

    [T4ToString]
    public partial class MyDataModel
    {
        public int DataItemOne { get; set; }
        public int DataItemTwo { get; set; }
    }

}

					

and then a T4 Template can generate look for some marker on existing classes (say the [T4ToString] attribute here) and generate the other half of the partial class to implement some boilerplate method:

//
// Generated Code - do not edit
//
namespace BlogExample
{

    public partial class MyDataModel
    {
        public override string ToString()
        {
            return $"MyDataModel: DataItemOne={DataItemOne}, DataItemTwo={DataItemTwo}";
        }
    }

}

					

The T4 template code here can use reflection to look at the compiled code, and work out what classes it needs to process, and what the generated code should look like. It might look something like:

<#@ template debug="false" hostspecific="false" language="C#" #>
<#@ assembly name="System.Core" #>
<#@ import namespace="System.Linq" #>
<#@ import namespace="System.Text" #>
<#@ import namespace="System.Reflection" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ assembly name="$(SolutionDir)FrameworkExample\bin\Debug\FrameworkExample.dll" #>
<#@ import namespace="FrameworkExample" #>
<#@ output extension=".cs" #>
//
// Generated Code - do not edit
//
namespace BlogExample
{
<#
    var sb = new StringBuilder();
	
	var attrType = typeof(T4ToStringAttribute);
	var assembly = attrType.Assembly;
	var types = assembly.GetTypes().Where(a => a.GetCustomAttributes(attrType, true).Length > 0);

    foreach(var t in types)
	{
		var parameters = t.GetProperties(BindingFlags.Instance | BindingFlags.Public);

		var propertyList = string.Empty;
		foreach(var p in parameters)
		{
		    if (propertyList.Length > 0)
            {
				propertyList += ", ";
            }
            propertyList += $"{p.Name}={{{p.Name}}}";
		}
#>
	public partial class <#=t.Name#>
	{
		public override string ToString()
		{
			return $"<#=t.Name#>: <#=propertyList#>";
		}
	}
<#
	}
#>
}

					

I had some code which worked broadly like this in the project I was trying to migrate to .Net 7, and I quickly realised that the T4 code wasn't going to work in the migrated codebase. T4 templates are no longer properly supported, and won't work in the build process here. The core problem seems to be that they can't load (and hence use) .Net Core DLLs. There are some hacks that can work in some circumstances, but they weren't relevant to my work.

The replacement technology is the newer "Source Generator" approach. This uses C# classes which implement the ISourceGenerator interface to generate extra source files which can then be compiled. But the migration isn't necessarily obvious...

The big differences

When the T4 approach runs, this happens after the C# compiler has turned the source that you typed out into a DLL. It needs to do that because it's using reflection to decide what code to generate. On the plus side, using reflection to do this analysis is pretty easy for most developers. But on the down side it does lead to an annoying side-effect. Generally you don't commit your generated code to source control, so when you clone a copy of a solution with this pattern it won't compile without errors the first time, because you need to have run the generation first, which can't happen until a build has completed...

flowchart LR
  HSC[Human
provided
source] CMP1[Compiler
tries to
generate DLLs] T4[T4 generates
more code] CMP2[Compiler
tries to
generate DLLs] HSC--Basic code
compiled-->CMP1 CMP1--Reflection reads
basic code-->T4 T4--Source for
enhanced
objects-->CMP2 CMP2--Original source
references
enhanced objects-->HSC

That circular reference can be a bit of a pain, but is generally sorted out by running the build twice, and by how you break up code between the projects in your solution. The heirarchy of your projects matters here.

That isn't true with the source generator approach. These don't need to wait for an initial compilation because they operate directly on the source code. Generators run after the compiler has parsed your code, but before anything has been compiled into IL. So the circular dependency above doesn't exist:

flowchart LR
  HSC[Human
provided
source] GEN[Source Generator
adds more source] CMP[Compiler
generates DLLs] HSC--Parsed by
compiler-->GEN GEN--Extra source
now parsed-->CMP

But this does mean the code in the generator cannot use Reflection to work out what to create...

Setup for the new method

The first thing to pay attention to here is that a Source Generator has to run inside the compiler's context. That means your generators have to live in NetStandard 2 DLLs. Hence these need to be defined in a separate DLL to the project you'll be generating your code in.

If you're going to want to debug the generator code you're writing (and trust me, you will!) then you need to have installed the .Net Compiler Platform component into Visual Studio. You can do that via the Visual Studio Installer. For your relevant instance of VS, click "modify", "Individual components", then search for ".Net Compiler" and make sure it's checked:

Selecting the .Net Compiler Platform in the VS Installer

When you create your generator project you need to ensure you've added some references and settings. At a minimum you need to add the nuget packages for Microsoft.CodeAnalysis.CSharp and Microsoft.CodeAnalysis.Analyzers, though you may also need others. You also need a specific ItemGroup, which I don't think there's UI for right now so you need to manually add:

<ItemGroup>
	<None Include="$(OutputPath)\$(AssemblyName).dll" Pack="true" PackagePath="analyzers/dotnet/cs" Visible="false" />
</ItemGroup>

					

and there are some settings which seem to be required in the PropertyGroup for controlling the project:

<EmitCompilerGeneratedFiles>true</EmitCompilerGeneratedFiles>
<CompilerGeneratedFilesOutputPath>Generated</CompilerGeneratedFilesOutputPath>
<IsRoslynComponent>true</IsRoslynComponent>

					

With that done you can add a basis class for your generator:

[Generator]
public class BlogExampleGenerator : ISourceGenerator
{
    public void Execute(GeneratorExecutionContext context)
    {
    }

    public void Initialize(GeneratorInitializationContext context)
    {
    }
}

					

Once that's in place you can add your generator DLL to your code project, and configure it as a Source Generator. Again there doesn't seem to be an easy UI for this, so you can add the relevant XML to your project file:

<ItemGroup>
	<ProjectReference Include="..\SourceGenerator\SourceGenerator.csproj"
						OutputItemType="Analyzer"
						ReferenceOutputAssembly="false" />
</ItemGroup>

					

It needs to reference the project (or the DLL) for your analyser and it needs to have the OutputItemType="Analyzer" attribute. If your project also depends on classes defined in the generator project for compiling (if you defined a marker attribute there for example) then you need the ReferenceOutputAssembly set to true. Otherwise, this can be false.

With all that set up, you can build your solution, and you should be able to compile everything...

Debugging your code

In your source generator project, you need to set up a debug profile which knows what the generator should run against. This is the bit the .Net Compiler Platform is needed for - it adds the debug approach used here. So if you can't set this bit up, it's because that module is missing.

Open the properties dialog for your source generator project and scroll down to the debug section near the bottom, then click the "Open debug launch profiles UI" link. Click the "new" button, and add a "Roslyn Component" entry, and pick your main project in the dropdown to set the source data used to run the generator for debugging:

The UI for configuring source generator debugging, showing the 'Roslyn Component' debug profile, and the relevant settings to make

With that done, you can set your generator project as the Startup Project, and click play to debug it. Breakpoints etc should work as normal, and all the context data will be from the project you picked in the dialog above.

The alternative code

But the real meat of this is how you replace the reflection code from T4 with parsing of the source tree for the generator. When it's invoked during the build process, a generator gets an instance of the GeneratorExecutionContext - which provides a lot of state about the current build. But the key thing for this task is the Compilation object, and its SyntaxTrees collection. Each SyntaxTree is a structure representing one file being compiled that has been parsed. That includes the stuff you think about as your code, as well as things like AssemblyInfo.cs which may be generated for you.

Each of the SyntaxTree objects is the complete parsed model of a source file, so it includes all of the tokens, keywords and aspects of the language syntax. So to find "all classes tagged with an attribute" you need to look through it to find the nodes which describe a class, and then look for the nodes which describe the attributes added to that class.

So the revised code ends up being a LINQ query on all the descendants of the syntax tree to find any class declarations. Each of those has a list of attributes, which can then be searched for the name of the attribute. Note, however, that since we're working with the C# and not the .Net IL here the attribute is referred to without the usual <something>Attribute suffix.

That ends up looking something like:

var classesWithAttribute = context.Compilation.SyntaxTrees
        .SelectMany(st => st.GetRoot()
                .DescendantNodes()
                .Where(n => n is ClassDeclarationSyntax)
                .Select(n => n as ClassDeclarationSyntax)
                .Where(r => r.AttributeLists
                    .SelectMany(al => al.Attributes)
                    .Any(a => a.Name.GetText().ToString() == "T4ToString")));

					

Once those ClassDeclarationSyntax objects have been pulled out, they can be searched for things like properties or methods by looking for their child items which have the right type, such as PropertyDeclarationSyntax for a property.

Depending on what you're doing here, you may find you have to pay more attention to the variations which C# allows in declarations here. If you're extracting data from a property, you may have to consider whether the one you're looking at has a "normal" get/set body with braces, or an expression body using a => for example. That's not the case with the previous reflection approach though - because all of those end up with broadly the same IL structure after compilation.

But with the changes made, the logic for a generator to replace the T4 template above looks something like:

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Text;
using System.Linq;
using System.Text;

namespace SourceGenerator
{

    [Generator]
    public class BlogExampleGenerator : ISourceGenerator
    {
        public void Execute(GeneratorExecutionContext context)
        {
            var sb = new StringBuilder();

            var classesWithAttribute = context.Compilation.SyntaxTrees
                    .SelectMany(st => st.GetRoot()
                            .DescendantNodes()
                            .Where(n => n is ClassDeclarationSyntax)
                            .Select(n => n as ClassDeclarationSyntax)
                            .Where(r => r.AttributeLists
                                .SelectMany(al => al.Attributes)
                                .Any(a => a.Name.GetText().ToString() == "T4ToString")));

            foreach (var c in classesWithAttribute)
            {
                var name = c.Identifier.ValueText;

                var props = c.DescendantNodes().Where(n => n is PropertyDeclarationSyntax)
                    .Select(n => n as PropertyDeclarationSyntax)
                    .Select(n => n.Identifier.ValueText);

                var propertyList = string.Empty;

                foreach(var p in props)
                {
                    if (propertyList.Length > 0)
                    {
                        propertyList += ", ";
                    }
                    propertyList += $"{p}={{{p}}}";
                }

                context.AddSource($"{name}.g.cs", SourceText.From($@"""
// <auto-generated/>
namespace CoreExample
{{

	public partial class {name}
	{{
		public override string ToString()
		{{
			return $""{name}: {propertyList}"";
		}}
	}}

}}
""", Encoding.UTF8));
            }
        }

        public void Initialize(GeneratorInitializationContext context)
        {
        }
    }

}

					

Conclusions

While it's great that this approach gets away from the old "circular reference" problem, you do need to think differently about how you go about reading the parsed source here. It's much more flexible and enables some interesting new approaches, but understanding the structure of the parsed source tree is definitely harder than the old Reflection model.

Another nice difference here is that Source Generators know how to output multiple files. So it's much easier to have a situation where you generate one file per class you extend, if that's your preference.

And it's kind of nice to step away from using Razor syntax here, and the challenges that can pose for structure and layout of your work. While I've not made use of it in the examples above, I think the addition of Raw String Literals in C# 11 could make writing the code "template" in your generator much easier than it's been in the past...

↑ Back to top