Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from:

Automatic packages from TFS: #3 – Pipelines and data transformation

Published 17 August 2014
Updated 25 August 2016
This is post 3 in an ongoing series titled Automatic packages from TFS

In the first two posts in this series we've looked at commandline parameters and fetching data, and then saving package files. This week, we'll look at how the fetched data can be transformed into the package data.

I said back in the first post that I wanted to build this tool using a pipeline-style architecture. The approach is good because it's flexible – the set of pipeline components that can be run and the order that they are executed in can be configurable. That means the one tool can use config settings (and potentially extensions) to support different projects. So for this blog post I'll look at the code to configure and run the pipeline, and the set of components I've managed to come up with so far for my prototype.

Running the components

The code to run the pipeline is very simple:

public class ProcessingPipeline
    private ProjectConfiguration _config;

    public ProcessingPipeline(ProjectConfiguration config)
        _config = config;

    public PackageProject Run(IDictionary<string, SourceControlActions> input)
        PipelineData pd = new PipelineData();
        pd.Configuration = _config;
        pd.Input = input;
        pd.Output = new PackageProject();

        foreach (IPipelineComponent cmp in _config.PipelineComponents)

        return pd.Output;


To create an instance of the pipeline we pass in the ProjectConfiguration (more of that later). When we run the pipeline, the input is the data that came back from Source Control, and the output is a PackageProject like the one we saw last week.

The input is used to create a helper data object PipelineData - and it's this object that's handed to the individual pipeline components in turn. It's pretty trivial:

public class PipelineData
    public IDictionary<string, SourceControlActions> Input { get; set; }
    public PackageProject Output { get; set; }
    public ProjectConfiguration Configuration { get; set; }


All it does is store the input data, the configuration data and the output project in one convenient object. Hence each pipeline component needs to implement a simple interface:

public interface IPipelineComponent
    void Run(PipelineData data);


All you need to be able to do is pass it the PipelineData for it to process.

Revisiting configuration

Since I made the first post in this series, I've revised the configuration approach for the code a bit. And one side effect of that is that have now moved most of the Sitecore-project-specific configuration into an external XML file which is read at runtime. That allows specifying assorted settings, but the key part for today is the pipeline components and their config.

An example of the XML for the config file might be:

  <input type="Sitecore.TFS.PackageGenerator.Inputs.TFSCommandLine,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.SetPackageMetadata,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.RemoveUnwantedItems,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.RenameFiles,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.ExtractFilesToDeploy,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.ExtractItemsToDeploy,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.ExtractBinariesToDeploy,Sitecore.TFS.PackageGenerator"/>
    <component type="Sitecore.TFS.PackageGenerator.PipelineComponents.ExtractDeletionsToDeploy,Sitecore.TFS.PackageGenerator"/>
    <setting name="TFSCommandLine.ToolPath" value="C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\TF.exe"/>
    <setting name="SetPackageMetadata.PackageName" value="GeneratedPackage" />
    <setting name="RemoveUnwantedItems.ExtensionsToIgnore" value=".scproj,.csproj,.sln,.vspscc,.tds,.sql,web.config,web.debug.config,web.release.config,packages.config"/>
    <setting name="RemoveUnwantedItems.FoldersToIgnore" value="/deployments/,.TDS_Debug.,/css/includes/,/externalpackages/,/buildconfig/"/>
    <setting name="RenameFiles.Extensions" value=".less|.css"/>
    <setting name="ExtractFilesToDeploy.WebProjectFolder" value="/ClientName/ProjectName/Main/Source/Client.Project.Website"/>
    <setting name="ExtractBinariesToDeploy.ProjectPathMap" value="Client.Project.Website|/bin/Client.Project.Website.dll|/bin/Client.Project.Website.pdb" />
  <output type="Sitecore.TFS.PackageGenerator.Outputs.SaveXmlToDisk,Sitecore.TFS.PackageGenerator"/>


The two bits of this we're interested in for the moment is the <pipelineComponents/> and the <settings> elements.

The config here is processed by the ProjectConfiguration class. It's created via a static method, which is called with the path of the config XML to load:

public class ProjectConfiguration
    public string WorkingFolder { get; private set; }
    public IList<IPipelineComponent> PipelineComponents { get; private set; }
    public IDictionary<string, string> Settings { get; private set; }
    public IInput Input { get; private set; }
    public IOutput Output { get; private set; }
    public ConsoleLog Log { get; private set; }

    private ProjectConfiguration()
        PipelineComponents = new List<IPipelineComponent>();
        Settings = new Dictionary<string, string>(StringComparer.CurrentCultureIgnoreCase);
        Log = new ConsoleLog();

    public static ProjectConfiguration Load(string file)
        ProjectConfiguration pc = new ProjectConfiguration();

        using (var xr = new System.Xml.XmlTextReader(file))
            var xml = XDocument.Load(xr);

        return pc;


Parsing the XML is done via:

    private void parse(XDocument xml)
        XElement root = xml.Element("configuration");

        string inputType = root.Element("input").Attribute("type").Value;
        Input = createInstance<IInput>(inputType);

        string outputType = root.Element("output").Attribute("type").Value;
        Output = createInstance<IOutput>(outputType);

        WorkingFolder = root.Element("workingFolder").Value;

        foreach (var component in root.Element("pipelineComponents").Elements("component"))
            string type = component.Attribute("type").Value;

            Type t = Type.GetType(type);
            System.Reflection.ConstructorInfo ci = t.GetConstructor(System.Type.EmptyTypes);
            IPipelineComponent cmp = ci.Invoke(null) as IPipelineComponent;


        foreach (var item in root.Element("settings").Elements("setting"))
            Settings.Add(item.Attribute("name").Value, item.Attribute("value").Value);


This uses the Linq-to-XML APIs to extract the appropriate elements, and process their values. (The non-prototype version of this code will need better error handling here, of course) In this version of the code, the input (getting changes from TFS) and the output (saving the package XML to disk) have been abstracted out to plugin types – with a view to perhaps supporting multiple input and output options in the future. the XML for both of these elements contains a .Net type descriptor – and the createInstance() method attempts to convert that from the string description into a valid object:

private T createInstance<T>(string type)
    Type t = Type.GetType(type);
    System.Reflection.ConstructorInfo ci = t.GetConstructor(System.Type.EmptyTypes);
    return (T)ci.Invoke(null);


It parses the type descriptor, extracts the parameterless constructor, and then invokes it to generate an object - and it uses generic type parameters to cast this object to the correct type.

For the pipeline components in the config XML, the code just repeats the same pattern looping over the elements describing each component, and adds them to a collection. For the settings, the name and value pairs are extracted to a dictionary.

So we can update some of the code from the first post in this series, and load the pipelines and settings configuration, then run the pipeline as follows:

ProjectConfiguration config = ProjectConfiguration.Load(pathToConfigFile);

var data = config.Input.ProcessWorkItems(config.WorkingFolder, cmdParams.StartChangeSet, cmdParams.EndChangeSet);

ProcessingPipeline pp = new ProcessingPipeline(config);
var packageData = pp.Run(data);

var xml = packageData.ToXml();

config.Output.Store(xml, cmdParams.PackageFileName);


So what pipeline steps are needed?

Working through the set of operations required, my prototype makes use of the following steps so far:

  • Set Package Metadata This fills in a few of the core bits of metadata, such as the package name and author.
  • Remove Unwanted Items Some items found via TFS do not need to go into packages – such as Visual Studio project files, or folders relating to build configuration. This operation removes those from the list of discovered changes so they are ignored.
  • Rename Files Some files may have a different name or extension in Source Control compared to the name / extension that needs to go in the package. In my case a checked in change to a .less file needs to be deployed as a .css file - and this operation will transform the name.
  • Extract Files to Deploy This component iterates the list of changes and finds the remaining things that are not serialised Sitecore items and are not C# files, and adds them to the package as disk files to deploy.
  • Extract Items to Deploy Likewise, this component will find all of the .item changes reported by TFS and transform their name in order to add them correctly to the package as Sitecore Items.
  • Extract Binaries to Deploy This component finds the C# changes reported by TFS, and matches them against the names of projects included in the configuration to generate the correct set of binary files to add to the package.
  • Extract Deletions to deploy Finally, any Source Control operations marked as "delete" are found by this code and enumerated in the package "Read Me" metadata – since there is no automated way to delete individual files or items through a package.

At the back of my mind I'm considering the possibility of post deployment steps for automatically deleting, running SQL scripts etc. But those are still just ideas at this stage.

Next week, I plan to drill down into the code for these individual pipeline components to show how they work, and how they interact with the configuration data.

↑ Back to top