Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2014/automatic-packages-from-tfs-4-pipeline-component-internals

Automatic packages from TFS: #4 – Pipeline component internals

Published 25 August 2014
Updated 25 August 2016
This is post 4 in an ongoing series titled Automatic packages from TFS

In last week's entry I listed out the set of pipeline components required to generate a package. This week, lets have a look at what goes into each of those components and see how they interact with the configuration and source control data.

1: Basic Package Metadata url copied!

The first component that runs is the `SetPackageMetadata`. This is a simple component that just sets some basic package metadata. For the moment that is just the package's name and the name of the package creator:
public class SetPackageMetadata : IPipelineComponent
{
    public const string PackageNameKey = "SetPackageMetadata.PackageName";

    public void Run(PipelineData data)
    {
        if (!data.Configuration.Settings.ContainsKey(PackageNameKey))
        {
            data.Configuration.ThrowMissingConfigurationException(PackageNameKey, "This should be the name to set for the package");
        }

        data.Output.Metadata.Name = data.Configuration.Settings[PackageNameKey];
        data.Output.Metadata.Author = System.Environment.UserDomainName + "\\" + System.Environment.UserName;
    }
}

					

All the pipeline components implement the IPipelineComponent interface and expose the Run() method that is called by the parent. When called, it does two things. First it looks at the configuration data that is passed in to the component. It checks to see if the configuration data includes a key called "SetPackageMetadata.PackageName". (For all the configuration settings I've chosen to use a "." pattern to avoid clashes) If the key does not exist in the configuration dictionary then the component can't continue – so it raises an exception to indicate this missing data. Otherwise it carries on and uses the configuration data and the system environment data to update properties in the package metadata.

2: Remove any unwanted changes url copied!

The second step in the pipeline is to remove any changed files from the Source Control data that we don't want to process. The job of the `RemoveUnwantedItems` is to fetch the configuration required for this, and then make use of the data to filter the Source Control Changes.
public class RemoveUnwantedItems : IPipelineComponent
{
    public const string ExtensionsToIgnoreKey = "RemoveUnwantedItems.ExtensionsToIgnore";
    public const string FoldersToIgnoreKey = "RemoveUnwantedItems.FoldersToIgnore";

    private string[] extensions;
    private string[] folders;

    public void Run(PipelineData data)
    {
        if (!data.Configuration.Settings.ContainsKey(ExtensionsToIgnoreKey))
        {
            data.Configuration.ThrowMissingConfigurationException(ExtensionsToIgnoreKey, "A comman separated list of file extensions to ignore when processing. Processed by an 'ends with' query.");
        }
        extensions = data.Configuration.Settings[ExtensionsToIgnoreKey].Split(',');

        if (!data.Configuration.Settings.ContainsKey(FoldersToIgnoreKey))
        {
            data.Configuration.ThrowMissingConfigurationException(FoldersToIgnoreKey, "A comman separted list of folders to ignore. Processed by a 'contains' query.");
        }
        folders = data.Configuration.Settings[FoldersToIgnoreKey].Split(',');

        string[] keys = new string[data.Input.Keys.Count];
        data.Input.Keys.CopyTo(keys, 0);

        foreach (string key in keys)
        {
            if (hasExcludedExtension(key) || containsExcludedFolder(key) || noExtension(key))
            {
                data.Input.Remove(key);
            }
        }
    }
}

					

Similarly to the previous code, the first thing to do is to fetch the configuration parameters. Here however, we also have to process them into arrays that we can iterate over later, as they are stored as comma separated lists. And once this is done, the code then needs to iterate the set of changes passed in through the input data for the pipeline. Since we need to iterate and modify this collection the code starts by copying the keys in the input dictionary, and then iterates that copy. For each key that it processes it checks whether it contains an excluded extension, an excluded folder in its path, or no extension at all. If one of those tests matches, then the item is removed from the input dictionary.

The tests for extensions and folders are simple:

private bool hasExcludedExtension(string key)
{
    foreach (string extension in extensions)
    {
        if (key.EndsWith(extension, StringComparison.InvariantCultureIgnoreCase))
        {
            return true;
        }
    }

    return false;
}

private bool containsExcludedFolder(string key)
{
    foreach (string folder in folders)
    {
        if (key.CaseInsensitiveContains(folder))
        {
            return true;
        }
    }

    return false;
}

private bool noExtension(string key)
{
    string remainder = key.Substring(key.LastIndexOf("/"));

    if (!remainder.Contains("."))
    {
        return true;
    }

    if (remainder == "/")
    {
        return true;
    }

    return false;
}

					

The CaseInsensitiveContains() is a helper method that hides the rather verbose uses System.Globalization to make this test.

3: Renaming files url copied!

If the input from source control contains things like changes to `.less` files that need to have the extension changed in the package output, then this component will process the change. It takes a slightly more complex input configuration to specify a list of old and new names, and then checks the source control data for entries that need changing:
public class RenameFiles : IPipelineComponent
{
    public const string ExtensionsKey = "RenameFiles.Extensions";

    public void Run(PipelineData data)
    {
        if(!data.Configuration.Settings.ContainsKey(ExtensionsKey))
        {
            data.Configuration.ThrowMissingConfigurationException(ExtensionsKey, "A list of renames to perform. Separate entries with commas Separate source and target names with a pipe.");
        }
        Dictionary<string, string> renames = new Dictionary<string, string>(StringComparer.CurrentCultureIgnoreCase);
        var extensions = data.Configuration.Settings[ExtensionsKey].Split(',');
        foreach (string extension in extensions)
        {
            var parts = extension.Split('|');
            renames.Add(parts[0].Trim(), parts[1].Trim());
        }

        string[] files = new string[data.Input.Keys.Count];
        data.Input.Keys.CopyTo(files, 0);

        foreach (string file in files)
        {
            foreach (var rename in renames)
            {
                if (file.EndsWith(rename.Key, StringComparison.CurrentCultureIgnoreCase))
                {
                    SourceControlActions actions = data.Input[file];
                    data.Input.Remove(file);

                    string newKey = file.Substring(0, file.LastIndexOf('.')) + rename.Value;

                    data.Input.Add(newKey, actions);

                    break;
                }
            }
        }
    }
}

					

This code follows similar patterns to the previous examples.

4: Getting the right files to deploy url copied!

Getting the relevant files to deploy is the next task to process. This requires knowing where the web project being deployed is located, in order to be able to adjust the paths returned by Source Control into the relative style of path required in a package:
public class ExtractFilesToDeploy : IPipelineComponent
{
    public const string WebProjectFolderKey = "ExtractFilesToDeploy.WebProjectFolder";

    public void Run(PipelineData data)
    {
        if (!data.Configuration.Settings.ContainsKey(WebProjectFolderKey))
        {
            data.Configuration.ThrowMissingConfigurationException(WebProjectFolderKey, "This should be the disk path to the web project folder.");
        }
        string webProjectFolder = data.Configuration.Settings[WebProjectFolderKey]; 
            

        var fs = new PackageModel.PackageSourceFiles();
        fs.Name = "Files to deploy";

        foreach (var item in data.Input)
        {
            if (!item.Key.EndsWith(".item", StringComparison.CurrentCultureIgnoreCase) && !item.Key.EndsWith(".cs", StringComparison.CurrentCultureIgnoreCase)  && item.Value.IsNotDelete())
            {
                fs.Add(item.Key.Replace(webProjectFolder, ""));
            }
        }

        data.Output.Sources.AddSource(fs);
    }
}

					

For each item in the source data we need to exclude anything that's a serialised item, a C# file or is a deletion operation. For all the source changes that remain the absolute Souce Control Working Folder path is adjusted by removing the web project location in order to turn it into a relative path. All of these items are added to a PackageSourceFiles object, which is in turn added to the output package itself.

5: Getting Sitecore items url copied!

The key bit of this tool is to get changes to serialised items into the output package. The `ExtractItemsToDeploy` item does this for us:
public class ExtractItemsToDeploy : IPipelineComponent
{
    private ProjectConfiguration cfg;

    public void Run(PipelineData data)
    {
        cfg = data.Configuration;

        var pi = new PackageModel.PackageSourceItems();
        pi.Name = "Items to deploy";

        foreach (var item in data.Input)
        {
            if (item.Key.EndsWith(".item", StringComparison.CurrentCultureIgnoreCase) && item.Value.IsNotDelete())
            {
                try
                {
                    pi.Add(formatItemIdentifer(item.Key));
                }
                catch (FileNotFoundException)
                {
                    cfg.Log.WriteLine("Serialised Item not found for '" + item.Key + "' - Cannot add to package.");
                }
                    
            }
        }

        data.Output.Sources.AddSource(pi);
    }
}

					

This goes through each item in our input data from Source Control and looks for those which are serialised Sitecore items and are not marked as deletions. Using the items that match this pattern, a PackageSourceItems is constructed and added to the overall package. So far so simple.

But if you look at the XML in a set of Sitecore items you'll note that the definition of each item to package looks a bit odd compared to what we get from Source Control:

<xitems>
  ...
  <Entries>
    <x-item>/master/sitecore/system/Dictionary/ProjectName/Forms/Login/ce_Password_Strength/{DD5E504F-5FF9-477F-A2FB-B3905B76368C}/invariant/0</x-item>
    <x-item>/master/sitecore/system/Dictionary/ProjectName/Forms/Payment/ce_Payment_ChequeNext/{D6F0CC98-7FB2-4930-A42B-AABA89766AEB}/invariant/0</x-item>
  </Entries>
  ...
</xitems>

					

For each entry in the file, there's a path – but it looks a bit odd. It seems to start with the Sitecore database name, followed by the content tree path, and then followed by a GUID, an indication of the language code and finishing off with the version number.

So how can we generate that string of data when all we know is the disk path of the serialised item, and we don't have access to Sitecore's APIs? After a fair amount of head scratching I hit upon a solution: Read the data from the serialised item file itself. And the method formatItemItentifier() above performs that process.

If we look at the content of one of those files, we see roughly the following:

----item----
version: 1
id: {DD5E504F-5FF9-477F-A2FB-B3905B76368C}
database: master
path: /sitecore/system/Dictionary/ChristiesEducation/Forms/Login/ce_Password_Strength
parent: {36777232-C3FA-4AB3-A7C7-4EA0161F8FEC}
name: ce_Password_Strength
master: {00000000-0000-0000-0000-000000000000}
template: {6D1CD897-1936-4A3A-A511-289A94C2A7B1}
templatekey: Dictionary entry

----field----
field: {580C75A8-C01A-4580-83CB-987776CEB3AF}
name: Key
key: key
content-length: 20

ce_Password_Strength
----version----
language: en
version: 1
revision: 37cb107f-7b88-4d66-93aa-4f0f8914c989

----field----
field: {2BA3454A-9A9C-4CDF-A9F8-107FD484EB6E}
name: Phrase
key: phrase
content-length: 68

At least 7 characters, where one or more is a punctuation character.

					

You can see that all the data we need is stored in the "item" section that begins the file. It includes the GUID, the path. For the purposes of the prototype here I'm not bothered by specific item versions or languages, so we can fix those bits of the data to "invariant" for the language and "0" for the version in order to tell Sitecore to ignore those options when building the package.

Hence we can write a very simple parser to turn the data into a dictionary. It ignores any blank lines, and looks for lines marked with four hyphens – processing data until it reaches one of these regions marked with a title other than "item". And finally it breaks the remaining strings into name / value pairs:

private IDictionary<string, string> parseFile(StreamReader sr)
{
    Dictionary<string, string> data = new Dictionary<string, string>(StringComparer.CurrentCultureIgnoreCase);
    bool done = false;

    string text = sr.ReadToEnd();
    sr.BaseStream.Seek(0, SeekOrigin.Begin);

    while (!done)
    {
        string line = sr.ReadLine();

        if(string.IsNullOrWhiteSpace(line))
        {
            continue;
        }

        if (line.StartsWith("----"))
        {
            string region = line.Replace("----", "");

            if (StringComparer.CurrentCultureIgnoreCase.Compare(region,"item") != 0)
            {
                done = true;
            }
        }
        else
        {
            string[] parts = line.Split(':');
            data.Add(parts[0].Trim(), parts[1].Trim());
        }
    }

    return data;
}

					

And we can generate the correct path data using code as follows:

private string formatItemIdentifer(string key)
{
    string identifier;

    using (var tr = File.OpenText(key))
    {
        var data = parseFile(tr);

        string db = data["database"];
        string id = data["id"];
        string path = data["path"];

        identifier = string.Format("/{0}{1}/{2}/invariant/0", db, path, id);
    }

    return identifier;
}

					

It takes the disk path from Source Control and opens it as a text file and parses it using the code above. And then it generates the correct path using the data extracted from the disk file.

Now this approach works well, and solves the core problem of not having access to Sitecore – however it requires that the data exists on disk. Hence this makes it a requirement of the tool that the current state of your working folder includes all the correct .item files. If you don't do a "get-latest" to ensure the files are available then you will see errors from the tool – FileNotFoundExceptions.

Another possible thing to do with this tool in the future might be to add the ability to perform this get operation automatically. However I'm thinking of this tool being used in a build workflow where this job should already have been performed in order to perform the build itself.

6: Adding binary files to deploy url copied!

The last thing which needs to be added to the package is the binaries generated from the code in the web package. The `ExtractBinariesToDeploy` component does this. The configuration for this is a bit more complex, as it needs to be a set of project names, with their accompanying set of binaries. Currently that's a comma separated list, where each item is a pipe-separated list of one project followed by a number of binary files. The following bit of code parses the data out of the config setting into a Dictionary containing a list of strings:
private Dictionary<string, List<string>> extractProjectPathMap(PipelineData data)
{
	Dictionary<string, List<string>> projectPathMap = new Dictionary<string, List<string>>(StringComparer.CurrentCultureIgnoreCase);
	var projectsCfg = data.Configuration.Settings[ProjectPathMapKey].Split(',');
	foreach (string project in projectsCfg)
	{
		string[] parts = project.Split('|');

		List<string> files = new List<string>();
		for (int i = 1; i < parts.Length; i++)
		{
			files.Add(parts[i]);
		}

		projectPathMap.Add(parts[0], files);
	}

	return projectPathMap;
}

					

That code is used by the core of the pipeline component to parse the configuration before it starts to process the individual items. It goes through the set of projects configured, and for each one, it checks to see if it can find any C# source changes in the input data. For any configured projects that do have changes, it adds the relevant binaries to the package:

public class ExtractBinariesToDeploy : IPipelineComponent
{
    public const string ProjectPathMapKey = "ExtractBinariesToDeploy.ProjectPathMap";

    public void Run(PipelineData data)
    {
        var fs = new PackageModel.PackageSourceFiles();
        fs.Name = "Binaries to deploy";

        if (!data.Configuration.Settings.ContainsKey(ProjectPathMapKey))
        {
            data.Configuration.ThrowMissingConfigurationException(ProjectPathMapKey, "A comma separated list of settings. Each one is a pipe separated items. The first part is the project path. All subsequent parts are binaries to add if C# files in the project have been changed.");
        }

        var projectPathMap = extractProjectPathMap(data);
        var projects = projectPathMap.Keys.ToList();

        foreach (var item in data.Input)
        {
            if (item.Key.EndsWith(".cs", StringComparison.CurrentCultureIgnoreCase))
            {
                foreach (string project in projects.ToArray())
                {
                    if (item.Key.CaseInsensitiveContains(project))
                    {
                        projects.Remove(project);

                        foreach (string file in projectPathMap[project])
                        {
                            fs.Add(file);
                        }
                    }
                }

                if (projects.Count == 0)
                {
                    break;
                }
            }
        }

        data.Output.Sources.AddSource(fs);
    }
}

					

And finally this adds a PackageSourceFiles to the output package that holds all of the binaries that it was necessary to add.

7: Dealing with deletions url copied!

Since there's no direct approach to deleting individual files or items inside a Sitecore package, for the moment I'm addressing deletions by adding them as an instruction to the ReadMe section of the package metadata:
public class ExtractDeletionsToDeploy : IPipelineComponent
{
    public void Run(PipelineData data)
    {
        bool deletionsFound = false;

        StringBuilder sb = new StringBuilder();
        sb.Append("The following items require deletion:\r\n");

        foreach (var item in data.Input)
        {
            if ( (item.Value & SourceControlActions.Delete) == SourceControlActions.Delete)
            {
                deletionsFound = true;
                sb.Append(item.Key);
                sb.Append("\r\n");
            }
        }

        if (deletionsFound)
        {
            data.Output.Metadata.ReadMe = sb.ToString();
        }
    }
}

					

I'm considering the idea that a post-deployment event might be able to automate the removal of these items in the future.

Next step... url copied!

So that's the set of pipeline components I'm using for my prototype – but as previously suggested, the code is designed to be extensible to allow them to be improved or replaced in future work.

Next week's will be the last part of this, I think – a wrap up and the final source code.

↑ Back to top