Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2019/reformatting-config-xml-so-its-easier-to-diff

Reformatting config XML so it's easier to diff

Published 30 September 2019

Every so often pretty much every developer ends up in a situation where they're looking at a bug that manifests on one platform, but not on another. The sort of bug where you end up spending hours looking through log and config files for a subtle difference. I found myself looking into just this sort of bug recently, but on a site where (to my frustration) the config files were full of comments and whitespace differences across platforms that made diffing really hard ** . Spotting that subtle bug-causing difference is pretty much impossible when your diff is full of noise... So how can we fix that?

With some Powershell of course!

I realised life would be much simpler for me if I could standardise the formatting and drop all the comments from the config. That way any differences that the diff tool showed up would be actual configuration differences, and not any of the annoying noise. So I hacked up some code to iterate a tree of .config files and reformat them for me...

Via the System.Xml.XmlDocument object we can load a config file while ignoring comments, and then write it back to disk with standardised indenting with a function like this:

function tidyConfigFile
{
    param(
        [string]$file
    )

    Write-Host "Modifying $file"

    $settings = New-Object System.Xml.XmlReaderSettings
    $settings.IgnoreComments = $true

    $reader = [System.Xml.XmlReader]::Create($file, $settings)

    $xml = New-Object System.Xml.XmlDocument
    $xml.Load($reader)

    $reader.Dispose()

    $xml.Save($file)
}

					

The only thing of note there is that you need to use an XmlReader with special settings to get the XmlDocument to strip out comments when it's loading.

So that will take some messy xml like

<?xml version="1.0" ?>
<!-- hello -->
<alpha>



<test x="y"/>

</alpha>

					

and neaten it up to:

<?xml version="1.0"?>
<alpha>
  <test x="y" />
</alpha>

					

So little recursive function to step through a tree of folders updating any config files will sort out the rest of the task:

function processFolder
{
    param(
        [string]$directory
    )

    Write-Host "Processing $directory"

    $files = Get-ChildItem $directory -Filter *.config | Select -ExpandProperty FullName

    foreach($file in $files)
    {
        tidyConfigFile $file
    }

    $children = Get-ChildItem $directory -Directory | Select -ExpandProperty FullName

    foreach($child in $children)
    {
        processFolder $child
    }
}

					

So with that in place, I could just put copies of my two sets of config in one folder, and let the script go...

Copied Config

processFolder ".\CopiedConfig"

					

To get myself a set of files that were much easier to diff.

Simple...

** – Yes, I know full well the fact there are differences like this is a massive code smell. But sometimes when you're the support developer on projects with little budget you just have to hold your nose and make it work, rather than spending days reworking the entre deployment process...
↑ Back to top