Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2014/custom-sitemap-filespart-three

Custom Sitemap Files – Part Three

Published 20 May 2014
Updated 25 August 2016
C# Sitecore ~3 min. read
This is post 3 of 4 in a series titled Custom Sitemap Files

The third part of this series is to look at how we can add images to our XML Sitemap files. We've looked at the configuration and the basic code to get entries into Sitemap files in the first two posts.

Getting images into the sitemap requires two things: First specifying some rules for what images to include, and secondly some code to extract those images from the content and write them into the index files. The code to deal with images that are specified in fields on the web page item is easy – but we also need to deal with the situation where the image is referred to by a component that has been dynamically bound to the page.

The configuration for an "image field on the current page" reference is pretty easy:

Image field template

We declare a new template for the configuration of these images. Each instance of this will specify the name of a Sitecore Image field that we'll try to load the image from. And we configure the Sitemap File template to allow image configuration to be added as children. We can then add instances of this template to our test config:

Image field settings

Now, for images that are in the data sources of components on our page we need a little more data:

Untitled1

This template will be used in the same way (to create items that are children of our sitemap config) but here we let the user choose a component type to look for as well as the name of the field to find in its data source. And we can set that up as so:

Untitled 2

With that configuration in place, (and some extra test data) we now need some code to process it. And that splits into three pieces. Additions to the configuration classes we defined to represent the config items. New code to extract the image data. And finally the extra code to serialise the data.

The class to extract the image configuration into is fairly simple:

public class ImageConfiguration
{
    public bool IsFieldImage { get; private set; }
    public string ImageFieldName { get; private set; }
    public string ImageComponent { get; private set; }

    public ImageConfiguration(SiteConfiguration sc, Item img)
    {
        if (img.TemplateID == Identifiers.SitemapComponentImageTemplateID)
        {
            IsFieldImage = false;

            ImageFieldName = img.Fields[Identifiers.SitemapImageFieldName_ComponentBasedFieldID].Value;

            string cmpID = img.Fields[Identifiers.SitemapImageComponentFieldID].Value;
            if (!string.IsNullOrWhiteSpace(cmpID))
            {
                ImageComponent = cmpID.ToUpperInvariant();
            }
        }
        if (img.TemplateID == Identifiers.SitemapFieldImageTemplateID)
        {
            IsFieldImage = true;

            ImageFieldName = img.Fields[Identifiers.SitemapImageFieldName_FieldBasedFieldID].Value;
        }
    }
}

					

And we can extend the configuration class for each sitemap to include a set of these images:

public class SiteConfiguration
{
    public string SitemapFilename { get; private set; }
    public string SitemapSourceDatabaseName { get; private set; }
    public Database SitemapSourceDatabase { get; private set; }
    public ID SitemapRootItem { get; private set; }
    public IEnumerable SitemapIncludeLanguages { get; private set; }
    public IEnumerable SitemapIncludeTemplates { get; private set; }
    public IEnumerable ImageExtraction { get; private set; }

    private Language getLanguage(string name)
    {
        Language l;

        if (Language.TryParse(name, out l))
        {
            return l;
        }

        return null;
    }

    public SiteConfiguration(Item siteItem)
    {
        SitemapFilename = siteItem.Fields[Identifiers.SitemapFilenameFieldID].Value;

        SitemapSourceDatabaseName = siteItem.Fields[Identifiers.SitemapSourceDatabaseFieldID].Value;

        SitemapSourceDatabase = Sitecore.Configuration.Factory.GetDatabase(SitemapSourceDatabaseName);

        ID rootItem = ID.Null;
        ID.TryParse(siteItem.Fields[Identifiers.SitemapRootItemFieldID].Value, out rootItem);
        SitemapRootItem = rootItem;

        SitemapIncludeLanguages = siteItem.Fields[Identifiers.SitemapIncludeLanguagesFieldID].Value.Split('|')
            .Where(s => !string.IsNullOrWhiteSpace(s))
            .Select(s => ID.Parse(s))
            .Select(i => SitemapSourceDatabase.GetItem(i))
            .Select(l => l.Name);

        SitemapIncludeTemplates = siteItem.Fields[Identifiers.SitemapIncludeTemplatesFieldID].Value.Split('|')
            .Where(s => !string.IsNullOrWhiteSpace(s))
            .Select(s => ID.Parse(s));

        var imageExtraction = siteItem.Axes.SelectItems(string.Format("./*[@@templateid='{0}' or @@templateid='{1}']", Identifiers.SitemapFieldImageTemplateID, Identifiers.SitemapComponentImageTemplateID));
        if (imageExtraction != null && imageExtraction.Length > 0)
        {
            ImageExtraction = imageExtraction
                .Select(i => new ImageConfiguration(this, i));
        }
        else
        {
            ImageExtraction = new List();
        }
    }
}

					

And we can extend the classes we use to store our sitemap data before we serialise it too – by adding a class to store and serialise image data:

public class SitemapUrl
{
    private List _images = new List();

    public IEnumerable Images { get { return _images; } }

    public void Add(SitemapImage img)
    {
        _images.Add(img);
    }

    public void AddRange(IEnumerable imgs)
    {
        _images.AddRange(imgs);
    }

    public string Location { get; set; }
    public DateTime? LastModified { get; set; }
    public ChangeFrequency? ChangeFrequency { get; set; }
    public Single? Priority { get; set; }

    public XElement Serialise()
    {
        XElement root = new XElement("url");

        root.Add(new XElement("loc", Location));
        
        foreach(SitemapImage img in _images)
        {
            root.Add(img.Serialise());
        }

        if(LastModified.HasValue)
        {
            root.Add(new XElement("lastmod", LastModified.Value.ToString("yyyy-MM-dd")));
        }

        if (ChangeFrequency.HasValue)
        {
            root.Add(new XElement("changefreq", ChangeFrequency.Value.ToString().ToLower()));
        }

        if (Priority.HasValue)
        {
            root.Add(new XElement("priority", Priority.Value.ToString("0.0")));
        }

        return root;
    }
}

public class SitemapImage
{
    public string Location { get; set; }
    public string Caption {get;set;}
    public string GeoLocation {get;set;}
    public string Title {get;set;}
    public string License {get;set;}

    public static readonly XNamespace Namespace = "http://www.google.com/schemas/sitemap-image/1.1";

    public XElement Serialise()
    {
        XElement img = new XElement(Namespace + "image");

        img.Add(new XElement(Namespace + "loc", Location));

        if (!string.IsNullOrWhiteSpace(Caption))
        {
            img.Add(new XElement(Namespace + "caption", Caption));
        }

        if(!string.IsNullOrWhiteSpace(GeoLocation))
        {
            img.Add(new XElement(Namespace + "geo_location", GeoLocation));
        }

        if(!string.IsNullOrWhiteSpace(Title))
        {
            img.Add(new XElement(Namespace + "title", Title));
        }

        if(!string.IsNullOrWhiteSpace(License))
        {
            img.Add(new XElement(Namespace + "license", License));
        }

        return img;
    }
}

					

And with all that in place, we can start updating the  code to generate the data. According to the schema definition for Sitemap XML, the image data goes into the url element, so we can add the code to the place in last week's code where we were generating that bit of data – the processLanguage() method. All we change here is to add a call to a new processImages() method and add the results to the images collection in the SitemapUrl object:

private SitemapUrl processLanguage(SiteConfiguration sc, Item item, Language l)
{
    SitemapUrl url = new SitemapUrl();

    var uo = new Sitecore.Links.UrlOptions()
    {
        AlwaysIncludeServerUrl = true,
        LanguageEmbedding = Sitecore.Links.LanguageEmbedding.AsNeeded,
        ShortenUrls = true,
        Language = l
    };

    url.Location = Sitecore.Links.LinkManager.GetItemUrl(item, uo);

    // process images
    url.AddRange(processImages(sc, item, l));

    DateField df = (DateField)item.Fields[Identifiers.__UpdatedFieldID];
    url.LastModified = df.DateTime;

    if (item.Fields.Contains(Identifiers.SitemapPriorityFieldID))
    {
        float f;
        if (float.TryParse(item.Fields[Identifiers.SitemapPriorityFieldID].Value, out f))
        {
            url.Priority = f;
        }
    }

    if (item.Fields.Contains(Identifiers.SitemapChangeFrequencyFieldID))
    {
        ChangeFrequency cf;
        if (Enum.TryParse(item.Fields[Identifiers.SitemapChangeFrequencyFieldID].Value, true, out cf))
        {
            url.ChangeFrequency = cf;
        }
    }

    return url;
}

					

The new method needs to process all the config we've set up for images, and generate the relevant data:

private IEnumerable<SitemapImage> processImages(SiteConfiguration sc, Item item, Language l)
{
    List<SitemapImage> imgs = new List<SitemapImage>();

    foreach (ImageConfiguration icfg in sc.ImageExtraction)
    {
        if (icfg.IsFieldImage)
        {
            SitemapImage img = makeImg(icfg, item);
            if (img != null)
            {
                imgs.Add(img);
            }
        }
        else
        {
            string xml = LayoutField.GetFieldValue(item.Fields["__Renderings"]);
            LayoutDefinition ld = LayoutDefinition.Parse(xml);

            DeviceDefinition deviceDef = ld.GetDevice("{FE5D7FDF-89C0-4D99-9AA3-B5FBD009C9F3}");

            foreach(RenderingDefinition renderingDef in deviceDef.GetRenderings(icfg.ImageComponent))
            {
                Item itm = sc.SitemapSourceDatabase.GetItem(renderingDef.Datasource);
                if (itm != null)
                {
                    SitemapImage img = makeImg(icfg, itm);
                    if (img != null)
                    {
                        imgs.Add(img);
                    }
                }
            }                    
        }
    }

    return imgs;
}

					

So, if the image configuration says "get the image from a field on the current item" then we just read the value from that field. But for the "get the image from a component data source" config we need to do a bit more work.

Sitecore stores its information about page layouts inside the "__Renderings" field on an item. This contains some XML that describes what's called a Layout Delta. (All that is explained here) In order to extract the data about "what components are on this page" we parse ask a couple of helper classes to fetch the layout XML for us and then parse it into Sitecore's object model for this data. Then we need to specify what device we're using. Here I've hard coded it to use the "Default" device's ID. In reality you'd probably want that to be configurable in some way. Once we have a device, we can call GetRenderings() on that device to get back a list of any components being displayed that make use of the UI component defined by the ID we pass in - and that ID comes from our configuration item. (Back in the config above, we picked the component with a Droptree – so the value we have here is a GUID) We then iterate the set of RenderingDefinition objects we get back, and try to add an image item for each one.

Generating the sitemap data for the image happens in the makeImg() method:

private SitemapImage makeImg(ImageConfiguration icfg, Item item)
{
    var mo = new Sitecore.Resources.Media.MediaUrlOptions()
    {
        AlwaysIncludeServerUrl = true,
        AbsolutePath = true
    };

    ImageField img = (ImageField)item.Fields[icfg.ImageFieldName];
    if (img != null && !string.IsNullOrWhiteSpace(img.Value))
    {
        var si = new SitemapImage();
        si.Location = Sitecore.Resources.Media.MediaManager.GetMediaUrl(img.MediaItem, mo);
            
        string caption = img.MediaItem.Fields[Identifiers.DescriptionFieldID].Value;
        if (!string.IsNullOrWhiteSpace(caption))
        {
            si.Caption = caption;
        }

        string title = img.MediaItem.Fields[Identifiers.TitleFieldID].Value;
        if (!string.IsNullOrWhiteSpace(title))
        {
            si.Title = title;
        }

        return si;
    }

    return null;
}

					

That uses the MediaManager to generate the URL for the image, extracts a few properties and returns our data item for serialising.

So when we wire all that up, and re-generate our sitemap XML we get:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
  <url>
    <loc>http://test/en/sitecore/content/Home</loc>
    <image:image>
      <image:loc>http://test/~/media/System/Simulator Backgrounds/Android Phone.ashx</image:loc>
    </image:image>
    <image:image>
      <image:loc>http://test/~/media/System/Simulator Backgrounds/Blackberry.ashx</image:loc>
    </image:image>
    <lastmod>2014-10-05</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc>http://test/ja-JP/sitecore/content/Home</loc>
    <image:image>
      <image:loc>http://test/~/media/System/Simulator Backgrounds/Blackberry.ashx</image:loc>
    </image:image>
    <lastmod>2013-09-11</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc>http://test/en/sitecore/content/Home/Global shared content/Sample</loc>
    <lastmod>2014-29-21</lastmod>
  </url>
  <url>
    <loc>http://test/en/sitecore/content/Home/Sample Page</loc>
    <image:image>
      <image:loc>http://test/~/media/Images/HTC_000004.ashx</image:loc>
      <image:caption>A quick snapshot</image:caption>
      <image:title>A view across some countryside</image:title>
    </image:image>
    <lastmod>2014-12-03</lastmod>
  </url>
</urlset>

					

And there we have it. Configurable sitemaps that can include image data as well as the more usual stuff.

If you're interested in playing with this yourself you can download a package file of the sitecore templates and example config and a zip file of the example c# files to put together yourself and play with. But please remember that this is just proof of concept code so it's not sensible to use it in production as-is.

I think I've got one more post to make on this topic – next will be some thoughts on how you can address sitemap generation on very large sites, where performance can be an issue.

↑ Back to top