I suspect a fairly common scenario for Sitecore developers is launching a new site which replaces an existing one with a shiny new design and content structure. It's a fairly common requirement of these projects that whoever is in charge of SEO will want redirects in place from important old URLs on the site, to new ones. They ensure that users who have bookmarks to the old pages don't see 404s, and try to keep the search engine rankings which had been acquired by the old site.
Another common scenario these days is for new websites to serve all of their pages under HTTPS, rather than just the "sensitive" pages as we might have done in the past.
When you combine these two needs together, you can end up with more complicated redirection rules than you might have needed in the past. If you're planning to make use of the the Url Redirect module from the Sitecore Marketplace, my experiences doing this might be of help to you:
We need zero or one of these rules to trigger for any request arriving at the server.
In my case, detecting a URL which requires a redirect needs an expression based on the pattern:
^the-url-to-redirect/?$
Which translates as:
When constructing these rules you need to remember that you don't see the protocol, host name or querystring at this point. If you're not confident with regular expressions, a testing tool such as the .Net Regex Tester can be very helpful to help you get the syntax right. This is generally easier than trying to debug the expressions by running redirect rules in Sitecore.
Your rule item probably ends up looking a bit like this: (Click to enlarge)
If this matches, you then need a
Redirect
item to describe what should happen.
Here, we need to specify a few things:
The first two are dealt with by the replacement expression that we define in the
Rewrite URL
field. This will look something like:
https://HTTP_HOST/my-new-site-url
The
{HTTP_HOST}
token will be replaced with whatever the host was in the original request. Using this approach rather than hard-coding the host allows the rules to be tested on non-production servers prior to deployment.
The third and fourth bullets are dealt with by the
Redirect Type
field and the
Stop processing of subsequent rules
checkbox.
So your redirect item will look something like:
You'll need to follow that pattern to create a set of rules for redirecting the old URLs that your SEO people require. You may find that using folders to organise the rules for sub-levels of the old site will help keep things neater. Folders are ignored by the logic which processes the rules, so they just help you organise rules.
First of all, the
Inbound Rule
here needs to match everything, as we only get here if no other rule has matched. The example rules that ship with the module use
(.*)
for this purpose. The brackets here are required to make a "group" in Regular Expression terms – basically some characters we're going to want to be able to fetch later.
Secondly, you need to add a
Condition
to qualify the
Inbound Rule
. We've already told it to "match any path" but we only want this rule to trigger if the request is made under HTTP. The data needed for that is as follows:
That means the rule only triggers when the flag saying "is the request under HTTPS" is false.
And finally, we need the redirect:
The only difference here is the use of the
{R:1}
token in the target URL. This fetches back the group we matched above, so we can paste the original requested URL into our re-written request.
But when I'd finished testing this and deployed it to my QA infrastructure, it stopped working as expected. Suddenly I was seeing the rules firing in such a way that two redirects were required. The generic HTTPS rule would trigger first for my test, followed by the specific rule.
Cue a few hours spent with the source code from Github and the trusty Visual Studio debugger...
The reason, it turns out is that the rules are fetched by a query when the module starts up, and it's based on the item template not content structure. The order items come back in isn't really defined here, but is most likely related to the order they exist in the underlying database. On my dev machine, I'd created them in the correct order so by luck they were working there, but copying them across to the QA site via a package hadn't maintained that ordering. Hence the list of rules in memory ended up in a different order to the rules in the content tree – because that's always sorted by the
__sortorder
field.
I ended up solving my problem by adding "sort by
__sortorder
into the redirect module code, so that each time a rule is updated the cached rule-set in memory is re-sorted. The
change set is available via github
if you find yourself in a similar situation, and I've submitted a Pull Request for this and some other minor changes. So hopefully that behaviour (or an improved version of it) can end up in the Marketplace module in the future...
One point worth noting is that because of the way that Sitecore manages the values in the
__sortorder
field, if you put rules into folders, more than one rule can end up with the same sort order value. Sitecore restarts its numbering from scratch for each folder, so if the rules are fetched from more than one folder you can see duplication. And unsurprisingly, the outcome of the sort operation is undefined for these duplicates. If this is going to be an issue to you, you probably need to manually adjust the values in the sort order field, to make sure their values sort correctly. This field lives in the "Standard Fields" for Sitecore Items, so you'll need to make sure those are visible in Content Editor in order to modify it. Or alternatively you might consider modifying the code to sort by a custom "rule priority" field that you create yourself.