Recently I wrote about an issue I encountered where a client's website was missing its GeoIP data (and the related back-end analytics data) entirely. While the changes discussed in that post solved the problem of there being no MongoDB data for GeoIP lookups at all, I continued to see odd issues with many users not being located after those fixes were made. Sorting this out seems to suggest that some of the "common wisdom" about configuring GeoIP for analytics isn't right – so here are my latest findings:
Afer the change in my previous post, I started to see some GeoIP data being recorded. But awhen I looked carefully at what was stored, I noted that the only data being captured was for addresses in Cardiff and London. I realised that those two locations happened to correspond to the network locations of me (on Kagool‘s internal network) and the client's authors. That looked suspicious, so I spent more time looking at the state of the site's configuration and infrastructure. However none of the information I was able to dig up seemed helpful or relevant.
So having exhausted what I knew, and the miraculous powers of Google, I tried raising a support request with Sitecore...
After I'd explained the setup of the site, provided the config and log files, and jumped through all the usual hoops of making a support request, the suggestion I was given was that there was probably an issue with Sitecore's config for the load balancing and how it processed traffic...
When some load balancers relay traffic from the internet to servers on your network, they can change the source address of the network packet that the server sees. Rather than the source being the IP address of the internet user's computer (which originally made the request) it can end up as the the IP Address of the load balancer itself. Obviously that would break things like geo-location or IP address based security. So there needs to be a way to handle that issue. Step forward
HTTP header. When the load balancer changes the source IP address, it will add an HTTP header which specifies the original IP address. That allows the receiving web server to process that header data to do geo-location or other source address based processing. (NB: The header isn't always called
- some kit uses different names, but that's probably the most common one)
So support's initial suggestion was that the internet traffic going to the CD servers in my cluster was going through a load-balancing device in Azure which was re-writing the source IP address, but my site was not configured to understand this. They reasoned that I was seeing location data in the logs from client and Kagool users because our network route to access the site went through a different route which was not making use of the load balancing. But the figured the public traffic was not having a location set because Sitecore did not know to expect the source address being rewritten. The GeoIP config for Sitecore has a setting called
where you can tell the analytics processing code that your network traffic includes a "source address was re-written" header. And that setting was empty in my config.
With that suggestion in hand, I did a bit of digging to try and work out what header might be being added by the load balancing hardware. I dropped some test code onto the CM and CD servers, so I could look at the raw headers arriving with client requests. But I was unable to see anything that looked relevant in this case, no matter where I was on the internet when I browsed. Both my laptop (on the company network) and my phone (on the public internet) showed something like:
There was nothing that looked like it might be an
or similar header...
Cue a lot of back-and-forth with Support to explain this result, and discussions with the infrastructure people who owned the Azure setup to see if they could cast any light on the issue. Eventually I made Support's config change anyway – since that was the best way to finally prove this was not the issue. And surprise surprise, it made no difference...
So after being sidetracked by that for a quite a while, Support went back to the config files and came up with an interesting suggestion: They pointed out that the site's CD servers had
set to false, but the CM server had it set to true. I was surprised that they'd commented on that, as I had thought this was configured correctly. After all, the documentation and blog posts I'd read (Like the info
here) had all pointed to "set it on only one server" being the correct setup...
Support said I should be enabling the
on all the servers in my cluster that needed to resolve locations in order to make everything work -
and they were right. Once this setting was changed on the CD servers, every visitor to the site was correctly looked up:
So if you're using a recent-ish version of Sitecore (The site in question was on v8.1 – so probably anything that age or newer) it seems like the documentation and blog posts around the
setting are out-of-date when they say "set it on only one machine".
Support are suggesting that the code and its behaviour seems to have changed at some point, and you do now need to set it on all your servers that need to resolve locations, as per the documentation that discusses this setting for Sitecore v9. That might mean "all your CD servers" or perhaps "all the CM and CD servers" depending on how your cluster is set up, and what results you want.
Those warnings in the docs about "setting this on more than one server can cause problems" no longer seem to be relevant.
So make a note of that, and avoid spending as much time debuging as I have 😉