Recently I wrote about an issue I encountered where a client's website was missing its GeoIP data (and the related back-end analytics data) entirely. While the changes discussed in that post solved the problem of there being no MongoDB data for GeoIP lookups at all, I continued to see odd issues with many users not being located after those fixes were made. Sorting this out seems to suggest that some of the "common wisdom" about configuring GeoIP for analytics isn't right – so here are my latest findings:
So having exhausted what I knew, and the miraculous powers of Google, I tried raising a support request with Sitecore...
When some load balancers relay traffic from the internet to servers on your network, they can change the source address of the network packet that the server sees. Rather than the source being the IP address of the internet user's computer (which originally made the request) it can end up as the the IP Address of the load balancer itself. Obviously that would break things like geo-location or IP address based security. So there needs to be a way to handle that issue. Step forward
the
X-Forwarded-For
HTTP header. When the load balancer changes the source IP address, it will add an HTTP header which specifies the original IP address. That allows the receiving web server to process that header data to do geo-location or other source address based processing. (NB: The header isn't always called
X-Forwarded-For
- some kit uses different names, but that's probably the most common one)
So support's initial suggestion was that the internet traffic going to the CD servers in my cluster was going through a load-balancing device in Azure which was re-writing the source IP address, but my site was not configured to understand this. They reasoned that I was seeing location data in the logs from client and Kagool users because our network route to access the site went through a different route which was not making use of the load balancing. But the figured the public traffic was not having a location set because Sitecore did not know to expect the source address being rewritten. The GeoIP config for Sitecore has a setting called
Analytics.ForwardedRequestHttpHeader
where you can tell the analytics processing code that your network traffic includes a "source address was re-written" header. And that setting was empty in my config.
With that suggestion in hand, I did a bit of digging to try and work out what header might be being added by the load balancing hardware. I dropped some test code onto the CM and CD servers, so I could look at the raw headers arriving with client requests. But I was unable to see anything that looked relevant in this case, no matter where I was on the internet when I browsed. Both my laptop (on the company network) and my phone (on the public internet) showed something like:
There was nothing that looked like it might be an
X-Forwarded-For
or similar header...
Cue a lot of back-and-forth with Support to explain this result, and discussions with the infrastructure people who owned the Azure setup to see if they could cast any light on the issue. Eventually I made Support's config change anyway – since that was the best way to finally prove this was not the issue. And surprise surprise, it made no difference...
Support said I should be enabling the
Analytics.PerformLookup
on all the servers in my cluster that needed to resolve locations in order to make everything work -
and they were right. Once this setting was changed on the CD servers, every visitor to the site was correctly looked up:
Bingo...
Support are suggesting that the code and its behaviour seems to have changed at some point, and you do now need to set it on all your servers that need to resolve locations, as per the documentation that discusses this setting for Sitecore v9. That might mean "all your CD servers" or perhaps "all the CM and CD servers" depending on how your cluster is set up, and what results you want.
Those warnings in the docs about "setting this on more than one server can cause problems" no longer seem to be relevant.
So make a note of that, and avoid spending as much time debuging as I have 😉
↑ Back to top