Every so often I get asked to look at existing Sitecore installs and write up reports on what's good and what's not so good about them. I spend time looking at lots of stuff, like the infrastructure, the databases, the code and security. But time after time I find myself writing up a similar issue...
And that is that almost every site I've looked at has had log files that contain regularly re-occurring errors.
Modern instances of Sitecore generate a lot of logs: The base CMS log, search crawling logs, FXM logs – plus probably some logs for extra modules you've added. And it seems to me that for a lot of Sitecore instances these files are just one of those annoying consumers of disk space that need tidying up every so often. But we shouldn't ignore them – they're our window into the internals of Sitecore and how well it's running.
The issues I find in the logs vary between installs I look at. Sometimes it's small stuff: like the keep-alive can't access the URL it's been given, or performance counters aren't working. But surprisingly often it's significant things, that will adversely affect the performance or features of the site: Incorrect email sending configuration, bad database connection strings, unhandled exceptions or SQL errors.
I've seen this issue with other software too. – if you have Coveo or Solr installed as part of your Sitecore solution, they have their own logs too. But don't forget Windows issues. Lots of red or yellow icons in Event Viewer is a similar indication that something significant isn't right with your site.
Don't have an "out of sight is out of mind" attitude to all these useful hints your servers are collecting for you.
Monitor. The. Logs.
Fix. The. Errors.
Can't be much clearer than that, I think?
Not that I want to do myself out of work, but if you're going to run a site you should have a plan to ensure you regularly check your logs. When you find issues occurring you need to schedule time to find the cause of the fault and to fix it.
In an ideal world, I'd recommend:
You should certainly do this on your Production and UAT / Test sites – but you may well find benefits from being equally pro-active about checking logs in development. The earlier you find an issue, the cheaper it is to fix, after all.
I appreciate that this sort of thing is not always easy to schedule. Work priorities often favour new features over solving log-file problems that aren't bringing the site down. Plus, if you have a 3rd party maintaining your infrastructure access to the log data might not be trivial. But in the long run I would argue that any hassle here is worth it. "A stitch in time" as they say...
And then you can have happy log files. And maybe one day I'll get to write that mythical "there is nothing wrong with this instance of Sitecore" report.
Well, you can't blame me for wishing, eh? 😉