There are some things in Sitecore that you just take for granted will work. Loading items is a good example of this. I'll admit that user error can get in your way, but usually if you can see an item in the content tree, you can write code that will load it without issues. So I'll admit I was pretty confused when I came across a scenario recently where this did not appear to work correctly. In case anyone else hits this challenge, here's what happened:
I have a client who is making use of the Commerce Connect module to pull in data about product categories for an sales website. The back-end commerce system is their own, but they have made added some customisation to Commerce Connect to allow their data to be moved from the back-end system into Sitecore each night. They had noticed that sometimes, when new trees of content were being added by the import process some of the data would come over, but other bits would not. And that's where I got involved – looking to see what might be going wrong with the import process.
After working through the custom bits of their integration code in the Visual Studio debugger, I spotted a place where this behaviour might occur. When it tried to import a new item of "product category" data, it was looking to load the correct parent item. (So the new child could be created under it) It looked like when a parent didn't exist, it would fail to process new children for that item. But oddly in some circumstances I could step through the code and see that the parent item was visible in the Sitecore content tree, but the request to load it still failed.
That confused me. So I did what I usually do when code confuses me: I try to reproduce the situation with the simplest possible thing. And after a few tries, I hit on the following process:
If I deleted all the product category data from Sitecore, I could run the Commerce Connect synchronisation process and it would complete ok. But having done that I could run some simple PowerShell that showed the problem:
get-childitem "/sitecore/content/Product Repository/Product Classifications/storeName/Grocery/Baby Store" Write-host "--" get-item "/sitecore/content/Product Repository/Product Classifications/storeName/Grocery/Baby Store/Baby Bath Skin Care" Write-host "--" get-item -path master: -ID "{F3D09AFA-D9B3-11C7-14B1-4D1096878F21}" | select -ExpandProperty Paths | select FullPath
Which gave the following result:
Attempting to load the parent item and show its children worked fine – and it would show that "Baby Bath Skin Care" was a child of the parent. But trying to load that item explicitly by it's path failed with an "item not found error". However asking for that "Baby Bath Skin Care" item by its ID worked fine.
Why on earth would loading an item by path fail when it was clearly present in the content tree?
Some other issues we'd been looking at for this client had centred around the publishing service and the descendants tables in Sitecore databases. That lead me to realising that the "cleanup database" option in the Control Panel could fix my issue: If I clicked it, and then re-ran my test script afterwards, it would succeed... Clearly this wasn't a fast query issue – so what was going on?
A good hint towards what was up came from my colleague Martin – who suggested that the database cleanup would be clearing caches as well as its other operations. So I tried using cache.aspx to clear caches instead of the database cleanup, and that too fixed the issue when I ran my test script – so Martin was definitely on to something.
But out of the box you can't look at Sitecore's caches in any more detail, so I fell back to this excellent post about looking at individual caches. (Thank you Brian!) You need to make a couple of tweaks to this for it to work with recent releases of Sitecore, and I took the opportunity to quickly hack in some code to show what was in a particular cache, to help with my debugging. [Code here, if you think it might help your work]
So I repeated my test a few times, clearing one cache, until I was able to work out that it was the "master[paths]" cache that was causing my issues. Clearing just that one would allow the script to complete ok.
Before clearing this cache, it included an entry for the item that wouldn't load:
But after clearing it and re-running the script the contents would look like this:
That seemed like a cause for the issue – initially the paths cache contains empty GUIDs, which means trying to load an item by path will fail until the cache is cleared and gets re-populated with the correct IDs.
I've raised this issue with Sitecore Support, who have confirmed a bug in how the caching behaviour works with Commerce Connect, that will get fixed in a future release. In the mean time, if you find yourself encountering a similar issue you can reference bug 157425 when talking to support.
But if you need to work around the problem, just clearing the
master[paths]
cache after your Commerce Connect import runs should help you.