I'm in the middle of trying to plan out the transition of a Sitecore 10 development project from PaaS deployments, over to the Azure Kubernetes Service. There's some great info out there, but there have also been some interesting things I've wondered about that seem less documented right now. So here are some things I've learned this week:
Big thanks to Bart Plasmeijer, Rob Ahnemann, Kamruz Jaman, Jeff L'Heureux and the people of SitecoreLunch who have helped me work out answers to my questions about all this. Maybe these points will be helpful to others who are trying to work out what the right approach for their projects are too.
There seem to be some conflicting priorities for choosing the version of Windows your images are based on. On one hand, you have your development docker images, which run on your local computer. Here you really want to optimise for "best local compatibility" and "lowest impact on your PC". That tends to suggest that you should run the latest versions of Docker Desktop, Windows and the newest versions of the Sitecore base containers. So it feels like the newest Windows "SAC" release would be the best bet here. And Sitecore have recently added containers for this to their feed.
(As I type this, the current SAC release Sitecore support is 2004, but Microsoft already have a newer release, which Sitecore should support soon, I'm told)
But then there's Kubernetes. When look at the config files you can download for configuring a production Kubernetes cluster, you'll see that these only describe using images based on the "LTSC" version of Windows. Which version should you be deploying to Production?
I did a pile of googling, but didn't really find a good answer. But Sitecore Slack to the rescue – Bart Plasmeijer explained the answer I missed:
According to Microsoft, Azure Kubernetes Service currently only supports with the LTSC version of Windows. So if you're deploying to AKS for production use, you have to use whatever the current LTSC release is right now.
Knowing the answer to the first point brings up a new issue. If we know we have to have LTSC images for production, should we use the same for local development, or might it make sense to build different images for local development vs onward deployment? A single set of images would seem to offer the simplest overall compatibility, but it would lead to loss of the latest Docker features on your development images.
I have a local development setup where developers build some custom images that include a few project-specific things, but do not include any of the main project's code. The devs start up Docker and publish that code from Visual Studio on top of the containers. That lets them run the site locally and do their development without worrying about any solution files being baked into their containers. But it's no good for onward deployment.
So when I sat and thought about this, it seemed to make sense that my project actually needs some different Dockerfiles for two scenarios here. I need CM / CD Dockerfiles to make my developer images, to drop code in development on top of. (Which might also be targeted to the 2004 version of Windows on my developer's machines) And I also need CM / CD Dockerfiles for onward deployment, and they need to include the compiled code by default.
When you look at the examples available, they always seem to have just a single set of Dockerfiles, and they run the same containers locally as they go on to deploy. So this made me wonder if my thinking was wrong here...
But having raised this issue with on Slack and discussed with Rob Ahnemann and the Sitecore Lunch people, it seems I'm not barking up the wrong tree there. Other people follow a similar model to optimise their image builds for different scenarios. And Kamruz Jaman pointed out that "build an image per windows release" is also a common approach – ensuring there's an image optimised for whichever O/S your developers and production servers are using.
And (again) the answer to #2 lead me to a new question. In my developer setup, I have a folder of Dockerfiles for all the roles in an XP deployment. So when docker-compose starts up an XP instance of the site, it's using custom images for all the roles. However the Dockerfiles for the xConnect role imagess don't actually do anything right now. The Dockerfiles exist in case I need to make custom changes to them later, but they don't currently add anything meaningful to the images.
So I stated wondering about whether it was necessary for my Kubernetes image build to do anything with these Dockerfiles. Might it just be easier to not bother building them, save myself some disk space on my Azure Container Registry and stick with the default Sitecore images for xConnect when I deploy onwards? (At least until a requirement comes up for customising xConnect somehow, that is)
Asking this on Sitecore Lunch got a couple of useful responses about why it's right to build all the roles each time. Jeff L'Heureux pointed out that it's simpler to have all your images share the same version tag for onward deployment – which seems like a key reason why building images for roles you're not customising still makes sense. The other key point is that due to the nature of Docker's layered file system, the image you'd be adding to your Azure Container Registry should be very small here – so the convenience of the common tags won't cost much in storage.
I'm sure I'll bump into more questions like this as I work through this transition. But for now I can get on with setting up my deployment process...