Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2024/xmcloud-white-screen-rendering-certificates

Blank white screen from local XM Cloud Rendering host?

No idea how I broke it, but do have one way to fix it...

Published 06 May 2024

There are some days when technology just doesn't want to play ball. And in my experience 99% of these days are when you're on a developer training course and its the exercise/labs machine that's being difficult. I had this recently on the XM Cloud developer intro course. I've no idea if anyone else would ever see this issue (or how it was caused) but it didn't return much useful info in Google, and I did find a way to fix for my problem. So it's documentation time...

The issue

I had the opportunity to take Sitecore's XM Cloud Developer Fundamentals training recently. These courses are all delivered online these days, and you get given a virtual machine to run your class exercises on. Following the course instructions I'd created an XM Cloud instance and set up the deploy process to use in the class. And the next step was to get local development working, so I could follow the instructions to create some new components.

The steps for this should be fairly simple:

  • Clone the git repo of your source code to your lab machine.
  • Configure the containers to run on the lab machine.
  • Tweak the JSS rendering host config to work locally.
  • Browse the containers for your site, and get on with the exercises.

And for most of my course colleagues, this all worked fine. But when I got to the final step, and tried to fire up Experience Editor to edit some content I was presented with this:

A browser showing an empty white screen, rather than the expected web page

Rather than an example website I got a blank white page. Not a "you've not configured your rendering host" error. Not a "you broke Next.js" error. Just a big empty space.

I went back through the lab notes, looking to see if I messed something up, but couldn't see anything obvious I'd done wrong. And I tried "clone the source to a fresh folder and repeat all the steps" just in case I'd typo'd something important. But neither of these steps had an effect. And my course trainer (and fellow students) hadn't seen this issue before either...

For the rest of the course I worked around this by publishing my code changes up to my cloud instance instead of testing them locally. This was slower, but it worked fine for finishing the training. But once I was done with the course, my interest had been piqued and I had to dive into trying to find a fix.

Diagnosing it

It's usually a good idea to start with what the computer is telling you. So I looked at the rendering host logs to see if there was some sort of error here:

Docker Desktop's window showing the logs from Sitecore's Rendering Host container. No obvious errors are shown.

But (to my limited knowledge of Node and Next.js) that all looked fine. So I tried the browser's developer tools. And the network trace gave me my first big clue:

The web browser's developer tools window showing some failed requests for JavaScript files.

While the overall page being requested was a 200 and returned data, a collection of JS files were coming back with an error. And when I requested one of these directly I got:

A browser showing the result of requesting one of the broken JavaScript files - an SSL error saying that the domain name does not match the certificate signature.

So the main JS files were reporting certificate errors and failing to load. For a technology that does a load of client-side rendering stuff (like Next.JS) that did seem like an important problem which might cause the white screen I was seeing. But why the error?

Well the original request was going to xmcloudcm.localhost which was returning a valid certificate. But the broken requests were going to www.xmcloudpreview.localhost and they were coming back with the default certificate from Traefik. And since that did not match this specific domain name, the browser was unhappy. If I clicked the "just ignore this certificate error and give me the file please" in the browser then I did get back javascript, so the problem wasn't really with the file, but with the way the data was being routed and returned.

Hacking a fix for it

I scratched my head a bit, got some advice from the Sitecore Lunch crew, and did a bunch of Googling. There were a lot of posts about "Rendering host does not trust your certificates" and "you broke your javascript code" but nothing which seemed to address this problem specifically. But I did come across one post which mentioned white screens in association with certificate errors. So I decided to try that approach.

In Sitecore container setups, all of the certificate-related stuff is dealt with by Traefik. It acts as a reverse proxy, and does SSL stripping from incoming requests. So all the traffic between the containers internally is unencrypted. The setup is broadly:

flowchart LR
  client[Client's
web browser] subgraph Docker traefik[Traefik
Proxy] cm[XM CM
Server] rendering[Rendering
Host] end certs[(Certificate data
and config)] client <-- https --> traefik traefik <-- http --> cm traefik <-- http --> rendering certs -.-> traefik

There are two sets of data files on your physical machine which tell Traefik how to handle request, routing and certificates. The first is that you have to give it certificate files and get them loaded. The second is that your Docker Compose files tell the system what domain names get routed to what containers via Traefik. So in theory to handle this scenario we can modify those bits of data to add a new certificate and route the related dns name.

The init.ps1 script for Sitecore containers grabs a copy of mkcert.exe and sets up its root certificate. So if you need to add a new SSL domain to your docker setup, you can use that tool to add a new certificate file. So in this case opening a console in the <repo>\docker\traefik\certs and running .\mkcert <your certificate name>. In this case I wanted a wildcard for anything in the xmcloudpreview.localhost domain. So the command was .\mkcert *.xmcloudpreview.localhost. And that will generate a private key and a certificate signed by your root:

Windows Explorer showing the certificates folder under the Traefik setup in Sitecore's Docker config. An extra certificate has been added to try and resolve the error in the previous image.

For Traefik to use these files you need to update some config. Under <repo>\docker\traefik\config\dynamic there's a file called certs_config.yaml which you can add your new files to:

tls:
  certificates:
    - certFile: C:\etc\traefik\certs\_wildcard.sxastarter.localhost.pem
      keyFile: C:\etc\traefik\certs\_wildcard.sxastarter.localhost-key.pem
    - certFile: C:\etc\traefik\certs\xmcloudcm.localhost.pem
      keyFile: C:\etc\traefik\certs\xmcloudcm.localhost-key.pem
    - certFile: C:\etc\traefik\certs\_wildcard.xmcloudpreview.localhost.pem
      keyFile: C:\etc\traefik\certs\_wildcard.xmcloudpreview.localhost-key.pem

					

If the domain name you're adding doesn't already exist in the setup for the solution you'll need to map it through Traefik to a container that will serve the responses. That's done with labels in your compose files. So as an example, if you needed to route www.your.domain.here to the rendering host, (and you'd added the right certificate above) then you'd be adding:

rendering:
    image: ${REGISTRY}${COMPOSE_PROJECT_NAME}-rendering:${VERSION:-latest}
    build:
      context: ./docker/build/rendering
      target: ${BUILD_CONFIGURATION}
      args:
        PARENT_IMAGE: ${REGISTRY}${COMPOSE_PROJECT_NAME}-nodejs:${VERSION:-latest}
    volumes:
      - .\src\sxastarter:C:\app
    environment:
      SITECORE_API_HOST: "http://cm"
      NEXTJS_DIST_DIR: ".next-container"
      PUBLIC_URL: "https://${RENDERING_HOST}"
      JSS_EDITING_SECRET: ${JSS_EDITING_SECRET}
      SITECORE_API_KEY: "${SITECORE_API_KEY_xmcloudpreview}"
      DISABLE_SSG_FETCH: ${DISABLE_SSG_FETCH}
    depends_on:
      - cm
      - nodejs
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.rendering-secure.entrypoints=websecure"
      - "traefik.http.routers.rendering-secure.rule=Host(`${RENDERING_HOST}`)"
      - "traefik.http.routers.rendering-secure.rule=Host(`www.your.domain.here`)"
      - "traefik.http.routers.rendering-secure.tls=true"

					

That makes sure Traefik knows what to do with the requests as they come in. But that wasn't necessary in my particular situation as that name was already in place, but your scenario might require that change.

Once you've made these changes, you'll need to use docker compose to stop and start your containers, in order for the config to get reloaded.

With that done, I tried reloading my pages and got:

A browser showing the same page that was a white screen above, now with Experience Editor being displayed correctly.

Success! I've got it working! No more certificate errors in the network trace, and proper content on the screen.

In conclusion...

I've still not worked out how my lab machine got broken in the first place, so I've probably worked around entirely the wrong problem here. (I'm fairly sure there would have been a better solution) But despite that I've got my lab instance working, and I've learned a new thing about how to configure Sitecore containers anyway. And maybe this might be helpful to some others...

↑ Back to top