I've been working on a deployment of Sitecore using containers recently, and hit a scenario which isn't discussed much in the Microsoft documentation: How do you go about setting it all up if you can't use Active Directory accounts across your DevOps and Azure instances? Having done some digging, here's what I've learned so far:
I needed to run Sitecore in AKS, but there was a security boundary between DevOps (owned and run by my client) and my dev/QA Azure instance (owned and run by my company). Normally this would be pretty easy – in DevOps you can create a Service Connection for an instance of AKS and give it your AD credentials for Azure. But that doesn't work for me – for tedious technical reasons it is not possible for me to log in as my Azure account when I'm accessing the client's DevOps, so I can't create one of these easy authentication tokens.
(For the curious: I have "username@mycompany.com" Azure AD account which is used for my Azure access in my company. But I also have the same username as a standard Microsoft account which is not tied to my company AD. The client used the standard Microsoft account to grant me access to their DevOps because they don't have access to our company AD either. So whenever I try to fill in the Service Connection form to give my credentials to set up this link, DevOps only accepts my Microsoft Account password. I cannot use the company AD account which would actually grant rights to the AKS instance. A silly issue – but a blocker none the less)
So how can I give DevOps permissions to push new deployments into my AKS instance?
It turns out that Kubernetes has its own approach to security – and this is something I can use to solve my problem. Most of the docs that you come across for AKS assume that you'll be using Active Directory, but it turns out you can use role-based access control (referred to as RBAC in the docs) with internal accounts too. So you can create an account in Kubernetes, grant it some permissions and then give its authentication tokens to DevOps to use when it connects.
Sounds simple – but like many things here, it requires some thought.
Kubernetes has the concept of "service accounts" – users which can perform actions in the system but are expected to be used by other computers rather than people. Some documentation only discusses this inside your cluster, when pods need to communicate. However there is also some documentation that discusses how these can be used for external systems to perform actions in your cluster.
From the command line you can create a new service account with
kubectl
. So I started by creating one in my Sitecore deployment's namespace. However despite a variety of experiments here, I was unable to make that work correctly – it was easy to create the account, but AKS would always refuse to perform any actions under these credentials. After a lot of research, I found one article that described putting your service account into the "kube-system" namespace. And this worked for me.
So the command I created the user with ended up as:
kubectl -n kube-system create serviceaccount azure-pipeline
Once you have an account, you need to "bind" it to a role, telling AKS which rights the account should have. Kubernetes calls this a "role binding". And Kubernetes roles can be scoped to either your application or to the entire cluster. Given the issue above, a cluster scoped role and binding seems more appropriate.
Again, I tried creating my own role, with what I thought was the right set of permissions, but was unable to make it work. So I fell back to one of the built-in admin roles. While this isn't best security practice, it's good enough to make the whole thing work while I learn more.
So the role binding to a cluster role can be set up:
kubectl create clusterrolebinding azure-pipeline-binding --clusterrole=cluster-admin --serviceaccount=kube-system:azure-pipeline
Note how the namespace here is specified by prefixing the user
<namespace>:<user>
rather than with the
-n
parameter here.
(You can also do this from Yaml files, of course – but I'll stick to the command line here)
When you click "Manage" for the AKS service connection in DevOps, you need to change the authentication method to "Service Account" to use this approach. That requires you to fill in two key bits of info:
First is the server url. As you can see from the image above, DevOps suggests a command to get this info. But when I paste that into my console, I get this:
But if you remove the "jsonpath" there, it will give you the data you need:
The server URL is highlighted here – and you can copy that into the DevOps field.
You then need the authentication data. Again, DevOps gives you some commands to run, and again these break for me. So cue some more research... What I found I could do was get all the service accounts by running
kubectl get serviceaccounts azure-pipeline -o custom-columns=":secrets[0].name" -n kube-system
to get the name of the token inside AKS – something like
azure-pipeline-token-ajdsa
And then you can pass the token name you get to a second command:
kubectl get secret azure-pipeline-token-ajdsa -o json -n kube-system
to get a blob of json:
{ "apiVersion": "v1", "data": { "ca.crt": "... redacted ...", "namespace": "a3ViZS1zeXN0ZW0=", "token": "... redacted ..." }, "kind": "Secret", "metadata": { "annotations": { "kubernetes.io/service-account.name": "azure-pipeline", "kubernetes.io/service-account.uid": "37d737b1-2e91-42c9-8e29-0104956ade8d" }, "creationTimestamp": "2020-12-04T09:56:03Z", "name": "azure-pipeline-token-ajdsa", "namespace": "kube-system", "resourceVersion": "11194", "selfLink": "/api/v1/namespaces/kube-system/secrets/azure-pipeline-token-ajdsa", "uid": "80ea29fc-26b4-46a5-bdaf-509aac101fba" }, "type": "kubernetes.io/service-account-token" }
And this is what you need to paste into the DevOps form. Once that's done and saved in your Service Connection, the DevOps kubectl step can use it to connect to your AKS instance.
Early on in this process I came across some documentation describing the
can-i
command for checking if a user can do an operation. In theory, you can use a command like:
kubectl auth can-i create deployment -n my-deployment --as azure-pipeline
and the response comes back as "yes" or "no" depending on whether that user has the appropriate rights or not.
But I cannot make this work. Following the pattern above I was able to create an accound that worked correctly with DevOps. But no matter what query I make via the command line to test those rights, it always says "no".
This caused me a certain amount of delay when I was working out the steps above. I was trying to avoid running deployments – as they take a while. I assume there's some key thing I've missed here about how the
can-i
command works which explains this issue. But I've not worked it out yet...
Before I started trying to get the DevOps release to work, I'd been testing some releases from the command line. Those had been working fine. So when I ran the first DevOps release that succeeded, I was surprised to find that after DevOps declared it had finished applying the release I was seeing AKS erroring because it was unable to pull the requisite images.
Stepping back and thinking about it, that did make some sense. My AKS instance is sitting with my ACR instance in my DevOps. That means when I ran command line deploys as "me" they were running with the correct security tokens to access both AKS and ACR. But once DevOps takes over using its special token above, that doesn't have access to my ACR...
Another pass through the documentation teaches me another new thing. You can add the auth details for your ACR as a secret in your AKS instance:
kubectl create secret docker-registry my-company-acr --docker-server=myacr.azurecr.io --docker-username=myacr --docker-password=1234343+sdasdsdaKJKweq --docker-email=my-email@somewhere.com
(You need to enable the "admin account" option under "Access Keys" in your ACR to get a standard username / password to use here)
And then you can tweak the yaml files for the pods you're deploying, by telling them the name of the secret you created above. It took me a couple of goes to get this in the right place in my files. I am developing a profound hatred of significant whitespace in config files 😉
Anyway, the right thing to set is:
apiVersion: apps/v1 kind: Deployment metadata: name: xdbrefdata labels: app: xdbrefdata spec: replicas: 1 selector: matchLabels: app: xdbrefdata template: metadata: labels: app: xdbrefdata spec: nodeSelector: kubernetes.io/os: windows imagePullSecrets: - name: my-company-acr containers: - name: sitecore-xp1-xdbrefdata image: myacr.azurecr.io/mysite-xp1-xdbrefdata:latest ports: - containerPort: 80 env: - name: Database_Server valueFrom: secretKeyRef: name: sitecore-database key: sitecore-databaseservername.txt livenessProbe: httpGet: path: /healthz/live port: 80 httpHeaders: - name: X-Kubernetes-Probe value: Liveness timeoutSeconds: 300 periodSeconds: 30 failureThreshold: 3
And with that in place, my deployments work from both the command line and from DevOps.
I need to work out how to reduce the rights given to the DevOps deploy process, but at least it works now...
↑ Back to top