Jeremy Davis
Jeremy Davis
Sitecore, C# and web development
Article printed from: https://blog.jermdavis.dev/posts/2020/security-fun-with-azure-kubernetes-service

Security fun with Azure Kubernetes Service

Published 21 December 2020

I've been working on a deployment of Sitecore using containers recently, and hit a scenario which isn't discussed much in the Microsoft documentation: How do you go about setting it all up if you can't use Active Directory accounts across your DevOps and Azure instances? Having done some digging, here's what I've learned so far:

My issue url copied!

I needed to run Sitecore in AKS, but there was a security boundary between DevOps (owned and run by my client) and my dev/QA Azure instance (owned and run by my company). Normally this would be pretty easy – in DevOps you can create a Service Connection for an instance of AKS and give it your AD credentials for Azure. But that doesn't work for me – for tedious technical reasons it is not possible for me to log in as my Azure account when I'm accessing the client's DevOps, so I can't create one of these easy authentication tokens.

(For the curious: I have "username@mycompany.com" Azure AD account which is used for my Azure access in my company. But I also have the same username as a standard Microsoft account which is not tied to my company AD. The client used the standard Microsoft account to grant me access to their DevOps because they don't have access to our company AD either. So whenever I try to fill in the Service Connection form to give my credentials to set up this link, DevOps only accepts my Microsoft Account password. I cannot use the company AD account which would actually grant rights to the AKS instance. A silly issue – but a blocker none the less)

Security Boundary Issue

So how can I give DevOps permissions to push new deployments into my AKS instance?

K8s tokens url copied!

It turns out that Kubernetes has its own approach to security – and this is something I can use to solve my problem. Most of the docs that you come across for AKS assume that you'll be using Active Directory, but it turns out you can use role-based access control (referred to as RBAC in the docs) with internal accounts too. So you can create an account in Kubernetes, grant it some permissions and then give its authentication tokens to DevOps to use when it connects.

Sounds simple – but like many things here, it requires some thought.

Getting an account url copied!

Kubernetes has the concept of "service accounts" – users which can perform actions in the system but are expected to be used by other computers rather than people. Some documentation only discusses this inside your cluster, when pods need to communicate. However there is also some documentation that discusses how these can be used for external systems to perform actions in your cluster.

From the command line you can create a new service account with kubectl. So I started by creating one in my Sitecore deployment's namespace. However despite a variety of experiments here, I was unable to make that work correctly – it was easy to create the account, but AKS would always refuse to perform any actions under these credentials. After a lot of research, I found one article that described putting your service account into the "kube-system" namespace. And this worked for me.

So the command I created the user with ended up as:

kubectl -n kube-system create serviceaccount azure-pipeline

					

Once you have an account, you need to "bind" it to a role, telling AKS which rights the account should have. Kubernetes calls this a "role binding". And Kubernetes roles can be scoped to either your application or to the entire cluster. Given the issue above, a cluster scoped role and binding seems more appropriate.

Again, I tried creating my own role, with what I thought was the right set of permissions, but was unable to make it work. So I fell back to one of the built-in admin roles. While this isn't best security practice, it's good enough to make the whole thing work while I learn more.

So the role binding to a cluster role can be set up:

kubectl create clusterrolebinding azure-pipeline-binding --clusterrole=cluster-admin --serviceaccount=kube-system:azure-pipeline

					

Note how the namespace here is specified by prefixing the user <namespace>:<user> rather than with the -n parameter here.

(You can also do this from Yaml files, of course – but I'll stick to the command line here)

Getting the credentials to use url copied!

When you click "Manage" for the AKS service connection in DevOps, you need to change the authentication method to "Service Account" to use this approach. That requires you to fill in two key bits of info:

Service Account

First is the server url. As you can see from the image above, DevOps suggests a command to get this info. But when I paste that into my console, I get this:

Server Error

But if you remove the "jsonpath" there, it will give you the data you need:

Valid Server

The server URL is highlighted here – and you can copy that into the DevOps field.

You then need the authentication data. Again, DevOps gives you some commands to run, and again these break for me. So cue some more research... What I found I could do was get all the service accounts by running

kubectl get serviceaccounts azure-pipeline -o custom-columns=":secrets[0].name" -n kube-system

					

to get the name of the token inside AKS – something like azure-pipeline-token-ajdsa And then you can pass the token name you get to a second command:

kubectl get secret azure-pipeline-token-ajdsa -o json -n kube-system

					

to get a blob of json:

{
    "apiVersion": "v1",
    "data": {
        "ca.crt": "... redacted ...",
        "namespace": "a3ViZS1zeXN0ZW0=",
        "token": "... redacted ..."
    },
    "kind": "Secret",
    "metadata": {
        "annotations": {
            "kubernetes.io/service-account.name": "azure-pipeline",
            "kubernetes.io/service-account.uid": "37d737b1-2e91-42c9-8e29-0104956ade8d"
        },
        "creationTimestamp": "2020-12-04T09:56:03Z",
        "name": "azure-pipeline-token-ajdsa",
        "namespace": "kube-system",
        "resourceVersion": "11194",
        "selfLink": "/api/v1/namespaces/kube-system/secrets/azure-pipeline-token-ajdsa",
        "uid": "80ea29fc-26b4-46a5-bdaf-509aac101fba"
    },
    "type": "kubernetes.io/service-account-token"
}

					

And this is what you need to paste into the DevOps form. Once that's done and saved in your Service Connection, the DevOps kubectl step can use it to connect to your AKS instance.

Another thing that tripped me up... url copied!

Early on in this process I came across some documentation describing the `can-i` command for checking if a user can do an operation. In theory, you can use a command like:
kubectl auth can-i create deployment -n my-deployment --as azure-pipeline

					

and the response comes back as "yes" or "no" depending on whether that user has the appropriate rights or not.

But I cannot make this work. Following the pattern above I was able to create an accound that worked correctly with DevOps. But no matter what query I make via the command line to test those rights, it always says "no".

This caused me a certain amount of delay when I was working out the steps above. I was trying to avoid running deployments – as they take a while. I assume there's some key thing I've missed here about how the can-i command works which explains this issue. But I've not worked it out yet...

And there was one more thing to trip me up... url copied!

Before I started trying to get the DevOps release to work, I'd been testing some releases from the command line. Those had been working fine. So when I ran the first DevOps release that succeeded, I was surprised to find that after DevOps declared it had finished applying the release I was seeing AKS erroring because it was unable to pull the requisite images.

Stepping back and thinking about it, that did make some sense. My AKS instance is sitting with my ACR instance in my DevOps. That means when I ran command line deploys as "me" they were running with the correct security tokens to access both AKS and ACR. But once DevOps takes over using its special token above, that doesn't have access to my ACR...

Another pass through the documentation teaches me another new thing. You can add the auth details for your ACR as a secret in your AKS instance:

kubectl create secret docker-registry my-company-acr --docker-server=myacr.azurecr.io --docker-username=myacr --docker-password=1234343+sdasdsdaKJKweq --docker-email=my-email@somewhere.com

					

(You need to enable the "admin account" option under "Access Keys" in your ACR to get a standard username / password to use here)

And then you can tweak the yaml files for the pods you're deploying, by telling them the name of the secret you created above. It took me a couple of goes to get this in the right place in my files. I am developing a profound hatred of significant whitespace in config files 😉

Anyway, the right thing to set is:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: xdbrefdata
  labels:
    app: xdbrefdata
spec:
  replicas: 1
  selector:
    matchLabels:
      app: xdbrefdata
  template:
    metadata:
      labels:
        app: xdbrefdata
    spec:
      nodeSelector:
        kubernetes.io/os: windows
      imagePullSecrets:
      - name: my-company-acr
      containers:
      - name: sitecore-xp1-xdbrefdata
        image: myacr.azurecr.io/mysite-xp1-xdbrefdata:latest
        ports:
        - containerPort: 80
        env:
        - name: Database_Server
          valueFrom:
            secretKeyRef:
              name: sitecore-database
              key: sitecore-databaseservername.txt
        livenessProbe:
          httpGet:
            path: /healthz/live
            port: 80
            httpHeaders:
            - name: X-Kubernetes-Probe
              value: Liveness
          timeoutSeconds: 300
          periodSeconds: 30
          failureThreshold: 3

					

And with that in place, my deployments work from both the command line and from DevOps.

I need to work out how to reduce the rights given to the DevOps deploy process, but at least it works now...

↑ Back to top