How To Write KMS Policies That Don't Suck

No one really wants to be the one to admit it, but the reality is that most people don't really understand KMS Policies, and they write KMS Policies that kind of suck. IAM in AWS is confusing enough as is, and it gets even worse when resource policies are taken into account. Most people seem to do okay with S3 Bucket Policies, but since KMS works a little differently than most resource policies engineers tend to struggle with them a lot more.

In this post, we're going to be touching on some common mistakes with KMS Policies, and provide some insight on how to write policies that don't suck. At least not as much.

How KMS Policies Differ

Before getting into what a good KMS Policy looks like, it helps to have some context on how KMS Policies differ from other resource policies in AWS, like S3 Bucket Policies. You can deploy an S3 Bucket with no policy at all, and it will work fine. As long as principals in the account have the right IAM permissions, they'll be able to access the Bucket just fine.

KMS is not like this. Given the sensitivity of encryption keys, KMS takes the opposite approach. It doesn't matter what IAM permissions you have, if the KMS Policy doesn't explicitly(through delegation to the account or a specific principal) allow use of the KMS CMK, you can't use it. No matter what. The actual permissions that a principal has for a KMS CMK will be the intersection of the KMS Policy and IAM Policies, in addition to any relevant SCPs and Permission Boundaries.

Delegating Permissions to IAM/Root

When you create a new KMS CMK through the console, the default policy applied to it delegates all KMS permissions to the AWS Account the KMS CMK was created in. Note that while it says root, it's not actually referring to the root user, but to the AWS Account itself. See AWS Account Principals for more info. That Policy looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Enable IAM User Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        }
    ]
}

This is one of the most fundamentally misunderstood parts of KMS Policies, for whatever reason. What this section does is it allows any Principal in this AWS Account to perform any action with or on this KMS CMK as long as they have the corresponding permissions from IAM. In this state, this KMS CMK is fully open within the account, with no additional protections offered by the KMS Policy.

Depending on your needs, this could be fine. Maybe it's a generic KMS CMK, small AWS Account, and no protections are needed beyond IAM. While obviously not perfect, I understand.

If One Statement is Fine, More is Better, Right?

This is generally where things go off the rails. Luckily most mistakes with KMS Policies don't make the KMS CMK less secure than the above policy. The AWS Console sets people up for failure here when it gives you the option to start adding Key Administrators and Key Users. Without really understanding what's going on, you'd think that you need to add the IAM Roles you want to be able to use or administer the CMK. After all, it's right there.

So you add your admin IAM Role to the administrators section, maybe a few IAM Roles to the user section so they can spin up EC2 Instances and encrypt objects and such, and you end up with something like this:

{

    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Enable IAM User Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Sid": "Allow access for Key Administrators",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:role/Team-Admin-Role"
            },
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
                "kms:Update*",
                "kms:Revoke*",
                "kms:Disable*",
                "kms:Get*",
                "kms:Delete*",
                "kms:TagResource",
                "kms:UntagResource",
                "kms:ScheduleKeyDeletion",
                "kms:CancelKeyDeletion"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Allow use of the key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:role/Team-Power-User-Role"
            },
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:DescribeKey"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Allow attachment of persistent resources",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:role/Team-Power-User-Role"
            },
            "Action": [
                "kms:CreateGrant",
                "kms:ListGrants",
                "kms:RevokeGrant"
            ],
            "Resource": "*",
            "Condition": {
                "Bool": {
                    "kms:GrantIsForAWSResource": "true"
                }
            }
        }
    ]
}

AWS added a bunch of other permissions, and we see certain administrative permissions delegated to our Admin IAM Role, and permissions that allow use of the KMS CMK to our Power User Role. This is least privilege, this is good, right?

The issue is that most of the time, the IAM Roles specified in these additional sections also have KMS permissions in IAM. In that situation, these extra statements don't change the actual permissions around the KMS CMK at all because ALL actions are already delegated to IAM in this Account from the very first statement, referring to root. We've basically just stacked allow statements on top of each other, even though the first one is effectively Allow: *.

The only thing this actually accomplishes is making the KMS Policy harder to decipher for people who aren't as familiar with the way these work, and provide the appearance of least privilege without actually restricting anything. This is bad. Having a KMS CMK that is obviously open is better than one that is the same level of open but has the appearance of being restricted, because it can trick people into thinking things are more secure than they are.

Writing a Better KMS Policy That Sucks Less

The absolute most secure KMS Policy would be one that doesn't delegate anything to root/IAM, and only allows the specific actions desired for specific principals. Honestly, I don't think this is reasonable to expect in most situations. If you aren't delegating management permissions to IAM, there is a much higher risk of people locking themselves out from the CMK, which requires logging into the root user for the AWS Account to fix.

We're going to look at two situations real quick to discuss what better KMS Policies can look like. First we'll touch on a KMS Key that is only used by applications/services, then we'll look at one where people are more involved.

A quick note about admin'ing the CMK: The policies that follow will still be delegating admin permissions to IAM, which would allow anyone in the Account with the proper IAM permissions to do things like change the Key Policy. In a perfect world, permissions like kms:PutKeyPolicy would only be delegated to IAM Roles used for deployment, since you have proper pipelines/deployment methods for resources, right? Right?

These Key Policies suck less, but they still aren't perfect, since perfect requires a whole lot else to be going on around IAM, deployment processes, etc. This article is intended to provide some insight on what you can do now in your environment without expecting absolute perfection.

KMS Key for App/Service

For the first situation, imagine we're building parts for our application, and we're setting up a KMS CMK for it. Maybe we're building something like a Lambda pulling from/sending to some SQS Queues. In this case, we need admins able to manage the KMS CMK, but not use it, and we need the Lambda Role to be able to use the key, but it has no reason to modify the CMK.

To do this, we're going to look at changing what is delegated to IAM, not reforming the Key Admin section. As stated earlier, if everything is delegated to IAM, all the following allow statements are pointless. What we want is to delegate KMS Management permissions to IAM for this CMK. Hopefully we don't have a bunch of random non-admin principals in the account with kms:*, if you do there are other issues going on that need to be addressed.

Then we're going to delegate the specific KMS actions required to the IAM Role used by our Lambda. The KMS Policy ends up looking something like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Enable IAM User Permissions to Admin Key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": [
                "kms:CancelKeyDeletion",
                "kms:CreateAlias",
                "kms:Delete*",
                "kms:Describe*",
                "kms:Disable*",
                "kms:Enable*",
                "kms:Get*",
                "kms:List*",
                "kms:PutKeyPolicy",
                "kms:Revoke*",
                "kms:ScheduleKeyDeletion",
                "kms:TagResource",
                "kms:UntagResource",
                "kms:Update*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Allow use of the key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:role/MyLambdaRole"
            },
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey*"
            ],
            "Resource": "*"
        }
    ]
}

Reference for KMS Permissions required for SQS consumers/producers

Quick side note, if you care about your sanity at all put policy actions in alphabetical order, and don't glob when there is only one possible action. That goes for IAM as well as resource policies. The AWS console doesn't do those things for whatever reason when it generates the policies shown earlier.

With this KMS Policy, the only principal who can actually use the KMS CMK to encrypt or decrypt data is the Lambda IAM Role. Even IAM Roles/Users with full admin cannot, though they can modify the Key Policy as mentioned earlier. Obviously it's good that only the principal that should be accessing the data can access the data, but there's another benefit here that's often overlooked. You can leverage KMS Policies to restrict other actions indirectly when your IAM situation is less than great. Even if every single IAM Role in your environment has admin(please don't), they can't access Secrets, secure SSM Parameters, encrypted Lambda environment variables, etc, if they aren't specified in the use section of the KMS Policy.

KMS CMKs Indirectly Used by People

A less clear situation is when people are involved. It's easy enough to just provide permissions to a single Lambda IAM Role as seen above, because you know what SQS Queue is involved, and you know what IAM Role is touching it. If we start looking at something like EBS encryption, it's not quite as easy. We may have a ton of EC2 Instances used by different teams that have encrypted volumes. Permissions to use the KMS CMK are required for starting an EC2 Instance, so while the person doesn't actually need to use the CMK to access data directly, they still need the ability to use the CMK.

It's more difficult to start adding a ton of principals to the KMS Policy, and we still don't actually need(and therefore don't want) to grant the ability to use the KMS CMK directly. I just need to start an Instance, not directly decrypt data. And in general, starting an EC2 Instance isn't that sensitive of an action, so this may be a situation where we don't really care who in the AWS Account can do it.

We can achieve this goal through proper utilization of conditions. This KMS Policy starts the same as the previous one, where we're delegating administration permissions to IAM. We then have additional statements that delegate more permissions to IAM if certain conditions are met. Remember that everything IAM is default deny. Here's the CMK Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Allow Management of Key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": [
                "kms:CancelKeyDeletion",
                "kms:CreateAlias",
                "kms:Delete*",
                "kms:Describe*",
                "kms:Disable*",
                "kms:Enable*",
                "kms:Get*",
                "kms:List*",
                "kms:PutKeyPolicy",
                "kms:Revoke*",
                "kms:ScheduleKeyDeletion",
                "kms:TagResource",
                "kms:UntagResource",
                "kms:Update*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Allow use of the key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:GenerateDataKey*",
                "kms:ReEncrypt*"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:ViaService": "ec2.us-east-1.amazonaws.com"
                }
            }
        },
        {
            "Sid": "Allow attachment of persistent resources",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789123:root"
            },
            "Action": [
                "kms:CreateGrant",
                "kms:ListGrants",
                "kms:RevokeGrant"
            ],
            "Resource": "*",
            "Condition": {
                "Bool": {
                    "kms:GrantIsForAWSResource": "true"
                },
                "StringEquals": {
                    "kms:ViaService": "ec2.us-east-1.amazonaws.com"
                }
            }
        }
    ]
}

Reference for KMS permissions for EBS encryption

In this Key Policy, we're delegating use permissions like kms:CreateGrant and kms:Encrypt to IAM, but only under certain conditions. Specifically, we're mostly focused on:

"Condition": {
    "StringEquals": {
        "kms:ViaService": "ec2.us-east-1.amazonaws.com"
    }
}

We're only allowing principals in this AWS Account to perform these use actions if it's coming through the EC2 service. Using conditions in this way allows anyone with ec2:StartInstance and the proper KMS permissions in IAM to start Instances, while still preventing them from using the KMS CMK directly. Crafting Key Policies like this allows us to get a step closer to actual least privilege, only allowing actions if they are being done in the right context.

Conclusion

In this post, I went over some methods of crafting KMS Policies that suck a little less. There's still a ton more that can be discussed, around utilization of explicit denies, and other KMS conditions like kms:RequestAlias, but that's a topic for another post.