CloudFormation
The Getting Started with Karpenter guide uses CloudFormation to bootstrap the cluster to enable Karpenter to create and manage nodes, as well as to allow Karpenter to respond to interruption events.
This document describes the cloudformation.yaml
file used in that guide.
These descriptions should allow you to understand:
- What Karpenter is authorized to do with your EKS cluster and AWS resources when using the
cloudformation.yaml
file - What permissions you need to set up if you are adding Karpenter to an existing cluster
Overview
To download a particular version of cloudformation.yaml
, set the version and use curl
to pull the file to your local system:
export KARPENTER_VERSION=v0.32.10
curl https://raw.githubusercontent.com/aws/karpenter-provider-aws/"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > cloudformation.yaml
Following some header information, the rest of the cloudformation.yaml
file describes the resources that CloudFormation deploys.
The sections of that file can be grouped together under the following general headings:
- Node Authorization: Creates a NodeInstanceProfile, attaches a NodeRole to it, and connects it to an IAM Identity Mapping used to authorize nodes to the cluster. This defines the permissions each node managed by Karpenter has to access EC2 and other AWS resources. This doesn’t actually create the IAM Identity Mapping. That part is orchestrated by
eksctl
in the Getting Started guide. - Controller Authorization: Creates the
KarpenterControllerPolicy
that is attached to the service account. Again, the actual service account creation (karpenter
), that is combined with theKarpenterControllerPolicy
, is orchestrated byeksctl
in the Getting Started guide. - Interruption Handling: Allows the Karpenter controller to see and respond to interruptions that occur with the nodes that Karpenter is managing. See the Interruption section of the Disruption page for details.
A lot of the object naming that is done by cloudformation.yaml
is based on the following:
Cluster name: With a username of
bob
the Getting Started Guide would name your clusterbob-karpenter-demo
That name would then be appended to any name below where${ClusterName}
is included.Partition: Any time an ARN is used, it includes the partition name to identify where the object is found. In most cases, that partition name is
aws
. However, it could also beaws-cn
(for China Regions) oraws-us-gov
(for AWS GovCloud US Regions).
Node Authorization
The following sections of the cloudformation.yaml
file set up IAM permissions for Kubernetes nodes created by Karpenter.
In particular, this involves setting up a node role that can be attached and passed to instance profiles that Karpenter generates at runtime:
- KarpenterNodeRole
KarpenterNodeRole
This section of the template defines the IAM role attached to generated instance profiles.
Given a cluster name of bob-karpenter-demo
, this role would end up being named "KarpenterNodeRole-bob-karpenter-demo
.
KarpenterNodeRole:
Type: "AWS::IAM::Role"
Properties:
RoleName: !Sub "KarpenterNodeRole-${ClusterName}"
Path: /
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
!Sub "ec2.${AWS::URLSuffix}"
Action:
- "sts:AssumeRole"
ManagedPolicyArns:
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"
The role created here includes several AWS managed policies, which are designed to provide permissions for specific uses needed by the nodes to work with EC2 and other AWS resources. These include:
- AmazonEKS_CNI_Policy: Provides the permissions that the Amazon VPC CNI Plugin needs to configure EKS worker nodes.
- AmazonEKSWorkerNodePolicy: Lets Amazon EKS worker nodes connect to EKS Clusters.
- AmazonEC2ContainerRegistryReadOnly: Allows read-only access to repositories in the Amazon EC2 Container Registry.
- AmazonSSMManagedInstanceCore: Adds AWS Systems Manager service core functions for Amazon EC2.
If you were to use a node role from an existing cluster, you could skip this provisioning step and pass this node role to any EC2NodeClasses that you create. Additionally, you would ensure that the Controller Policy has iam:PassRole
permission to the role attached to the generated instance profiles.
Controller Authorization
This section sets the AWS permissions for the Karpenter Controller. When used in the Getting Started guide, eksctl
uses these permissions to create a service account (karpenter) that is combined with the KarpenterControllerPolicy.
The resources defined in this section are associated with:
- KarpenterControllerPolicy
Because the scope of the KarpenterControllerPolicy is an AWS region, the cluster’s AWS region is included in the AllowScopedEC2InstanceActions
.
KarpenterControllerPolicy
A KarpenterControllerPolicy
object sets the name of the policy, then defines a set of resources and actions allowed for those resources.
For our example, the KarpenterControllerPolicy would be named: KarpenterControllerPolicy-bob-karpenter-demo
KarpenterControllerPolicy:
Type: AWS::IAM::ManagedPolicy
Properties:
ManagedPolicyName: !Sub "KarpenterControllerPolicy-${ClusterName}"
# The PolicyDocument must be in JSON string format because we use a StringEquals condition that uses an interpolated
# value in one of its key parameters which isn't natively supported by CloudFormation
PolicyDocument: !Sub |
{
"Version": "2012-10-17",
"Statement": [
Someone wanting to add Karpenter to an existing cluster, instead of using cloudformation.yaml
, would need to create the IAM policy directly and assign that policy to the role leveraged by the service account using IRSA.
AllowScopedEC2InstanceActions
The AllowScopedEC2InstanceActions statement ID (Sid) identifies a set of EC2 resources that are allowed to be accessed with
RunInstances and CreateFleet actions.
For RunInstances
and CreateFleet
actions, the Karpenter controller can read (but not create) image
, snapshot
, spot-instances-request
, security-group
, subnet
and launch-template
EC2 resources, scoped for the particular AWS partition and region.
{
"Sid": "AllowScopedEC2InstanceActions",
"Effect": "Allow",
"Resource": [
"arn:${AWS::Partition}:ec2:${AWS::Region}::image/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}::snapshot/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:spot-instances-request/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:security-group/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:subnet/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
],
"Action": [
"ec2:RunInstances",
"ec2:CreateFleet"
]
}
AllowScopedEC2InstanceActionsWithTags
The AllowScopedEC2InstanceActionsWithTags Sid allows the
RunInstances, CreateFleet, and CreateLaunchTemplate
actions requested by the Karpenter controller to create all fleet
, instance
, volume
, network-interface
, or launch-template
EC2 resources (for the partition and region), and requires that the kubernetes.io/cluster/${ClusterName}
tag be set to owned
and a karpenter.sh/nodepool
tag be set to any value. This ensures that Karpenter is only allowed to create instances for a single EKS cluster.
{
"Sid": "AllowScopedEC2InstanceActionsWithTags",
"Effect": "Allow",
"Resource": [
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:fleet/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:volume/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:network-interface/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
],
"Action": [
"ec2:RunInstances",
"ec2:CreateFleet",
"ec2:CreateLaunchTemplate"
],
"Condition": {
"StringEquals": {
"aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned"
},
"StringLike": {
"aws:RequestTag/karpenter.sh/nodepool": "*"
}
}
}
AllowScopedResourceCreationTagging
The AllowScopedResourceCreationTagging Sid allows EC2 CreateTags
actions on fleet
, instance
, volume
, network-interface
, and launch-template
resources, While making RunInstance
, CreateFleet
, or CreateLaunchTemplate
calls. Additionally, this ensures that resources can’t be tagged arbitrarily by Karpenter after they are created.
{
"Sid": "AllowScopedResourceCreationTagging",
"Effect": "Allow",
"Resource": [
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:fleet/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:volume/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:network-interface/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
],
"Action": "ec2:CreateTags",
"Condition": {
"StringEquals": {
"aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
"ec2:CreateAction": [
"RunInstances",
"CreateFleet",
"CreateLaunchTemplate"
]
},
"StringLike": {
"aws:RequestTag/karpenter.sh/nodepool": "*"
}
}
}
AllowScopedResourceTagging
The AllowScopedResourceTagging Sid allows EC2 CreateTags actions on all instances created by Karpenter after their creation. It enforces that Karpenter is only able to update the tags on cluster instances it is operating on through the kubernetes.io/cluster/${ClusterName}
" and karpenter.sh/nodepool
tags.
{
"Sid": "AllowScopedResourceTagging",
"Effect": "Allow",
"Resource": "arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
"Action": "ec2:CreateTags",
"Condition": {
"StringEquals": {
"aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned"
},
"StringLike": {
"aws:ResourceTag/karpenter.sh/nodepool": "*"
},
"ForAllValues:StringEquals": {
"aws:TagKeys": [
"karpenter.sh/nodeclaim",
"Name"
]
}
}
}
AllowScopedDeletion
The AllowScopedDeletion Sid allows TerminateInstances and DeleteLaunchTemplate actions to delete instance and launch-template resources, provided that karpenter.sh/nodepool
and kubernetes.io/cluster/${ClusterName}
tags are set. These tags must be present on all resources that Karpenter is going to delete. This ensures that Karpenter can only delete instances and launch templates that are associated with it.
{
"Sid": "AllowScopedDeletion",
"Effect": "Allow",
"Resource": [
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
"arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
],
"Action": [
"ec2:TerminateInstances",
"ec2:DeleteLaunchTemplate"
],
"Condition": {
"StringEquals": {
"aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned"
},
"StringLike": {
"aws:ResourceTag/karpenter.sh/nodepool": "*"
}
}
}
AllowRegionalReadActions
The AllowRegionalReadActions Sid allows DescribeAvailabilityZones, DescribeImages, DescribeInstances, DescribeInstanceTypeOfferings, DescribeInstanceTypes, DescribeLaunchTemplates, DescribeSecurityGroups, DescribeSpotPriceHistory, and DescribeSubnets actions for the current AWS region. This allows the Karpenter controller to do any of those read-only actions across all related resources for that AWS region.
{
"Sid": "AllowRegionalReadActions",
"Effect": "Allow",
"Resource": "*",
"Action": [
"ec2:DescribeAvailabilityZones",
"ec2:DescribeImages",
"ec2:DescribeInstances",
"ec2:DescribeInstanceTypeOfferings",
"ec2:DescribeInstanceTypes",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets"
],
"Condition": {
"StringEquals": {
"aws:RequestedRegion": "${AWS::Region}"
}
}
}
AllowSSMReadActions
The AllowSSMReadActions Sid allows the Karpenter controller to read SSM parameters (ssm:GetParameter
) from the current region for SSM parameters generated by ASW services.
NOTE: If potentially sensitive information is stored in SSM parameters, you could consider restricting access to these messages further.
{
"Sid": "AllowSSMReadActions",
"Effect": "Allow",
"Resource": "arn:${AWS::Partition}:ssm:${AWS::Region}::parameter/aws/service/*",
"Action": "ssm:GetParameter"
}
AllowPricingReadActions
Because pricing information does not exist in every region at the moment, the AllowPricingReadActions Sid allows the Karpenter controller to get product pricing information (pricing:GetProducts
) for all related resources across all regions.
{
"Sid": "AllowPricingReadActions",
"Effect": "Allow",
"Resource": "*",
"Action": "pricing:GetProducts"
}
AllowInterruptionQueueActions
Karpenter supports interruption queues, that you can create as described in the Interruption section of the Disruption page.
This section of the cloudformation.yaml template can give Karpenter permission to access those queues by specifying the resource ARN.
For the interruption queue you created (${KarpenterInterruptionQueue.Arn}
), the AllowInterruptionQueueActions Sid lets the Karpenter controller have permission to delete messages (DeleteMessage), get queue URL (GetQueueUrl), and receive messages (ReceiveMessage).
{
"Sid": "AllowInterruptionQueueActions",
"Effect": "Allow",
"Resource": "${KarpenterInterruptionQueue.Arn}",
"Action": [
"sqs:DeleteMessage",
"sqs:GetQueueUrl",
"sqs:ReceiveMessage"
]
}
AllowPassingInstanceRole
The AllowPassingInstanceRole Sid gives the Karpenter controller permission to pass (iam:PassRole
) the node role (KarpenterNodeRole-${ClusterName}
) to generated instance profiles.
This gives EC2 permission explicit permission to use the KarpenterNodeRole-${ClusterName}
when assigning permissions to generated instance profiles while launching nodes.
{
"Sid": "AllowPassingInstanceRole",
"Effect": "Allow",
"Resource": "${KarpenterNodeRole.Arn}",
"Action": "iam:PassRole",
"Condition": {
"StringEquals": {
"iam:PassedToService": "ec2.amazonaws.com"
}
}
}
AllowScopedInstanceProfileCreationActions
The AllowScopedInstanceProfileCreationActions Sid gives the Karpenter controller permission to create a new instance profile with iam:CreateInstanceProfile
,
provided that the request is made to a cluster with kubernetes.io/cluster/${ClusterName}
set to owned and is made in the current region.
Also, karpenter.k8s.aws/ec2nodeclass
must be set to some value. This ensures that Karpenter can generate instance profiles on your behalf based on roles specified in your EC2NodeClasses
that you use to configure Karpenter.
{
"Sid": "AllowScopedInstanceProfileCreationActions",
"Effect": "Allow",
"Resource": "*",
"Action": [
"iam:CreateInstanceProfile"
],
"Condition": {
"StringEquals": {
"aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
"aws:RequestTag/topology.kubernetes.io/region": "${AWS::Region}"
},
"StringLike": {
"aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*"
}
}
}
AllowScopedInstanceProfileTagActions
The AllowScopedInstanceProfileTagActions Sid gives the Karpenter controller permission to tag an instance profile with iam:TagInstanceProfile
, based on the values shown below,
Also, karpenter.k8s.aws/ec2nodeclass
must be set to some value. This ensures that Karpenter is only able to act on instance profiles that it provisions for this cluster.
{
"Sid": "AllowScopedInstanceProfileTagActions",
"Effect": "Allow",
"Resource": "*",
"Action": [
"iam:TagInstanceProfile"
],
"Condition": {
"StringEquals": {
"aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned",
"aws:ResourceTag/topology.kubernetes.io/region": "${AWS::Region}",
"aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
"aws:RequestTag/topology.kubernetes.io/region": "${AWS::Region}"
},
"StringLike": {
"aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*",
"aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*"
}
}
}
AllowScopedInstanceProfileActions
The AllowScopedInstanceProfileActions Sid gives the Karpenter controller permission to perform iam:AddRoleToInstanceProfile
, iam:RemoveRoleFromInstanceProfile
, and iam:DeleteInstanceProfile
actions,
provided that the request is made to a cluster with kubernetes.io/cluster/${ClusterName}
set to owned and is made in the current region.
Also, karpenter.k8s.aws/ec2nodeclass
must be set to some value. This permission is further enforced by the iam:PassRole
permission. If Karpenter attempts to add a role to an instance profile that it doesn’t have iam:PassRole
permission on, that call will fail. Therefore, if you configure Karpenter to use a new role through the EC2NodeClass
, ensure that you also specify that role within your iam:PassRole
permission.
{
"Sid": "AllowScopedInstanceProfileActions",
"Effect": "Allow",
"Resource": "*",
"Action": [
"iam:AddRoleToInstanceProfile",
"iam:RemoveRoleFromInstanceProfile",
"iam:DeleteInstanceProfile"
],
"Condition": {
"StringEquals": {
"aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned",
"aws:ResourceTag/topology.kubernetes.io/region": "${AWS::Region}"
},
"StringLike": {
"aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*"
}
}
}
AllowInstanceProfileActions
The AllowInstanceProfileActions Sid gives the Karpenter controller permission to perform iam:GetInstanceProfile
actions to retrieve information about a specified instance profile, including understanding if an instance profile has been provisioned for an EC2NodeClass
or needs to be re-provisioned.
{
"Sid": "AllowInstanceProfileReadActions",
"Effect": "Allow",
"Resource": "*",
"Action": "iam:GetInstanceProfile"
}
AllowAPIServerEndpointDiscovery
You can optionally allow the Karpenter controller to discover the Kubernetes cluster’s external API endpoint to enable EC2 nodes to successfully join the EKS cluster.
Note: If you are not using an EKS control plane, you will have to specify this endpoint explicitly. See the description of the
aws.clusterEndpoint
setting in the ConfigMap documentation for details.
The AllowAPIServerEndpointDiscovery Sid allows the Karpenter controller to get that information (eks:DescribeCluster
) for the cluster (cluster/${ClusterName}
).
{
"Sid": "AllowAPIServerEndpointDiscovery",
"Effect": "Allow",
"Resource": "arn:${AWS::Partition}:eks:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}",
"Action": "eks:DescribeCluster"
}
Interruption Handling
Settings in this section allow the Karpenter controller to stand-up an interruption queue to receive notification messages from other AWS services about the health and status of instances. For example, this interruption queue allows Karpenter to be aware of spot instance interruptions that are sent 2 minutes before spot instances are reclaimed by EC2. Adding this queue allows Karpenter to be proactive in migrating workloads to new nodes. See the Interruption section of the Disruption page for details.
Defining the KarpenterInterruptionQueuePolicy
allows Karpenter to see and respond to the following:
- AWS health events
- Spot interruptions
- Spot rebalance recommendations
- Instance state changes
The resources defined in this section include:
- KarpenterInterruptionQueue
- KarpenterInterruptionQueuePolicy
- ScheduledChangeRule
- SpotInterruptionRule
- RebalanceRule
- InstanceStateChangeRule
KarpenterInterruptionQueue
The AWS::SQS::Queue resource is used to create an Amazon SQS standard queue.
Properties of that resource set the QueueName
to the name of your cluster, the time for which SQS retains each message (MessageRetentionPeriod
) to 300 seconds, and enabling serverside-side encryption using SQS owned encryption keys (SqsManagedSseEnabled
) to true
.
See SetQueueAttributes for descriptions of some of these attributes.
KarpenterInterruptionQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: !Sub "${ClusterName}"
MessageRetentionPeriod: 300
SqsManagedSseEnabled: true
KarpenterInterruptionQueuePolicy
The Karpenter interruption queue policy is created to allow AWS services that we want to receive instance notifications from to push notification messages to the queue.
The AWS::SQS::QueuePolicy resource here applies EC2InterruptionPolicy
to the KarpenterInterruptionQueue
. The policy allows sqs:SendMessage actions to events.amazonaws.com
and sqs.amazonaws.com
services. It also allows the GetAtt
function to get attributes from KarpenterInterruptionQueue.Arn
.
KarpenterInterruptionQueuePolicy:
Type: AWS::SQS::QueuePolicy
Properties:
Queues:
- !Ref KarpenterInterruptionQueue
PolicyDocument:
Id: EC2InterruptionPolicy
Statement:
- Effect: Allow
Principal:
Service:
- events.amazonaws.com
- sqs.amazonaws.com
Action: sqs:SendMessage
Resource: !GetAtt KarpenterInterruptionQueue.Arn
Rules
This section allows Karpenter to gather AWS Health Events and direct them to a queue where they can be consumed by Karpenter. These rules include:
ScheduledChangeRule: The AWS::Events::Rule creates a rule where the EventPattern is set to send events from the
aws.health
source toKarpenterInterruptionQueue
.ScheduledChangeRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.health detail-type: - AWS Health Event Targets: - Id: KarpenterInterruptionQueueTarget Arn: !GetAtt KarpenterInterruptionQueue.Arn
SpotInterruptionRule: An EC2 Spot Instance Interruption warning tells you that AWS is about to reclaim a Spot instance you are using. This rule allows Karpenter to gather EC2 Spot Instance Interruption Warning events and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the
aws.ec2
source toKarpenterInterruptionQueue
.SpotInterruptionRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.ec2 detail-type: - EC2 Spot Instance Interruption Warning Targets: - Id: KarpenterInterruptionQueueTarget Arn: !GetAtt KarpenterInterruptionQueue.Arn
RebalanceRule: An EC2 Instance Rebalance Recommendation signal tells you that a Spot instance is at a heightened risk of being interrupted, allowing Karpenter to get new instances or simply rebalance workloads. This rule allows Karpenter to gather EC2 Instance Rebalance Recommendation signals and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the
aws.ec2
source toKarpenterInterruptionQueue
.RebalanceRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.ec2 detail-type: - EC2 Instance Rebalance Recommendation Targets: - Id: KarpenterInterruptionQueueTarget Arn: !GetAtt KarpenterInterruptionQueue.Arn
InstanceStateChangeRule: An EC2 Instance State-change Notification signal tells you that the state of an instance has changed to one of the following states: pending, running, stopping, stopped, shutting-down, or terminated. This rule allows Karpenter to gather EC2 Instance State-change signals and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the
aws.ec2
source toKarpenterInterruptionQueue
.InstanceStateChangeRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.ec2 detail-type: - EC2 Instance State-change Notification Targets: - Id: KarpenterInterruptionQueueTarget Arn: !GetAtt KarpenterInterruptionQueue.Arn