karpenter-provider-aws: ERROR controller.provisioning Provisioning failed, launching node, creating cloud provider instance, with fleet error(s), UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: XZX0joS
Version
Karpenter Version: v0.16.1
kubectl version: client (1.25) and server (1.22)
Kubernetes Version: v1.22
Expected Behavior
Karpenter is active and ready to begin provisioning nodes. Create some pods using a deployment, and watch Karpenter provision nodes in response.
Actual Behavior
Created some pods using a deployment, Karpenter failed to provision nodes.
INFO controller.provisioning Launching node with 5 pods requesting {“cpu”:“5125m”,“pods”:“7”} from types inf1.2xlarge, t3a.2xlarge, c5d.2xlarge, t3.2xlarge, m5.2xlarge and 308 other(s) {“commit”: “b157d45”, “provisioner”: “default”}
ERROR controller.provisioning Provisioning failed, launching node, creating cloud provider instance, with fleet error(s), UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: XZX0joSxj6TJ98
Steps to Reproduce the Problem
Terraform code to provision EKS cluster with Karpenter IRSA and instance profile.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.14.4"
name = "vpc-${local.cluster_name}"
cidr = var.cidr
azs = data.aws_availability_zones.available.names
private_subnets = var.private_subnets
public_subnets = var.public_subnets
elasticache_subnets = var.elasticache_subnets
enable_nat_gateway = true
single_nat_gateway = true
one_nat_gateway_per_az = false
enable_dns_hostnames = true
enable_dns_support = true
# VPC Flow Logs (Cloudwatch log group and IAM role will be created)
enable_flow_log = true
create_flow_log_cloudwatch_log_group = true
create_flow_log_cloudwatch_iam_role = true
flow_log_max_aggregation_interval = 60
public_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name # for Karpenter auto-discovery
}
private_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
tags = local.tags
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "18.29.0"
cluster_name = local.cluster_name
cluster_version = "1.22"
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
vpc_id = data.terraform_remote_state.vpc.outputs.vpc_id
subnet_ids = data.terraform_remote_state.vpc.outputs.public_subnets
cluster_enabled_log_types = var.log_types
manage_aws_auth_configmap = true
aws_auth_roles = var.aws_auth_roles
aws_auth_users = var.aws_auth_users
aws_auth_accounts = var.aws_auth_accounts
#Required for Karpenter role below
enable_irsa = true
create_cloudwatch_log_group = false
cloudwatch_log_group_retention_in_days = 3
node_security_group_additional_rules = {
ingress_nodes_karpenter_port = {
description = "Cluster API to Node group for Karpenter webhook"
protocol = "tcp"
from_port = 8443
to_port = 8443
type = "ingress"
source_cluster_security_group = true
}
}
node_security_group_tags = {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
# Only need one node to get Karpenter up and running.
# This ensures core services such as VPC CNI, CoreDNS, etc. are up and running
# so that Karpenter can be deployed and start managing compute capacity as required
eks_managed_node_groups = {
"${local.cluster_name}" = {
#attach_cluster_primary_security_group = true
capacity_type = "ON_DEMAND"
instance_types = ["m5.large"]
# Not required nor used - avoid tagging two security groups with same tag as well
create_security_group = false
# Ensure enough capacity to run 2 Karpenter pods
min_size = 2
max_size = 3
desired_size = 2
iam_role_additional_policies = [
"arn:${local.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore", # Required by Karpenter
"arn:${local.partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy",
"arn:${local.partition}:iam::aws:policy/AmazonEKS_CNI_Policy",
"arn:${local.partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly", #for access to ECR images
"arn:${local.partition}:iam::aws:policy/CloudWatchAgentServerPolicy"
]
tags = {
# This will tag the launch template created for use by Karpenter
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
}
}
}
resource "aws_iam_instance_profile" "karpenter" {
name = "KarpenterNodeInstanceProfile-${local.cluster_name}"
role = module.eks.eks_managed_node_groups["${local.cluster_name}"].iam_role_name
}
module "karpenter_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.3.3"
role_name = "${local.cluster_name}-karpenter"
attach_karpenter_controller_policy = true
karpenter_tag_key = "karpenter.sh/discovery/${local.cluster_name}"
karpenter_controller_cluster_id = module.eks.cluster_id
karpenter_controller_ssm_parameter_arns = [
"arn:${local.partition}:ssm:*:*:parameter/aws/service/*"
]
karpenter_controller_node_iam_role_arns = [
module.eks.eks_managed_node_groups["${local.cluster_name}"].iam_role_arn
]
oidc_providers = {
ex = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["karpenter:karpenter"]
}
}
}
output.tf
output "cluster_arn" {
description = "The Amazon Resource Name (ARN) of the cluster"
value = module.eks.cluster_arn
}
output "cluster_certificate_authority_data" {
description = "Base64 encoded certificate data required to communicate with the cluster"
value = module.eks.cluster_certificate_authority_data
}
output "cluster_endpoint" {
description = "Endpoint for EKS control plane."
value = module.eks.cluster_endpoint
}
output "cluster_id" {
description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready"
value = module.eks.cluster_id
}
output "cluster_oidc_issuer_url" {
description = "The URL on the EKS cluster for the OpenID Connect identity provider"
value = module.eks.cluster_oidc_issuer_url
}
output "cluster_platform_version" {
description = "Platform version for the cluster"
value = module.eks.cluster_platform_version
}
output "cluster_status" {
description = "Status of the EKS cluster. One of `CREATING`, `ACTIVE`, `DELETING`, `FAILED`"
value = module.eks.cluster_status
}
output "cluster_primary_security_group_id" {
description = "Cluster security group that was created by Amazon EKS for the cluster. Managed node groups use this security group for control-plane-to-data-plane communication."
value = module.eks.cluster_primary_security_group_id
}
output "cluster_region" {
description = "The AWS region the cluster has been depoyed to"
value = var.region
}
output "eks_managed_node_groups" {
description = "Map of attribute maps for all EKS managed node groups created."
value = module.eks.eks_managed_node_groups
}
output "cluster_iam_role_arn" {
description = "IAM role ARN of the EKS cluster."
value = module.eks.cluster_iam_role_arn
}
output "cluster_iam_role_name" {
description = "IAM role name of the EKS cluster."
value = module.eks.cluster_iam_role_name
}
output "cluster_iam_role_unique_id" {
description = "Stable and unique string identifying the IAM role."
value = module.eks.cluster_iam_role_unique_id
}
output "oidc_provider_arn" {
description = "The ARN of the OIDC Provider"
value = module.eks.oidc_provider_arn
}
output "karpenter_irsa_iam_role_arn" {
description = "ARN of IAM role"
value = module.karpenter_irsa.iam_role_arn
}
output "karpenter_irsa_iam_role_name" {
description = "Name of IAM role"
value = module.karpenter_irsa.iam_role_name
}
output "karpenter_irsa_iam_role_path" {
description = "Path of IAM role"
value = module.karpenter_irsa.iam_role_path
}
output "karpenter_irsa_iam_role_unique_id" {
description = "Unique ID of IAM role"
value = module.karpenter_irsa.iam_role_unique_id
}
output "aws_iam_instance_profile" {
description = "Karpenter discovers the InstanceProfile using the name KarpenterNodeRole-ClusterName."
value = aws_iam_instance_profile.karpenter.name
}
output "vpc_ic" {
description = "VPC ID"
value = data.terraform_remote_state.vpc.outputs.vpc_id
}
helm repo add karpenter https://charts.karpenter.sh/ helm repo update
helm upgrade --install --namespace karpenter --create-namespace \
karpenter karpenter/karpenter \
--version ${KARPENTER_VERSION} \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
--set clusterName=${CLUSTER_NAME} \
--set clusterEndpoint=${CLUSTER_ENDPOINT} \
--set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
--wait # for the defaulting webhook to install before creating a Provisioner
Resource Specs and Logs
Provisioner specs
cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
limits:
resources:
cpu: 1000
providerRef:
name: default
ttlSecondsAfterEmpty: 30
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
karpenter.sh/discovery/${CLUSTER_NAME}: ${CLUSTER_NAME}
securityGroupSelector:
karpenter.sh/discovery:/${CLUSTER_NAME} ${CLUSTER_NAME}
EOF
Pod spec: This deployment uses the pause image
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
resources:
requests:
cpu: 1
EOF
kubectl scale deployment inflate --replicas 5
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
ERROR controller.provisioning Provisioning failed, launching node, creating cloud provider instance, with fleet error(s), UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: C7p8L0t12N16ndxkKcGjcONv8J49w9BbgBmdY
kubectl logs karpenter-5c77486564-jvdm7 -n karpenter
Defaulted container "controller" out of: controller, webhook
{"level":"info","ts":1663558880.1839774,"logger":"fallback","caller":"injection/injection.go:61","msg":"Starting informers..."}
2022-09-19T03:41:20.178Z INFO Successfully created the logger.
2022-09-19T03:41:20.178Z INFO Logging level set to: debug
2022-09-19T03:41:20.184Z INFO controller Initializing with version v0.16.1 {"commit": "b157d45"}
2022-09-19T03:41:20.184Z INFO controller Setting GC memory limit to 966367641, container limit = 1073741824 {"commit": "b157d45"}
2022-09-19T03:41:20.203Z DEBUG controller.aws Using AWS region us-east-1 {"commit": "b157d45"}
2022-09-19T03:41:20.403Z DEBUG controller.aws Discovered caBundle, length 1099 {"commit": "b157d45"}
2022-09-19T03:41:20.403Z INFO controller loading config from karpenter/karpenter-global-settings {"commit": "b157d45"}
I0919 03:41:20.516867 1 leaderelection.go:243] attempting to acquire leader lease karpenter/karpenter-leader-election...
2022-09-19T03:41:20.517Z INFO controller starting metrics server {"commit": "b157d45", "path": "/metrics"}
E0919 03:41:20.559780 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io "karpenter-leader-election" already exists
2022-09-19T03:41:21.074Z INFO controller.aws.pricing updated spot pricing with 558 instance types and 2629 offerings {"commit": "b157d45"}
2022-09-19T03:41:22.060Z INFO controller.aws.pricing updated on-demand pricing with 558 instance types {"commit": "b157d45"}
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave “+1” or “me too” comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 27 (12 by maintainers)
It is still happening on v0.27.0 and I am not using terraform at all.
@FernandoMiguel , It is not a bug of karpenter. It is the bug related to the module https://github.com/terraform-aws-modules/terraform-aws-iam/tree/v5.5.0/modules/iam-role-for-service-accounts-eks. I filed a request at https://github.com/terraform-aws-modules/terraform-aws-iam/issues/284. As of now, I downloaded the module iam-role-for-service-accounts-eks, made changes in policies.tf. The code is as follows.
I added iam_instance_profile into the launch template. I am able to successfully bring up Ubuntu EC2 instances as EKS nodes. However, I have to use the policy AmazonEKS_Karpenter_Controller_Policy-karpenter-eks-dev that I created at AWS console instead of using the policy AmazonEKS_Karpenter_Controller_Policy-20220922191658668300000010 that is created by my terraform script. I got, “UnauthorizedOperation” if I use the policy AmazonEKS_Karpenter_Controller_Policy-20220922191658668300000010 (For completed errors please refer to the messages I posted earlier). The difference between these two policies are tags. Per @FernandoMiguel, I may use the module terraform-aws-eks-blueprints. I will look into it. Anyone has idea why the policy created by above Terraform code does not work? I referenced this doc https://karpenter.sh/v0.16.2/getting-started/getting-started-with-terraform/ for the above Terraform code. Or how to make improvement of “Getting Started with Terraform” to make the policy works?
You can decode the authorization failure message to understand what’s the issue about.
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-not-auth-launch/