aws-cdk: eks: fail to create eks nodegroup in cn-north-1

Describe the bug

Hi, folks

I met a promble when use aws python cdk to create eks cluster. Please find information below:

My local env: (.venv) [ec2-user@ip-10-0-1-73 python-cdk]$ cdk --version 2.67.0 (build b6f7f39) (.venv) [ec2-user@ip-10-0-1-73 python-cdk]$ python3 --version Python 3.7.10 (.venv) [ec2-user@ip-10-0-1-73 python-cdk]$ cat /proc/version Linux version 5.10.144-127.601.amzn2.x86_64 (mockbuild@ip-10-0-44-229) (gcc10-gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1), GNU ld version 2.35-21.amzn2.0.1) #1 SMP Thu Sep 29 01:11:59 UTC 2022

Here is the core code:

node_role = iam.Role.from_role_arn(self, 'eks-node-role-arn-lookup', 'arn:aws-cn:iam::xxxxxxxxxxx:role/eks-node-role')

cluster.add_nodegroup_capacity(
    nodegroup_name,
    nodegroup_name=nodegroup_name,
    instance_types=[ec2.InstanceType(instance_type)],
    min_size=1,
    max_size=3,
    capacity_type=capacity_type,
    disk_size=disk_size,
    ami_type=ami_type
	node_role=node_role
)

I manually create the Node Role, and the cdk will deploy successfully, but when i remove the node_role parameter, like these:

cluster.add_nodegroup_capacity(
    nodegroup_name,
    nodegroup_name=nodegroup_name,
    instance_types=[ec2.InstanceType(instance_type)],
    min_size=1,
    max_size=2,
    capacity_type=capacity_type,
    disk_size=disk_size,
    ami_type=ami_type
)

Below error messages will be thrown :

Resource handler returned message: "Following required service principals [ec2.amazonaws.com.cn] were not found in the trust relations
hips of nodeRole arn:aws-cn:iam::4123xxxxxxx:role/eks-cluster-stack-eksgitlabrunnerclusterNodegroupg-1EPH8PW36YZ3A (Service: Eks, Sta
tus Code: 400, Request ID: 6f4cc1b1-4fd2-4072-887c-abc6ddf60d58)" (RequestToken: 7c7be61d-a2a5-3e36-1a34-e6a54c71d72a, HandlerErrorCod
e: InvalidRequest)

But i think the principals [ec2.amazonaws.com.cn] is right in cn-north-1 region.

Could you please help to check this problem ?

Expected Behavior

When I do not specify the node role in the method, i think cdk will automaticallycreate the node role.

Method doc : https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_eks/Cluster.html#aws_cdk.aws_eks.Cluster.add_nodegroup_capacity

Current Behavior

In the cn-north-1 region, CDk create the node role failed.

I had checked the principals which in my another ec2 role, the configuration [ec2.amazonaws.com.cn] is right.

It seems that CDK cannot recognize this principals

Reproduction Steps

Refer to the CDK code, when remove the node_role, it will create failed in cn-north-1 region.

Possible Solution

manually create the node role, and hard-code in the cdk code

Additional Information/Context

No response

CDK CLI Version

2.67.0

Framework Version

No response

Node.js Version

v16.18.0

OS

Amazon Linux2

Language

Python

Language Version

3.7.10

Other information

No response

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 17 (11 by maintainers)

Commits related to this issue

Most upvoted comments

I can confirm we can successfully deploy EKS cluster in China regions with escape hatches as below:

import { KubectlV26Layer as KubectlLayer } from '@aws-cdk/lambda-layer-kubectl-v26';

const cluster = new eks.Cluster(scope, 'EksCluster', {
        vpc,
        version: eks.KubernetesVersion.V1_26,
        kubectlLayer: new KubectlLayer(scope, 'KubectlLayer'),
        defaultCapacity: 2,
    });

// override the service principal for the default nodegroup
overrideServicePrincipal(cluster.defaultNodegroup?.role.node.defaultChild as iam.CfnRole)

const ng = cluster.addNodegroupCapacity('NG', {
  desiredSize: 2,
});

// override the service principal for the additional nodegroup
overrideServicePrincipal(ng.role.node.defaultChild as iam.CfnRole)


function overrideServicePrincipal(role: iam.CfnRole) {
  role.addPropertyOverride('AssumeRolePolicyDocument.Statement.0.Principal.Service', ['ec2.amazonaws.com', 'ec2.amazonaws.com.cn'])
}
% kubectl get no
NAME                                          STATUS   ROLES    AGE     VERSION
ip-10-0-140-206.cn-north-1.compute.internal   Ready    <none>   2m34s   v1.26.2-eks-a59e1f0
ip-10-0-141-57.cn-north-1.compute.internal    Ready    <none>   2m20s   v1.26.2-eks-a59e1f0
ip-10-0-174-210.cn-north-1.compute.internal   Ready    <none>   2m34s   v1.26.2-eks-a59e1f0

This is a temporary fix for this issue from CDK.

Hi Pahud @pahud , yes, I can create the eks cluster via v2.65 and v2.66, but without the Nodegroup resource. I think like this:

const cluster = new eks.Cluster(this, 'Cluster', {
  vpc,
  version: eks.KubernetesVersion.V1_24,
  defaultCapacity: 0,
  kubectlLayer,
});

Here is my python code:

vpc = ec2.Vpc.from_lookup(
            self, "my-vpc", vpc_id=vpc_id
        )
# eks cluster
cluster = self.create_eks_cluster(vpc)
def create_eks_cluster(self, vpc):
        cluster = eks.Cluster(
            self,
            "eks-cluster",
            cluster_name=cluster_name,
            vpc=vpc,
            default_capacity=0,
            version=eks.KubernetesVersion.V1_24
        )
        return cluster

@pahud Yes, I can deploy the EKS Cluster via cdk v2.65 in cn-north-1.

@Bruce-Lu674 The relevant team is working on it but I don’t have ETA at this moment but I will update here when I see the issue is fixed(hopefully very soon).

btw, are you able to successfully deploy eks with cdk 2.65 in cn-north-1?