terraform-provider-aws: [Bug]: IPAM allocation fails with "InvalidIpamPoolAllocationId"
Terraform Core Version
1.3.2
AWS Provider Version
4.32.0 and 4.50.0
Affected Resource(s)
aws_vpc_ipam_pool_cidr_allocation
Expected Behavior
I expected that the IPAM allocation will be created successfully.
Actual Behavior
In our environment we are using a multi-account setup. The IPAM pools are created in one account and shared with RAM to another account. We are running in the below mentioned issue when we want to allocate an CIDR in the shared IPAM pool.
To create our IPAM pool allocation we are using this snippet in our code:
resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {
count = var.cf_subnet_infra_count
ipam_pool_id = var.ipam_pool_id
netmask_length = 27
}
But immediately afterwards we get the following error:
Error: InvalidIpamPoolAllocationId.NotFound: The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.
status code: 400, request id: 9683f21c-8972-4c40-8227-72f5c219e5d3
with aws_vpc_ipam_pool_cidr_allocation.vpc-ipam-pool-alloc-cidr-cf-subnet-infra[0],
on ipam_pool_allocations.tf line 1, in resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra":
1: resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {
When we run “aws ec2 get-ipam-pool-allocations --ipam-pool-id ipam-pool-0b325bb4efc6dacae”, I properly get returned all IpamPoolAllocations.
- We have not changed anything of the Terraform code in regards to the IPAM pool allocations.
- 3 different persons tried running the setup with assuming role “workload-terraform-role” and ran into the issue too.
- We also tried running
aws_vpc_ipam_pool_cidr_allocationin a different AWS account and ran also into that issue. - On previous runs couple weeks/months ago, the same code correctly created the
aws_vpc_ipam_pool_cidr_allocationwithout throwing the error.
Relevant Error/Panic Output Snippet
No response
Terraform Configuration Files
resource "aws_vpc_ipam_pool_cidr_allocation" "vpc-ipam-pool-alloc-cidr-cf-subnet-infra" {
count = var.cf_subnet_infra_count
ipam_pool_id = var.ipam_pool_id
netmask_length = 27
}
Steps to Reproduce
- Create an IPAM pool
- Try to allocate a CIDR in the pool
Debug Output
2023-01-16T10:54:34.077+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Action=GetIpamPoolAllocations&IpamPoolAllocationId=ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01&IpamPoolId=ipam-pool-0b325bb4efc6dacae&Version=2016-11-15
2023-01-16T10:54:34.077+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: -----------------------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] DEBUG: Response ec2/GetIpamPoolAllocations Details:
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: ---[ RESPONSE ]--------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: HTTP/1.1 400 Bad Request
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Connection: close
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Transfer-Encoding: chunked
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Cache-Control: no-cache, no-store
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Content-Type: text/xml;charset=UTF-8
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Date: Mon, 16 Jan 2023 09:54:33 GMT
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Server: AmazonEC2
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Strict-Transport-Security: max-age=31536000; includeSubDomains
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: Vary: accept-encoding
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: X-Amzn-Requestid: 9683f21c-8972-4c40-8227-72f5c219e5d3
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5:
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5:
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: -----------------------------------------------------
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: <Response><Errors><Error><Code>InvalidIpamPoolAllocationId.NotFound</Code><Message>The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.</Message></Error></Errors><RequestID>9683f21c-8972-4c40-8227-72f5c219e5d3</RequestID></Response>
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: [DEBUG] [aws-sdk-go] DEBUG: Validate Response ec2/GetIpamPoolAllocations failed, attempt 0/25, error InvalidIpamPoolAllocationId.NotFound: The IPAM pool allocation (ipam-pool-alloc-0f1fe03456e174fea9c82affb5ee35e01) does not exist.
2023-01-16T10:54:34.330+0100 [DEBUG] provider.terraform-provider-aws_v4.32.0_x5: status code: 400, request id: 9683f21c-8972-4c40-8227-72f5c219e5d3
Panic Output
No response
Important Factoids
No response
References
No response
Would you like to implement a fix?
None
Alexander Barth (alexander.barth@mercedes-benz.com) on behalf of Mercedes-Benz Tech Innovation GmbH, Provider Information
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 23
- Comments: 18 (5 by maintainers)
Hello Kevin, thanks for putting the effort in for the sample code! In the provider we have mechanisms for retries and waiting (retries and waiters), and our PR guidelines suggest that we follow any exiting patterns in the resource being modified.
I have added the mechanisms for retry and waiting to account for eventually consistency of the read operation, and I’ve added additional acceptance tests to verify cross region pool CIDR allocation.
I assure you this is being worked on. The provider team does releases each Thursday.
I can successfully reproduce in an AccTests. Working on a fix.
We’re seeing similar issues with that resource as well (using 4.50.0). IPAM pool isn’t shared with RAM in our case, all operations happen in the same AWS account.
First attempt
planapplyapplyfails with the following error message, however if we check the AWS Console the allocation is well created in IPAM service.Second attempt
planResource is shown as tainted.
applyBecause the resource is tainted, it is being deleted, but that fails as well.
(^ not a typo in error message, a value is missing)
EDIT: We’ve opened a case with AWS support in the meantime, as we believe this is likely to be an issue with AWS IPAM service API rather than the provider. We were able to replicate the issue with AWS CLI as well.
A quick clarification, AWS Enterprise Support does offer Third-Party Product support, including open source software such as Terraform. I agree that having a reproducible case in a script using the AWS CLI is certainly helpful, though not required.
AWS works with Hashicorp and the open source community to evaluate and prioritize issues as per the Terraform AWS Provider FAQ.
I also just started on this issue and added this code block to
ipam_pool_cidr_allocation.goWe need this change urgently. Do you work on this within the next few days or should I open a PR? If the latter, could you share your test code?
@AdamTylerLynch I will do that. I just thought it was worth mentioning that using import was not a workaround for us.
Interesting. For me the script always runs through without any issues and the creation through terraform still throws the error InvalidIpamPoolAllocationId.NotFound. Exact same IAM-Role used.
Mili Durasovic mili.durasovic@mercedes-benz.com, Mercedes-Benz Tech Innovation GmbH Provider Information
I’ve been running the following script, and issue happens randomly after a couple runs
script.sh
We have the exact same problem as @Tailzip. We are referencing the cidr in a local. It seems that the local is being evaluated way too early. The terraform resource might indicate that the allocation is done, but it seems like it’s still ongoing asynchronously in AWS.
We tried to remove the tainted resource and then import the resource, but that doesn’t seem to work. Afterwards it showed us a completely new resource being created. Applying that will result in the mentioned error again (“couldn’t find resource”). The import statement looks a little bit weird as well. The text says that the allocation id is used for the import, but the example only shows the resource.
EDIT: We verified the problem with 4.19.0, 4.46.0 and 4.48.0. It seems that it started last week as sporadic behavior, but now this is a constant behavior.