terraform-provider-helm: Multiple helm_release resources concurrency issue (fatal error: concurrent map read and map write)
TL;DR: I created my own wrapper module (see below) for Helm provider’s helm_release resource. I use that wrapper module multiple times in my main.tf to roll out complete stack for my cluster (see below). I experience random crashes during terraform plan and apply:
Error: rpc error: code = Unavailable desc = transport is closing
Error: rpc error: code = Unavailable desc = transport is closing
Error: rpc error: code = Unavailable desc = transport is closing
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave “+1” or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform Version and Provider Version
Terraform v0.12.24andv0.12.25(always hangs)terragrunt version v0.23.10andv0.23.18
Provider Version
terraform-provider-helm_v1.2.1_x4
Affected Resource(s)
helm_release
Terraform Configuration Files
Debug Output
Stack Trace
fatal error: concurrent map read and map write
goroutine 31 [running]:
runtime.throw(0x2c94295, 0x21)
/opt/goenv/versions/1.13.7/src/runtime/panic.go:774 +0x72 fp=0xc000aa7020 sp=0xc000aa6ff0 pc=0x102e112
runtime.mapaccess2_faststr(0x2957ac0, 0xc0002aa360, 0xc000432700, 0x1b, 0x1, 0xc000432700)
/opt/goenv/versions/1.13.7/src/runtime/map_faststr.go:116 +0x48f fp=0xc000aa7090 sp=0xc000aa7020 pc=0x1012bdf
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*DiffFieldReader).ReadField(0xc0002aa300, 0xc0006d2720, 0x3, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x5555555555555555, ...)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/field_reader_diff.go:51 +0x10d fp=0xc000aa71d0 sp=0xc000aa7090 pc=0x197623d
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*MultiLevelFieldReader).ReadFieldMerge(0xc000b05940, 0xc0006d2720, 0x3, 0x3, 0x2c5b89f, 0x3, 0x0, 0x0, 0x0, 0x0, ...)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/field_reader_multi.go:45 +0x1d8 fp=0xc000aa72e0 sp=0xc000aa71d0 pc=0x19796b8
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*ResourceData).get(0xc000116540, 0xc0006d2720, 0x3, 0x3, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/resource_data.go:537 +0x2f8 fp=0xc000aa73c8 sp=0xc000aa72e0 pc=0x1986128
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*ResourceData).getRaw(0xc000116540, 0xc0004326e0, 0x1b, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/resource_data.go:128 +0x75 fp=0xc000aa7430 sp=0xc000aa73c8 pc=0x19837c5
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*ResourceData).GetOk(0xc000116540, 0xc0004326e0, 0x1b, 0x2c6c3d0, 0xe, 0xc0004326e0)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/resource_data.go:94 +0x5f fp=0xc000aa74e0 sp=0xc000aa7430 pc=0x198351f
github.com/terraform-providers/terraform-provider-helm/helm.k8sGetOk(0xc000116540, 0x2c6c3d0, 0xe, 0x24, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/helm/provider.go:268 +0x93 fp=0xc000aa7588 sp=0xc000aa74e0 pc=0x27709b3
github.com/terraform-providers/terraform-provider-helm/helm.(*KubeConfig).toRawKubeConfigLoader(0xc0006d2060, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/helm/structure_kubeconfig.go:86 +0xd38 fp=0xc000aa7718 sp=0xc000aa7588 pc=0x277bc68
github.com/terraform-providers/terraform-provider-helm/helm.(*KubeConfig).ToRawKubeConfigLoader(0xc0006d2060, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/helm/structure_kubeconfig.go:68 +0xab fp=0xc000aa7780 sp=0xc000aa7718 pc=0x277aecb
github.com/terraform-providers/terraform-provider-helm/helm.(*KubeConfig).ToRESTConfig(0xc0006d2060, 0x0, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/helm/structure_kubeconfig.go:31 +0x2b fp=0xc000aa77b0 sp=0xc000aa7780 pc=0x277ab1b
k8s.io/kubectl/pkg/cmd/util.(*factoryImpl).ToRESTConfig(...)
github.com/terraform-providers/terraform-provider-helm/vendor/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go:63
k8s.io/kubectl/pkg/cmd/util.(*factoryImpl).KubernetesClientSet(0xc0006d2090, 0x0, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/vendor/k8s.io/kubectl/pkg/cmd/util/factory_client_access.go:79 +0x38 fp=0xc000aa77e0 sp=0xc000aa77b0 pc=0x25167e8
helm.sh/helm/v3/pkg/kube.(*Client).IsReachable(0xc0006d20c0, 0x1983656, 0x28d1360)
github.com/terraform-providers/terraform-provider-helm/vendor/helm.sh/helm/v3/pkg/kube/client.go:91 +0x37 fp=0xc000aa7840 sp=0xc000aa77e0 pc=0x2518be7
helm.sh/helm/v3/pkg/action.(*Get).Run(0xc000aa78d0, 0xc000744960, 0x12, 0x28d1360, 0x28d1360, 0x3012010)
github.com/terraform-providers/terraform-provider-helm/vendor/helm.sh/helm/v3/pkg/action/get.go:41 +0x3b fp=0xc000aa7888 sp=0xc000aa7840 pc=0x271acab
github.com/terraform-providers/terraform-provider-helm/helm.getRelease(0xc000a192c0, 0xc000744960, 0x12, 0x28d1360, 0xc0006ced50, 0x1)
github.com/terraform-providers/terraform-provider-helm/helm/resource_release.go:894 +0x64 fp=0xc000aa7900 sp=0xc000aa7888 pc=0x2778fe4
github.com/terraform-providers/terraform-provider-helm/helm.resourceReleaseExists(0xc0002aca10, 0x2a55f40, 0xc0002aa2a0, 0xc0002aca10, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/helm/resource_release.go:730 +0x10b fp=0xc000aa7958 sp=0xc000aa7900 pc=0x277748b
github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*Resource).RefreshWithoutUpgrade(0xc000596c60, 0xc000b685a0, 0x2a55f40, 0xc0002aa2a0, 0xc000146960, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/resource.go:445 +0x24d fp=0xc000aa79c8 sp=0xc000aa7958 pc=0x19812dd
github.com/hashicorp/terraform-plugin-sdk/internal/helper/plugin.(*GRPCProviderServer).ReadResource(0xc00000ebc8, 0x3098960, 0xc000573020, 0xc00013c720, 0xc00000ebc8, 0xc000573020, 0xc0006b5b30)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/internal/helper/plugin/grpc_provider.go:525 +0x3d8 fp=0xc000aa7ad0 sp=0xc000aa79c8 pc=0x19a0b28
github.com/hashicorp/terraform-plugin-sdk/internal/tfplugin5._Provider_ReadResource_Handler(0x2bc4c00, 0xc00000ebc8, 0x3098960, 0xc000573020, 0xc00013c6c0, 0x0, 0x3098960, 0xc000573020, 0xc000684580, 0x54a)
github.com/terraform-providers/terraform-provider-helm/vendor/github.com/hashicorp/terraform-plugin-sdk/internal/tfplugin5/tfplugin5.pb.go:3269 +0x217 fp=0xc000aa7b40 sp=0xc000aa7ad0 pc=0x18b2787
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00053fb00, 0x30c10e0, 0xc000532780, 0xc00013a200, 0xc0003063c0, 0x4108710, 0x0, 0x0, 0x0)
github.com/terraform-providers/terraform-provider-helm/vendor/google.golang.org/grpc/server.go:1024 +0x4f4 fp=0xc000aa7e18 sp=0xc000aa7b40 pc=0x147ad24
google.golang.org/grpc.(*Server).handleStream(0xc00053fb00, 0x30c10e0, 0xc000532780, 0xc00013a200, 0x0)
github.com/terraform-providers/terraform-provider-helm/vendor/google.golang.org/grpc/server.go:1313 +0xd97 fp=0xc000aa7f48 sp=0xc000aa7e18 pc=0x147ea47
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004ba890, 0xc00053fb00, 0x30c10e0, 0xc000532780, 0xc00013a200)
github.com/terraform-providers/terraform-provider-helm/vendor/google.golang.org/grpc/server.go:722 +0xbb fp=0xc000aa7fb8 sp=0xc000aa7f48 pc=0x148be9b
runtime.goexit()
/opt/goenv/versions/1.13.7/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc000aa7fc0 sp=0xc000aa7fb8 pc=0x105b761
created by google.golang.org/grpc.(*Server).serveStreams.func1
github.com/terraform-providers/terraform-provider-helm/vendor/google.golang.org/grpc/server.go:720 +0xa1
Full Stack Trace
https://gist.github.com/krzysztof-miemiec/2d44d187a75a2f106ccd9c71fbf67883
Expected Behavior
Helm properly plans and applies changes to multiple releases.
Actual Behavior
Random crashes happen during plan phase (I provided stack trace for that) OR sometimes everything gets stuck during deployment (no logs 😞).
Steps to Reproduce
- Have Kubernetes cluster up and running, be able to freely access it via configured context
- Download zip file attached
terraform plan
Important Factoids
I never debugged or contributed to any Go tool, so these are mostly guesses based on my knowledge of Terraform:
- It happens randomly, I guess it occurs more often when you define more
helm_releaseresources (it probably won’t occur when multiple instances of helm provider are not running at the same time 🤔) - I noticed that it takes ages to refresh state of Helm resources, especially ones that are using CRDs (not the ones that define them). Using
-parallelism=1flag helps a bit with performance and with this flag I didn’t encounter crashes or infinite wait on plan phase. - Infinite wait happens sometimes during
terraform applyeven withwait = false. - If I understand stack trace correctly, it is a concurrency issue with Helm provider or Terraform itself
References
- https://github.com/terraform-providers/terraform-provider-helm/issues/486 - same user-facing error message, but I guess the cause is different
- https://github.com/hashicorp/terraform/issues/24589 - similar root cause, issue is closed, PR’s merged, but it’s not released yet
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 18
- Comments: 16 (8 by maintainers)
Commits related to this issue
- fix: avoid concurrent read/write in metadata (#494) * FIX: Looks like rewriting code related to Helm configuration creation (Kubeconfig structure) does the trick - I'm not sure of performance and mem... — committed to krzysztof-miemiec/terraform-provider-helm by krzysztof-miemiec 4 years ago
- fix: avoid concurrent read/write in metadata (#494) * FIX: Looks like rewriting code related to Helm configuration creation (Kubeconfig structure) does the trick - I'm not sure of performance and mem... — committed to krzysztof-miemiec/terraform-provider-helm by krzysztof-miemiec 4 years ago
- fix: avoid concurrent read/write in metadata (#494) * FIX: Looks like rewriting code related to Helm configuration creation (Kubeconfig structure) does the trick - I'm not sure of performance and mem... — committed to krzysztof-miemiec/terraform-provider-helm by krzysztof-miemiec 4 years ago
- Fix: avoid concurrent read/write in metadata (#494) (#525) — committed to hashicorp/terraform-provider-helm by krzysztof-miemiec 4 years ago
The error is not constant. It fails 50-60% of the time. The success rate might be related to the performance of the systems.
@krzysztof-miemiec looks like we hit the same bug at the same time 😉 https://github.com/terraform-providers/terraform-provider-helm/issues/493
Today Terraform version 0.12.25 was released, which fixed a concurrency bug. https://github.com/hashicorp/terraform/blob/v0.12.25/CHANGELOG.md There’s a chance that could have fixed this issue, but we still need to test using the reproducer.
I have encountered a similar problem. I have created a module that will deploy helm releases in a rancher namespace. (helm provider and rancher 2 provider)
At first, I did try to run this with 80-90 modules and did encounter the problem as well. So I did go down to 10 modules.
Deploying >= 10 modules will encounter in constant errors. (Did not test less than 10) Some of the helm releases where reported as deployed by the helm provider but in the “pending” state. When I checked in Kubernetes I could not see any resources regarding the specific helm release (Pods, deployments, etc), also
helm lsdid not show anything.The namespaces were created successfully since this is done by the rancher2 provider.
When running with
-parallelism=1everything worked fine.-parallelism=2does not work either.Also running 2 parallel
terraform apply -parallelism=1with 2 states with 10 modules each does work fine. I think it is not an issue by Kubernetes itself. I think it is related to terraform or the helm provider.Tested with terraform version
Any idea of how I could provide more useful information to debug this?
@krzysztof-miemiec 1.0.0 also works for me, it’s appeared right after update to 1.1.0 and upper.
im using 1.0.0 in production atm (without any additional flags)
i’ve tested helm-provider again with native kubernetes 1.16.9
so basically it works 15 out of 15 times with token specified in kubernetes spec.
other ways (like exec command or direct kubeconfig with or without exec path - causing the problem described problem).
I also encounter this error.
It seems that helm provider can not talk to k8s api. When i set my local kubeconfig to working k8s context, the terraform plan works. When I want to rely on the kubernetes provider only and set
KUBE_LOAD_CONFIG_FILE=falsethen I see the issue above.