kops: `kops create --dry-run` should not fail if cluster state already exists

  1. What kops version are you running?

1.8.0

  1. What Kubernetes version are you running?

1.8.0

  1. What cloud provider are you using?

AWS

  1. What commands did you run? What is the simplest way to reproduce this issue?

After cluster state is already in state store (e.g. it is already created or manifest was uploaded to S3 via kops replace -f ...) try to produce a manifest using the --name (e.g. my-cluster.k8s.local) and --state (e.g. s3://my-cluster-state-store) of the existing cluster:

kops create --dry-run ...
  1. What happened after the commands executed?
I0117 11:04:51.031445   28032 s3context.go:163] Found bucket "my-cluster-state-store" in region "eu-west-1"
I0117 11:04:51.031751   28032 s3fs.go:176] Reading file "s3://my-cluster-state-store/my-cluster.k8s.local/config"

cluster "my-cluster.k8s.local" already exists; use 'kops update cluster' to apply changes
  1. What did you expect to happen?

No error. It should produce a manifest yaml independently of whether the cluster with such name already exists or not. Ideally, it must not require the existence of the bucket at all on dry run.

I need such behavior in order to automate cluster (re)deploys. I want my script to always start with creation of the desired manifest via kops create --dry-run .... Because of the current behavior the second execution of my script fails (i.e. after the cluster is already created).

I found a workaround though:

  1. Always provide an empty bucket on kops create --dry-run ...
  2. After it produces the manifest replace the configBase field in yaml so that it points to the real S3 bucket which will be used to hold the state for the cluster instead of the empty bucket name.

Related issues: https://github.com/kubernetes/kops/issues/2603, https://github.com/kubernetes/kops/issues/1984, https://github.com/kubernetes/kops/issues/4287.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 10
  • Comments: 38 (8 by maintainers)

Most upvoted comments

My initial reaction is, if I run kops create --dry-run I would want all the same checks as a regular kops create, just no actual execution. That would include the check for existing clusters, similar to the way terraform plan takes into account existing state. I think it could get confusing allowing a dry run to pass a check (like existing cluster), that would break an actual creation.

That said your automation seems pretty awesome. I personally don’t automate creation at that level. We just version our manifests in s3. I have one concern with your constant kops create cluster. @chrislovecnm can keep me honest on this but I think there have been times where kops will assume changes will only automatically be applied to a new cluster and thus by running it frequently, eventually you may get a new flag defaulted in a way you would not expect.

For instance, at one point:

iam:
  legacy: true

was added to older kops clusters to ensure change wasn’t automatically pushed to existing clusters. If you had a cluster, then regenerated your cluster manifest from scratch, you’d get legacy: false and your cluster could see unexpected change. Now I’m not sure if additional changes like that could occur down the road but just one concern that comes to mind with the method above.

My preference would be an idempotent value like set (closest thing I could find in kubectl ):

kops set cluster --dry-run --output yaml 
                          --name $name
                          --state $state
                          --node-count $node-count 
                          --ssh-public-key $ssh-public-key .

Instead this would set those values to an existing cluster, or fall back to create a new cluster, thus dry-run / existing cluster should be idempotent.

I see the need for a solution, but I’m not sure changing kops create is the right way. I need to think a little more on this but that’s what I have so far!

I also need an apply mechanism. I’m working on a multi-tenant cluster provisioning system that works across cloud providers, so I need to automate everything. My current approach is to run kops export kubecfg, check if stderr has a “not found” string, if it doesn’t, then I run kops get cluster -o json, otherwise I run kops create cluster --dry-run -o json. I then update the manifest and run a replace --force.

My desire is to have helpers used in create_cluster.go exported, so I can generate a mostly complete manifest, fill in the gaps with user config, then apply that manifest. For instance, I need the code that maps subnet regions for GCP, so I don’t have to roll that myself. Something like this would greatly reduce the complexity of my system.

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale