etcd: ETCD doesn't automatically load changes to ca bundles for peer-trusted-ca-file or trusted-ca-file

Etcd cannot handle cert bundles in the peer-trusted-ca-file or trusted-ca-file section. Without the ability to handle CA bundles, it is impossible to do a 0 downtime approach to CA rotation without resigning all active client and server certs at once.

If a CA bundle was allowed: A new CA could be created and made valid in all components in the first interation. Then client certs can be resigned with the new CA since the server components have the new CA plus the old CA in it’s trust bundle. Once all clients have been resigned and downloaded the old + new CA the server components can be signed with the new CA and then the old CA can be effectively removed.

It appears this was meant to be fixed but I am able to replicate the issue in an etcd deployment today. I will expose all the certs and command line configurations in this issue so the exact steps can be replicated.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 2
  • Comments: 30 (6 by maintainers)

Commits related to this issue

Most upvoted comments

This is not stale

not stale, would love to see https://github.com/etcd-io/etcd/pull/13307 or a similar version merged.

Not stale

Ok… After a bit of poking around I think I see why CA certificates aren’t reloaded on new connections the same way that certs and client certs are.

The config object for crypto tls allows for GetCertificate and GetClientCertificate to be function based callbacks: https://github.com/golang/go/blob/master/src/crypto/tls/common.go#L557

etcd implements those and uses them to get a fresh copy of the cert and key file from the filesystem each time a new client connection is initiated: https://github.com/etcd-io/etcd/blob/main/client/pkg/transport/listener.go#L408

However, the config object does not expose a similar function based callback for loading CA certificates. For these it only exposes a single attribute: https://github.com/golang/go/blob/master/src/crypto/tls/common.go#L638 that is setup when the config is created.

The only way that I can see to work around this without changing the flow completely would be to implement the getConfigForClient() function (https://github.com/golang/go/blob/master/src/crypto/tls/common.go#L587) which would allow us to re-read the CAs from disk at the same time that we get the certs / client certs and re-initialize the CA certificate pool: https://github.com/etcd-io/etcd/blob/main/client/pkg/transport/listener.go#L486

@serathius would a change like this be acceptable to the project?

Nevermind, I see that https://github.com/etcd-io/etcd/pull/13307 does exactly that and is already in the process of being reviewed.

I’m not sure how I missed that or why that wasn’t recommended as it is an almost completed option instead of asking for a contribution. Anyway, I’ll monitor the process of that PR.