amphtml: I2I: Create stable release channel with slower release cadence
AMP Monthly Releases
Background
Today, a new version of the AMP runtime is released once a week. The release cycle involves cutting a new canary branch each Tuesday, releasing it to a subset of users in two modes (one with canary flags turned on, and one with them turned off), and then releasing the same version to production the following week, modulo cherry-picks.
In the near future, we are looking to introduce nightly builds into the mix, where each week’s canary release will be based on the previous week’s most stable nightly build. For the purpose of this document, it’s sufficient to note that the nightly build process is not going to change the weekly cadence of production releases.
For this document, lts will be used to refer to the approximately-monthly release flavor(s) so as not to tie it to any specific schedule. Flavors will be referred to as lts and lts-rc, the counterparts of prod and rc. This does not account for discussions about renaming current release channels (discussions which include renaming prod to stable) as that would confuse matters here.
Problem Statement
External partners of AMP are faced with the problem of having to ensure quality between releases and avoid regressions to their websites. As the AMP project has grown in size and user base, the need has arisen for a version-locking mechanism that has a slower cadence than the current weekly release cycle.
It’s worth drawing a contrast between a version-locking mechanism (requires action on the part of a website) and our current canary / RC release mechanism (requires opt-in by users, or served randomly to a small percentage of requests). Today, the only way for a partner to lock their website to a specific version of AMP is to pin it to a specific RTV version, which will require regular updates on their side, and causes the page to become an invalid-AMP page, which also makes it ineligible for inclusion in AMP caches, affecting preloading speed benefits.
Therefore, we need to do two things:
- Create a new release channel for AMP that is updated less frequently than our normal v0.js release
- Offer partners a simple way to control the AMP version used by their website regardless of AMP’s release schedule
Non-Goals
- Allowing use of the new channel for AMP emails, ads, Stories, or Actions
Current release cycle
Today, AMP releases are done in phases:
- [RELEASE] Compile the runtime and create four flavors with the same version with four prefixes: canary (00), rc (03), control (02), and prod (01) and write them to an RTV directory.
- [PROMOTE] Push the canary and rc flavors within 24 hours to opt-in users and a small percentage of all requests
- [PROMOTE] About a week later, push the prod (and control) flavors to all users.
As a result, the release cycle is subject to the following constraints:
- Websites that use
v0.js
will get a new AMP version every week - To avoid this, a publisher may pin their website to a specific RTV version
Things to consider
- AliExpress, one of AMP’s largest partners, would like to version-lock AMP to one of the last N versions. What is a good value for N? See: https://github.com/ampproject/amphtml/issues/20903
- N = 4
- We’d need to ensure that a website that opts into a version-locked release receives the same version of the core runtime, extensions, and all other build targets
- Runtime already handles this.
- The mechanism to create a lts release should ideally fit into our existing release mechanism (that includes the AMP_CANARY cookie, and the canary, rc, control, and prod releases)
- This may not be necessary, given that the lts release will be an exact mirror of a previous prod release. There will be no experiments or need to canary a build to a random segment of the population, so it’s possible most of these cookies could be ignored. The only exception would be opt-in, which would allow partners to test a new build on their site manually.
- Vendors want stable builds, but there are cases where we’ll need to change the build anyway, such as a security or privacy issue. We should make it clear that “stable builds” may change in some circumstances.
- This is true, but since the lts release would be over a month old, presumably it would be slightly more rare that a cherry-pick would be required
- What is the bar for cherry picks to lts builds? (For weekly releases, the bar is a P0 issue. Consider using the same bar)
- See below
Any cherry-pick into the weekly release that lts-rc mirrors would be included in lts-rc; cherry-picks should almost never be necessary for lts
- In cases where a cherry pick is needed only for the lts version, will we create a new cherry pick just for the lts?
It’s not clear to me if there’s any case where this should be relevant. I suppose that would have to mean there was an unnoticed P0 for over a month, that was eventually resolved in the weekly release process, but not through a cherry-pick. The release cadence should guard against most cherry-pick needs.- https://github.com/ampproject/amphtml/issues/25417 was a cherry pick for a behavior that’s existed for years
- In cases where cherry-picks are required, the process should be similar to a production cherry-pick
Potential solutions
Release Cadence
For the purposes of explaining the cadence, I’m going to adopt the convention of referring to the release period as a “month”, which is assumed to have exactly 4 “weeks”. Due to actual month/day counts this will instead equate to a rolling 4-week window, but explaining in terms of months will make the approach easier to grasp. I will refer to releases as release[month][week] where week is 1-4.
- During the [PROMOTION] phase of release[N][1], identify the “best” prod release from month N-1; note that doing this at the [PROMOTION] phase of release[N][1] ensures that even the latest prod release (release[N-1][4]) has been in the wild for at least one full week.
- Promote lts-rc to lts.
- Assign lts-rc to the release chosen in step 1
Some effects of this cadence, in no particular order:
- Actual lts cuts may be very close together (ie. Nov-w4 and Dec-w1 above) or far apart (Dec-w1 and Jan-w3)
- Each lts cut will be lts-rc for a full “month” before reaching lts
- When lts-rc is assigned, it will be at least one week and at most four weeks since it was promoted to the weekly prod
- When lts is released, it will be at least five weeks and at most nine weeks since it was promoted to the weekly prod
LTS Release Candidate
There are two similar but subtly distinct problems third-party clients face which could be addressed by lts releases:
- There is too little time between releases for sites (ex. AliExpress) to fix new issues
- There is too little time between a canary being cut and its promotion to prod for sites to identify issues before they reach production
Releasing approximately monthly solves the first problem. By also providing a lts-rc, we give the third-parties a month-long window to validate the new version on their site. It would be possible to abandon the lts-rc channel entirely, and let lts operate in its place with a running delay of anywhere from 1-5 weeks.
Advantages
- Four-week window for third-parties to validate releases before they are deployed to users (and RTV lock if they see a pending breakage)
- Most or all P0s would be addressed long before a release reaches lts
- Clean integration with existing rc/canary cookies and splits (ie. AliExpress automatically has users testing the next lts release candidate)
Disadvantages
- Possibly more complexity in request rewriting, caching, etc.
- More targets to cherry-pick into if bugs are found
- lts can be behind prod by as much as nine weeks
Proposed choice: Provide a lts-rc, reachable with an opt-in cookie (not with any sort of automatic Mendel splits). This will allow developers of big sites plenty of time to see how their site will be affected by an upcoming release, without affecting the experience for users.
If the delay between prod and lts is a significant issue, the lts-rc window could be shrunken arbitrarily; lts-rc could serve lts except for N weeks before the next lts release. This gives us the option to decide how far in advance we want to give a release candidate for vendors on the lts channel to test.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 37 (32 by maintainers)
This was launched and posted about on the AMP Blog. Closing
Stories are AMP, so they would be affected by this. If we don’t want to support this for Stories then that’s a more involved process.
On Fri, Dec 6, 2019 at 9:56 AM Ryan Cebulko notifications@github.com wrote:
Was just going to write that email, ads and actions validator specs probably shouldn’t allow this. 👍
“Stories” is not a validator spec like the others though.
I think updating AMP’s release schedule documentation and an FYI (as you have already done) are good enough here.
@rcebulko for the /sp/ mp/ /esm/ paths, these are all experiment buckets which other caches are not aware of. These paths are temporary and will be gotten rid of by Q1