amphtml: Bing.com AMP cache sets incorrect __amp_source_origin value on CORS requests

What’s the issue?

CORS requests made from AMP documents hosted by Bing.com’s AMP cache are failing due to incorrect domain being set in the __amp_source_origin query param. The value of this param is being set to the AMP cache domain rather than the publisher’s domain. This causes CORS validation to fail on the source origin’s server.

How do we reproduce the issue?

Yahoo Sports document hosted by Bing’s AMP cache and served via bing.com:

https://www.bing.com/amp/s/sports.yahoo.com/amphtml/fantasy-basketball-rookies-who-will-provide-the-most-future-value-211618398.html

This results in a CORS validation failure, as seen in the console:

Access to fetch at ‘https://www.yahoo.com/caas/sidekick/sidekick?appid=amp&lang=en-US&region=US&site=sports&__amp_source_origin=https%3A%2F%2Fsports-yahoo-com.bing-amp.com’ from origin ‘https://sports-yahoo-com.bing-amp.com’ has been blocked by CORS policy: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

__amp_source_origin is being set to https://sports-yahoo-com.bing-amp.com instead of https://sports.yahoo.com

There’s some suspicion that this is related the configuration of the cache, specifically the cdnProxyRegex property, which defaults to the Google AMP CDN domain if not set. See this bit of code, where cdnProxyRegex is configured, otherwise it falls back to Google’s CDN:

https://github.com/ampproject/amphtml/blob/2755c10b95d54b98a587b7b091db7d5697684903/src/config.js#L41-L42

Worth asking - if this value is required to be set by a cache to a regex for its own domain in order for AMP documents to function properly, why offer a fallback value at all?

cc @choumx @ssantosms

What browsers are affected?

All

Which AMP version is affected?

Noticed on 1901081935550, suspect it’s always been an issue however

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

Actually @rsimha I can’t close this bug. Would you mind closing it? It has been fixed on our side.

@src-code I noticed that yesterday. Team is looking into it with priority. I will close this bug

@jridgewell @ssantosms It looks to me like there is an extra . in the regex that shouldn’t be there, before ‘bing-amp.com’:

/^https://([a-zA-Z0-9_-]+.)?.bing-amp.com$/

There’s another dot inside the capture group, so it’s expecting two dots in the string it’s matching. Also, I assume these should be literal periods, and thus escaped?

That’s a good point. I will validate and fix if needed.

We can change this to be a required setting in the config. /to @rsimha