gatsby: [gatsby-source-medium] Plugin fails due to new medium cloudflare ddos prevention
Description
Due to medium integrating cloudflares anti ddos protection recaptcha the gatsy-source-medium plugin now works sporadically and fails builds.
This is a new feature they’ve just implemented so nothing that the plugin has done wrong of course! Just something that needs to be worked around
Possible way to fix
Cloudscraper
I’ve not come across an issue like this before but a bit of research has led me to cloudscraper which reports that it can bypass the cloudflare screen on a node process.
Happy to take work on a PR to get this integrated but looking for advice/suggestions prior to starting work as this area of getting around cloudflare screen is new to me!
UPDATE upon closer inspection I don’t think this will work as the module doesn’t bypass reCaptcha, from the issues the author suggests using a paid service which isn’t something I think gatsby would want to integrate
Use RSS feed
Using a different endpoint https://medium.com/feed/${nameOfBlog}
you’re able to get an RSS export of the blog/publication, this is simple enough to parse to json and then parse the HTML content but would require a bit more thought to match the current implementation of the plugin. Another issue with this potential fix is that the RSS feed only gives you the latest 10 posts from the blog, rather than the 100 limit the gatsby plugin requests.
Steps to reproduce
- Add plugin to build process
- Use plugin in creating a page
- Run build
- See 403 errors in build from axios detailing the failed request to url,
https://medium.com/${nameOfBlogHere}/latest?format=json&limit=100
Expected result
Build should run without errors
Actual result
Build failed due to 403 requests due to not completing captcha
Environment
Builds fail locally and on Netlify environment.
System:
OS: macOS 10.14.6
CPU: (12) x64 Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
Shell: 3.0.2 - /usr/local/bin/fish
Binaries:
Node: 10.15.1 - /usr/local/bin/node
Yarn: 1.17.3 - ~/.yarn/bin/yarn
npm: 5.6.0 - /usr/local/bin/npm
Languages:
Python: 2.7.10 - /usr/bin/python
Browsers:
Chrome: 76.0.3809.132
Firefox: 68.0.2
Safari: 12.1.2
npmPackages:
gatsby: ^2.8.3 => 2.14.0
gatsby-image: 2.0.20 => 2.0.20
gatsby-plugin-env-variables: ^1.0.1 => 1.0.1
gatsby-plugin-favicon: 3.1.4 => 3.1.4
gatsby-plugin-google-analytics: 2.0.7 => 2.0.7
gatsby-plugin-google-tagmanager: ^2.0.6 => 2.1.7
gatsby-plugin-modal-routing: ^1.0.0 => 1.0.2
gatsby-plugin-netlify: ^2.1.3 => 2.1.10
gatsby-plugin-prefetch-google-fonts: ^1.4.2 => 1.4.3
gatsby-plugin-react-helmet: 3.0.2 => 3.0.2
gatsby-plugin-robots-txt: ^1.5.0 => 1.5.0
gatsby-plugin-sitemap: 2.0.2 => 2.0.2
gatsby-plugin-styled-components: 3.0.3 => 3.0.3
gatsby-source-contentful: 2.1.28 => 2.1.28
gatsby-source-filesystem: ^2.0.8 => 2.1.18
gatsby-source-lever: 2.0.9 => 2.0.9
gatsby-source-medium: 2.0.8 => 2.0.8
gatsby-transformer-sharp: ^2.1.8 => 2.2.10
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 10
- Comments: 46 (32 by maintainers)
Commits related to this issue
- Remove Medium integration because it no longer works https://github.com/gatsbyjs/gatsby/issues/17335 — committed to storybookjs/frontpage by domyen 5 years ago
- Temporary workaround for Medium CAPTCA issue: https://github.com/gatsbyjs/gatsby/issues/17335#issuecomment-529912619 — committed to CivicActions/civicactions.com by grugnog 5 years ago
- Temporary workaround for Medium CAPTCA issue: https://github.com/gatsbyjs/gatsby/issues/17335#issuecomment-529912619 — committed to CivicActions/civicactions.com by grugnog 5 years ago
- Merge pull request #74 from EmaSuriano/fix/medium-plugin-error fix/medium-plugin-quick-patch — committed to EmaSuriano/gatsby-starter-mate by EmaSuriano 5 years ago
- Updating deps and components * gatsby-source-medium 403 issue [https://github.com/gatsbyjs/gatsby/issues/17335](click here) * hardcode blog posts * update styling — committed to natalieolivo/natalieolivo.github.io by deleted user 4 years ago
- Updating deps and components * gatsby-source-medium 403 issue [https://github.com/gatsbyjs/gatsby/issues/17335](click here) * hardcode blog posts * update styling — committed to natalieolivo/natalieolivo.github.io by deleted user 4 years ago
- Remove Medium integration because it no longer works https://github.com/gatsbyjs/gatsby/issues/17335 — committed to jackwolfskin0302/frontend by jackwolfskin0302 5 years ago
I haven’t checked if it’s the same payload, it’s probably not the most reliable solution, but if you try omitting
latest
from the endpoint the plugin works again (but again, it’s a temp solution)So url =>
https://medium.com/${nameOfBlogHere}?format=json&limit=10
IMO getting 10 posts is better than nothing. People might want to show their latest posts somewhere. Feel free to discuss the future of the plugin here or in a new issue (but please close this one here then) and if e.g. the JSON solution would be feasible. In the meantime the documentation should probably be updated to make people aware of these issues.
I came up with a similar solution - fetching the JSON file manually and using a local plugin: https://github.com/smartive/smartive.ch/commit/fcaf8d588d40a967035d3b8fec7e3a25f4a5f916
I’m still waiting to hear back from medium, they said they would speak to their platform team.
The work around works but does majorly limited the plugin features. We’ve been able to work around this by using other methods.
I agree with @LekoArts, I’d love for this plugin to work even if only in a limited capacity. Right now it doesn’t work at all.
@EmaSuriano - here you go
Looking over the
gatsby-starter-mate
starter I realise this is not the best solution, we’ve not been able to find a different solution though…Oh I just saw your reference, we use contentful as well. I can share the script in a gist when I’m home
@EmaSuriano - I’m still waiting to hear back from medium about the cloud flare protection thingy. I will reach out again on Monday for a status update but ultimately I think we should prepare to not rely on this plugin as it uses the json feed.
It was breaking our build as well, we werent able to deploy to production or even create previews. What we opted for in the end was to manually copy paste the json into a local file and break apart the plugins code to grab all the bits we needed and then use the new code to upload relavant data to our cms. We’re in the middle of a blog migration so we had some stuff in place already but the concept should be fairly straightforward if you have a cms with API write access or you can write to local files. obviously this might not be your best solution if you have a few hundred posts you want to collate.
I don’t think rewriting the plugin to use the API will work as it doesn’t give us the same data this json feed does, only post publishing.
If you’re really stuck and need this data, it is possible to get an export of your blog or publication from medium in HTML or XML. From this export you can get all the data you need, for posts that occur after the export you can then use the RSS feed to grab the 10 latest.
A bit of a ramble, if you need anymore help or have questions let me know and I’ll try and assist!
@grugnog - agreed the RSS feed is an option but as detailed in my original post it limits the number of returned posts to max 10, it lacks a few fields that the JSON gives us but does give you the entire post content.
From my understanding the medium API doesn’t give you any sort of post content, only the ability to create posts and list a users publications…
I have reached out to medium support to ask for some clarification on this, it seems fairly strange to add a DDoS protection that requires human interaction on to a feed of JSON. Unless they’re trying to lock down access to post data, which would be understandable