lychee: lychee shows network error: forbidden for valid links
Suddenly lychee shows this error for valid links. But these links are valid and also accessible from the browser.
❯ lychee --max-concurrency 1 --no-progress --verbose "work/ok.txt"
✗ [403] https://catboost.ai/ | Network error: Forbidden
✗ [403] https://catboost.ai/en/docs/concepts/python-reference_datasets_msrank | Network error: Forbidden
Issues found in 1 input. Find details below.
[work/ok.txt]:
✗ [403] https://catboost.ai/ | Network error: Forbidden
✗ [403] https://catboost.ai/en/docs/concepts/python-reference_datasets_msrank | Network error: Forbidden
🔍 2 Total ✅ 0 OK 🚫 2 Errors (HTTP:2)
Contents of work/ok.txt
https://catboost.ai/
https://catboost.ai/en/docs/concepts/python-reference_datasets_msrank
Lychee version
❯ lychee --version
lychee 0.10.1
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 41 (27 by maintainers)
Seems like there isn’t much upstream traction, and it’s not something we can fix on our side, so I’m gonna go ahead and close this. If the upstream issue gets fixed, we can reopen and integrate
reqwest-impersonate. Apologies if this is not the outcome y’all were hoping for, but I think we need to find another way.No updates, but if I find the time I will create a pull request to integrate reqwest impersonate as a fallback backend. It will be an optional library feature, but it will be enabled by default in the binary. It’s a great match because I want to refactor the client code anyway soon. Thanks for the reminder.
I won’t be able to test it for a while because I get covid (again).
On your thoughts,
Bad news. I wanted to integrate this, but I don’t think it’s possible right now.
reqwest-impersonatepatches some dependencies (e.g. hyper) and therefore cannot be published on crates.io. If we integrate it into lychee, that means we couldn’t publish the library on crates.io either even if we putreqwest-impersonatebehind a feature flag, which is disabled by default. See https://github.com/rust-lang/cargo/issues/6738. Is there a possibility that I don’t see right now?@mre
I don’t know the answer for the second question.
For the first one, I suggest to test it on other related issue where browsers are able to open a URL but curl and lychee are not.
Dang. It works. 😞
That means if we integrate that backend into lychee it would solve your issue. Two questions (@lebensterben)
Okay thanks. The second one should not have failed. It’s an error on my end. However I do expect it to fail just like the first test with request. At least the results have always been consistent between them on my end.
For the last one, which is the most promising one. You need to install boringssl for that first.
I’ve added support for it to getcurl-test and it indeed works:
Tested locally and inside a Github codespace. Can you both test it on your machines as well? Just clone the project and run the command above.
If it works I really don’t know if we should add
reqwest-impersonateto the project. Might be a maintenance issue down the road as it could diverge fromreqwestand is maintained by a single (yet awesome) person.@mre
With my curl version
I’ve created a new git repository to make a mock test. And the result is negative. Repository: https://github.com/Rizwan-Hasan/test-lychee-links Logs: https://github.com/Rizwan-Hasan/test-lychee-links/runs/7826437363
Hi @mre ,
It originally happened on github action. Same happened after I tried it on my local pc. Ran it several time on github action too and still the same. Here’s the recent github action log https://github.com/Rizwan-Hasan/clearml-docs/runs/7825833483 And this is 3 days earlier https://github.com/Rizwan-Hasan/clearml-docs/runs/7787255012
We need it to work on GitHub Action.
Hum, looks like it’s an issue on your end. 🤔 At least it works over here:
Can you try from a different network? Or maybe reconnect to your wifi? Maybe it also was just a temporary hickup?