newspaper: Article `download()` failed with 404 Client Error
Hi,
I keep getting this error message - Article download() failed with 404 Client Error: Not Found for url: http://www.foxnews.com/2017/09/22/sheriff-clarke-trump-wins-either-way-luther-strange-roy-moore-alabama-senate-race on URL http://www.foxnews.com/2017/09/22/sheriff-clarke-trump-wins-either-way-luther-strange-roy-moore-alabama-senate-race
It happens for various article url links.
Here is the code i am using, `news_content = newspaper.build(url) for eachArticle in news_content.articles: i = i +1 article = news_content.articles[i]
article.download()#now download and parse each articles
article.parse()
article.nlp()
backupfile.write("\n"+ "--------------------------------------------------------------" + "\n")
backupfile.write(str(article.keywords))
datasetfile.write("\n" + "----SUMMARY ARTICLE-> No. " + str(i) + "\n")
datasetfile.write(article.summary) #only summary of the article is written in the dataset directory
backupfile.write("\n"+"----SUMMARY ARTICLE---" + "\n")
backupfile.write(article.summary)
backupfile.write("\n"+"----TEXT INSIDE ARTICLE---" + "\n")
backupfile.write(article.text)
time.sleep(2)`
Attached below is the screenshot of the error,

About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (2 by maintainers)
I posted the solution here:
Here is the link: https://stackoverflow.com/a/63060794/2414957
I just used a simple try except structure. Seems to works just fine (at least for the 404 error I was seeing)(code below - don’t mind the splitting and stuff’ 😃)
url.strip() will not fix a bad URL. See URL above returned by cnn object. Click on it. It is a bad URL.
First you told me just to do “except:” now you are telling me there is no error handling?
One of my colleagues had the same problem. She striped off newline character in the url strings using
url.strip()and the error stopped.if its just getting the text that you want to do, Since you already get information from curl and python request
then use newspaper’s , full_text