ArchiveBox: Link parsing: Pinboard private feeds don't seem to get parsed properly
I would love to have the cron job that monitors my Pocket feed also monitor my private Pinboard feed. However, no matter which method I use to pass the feed to bookmark-archiver using the instructions, all have their own unique failure.
If I pass a public feed, like http://feeds.pinboard.in/rss/u:username/
, it works fine. But if I pass a private feed, like https://feeds.pinboard.in/rss/secret:xxxx/u:username/private/
, it errors out. I have tried the RSS, JSON, and Text feeds, and none work.
Examples here: (I’ve simply replaced the actual feed I used to test, with the demo URL Pinboard provides)
./archive "https://feeds.pinboard.in/rss/secret:xxxx/u:username/private/"
[*] [2018-10-18 21:14:03] Downloadinghttps://feeds.pinboard.in/rss/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897243.txt
[X] No links found :(
./archive "https://feeds.pinboard.in/json/secret:xxxx/u:username/private/"
[*] [2018-10-18 21:13:46] Downloading https://feeds.pinboard.in/json/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897226.txt
Traceback (most recent call last):
File "./archive", line 161, in <module>
links = merge_links(archive_path=out_dir, import_path=source)
File "./archive", line 53, in merge_links
raw_links = parse_links(import_path)
File "/home/USERNAME/datahoarding/bookmark-archiver/archiver/parse.py", line 54, in parse_links
links += list(parser_func(file))
File "/home/USERNAME/bookmark-archiver/archiver/parse.py", line 108, in parse_json_export
url = erg['url']
KeyError: 'url'
./archive "https://feeds.pinboard.in/text/secret:xxxx/u:username/private/"
[*] [2018-10-18 21:17:57] Downloading https://feeds.pinboard.in/text/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897477.txt
[X] No links found :(
Even though the script says that links are not found, they are definitely there, and simply pasting the URL into a browser outputs the feed in the proper format. I used this script successfully with other methods, like the Pinboard manual export, Pocket manual export AND RSS feed, and browser export. Is this just not a supported method for importing/monitoring?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 19 (16 by maintainers)
From the
settings
->backup
page:Legacy HTML (seems to be broken HTML/XML?)
XML
JSON
Private RSS feed:
I’ve ran into the same problem. I solved this with a little go program which will login to pinboard and klick the actual “backup my bookmarks in legacy Netscape format” button – which works fine for me.
Seems to work for me on the most recent master (ce257949b4468c77412c026b5987c3f37bad6443). 😃 Thanks a ton.
My original issue doesn’t seem to be the same problem that @f0086 is dealing with.
I am very sorry, but it does not work. You are using the wrong URLs. You need to use the URL in the
<link></link>
tag. I will have a look at this.#123 seems related to this 😃
EDIT: Ok, I had a quick look at the code, but did not find a proper solution. The
xml.etree.ElementTree
component is not working as expected I think, but I am not a Python guy, so not sure about that. My setup (see above) works great for me, so I have no interest in spending an evening debugging this for now, sorry 😦 Maybe it is not worth it anyway, because of #123 ?!?