ArchiveBox: Bugfix: Error: Search Backend only searching default admin search fields
Describe the bug
Bug:
The bug occurs when I attempt to search any query. An error message appears saying: "Error from the search backend, only showing results from default admin search fields -Error:[Errno -3] Temporary failure in name resolution."
If the search query is a word in the title of a website, it will return results with that word in it. If it is only in the wget snapshot of the item, it will not return that item.
Context:
I am running ArchiveBox using on Windows 10 with docker-compose and have launched the web UI which I am successfully accessing at http://127.0.0.1:8000
. As far as I can tell, all the snapshots are functional and there are no pending links. The output directory is on an external hard drive, but there have been no issues reading/writing from this drive (except for speed, though I can’t tell if that’s just how the Django Web UI is or not).
Relevant Info: Bug seems similar to @jdcaballerov comment when search enabled but backend failed in his testing (see screenshot 4).
Steps to reproduce
mkdir archivebox && cd archivebox
curl -O https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml
Edit the docker-compose yml’s volumes
section to read:
volumes:
- ./data:/data
- D:\\archivebox:/mnt/d/archivebox
(unsure if external drive-specific setup needed to reproduce, so wanted to include)
Open a Windows Terminal in administrator mode, navigate to D:/archivebox/
, open a Git Bash tab and run the following:
> docker-compose up -d
> docker-compose run archivebox init
> docker-compose run archivebox manage createsuperuser
> docker-compose run archivebox add 'https://www.dailydot.com/parsec/fandom/dieselpunk-steampunk-beginners-guide/'
Navigate to http://127.0.0.1:8000
and search "beginners’ (see screenshot 1). Because it is in the title, it will show up. The error message will also show up.
Search “biopunk” (see screenshot 2). Even though it is in the wget file, it will not show up (see screenshot 3). The error message will show up. I have not done extensive testing on whether different filetype snapshots will get searched or not, but I don’t think it picks any of them up if they are not in title.
Screenshots or log output
Screenshot 1:
Screenshot 2:
Screenshot 3:
Screenshot 4:
ArchiveBox version
ArchiveBox v0.5.6
Cpython Linux Linux-4.19.128-microsoft-standard-x86_64-with-glibc2.28 x86_64 (in Docker)
[i] Dependency versions:
√ ARCHIVEBOX_BINARY v0.5.6 valid /usr/local/bin/archivebox
√ PYTHON_BINARY v3.9.1 valid /usr/local/bin/python3.9
√ DJANGO_BINARY v3.1.3 valid /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py
√ CURL_BINARY v7.64.0 valid /usr/bin/curl
√ WGET_BINARY v1.20.1 valid /usr/bin/wget
√ NODE_BINARY v15.8.0 valid /usr/bin/node
√ SINGLEFILE_BINARY v0.1.14 valid /node/node_modules/single-file/cli/single-file
√ READABILITY_BINARY v0.1.0 valid /node/node_modules/readability-extractor/readability-extractor
√ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js
√ GIT_BINARY v2.20.1 valid /usr/bin/git
√ YOUTUBEDL_BINARY v2021.02.04.1 valid /usr/local/bin/youtube-dl
√ CHROME_BINARY v88.0.4324.146 valid /usr/bin/chromium
√ RIPGREP_BINARY v0.10.0 valid /usr/bin/rg
[i] Source-code locations:
√ PACKAGE_DIR 22 files valid /app/archivebox
√ TEMPLATES_DIR 3 files valid /app/archivebox/templates
[i] Secrets locations:
- CHROME_USER_DATA_DIR - disabled
- COOKIES_FILE - disabled
>docker -version
Docker version 20.10.2, build 2291f61
>docker-compose --version
docker-compose version 1.27.4, build 40524192
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (9 by maintainers)
Well that’s that then I suppose. Is there a way to set a “timeout” per link on the
docker-compose run archivebox update --index-only
command? So it will skip links it spends more than say, a minute trying to index?