ArchiveBox: Bugfix: Error: Search Backend only searching default admin search fields

Describe the bug

Bug: The bug occurs when I attempt to search any query. An error message appears saying: "Error from the search backend, only showing results from default admin search fields -Error:[Errno -3] Temporary failure in name resolution."

If the search query is a word in the title of a website, it will return results with that word in it. If it is only in the wget snapshot of the item, it will not return that item.

Context: I am running ArchiveBox using on Windows 10 with docker-compose and have launched the web UI which I am successfully accessing at http://127.0.0.1:8000. As far as I can tell, all the snapshots are functional and there are no pending links. The output directory is on an external hard drive, but there have been no issues reading/writing from this drive (except for speed, though I can’t tell if that’s just how the Django Web UI is or not).

Relevant Info: Bug seems similar to @jdcaballerov comment when search enabled but backend failed in his testing (see screenshot 4).

Steps to reproduce

mkdir archivebox && cd archivebox curl -O https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml

Edit the docker-compose yml’s volumes section to read:

volumes:
  - ./data:/data
  - D:\\archivebox:/mnt/d/archivebox

(unsure if external drive-specific setup needed to reproduce, so wanted to include)

Open a Windows Terminal in administrator mode, navigate to D:/archivebox/, open a Git Bash tab and run the following:

> docker-compose up -d > docker-compose run archivebox init > docker-compose run archivebox manage createsuperuser > docker-compose run archivebox add 'https://www.dailydot.com/parsec/fandom/dieselpunk-steampunk-beginners-guide/'

Navigate to http://127.0.0.1:8000 and search "beginners’ (see screenshot 1). Because it is in the title, it will show up. The error message will also show up.

Search “biopunk” (see screenshot 2). Even though it is in the wget file, it will not show up (see screenshot 3). The error message will show up. I have not done extensive testing on whether different filetype snapshots will get searched or not, but I don’t think it picks any of them up if they are not in title.

Screenshots or log output

Screenshot 1: image

Screenshot 2: image

Screenshot 3: image

Screenshot 4: image

ArchiveBox version

ArchiveBox v0.5.6
Cpython Linux Linux-4.19.128-microsoft-standard-x86_64-with-glibc2.28 x86_64 (in Docker)

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.5.6          valid     /usr/local/bin/archivebox
 √  PYTHON_BINARY         v3.9.1          valid     /usr/local/bin/python3.9
 √  DJANGO_BINARY         v3.1.3          valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget
 √  NODE_BINARY           v15.8.0         valid     /usr/bin/node
 √  SINGLEFILE_BINARY     v0.1.14         valid     /node/node_modules/single-file/cli/single-file
 √  READABILITY_BINARY    v0.1.0          valid     /node/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js
 √  GIT_BINARY            v2.20.1         valid     /usr/bin/git
 √  YOUTUBEDL_BINARY      v2021.02.04.1   valid     /usr/local/bin/youtube-dl
 √  CHROME_BINARY         v88.0.4324.146  valid     /usr/bin/chromium
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg

[i] Source-code locations:
 √  PACKAGE_DIR           22 files        valid     /app/archivebox
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled
 -  COOKIES_FILE          -               disabled

>docker -version Docker version 20.10.2, build 2291f61

>docker-compose --version docker-compose version 1.27.4, build 40524192

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (9 by maintainers)

Most upvoted comments

Well that’s that then I suppose. Is there a way to set a “timeout” per link on the docker-compose run archivebox update --index-only command? So it will skip links it spends more than say, a minute trying to index?