ArchiveBox: Can't start a container using a named volume

Describe the bug

Can’t get ArchiveBox to run with data in a named volume. If I just map a standard folder, it works though.

Steps to reproduce

I’m trying to use docker-compose. My file:

version: "3"

services:
  archivebox:
    image: nikisweeting/archivebox
    volumes:
      - archivebox_files:/data
    ports:
      - 8000:8000 

volumes:
  archivebox_files:

First I run docker-compose up --no-start to create the container and volume without starting anything.

Then, running docker-compose run archivebox init keeps failing on permission errors. I tried creating the folders manually within the volume, and setting everything in the volume to mode 777, but nothing helped.

Screenshots or log output



#### Software versions

 - OS: Ubuntu Server 18 LTS
 - ArchiveBox version:        latest docker image

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 17 (8 by maintainers)

Most upvoted comments

@dohlin I would also recommend the incremental upgrade, although you don’t have to do it though every intermediate version. v0.4.x was specifically designed to handle importing really old archives, so if you go from the old version to v0.4.24, then from there to v0.5.6, it should work in only 2 steps. 2k links is well within the realm of what it can handle, it should only start getting sketchy above ~25k links (and v0.6 coming soon is tested to be stable up to 150k). v0.6 also has many fixes that improve performance overall, though it’s not totally solved the db locking issue, it should be much better when that comes out.

Also as @mAAdhaTTah mentioned, make sure you have the webserver stopped and only use 1 CLI process to do the upgrade/import, there should be no concurrency / locking issues with only 1 process.

pirate on Apr 2, 2021

@dohlin If you import the links one at a time, via the CLI, and keep the web server off during that time, you should only have the CLI process locking/using the db which should minimize/eliminate the problem.

Alternatively, if you still have the results from the early build, i would update incrementally. Meaning, instead of going straight to the current version, you install each incremental version, upgrade the content, then install the next version. This might be safer than tryna go all the way at once.

mAAdhaTTah on Apr 2, 2021

@tonylaw7

Can you post the output of archivebox version and archivebox status

@zblesk v0.5.0 has many speed improvements that should make multi-process archiving better, but it’s not finished yet, give us a week or so for the final testing.

pirate on Dec 9, 2020

It can crash on low memory systems because there are some bottlenecks with big indexes. The upcoming 0.5 release should help with this. You can run archivebox update and it will retry failed links and extractors.

cdvv7788 on Sep 10, 2020