cog: [Cog v0.8.0 error after upgrading from v0.7.2] Error on Cog build: exec: /sbin/ldconfig.real: not found
Impact: I’m unable to build any image using Cog and therefore deploy any models to Replicate.
On both Lambdalabs and TensorDock:
sudo cog build
cog.yaml:
# Configuration for Cog ⚙️
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md
build:
# set to true if your model requires a GPU
gpu: true
cuda: "11.8"
# python version in the form '3.8' or '3.8.12'
python_version: "3.10"
# a list of packages in the format <package-name>==<version>
python_packages:
- "torch==2.0.0"
- "transformers==4.30.1"
- "sentencepiece==0.1.97"
- "accelerate==0.20.3"
# https://github.com/oobabooga/text-generation-webui/blob/main/docs/LLaMA-model.md#option-2-convert-the-weights-yourself
- "protobuf==3.20.1"
- "auto-gptq==0.2.2"
# predict.py defines how predictions are run on your model
predict: "predict.py:Predictor"
I receive the following error logs:
=> ERROR [stage-1 3/11] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recom 16.4s
------
> [stage-1 3/11] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recommends make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev git ca-certificates && rm -rf /var/lib/apt/lists/*:
#0 13.37 debconf: delaying package configuration, since apt-utils is not installed
....
#0 20.67 Setting up tk-dev:amd64 (8.6.11+1build2) ...
#0 20.67 Processing triggers for libc-bin (2.35-0ubuntu3.1) ...
#0 20.67 /usr/sbin/ldconfig: 16: exec: /sbin/ldconfig.real: not found
#0 20.67 /usr/sbin/ldconfig: 16: exec: /sbin/ldconfig.real: not found
#0 20.67 dpkg: error processing package libc-bin (--configure):
#0 20.67 installed libc-bin package post-installation script subprocess returned error exit status 127
#0 20.68 Errors were encountered while processing:
#0 20.68 libc-bin
#0 20.69 E: Sub-process /usr/bin/dpkg returned an error code (1)
------
Dockerfile:13
--------------------
12 | ENV PATH="/root/.pyenv/shims:/root/.pyenv/bin:$PATH"
13 | >>> RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recommends \
14 | >>> make \
15 | >>> build-essential \
16 | >>> libssl-dev \
17 | >>> zlib1g-dev \
18 | >>> libbz2-dev \
19 | >>> libreadline-dev \
20 | >>> libsqlite3-dev \
21 | >>> wget \
22 | >>> curl \
23 | >>> llvm \
24 | >>> libncurses5-dev \
25 | >>> libncursesw5-dev \
26 | >>> xz-utils \
27 | >>> tk-dev \
28 | >>> libffi-dev \
29 | >>> liblzma-dev \
30 | >>> git \
31 | >>> ca-certificates \
32 | >>> && rm -rf /var/lib/apt/lists/*
33 | RUN curl -s -S -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash && \
--------------------
ERROR: failed to solve: process "/bin/sh -c apt-get update -qq && apt-get install -qqy --no-install-recommends \tmake \tbuild-essential \tlibssl-dev \tzlib1g-dev \tlibbz2-dev \tlibreadline-dev \tlibsqlite3-dev \twget \tcurl \tllvm \tlibncurses5-dev \tlibncursesw5-dev \txz-utils \ttk-dev \tlibffi-dev \tliblzma-dev \tgit \tca-certificates \t&& rm -rf /var/lib/apt/lists/*" did not complete successfully: exit code: 100
ⅹ Failed to build Docker image: exit status 1
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 22 (11 by maintainers)
Hi @Glavin001. Thanks for your help and patience as we try to debug this issue. I apologize for the inconvenience this caused.
We just released Cog v0.8.2. This release includes #1231, which reverts #1161, which we believe to be the cause of the regression you’re seeing.
Please give that a try when you have a chance and let us know if you’re still having this issue. Thanks! 🙏
Hi everyone, apologies - I pushed this change in hopes of making the image smaller and faster to build.
It seems like you might have an older version of the cuda base image. The current version of 11.8.0-cudnn8-devel-ubuntu22.04 already have libc-bin installed, and also has /sbin/ldconfig.real. My guess is maybe the
rm -rf /var/lib/apt/lists/*was important. Could you postdocker images --no-trunc|grep cudaplease?Here’s the
cog debugdiff between v0.7.2 and v0.8.0:v0.8.0 is the issue.
Workaround: Downgrading to v0.7.2 fixes the issues! 🎉 ✅