prometheus: 2.6.0: opening storage failed: mkdir data/: read-only file system
Bug Report
What did you do?
Upgrading from Docker 2.5.0 to 2.6.0 introduces a fatal error, “opening storage failed: mkdir data/: read-only file system”. Deleting everything and reverting to 2.5.0 with the exact same configuration does not have this problem.
What did you expect to see?
I don’t see that any changes are required for 2.6.0.
Environment This is running in a vanilla minikube running k8s 1.11.6.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: monitor
name: prometheus
spec:
replicas: 1
revisionHistoryLimit: 2
strategy:
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
template:
metadata:
name: prometheus
labels:
service: prometheus
apiVersion: v2
app: prometheus
spec:
containers:
- name: prometheus
image: quay.io/prometheus/prometheus:v2.5.0
imagePullPolicy: Always
ports:
- name: web
containerPort: 9090
livenessProbe:
tcpSocket:
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 5
readinessProbe:
tcpSocket:
port: 9090
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
memory: 64Mi
volumeMounts:
- name: config
mountPath: /etc/prometheus
- name: data
mountPath: /prometheus
volumes:
- name: config
configMap:
name: prometheus
- name: data
persistentVolumeClaim:
claimName: prometheus
-
System information:
quay.io/prometheus/prometheus:v2.6.0
-
Prometheus version:
v2.6.0
-
Logs:
level=info ts=2018-12-26T14:41:12.22879461Z caller=main.go:243 msg="Starting Prometheus" version="(version=2.6.0, branch=HEAD, revision=dbd1d58c894775c0788470944b818cc724f550fb)"
level=info ts=2018-12-26T14:41:12.228845931Z caller=main.go:244 build_context="(go=go1.11.3, user=root@bf5760470f13, date=20181217-15:14:46)"
level=info ts=2018-12-26T14:41:12.228864658Z caller=main.go:245 host_details="(Linux 4.15.0 #1 SMP Fri Dec 21 23:51:58 UTC 2018 x86_64 prometheus-85c56c84d8-chf77 (none))"
level=info ts=2018-12-26T14:41:12.228880573Z caller=main.go:246 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-12-26T14:41:12.228894018Z caller=main.go:247 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2018-12-26T14:41:12.229899898Z caller=main.go:561 msg="Starting TSDB ..."
level=info ts=2018-12-26T14:41:12.229984509Z caller=web.go:429 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-12-26T14:41:12.229996811Z caller=main.go:430 msg="Stopping scrape discovery manager..."
level=info ts=2018-12-26T14:41:12.230007845Z caller=main.go:444 msg="Stopping notify discovery manager..."
level=info ts=2018-12-26T14:41:12.23001292Z caller=main.go:466 msg="Stopping scrape manager..."
level=info ts=2018-12-26T14:41:12.230018683Z caller=main.go:440 msg="Notify discovery manager stopped"
level=info ts=2018-12-26T14:41:12.23008139Z caller=main.go:426 msg="Scrape discovery manager stopped"
level=info ts=2018-12-26T14:41:12.23009587Z caller=main.go:460 msg="Scrape manager stopped"
level=info ts=2018-12-26T14:41:12.230107221Z caller=manager.go:664 component="rule manager" msg="Stopping rule manager..."
level=info ts=2018-12-26T14:41:12.230125069Z caller=manager.go:670 component="rule manager" msg="Rule manager stopped"
level=info ts=2018-12-26T14:41:12.230134183Z caller=notifier.go:521 component=notifier msg="Stopping notification manager..."
level=info ts=2018-12-26T14:41:12.230144125Z caller=main.go:615 msg="Notifier manager stopped"
level=error ts=2018-12-26T14:41:12.230506795Z caller=main.go:624 err="opening storage failed: mkdir data/: read-only file system"
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 2
- Comments: 24 (13 by maintainers)
Commits related to this issue
- Do not pull latest Prometheus image As it has breaking changes (it expects the mounted location to be rw and store data in it) due to https://github.com/prometheus/prometheus/pull/4976 (https://githu... — committed to rhcs-dashboard/ceph-dev by epuertat 5 years ago
- Rollback Dockerfile to version @ 2.5.x Fixes https://github.com/prometheus/prometheus/issues/5043 Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> — committed to gouthamve/prometheus by gouthamve 5 years ago
- Rollback Dockerfile to version @ 2.5.x (#5122) Fixes https://github.com/prometheus/prometheus/issues/5043 Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> — committed to prometheus/prometheus by gouthamve 5 years ago
- revert Dockerfile to older settings See https://github.com/prometheus/prometheus/issues/5043 Also, this brings Dockerfile more in sync with Dockerfile.rhel — committed to pgier/prometheus by pgier 5 years ago
@Dravere The Dockerfile wasn’t explicitly mentioned in the stability guarantees in https://prometheus.io/blog/2016/07/18/prometheus-1-0-released/#what-does-1-0-mean-for-you, so we don’t have an official rule for that. Most likely we would not have made a breaking change intentionally though.
While rolling back would be another breaking change, you could also see a rollback as a bugfix of a previous unintentional breakage that should never have happened. Especially if most users still have to hit it.
Please take this as constructive criticism and not some rant on the team owing me or anyone else anything! You’re doing great work!
I’m at a loss to see how the symlink introduced in #4976 improves anything. I was trying to setup and test prometheus and spent the better part of two days trying to figure out why i was getting this perm issue as Google does not find this issue. And even once I did find it, without looking at the changes in the commit its still nearly impossible to understand whats going on and why anything mentioned in this thread fixes things. “opening storage failed: mkdir data/: permission denied” is a relative path error and distinctly unhelpful in troubleshooting the issue.
Symlinking your data directory into /etc makes no sense to me and is against several decades of Unix file system convention. Configs go in /etc, data should be in /opt, /var or in docker’s case /prometheus, /data or /prometheus/data would be acceptable.
On top of all this, you’ve outdated and broken every promethues docker tutorial out on the web with this change. I would urge you to reconsider and possibly revert the symlink or at least make a section in the main README that addresses this change, the permission denied error it causes, the changes needed to fix it, and a well thought out explanation of why you needed to do this. I can only imagine that its going to affect your adoption/use rates for the next year, at least, as folk get frustrated following an online docker tutorials that all no longer work.
In closing please understand that I’m not mad and understand changes happen and are often needed. I don’t have the time to dig into this change and understand why the symlink was needed, so I may in fact be the one that’s in the wrong!
@SuperQ that makes it a bit more clear, but still not sure you’re going about it in the right way. I would humbly recommend you consider rolling back the change, and spin up a 2.0 branch that fixes the WORKDIR. Which seems to be the real problem, instead of trying to mess with the existing versions image file layout. The way it is now you are breaking existing deployments when they upgrade and there current volume setups cause these issus. And you are invalidating any tutorials that were written by the community to this point.
I understand that the rollback has its own risks to folks who have already dealt with this. But that’s what README and CHANGELOG files are for. Make it clear what you are doing, why you are doing it and continue to update the docs/wiki with info on how to deal with the common issues that may come up from the revert.
https://semver.org/
We also had this issue. It works when changing the data path: