rancher: Web service ip not resolving in nginx container
Rancher Versions: 1.3.0 Docker Version: 1.13.0 OS and where are the hosts located? cloud Setup Details: digital ocean: 1 rancher server + 1 host Environment Type: Cattle
Steps to Reproduce:
When I deploy using rancher-compose command I got the following error from my nginx container:
2017/02/05 02:59:59 [emerg] 9#9: host not found in upstream "web" in /etc/nginx/conf.d/default.conf:13
This is similar to issue: https://github.com/rancher/rancher/issues/2628
The weird thing is when I push images to a host created using docker machine, everything works as fine no problem:
docker-machine -f staging.yml up -d # see staging.yml below
I’m not sure what difference in the rancher-created host vs. my docker-machine-created host that would cause this issue.
This is my rancher-compose.yml
version: '2'
services:
web:
scale: 1
start_on_create: true
nginx:
scale: 1
start_on_create: true
postgres:
scale: 1
start_on_create: true
This is my staging.yml docker compose file:
Note that I’m using envsubst so I can use environment variables in my nginx conf files as recommended by the official library/nginx documentation on Docker Hub.
version: "2"
services:
web:
image: myacct/web_staging:0.0.13
stdin_open: true
tty: true
restart: always
expose:
- "8000"
networks:
- backend
volumes:
- django-static:/usr/src/collectstatic
- backup:/backup
env_file: .env
environment:
DEBUG: 'false'
DB_PASS: secretepw
EMAIL_ENABLE_NOTIFICATION: 'true'
entrypoint: /usr/src/app/backend/docker-entrypoint.sh postgres 5432
command: /bin/bash /usr/src/app/backend/start.sh
labels:
io.rancher.container.pull_image: always
io.rancher.scheduler.affinity: myhostname=host01
nginx:
image: myacct/nginx_staging:0.0.18
restart: always
ports:
- "80:80"
volumes_from:
- web
env_file: .env
networks:
- backend
labels:
io.rancher.sidekicks: web
io.rancher.container.pull_image: always
io.rancher.scheduler.affinity: myhostname=host01
command: /bin/sh -c "envsubst < /etc/nginx/conf.d/django_project.template > /etc/nginx/conf.d/default.conf && nginx -g 'daemon off;'"
postgres:
restart: always
image: myacct/postgres_staging:0.0.6
stdin_open: true
tty: true
volumes:
- pgdata:/var/lib/postgresql/data/
networks:
- backend
environment:
POSTGRES_PASSWORD: secretpw
labels:
io.rancher.container.pull_image: always
io.rancher.scheduler.affinity: myhostname=host01
volumes:
django-static:
pgdata:
# used to backup and restore django data
backup:
networks:
backend:
This is my nginx.conf:
user nginx;
worker_processes 4;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
And this is my /etc/nginx/conf.d/default.conf
server {
listen 80;
server_name example.org;
charset utf-8;
client_max_body_size 1000M;
location /static {
alias /usr/src/collectstatic;
}
location / {
proxy_pass http://web:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
When go to /etc/resolve.conf in my nginx container I get:
# cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 8.8.4.4
Is there any more information I can provide?
Thanks. -Paul
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (5 by maintainers)
Just a general hint with Docker and nginx: nginx makes just one DNS lookup at service start, if your backend container gets a new IP you have to restart the nginx container.
You can fix this by setting an DNS resolver : resolver 169.254.169.250 valid=5s ipv6=off;
and filling an variable with your DNS name : set $backendweb web; proxy_pass http://:8000$backendweb;
By this configuration nginx looks up DNS all 5 seconds again and does not fail on startup if your web container is not yet started or has no IP assigned
I can confirm that the problem still exists on Rancher 1.6.2.
@janeczku I know about those. The problem is Rancher not returning A records for some load balancers (
<serviceName>). We think we can replicate it by removing the service (two Rancher LBs) and then redeploying it. After this is done, Rancher stops returning A records for<serviceName>, only<serviceName>.<stackName>etc are successfully queried. Furthermore, restarting Rancher’s DNS container fixes it but only in a single host setup, it breaks the DNS resolution in our two host setup. I.e. if we restart DNS on host1, host2 can’t query and vice versa.@jurajseffer
There are known issues with Alpine not playing nice with domain search, because it’s not using the glib standard c library. https://github.com/gliderlabs/docker-alpine/issues/8#issuecomment-255600445 In short:
<serviceName>does not resolve, but<serviceName>.<stackName>does.