azure-pipelines-agent: Problem on connecting to services

Agent Version and Platform

Version of your agent? 2.144.0/2.144.1/…

Agent name: ‘Hosted Agent’ Agent machine name: ‘fv-az712’ Current agent version: ‘2.164.8’ Current image version: ‘20200211.1’ Agent running as: ‘vsts’ Prepare build directory. Set build variables. Download all required tasks.

Azure DevOps Type and Version

dev.azure.com https://dev.azure.com/manpremo

What’s not working?

Please include error messages and screenshots.

Starting since some days ago a working pipeline as stopped to work because. Inside my tasj I’m using a postgres service and now the service is not more resolved at network level.

I have create a simple pipeline where I try to connect to postgres and I get this error:

psql: could not translate host name "postgres" to address: Name or service not known
resources:
  containers:
  - container: postgres
    image: postgres:latest
  - container: u18
    image: ubuntu:18.04
    options: '-v /usr/bin/sudo:/usr/bin/sudo -v /usr/lib/sudo/libsudo_util.so.0:/usr/lib/sudo/libsudo_util.so.0 -v /usr/lib/sudo/sudoers.so:/usr/lib/sudo/sudoers.so -v /etc/sudoers:/etc/sudoers'

stages:
- stage: xxx
  jobs:
  - job: yyy
    container: u18
    services:
      postgres: postgres
    steps:
      - script: |
          sudo apt update
          sudo apt install -y postgresql-client
          psql --host=postgres --username=postgres --command="SELECT 1;"

Agent and Worker’s Diagnostic Logs

logs_1209.zip

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 5
  • Comments: 24 (7 by maintainers)

Most upvoted comments

This issue is still persistent with postgres:13 and its more a problem of readiness i would say, because docker container is started only right before the first tag? is executed and not yet ready. Why does azure devops not make sure that the host is registered in the network and listening on exposed port before continued.

It could be that the container is crashing immediately.

Anyhow, this is how I got it working:

trigger: none

resources:
  containers:
    - container: python38
      image: "python:3.8"
      
    - container: pg12
      image: "postgres:12"
      env:
       POSTGRES_USER: <snip>
       POSTGRES_PASSWORD: <snip>
       PGDATA: "/data/postgres"
      ports:
      - 5432

jobs:
- job: "Run_Test_Suite"
  pool:
    vmImage: 'ubuntu-20.04'

  services:
    pg12: pg12

  steps:
    - script: |
        printenv
      displayName: "print environment variables"
      target:
        container: python38
    
    # workaround for pg12 container not being visible when running tests. stop and start again
    - task: Docker@2
      inputs:
        command: 'stop'
        container: 'pg12'
      displayName: "stop pg12 container"
    
    - task: Docker@2
      inputs:
        command: 'start'
        container: 'pg12'
      displayName: "start pg12 container again"

    - script: |
        getent hosts pg12
      displayName: "show ips of docker service containers"
      target:
        container: python38
    
    # this is here just for debugging purposes if the tests fail because of name resolution errors
    - script: |
        docker logs pg12
      displayName: "fetch pg12 logs before test"
      
    - script: |
         <snip>
      displayName: "running tests"
      target:
        container: python38

This issue has had no activity in 180 days. Please comment if it is not actually stale

I’m running a docker-compose inside a pipeline with a postgres image and have been suffering from a similar issue. Switching from postgres:11.7-alpine to postgres:11.5-alpine appears to have solved the issue, at least temporarily.