swarmkit: 17.06 won't deploy stack anymore. no suitable node (unsupported platform on 3 nodes) on armhf docker cluster

Hello,

I have an armhf cluster and since 17.06 installed, I’m getting this error and my stacks doesn’t come up, it stays as pending forever. There’s no constraint. This was working perfectly on 17.03.

Inspecting the task, I’m getting:

        "Status": {
            "Timestamp": "2017-07-02T12:51:00.556959128Z",
            "State": "pending",
            "Message": "no suitable node (unsupported platform on 3 nodes)",
            "ContainerStatus": {},
            "PortStatus": {}
        },

docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 9
Server Version: 17.06.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local nfs
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: chzqrk30d8aph7ikg60owdbjz
 Is Manager: true
 ClusterID: 1ucdfzovu4whdawkzv8wbfeb6
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Root Rotation In Progress: false
 Node Address: 192.168.178.6
 Manager Addresses:
  192.168.178.6:2377
  192.168.178.7:2377
  192.168.178.8:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.34-45
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: armv7l
CPUs: 8
Total Memory: 1.949GiB
Name: odroid01.casaams.wsartori.com
ID: DRWB:PHMO:GCXF:GDTK:QXDF:V2WM:6ZAM:MC5G:GYID:7YOH:PBQC:BOYE
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 registry:5000
 127.0.0.0/8
Live Restore Enabled: false

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 51 (27 by maintainers)

Commits related to this issue

Most upvoted comments

@trunet apologies, yes for stacks you can use --resolve-image=never for now.

@trunet I’m not sure about that, but we’ll see what’s the best way to resolve this. Until then, you can get around this issue by using the --no-resolve-image flag on service create/update which will not add platform information to your service spec.

@tianon The image config is covered by the spec, unfortunately. For the cases where cross builds are happening, the image will have to be modified post-build. In the case where we just want to route a workload a host that can run that image, we likely don’t have enough information, as that will have to be wrapped in a manifest list (index) to provide the actual platform information in the ARM case.

I see the issue.

It seems like the node reports

OSType: linux
Architecture: armv7l

but the image reports

OSType: linux
Architecture: arm

and that causes scheduling to fail. We will need to add more normalization for architecture names. I’ll do that but it would be useful to have a full list of architecture name variations.

cc @thaJeztah @aaronlehmann currently we only do this for x86_64 and amd64: https://github.com/docker/swarmkit/blob/master/manager/scheduler/filter.go#L288

Looks like @luxas has a multiarch image, cf

ed$ docker run --rm mplatform/mquery luxas/prometheus:v2.0.0-rc.0
Manifest List: Yes
Supported platforms:
 - amd64/linux
 - arm/linux (variant: undefined)
 - arm64/linux (variant: undefined)

also

ed$ docker run --rm mplatform/mquery luxas/prometheus:v1.7.1
Manifest List: Yes
Supported platforms:
 - amd64/linux
 - arm/linux (variant: undefined)
 - arm64/linux (variant: undefined)

I ran into the same issue today. I run a manager on amd64 and have a node which is armv6l. It’s a raspberrypi model b+. I can run the image on the node itself, but when I want to run it as a swarm service it says: “Message”: “no suitable node (unsupported platform on 4 nodes)”

Btw. I also have two amd64 nodes.

It seems it can’t map the arm image on armv6l in the swarm scheduler.

@nishanttotla Is there a way to build the image with armv7l architecture instead of only arm, I didn’t find anything on commit, push on build to add this? how docker hub is handling that, I suppose all https://hub.docker.com/r/arm32v7/ should uses armv7l?