nomad-autoscaler: Failed to get policy (EOF)
Hi folks,
Seeing some more errors and Nomad jobs that should definitely be triggering scaling are not happening.
https://gist.github.com/evandam/4956199b01cd4303873b69cf7d3779c1
My first thought is that the Nomad agent is having issues, but it seems okay and I’m able to hit the same endpoints in the logs directly on the host 🤷♂️
As a side note, is there a decent way to debug things? There’s policy_id in logs but I’m not sure how that maps back to the actual policy or Nomad job in question. It seems like it would be very helpful to be able to trace these back.
We’re running Nomad 1.1.2 for what it’s worth.
Any thoughts?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 10
- Comments: 16 (6 by maintainers)
We just found 6 different instances of nomad-autoscaler v0.3.6 in 6 different Nomad clusters all in this same “failed to read policy in time” state. Like @hntrmrrs mentioned, all 6 instances seem to start with receiving a 500 from the Nomad API:
I’m trying to confirm with some teammates, but I think this might have happened when we did an upgrade on the Nomad Servers.
Here you go!
2022-07-19T18:44:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T18:44:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:47:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T18:47:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:50:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:50:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T18:53:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T18:53:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:56:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T18:56:52.664Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:59:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T18:59:52.665Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da 2022-07-19T19:02:52.666Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=ffbaf7d4-41c7-5703-fcee-a641f253a19f 2022-07-19T19:02:52.666Z [ERROR] policy_manager.policy_handler: timeout: failed to read policy in time: policy_id=46c49638-8dfd-b3e2-9340-9a637ed8d8da SIGABRT: abort PC=0x46aac1 m=0 sigcode=0 goroutine 0 [idle]: runtime.futex() /usr/local/go/src/runtime/sys_linux_amd64.s:519 +0x21 runtime.futexsleep(0xffffffffffff8000, 0x2561be0, 0xc00004c800) /usr/local/go/src/runtime/os_linux.go:44 +0x36 runtime.notesleep(0x2561d30) /usr/local/go/src/runtime/lock_futex.go:160 +0x87 runtime.mPark() /usr/local/go/src/runtime/proc.go:1441 +0x2a runtime.stoplockedm() /usr/local/go/src/runtime/proc.go:2602 +0x65 runtime.schedule() /usr/local/go/src/runtime/proc.go:3299 +0x3d runtime.park_m(0xc000520680) /usr/local/go/src/runtime/proc.go:3516 +0x14d runtime.mcall() /usr/local/go/src/runtime/asm_amd64.s:307 +0x43 goroutine 1 [chan receive, 1262 minutes]: github.com/hashicorp/nomad-autoscaler/agent.(*Agent).handleSignals(0xc0003cdb20) /home/circleci/project/project/agent/agent.go:270 +0xe5 github.com/hashicorp/nomad-autoscaler/agent.(*Agent).Run(0xc0003cdb20) /home/circleci/project/project/agent/agent.go:100 +0x446 github.com/hashicorp/nomad-autoscaler/command.(*AgentCommand).Run(0xc000219a40, {0xc00003a0a0, 0x203000, 0x0}) /home/circleci/project/project/command/agent.go:333 +0x8e5 github.com/mitchellh/cli.(*CLI).Run(0xc0003a5400) /go/pkg/mod/github.com/mitchellh/cli@v1.1.2/cli.go:262 +0x5f8 main.main() /home/circleci/project/project/main.go:26 +0x24d goroutine 7 [select]: go.opencensus.io/stats/view.(*worker).start(0xc0000f4980) /go/pkg/mod/go.opencensus.io@v0.22.5/stats/view/worker.go:276 +0xb9 created by go.opencensus.io/stats/view.init.0 /go/pkg/mod/go.opencensus.io@v0.22.5/stats/view/worker.go:34 +0x92 goroutine 10 [IO wait]: internal/poll.runtime_pollWait(0x7f21944c2e58, 0x72) /usr/local/go/src/runtime/netpoll.go:234 +0x89 internal/poll.(*pollDesc).wait(0xc0000f5800, 0x416f46, 0x0) /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0xc0000f5800) /usr/local/go/src/internal/poll/fd_unix.go:402 +0x22c net.(*netFD).accept(0xc0000f5800) /usr/local/go/src/net/fd_unix.go:173 +0x35 net.(*TCPListener).accept(0xc00039d170) /usr/local/go/src/net/tcpsock_posix.go:140 +0x28 net.(*TCPListener).Accept(0xc00039d170) /usr/local/go/src/net/tcpsock.go:262 +0x3d net/http.(*Server).Serve(0xc0001e1340, {0x1a5bd18, 0xc00039d170}) /usr/local/go/src/net/http/server.go:3002 +0x394 github.com/hashicorp/nomad-autoscaler/agent/http.(*Server).Start(0xc0002df360) /home/circleci/project/project/agent/http/server.go:124 +0xd2 created by github.com/hashicorp/nomad-autoscaler/command.(*AgentCommand).Run /home/circleci/project/project/command/agent.go:330 +0x896 goroutine 34 [chan receive]: github.com/hashicorp/nomad-autoscaler/plugins/builtin/target/nomad/plugin.(*TargetPlugin).garbageCollectionLoop(0xc000469740) /home/circleci/project/project/plugins/builtin/target/nomad/plugin/plugin.go:213 +0xc9 created by github.com/hashicorp/nomad-autoscaler/plugins/builtin/target/nomad/plugin.(*TargetPlugin).SetConfig /home/circleci/project/project/plugins/builtin/target/nomad/plugin/plugin.go:94 +0xef goroutine 20 [syscall, 1263 minutes]: os/signal.signal_recv() /usr/local/go/src/runtime/sigqueue.go:169 +0x98 os/signal.loop() /usr/local/go/src/os/signal/signal_unix.go:24 +0x19 created by os/signal.Notify.func1.1 /usr/local/go/src/os/signal/signal.go:151 +0x2c goroutine 21 [select, 100962 minutes]: github.com/armon/go-metrics.(*InmemSignal).run(0xc00053e240) /go/pkg/mod/github.com/armon/go-metrics@v0.3.7/inmem_signal.go:64 +0x6c created by github.com/armon/go-metrics.NewInmemSignal /go/pkg/mod/github.com/armon/go-metrics@v0.3.7/inmem_signal.go:38 +0x174 goroutine 23 [sleep]: time.Sleep(0x3b9aca00) /usr/local/go/src/runtime/time.go:193 +0x12e github.com/armon/go-metrics.(*Metrics).collectStats(0xc0004f2000) /go/pkg/mod/github.com/armon/go-metrics@v0.3.7/metrics.go:230 +0x25 created by github.com/armon/go-metrics.New /go/pkg/mod/github.com/armon/go-metrics@v0.3.7/start.go:84 +0x17b goroutine 25 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a1e0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 26 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a230, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 27 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a280, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 28 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a2d0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 29 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a320, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 30 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a370, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 31 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a3c0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 32 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a410, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 49 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a460, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 50 [select, 100962 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x1843df3, 0xa}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a4b0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:128 +0x25f goroutine 51 [select, 59053 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a500, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 52 [select, 59052 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a550, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 53 [select, 59052 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a5a0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 54 [select, 59051 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a5f0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 55 [select, 59051 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a640, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 56 [select, 59055 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a690, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 57 [select, 59055 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a6e0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 58 [select, 59054 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a730, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 59 [select, 59055 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a780, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 60 [select, 59053 minutes]: github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).waitForWork(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:246 +0x1e5 github.com/hashicorp/nomad-autoscaler/policyeval.(*Broker).Dequeue(0xc000366d20, {0x1a6fb30, 0xc000400c00}, {0x183fb5a, 0x7}) /home/circleci/project/project/policyeval/broker.go:176 +0x19a github.com/hashicorp/nomad-autoscaler/policyeval.(*BaseWorker).Run(0xc00036a7d0, {0x1a6fb30, 0xc000400c00}) /home/circleci/project/project/policyeval/base_worker.go:60 +0xb6 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).initWorkers /home/circleci/project/project/agent/agent.go:134 +0x35b goroutine 61 [select, 59051 minutes]: github.com/hashicorp/nomad-autoscaler/agent.(*Agent).runEvalHandler(0xc0003cdb20, {0x1a6fb30, 0xc000400c00}, 0xc000366cc0) /home/circleci/project/project/agent/agent.go:106 +0xb2 created by github.com/hashicorp/nomad-autoscaler/agent.(*Agent).Run /home/circleci/project/project/agent/agent.go:97 +0x439 goroutine 5811358 [select, 5 minutes]: net/http.(*persistConn).roundTrip(0xc00053a000, 0xc0006f22c0) /usr/local/go/src/net/http/transport.go:2614 +0x97d net/http.(*Transport).roundTrip(0xc0001e48c0, 0xc000191100) /usr/local/go/src/net/http/transport.go:594 +0x7d1 net/http.(*Transport).RoundTrip(0x30, 0x1a3c1e0) /usr/local/go/src/net/http/roundtrip.go:18 +0x19 net/http.send(0xc000191100, {0x1a3c1e0, 0xc0001e48c0}, {0x17ff560, 0x40ce01, 0x0}) /usr/local/go/src/net/http/client.go:252 +0x5d8 net/http.(*Client).send(0xc000465e00, 0xc000191100, {0x203000, 0x203000, 0x0}) /usr/local/go/src/net/http/client.go:176 +0x9b net/http.(*Client).do(0xc000465e00, 0xc000191100) /usr/local/go/src/net/http/client.go:725 +0x908 net/http.(*Client).Do(...) /usr/local/go/src/net/http/client.go:593 github.com/hashicorp/nomad/api.(*Client).doRequest(0xc0003b21c0, 0xc0000e06c0) /go/pkg/mod/github.com/hashicorp/nomad/api@v0.0.0-20210503143957-4ccada7924cf/api.go:722 +0x65 github.com/hashicorp/nomad/api.(*Client).query(0xc0004c37b0, {0x184fced, 0xc0006920c0}, {0x1478d40, 0xc00043adc8}, 0x464a00) /go/pkg/mod/github.com/hashicorp/nomad/api@v0.0.0-20210503143957-4ccada7924cf/api.go:841 +0xb4 github.com/hashicorp/nomad/api.(*Scaling).ListPolicies(0xc0004c37b0, 0xc00043e760) /go/pkg/mod/github.com/hashicorp/nomad/api@v0.0.0-20210503143957-4ccada7924cf/scaling.go:20 +0x65 github.com/hashicorp/nomad-autoscaler/policy/nomad.(*Source).MonitorIDs.func1() /home/circleci/project/project/policy/nomad/source.go:109 +0xca created by github.com/hashicorp/nomad-autoscaler/policy/nomad.(*Source).MonitorIDs /home/circleci/project/project/policy/nomad/source.go:102 +0x1e6 goroutine 5206364 [select]: github.com/hashicorp/nomad-autoscaler/policy.(*Handler).Run(0xc00021de00, {0x1a6fb30, 0xc000400c00}, 0xc0002a8fd0) /home/circleci/project/project/policy/handler.go:118 +0x411 github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run.func1({0xc0000406f0, 0x24}) /home/circleci/project/project/policy/manager.go:116 +0x46 created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:115 +0xed3 goroutine 5206401 [select, 1263 minutes]: github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run(0xc00036a140, {0x1a6fb30, 0xc000400c00}, 0xc000366cc0) /home/circleci/project/project/policy/manager.go:72 +0x475 created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:154 +0x847 goroutine 5206403 [select, 1263 minutes]: github.com/hashicorp/nomad-autoscaler/policy/file.(*Source).MonitorIDs(0xc0004b6100, {0x1a6fb30, 0xc000400340}, {0xc00035e360, 0xc00035e1e0}) /home/circleci/project/project/policy/file/source.go:84 +0x131 created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:67 +0x1f3 goroutine 5206404 [select, 5 minutes]: github.com/hashicorp/nomad-autoscaler/policy/nomad.(*Source).MonitorIDs(0xc00037a240, {0x1a6fb30, 0xc000400340}, {0xc00035e360, 0xc00035e1e0}) /home/circleci/project/project/policy/nomad/source.go:113 +0x26d created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:67 +0x1f3 goroutine 5206410 [select, 1263 minutes]: github.com/hashicorp/nomad-autoscaler/policy/file.(*Source).MonitorPolicy(0xc0004b6100, {0x1a6fb30, 0xc0004005c0}, {{0xc0000406f0, 0x24}, 0xc00007e720, 0xc00007e840, 0xc00007e6c0}) /home/circleci/project/project/policy/file/source.go:149 +0x7bc created by github.com/hashicorp/nomad-autoscaler/policy.(*Handler).Run /home/circleci/project/project/policy/handler.go:115 +0x2e5 goroutine 5206402 [select]: github.com/hashicorp/nomad-autoscaler/policy.(*Manager).periodicMetricsReporter(0xc00036a140, {0x1a6fb30, 0xc000400340}, 0xc000366cc0) /home/circleci/project/project/policy/manager.go:224 +0xdf created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:61 +0x196 goroutine 5811386 [IO wait, 5 minutes]: internal/poll.runtime_pollWait(0x7f21944c2c88, 0x72) /usr/local/go/src/runtime/netpoll.go:234 +0x89 internal/poll.(*pollDesc).wait(0xc000466f00, 0xc000131800, 0x0) /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc000466f00, {0xc000131800, 0xb09, 0xb09}) /usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc000466f00, {0xc000131800, 0xc00008d758, 0xc00008d758}) /usr/local/go/src/net/fd_posix.go:56 +0x29 net.(*conn).Read(0xc0003720e0, {0xc000131800, 0xc000315800, 0xc00008d7f8}) /usr/local/go/src/net/net.go:183 +0x45 crypto/tls.(*atLeastReader).Read(0xc00043ae70, {0xc000131800, 0x0, 0x409b8d}) /usr/local/go/src/crypto/tls/conn.go:777 +0x3d bytes.(*Buffer).ReadFrom(0xc0001c4978, {0x1a39800, 0xc00043ae70}) /usr/local/go/src/bytes/buffer.go:204 +0x98 crypto/tls.(*Conn).readFromUntil(0xc0001c4700, {0x1a3c0c0, 0xc0003720e0}, 0x183be26) /usr/local/go/src/crypto/tls/conn.go:799 +0xe5 crypto/tls.(*Conn).readRecordOrCCS(0xc0001c4700, 0x0) /usr/local/go/src/crypto/tls/conn.go:606 +0x112 crypto/tls.(*Conn).readRecord(...) /usr/local/go/src/crypto/tls/conn.go:574 crypto/tls.(*Conn).Read(0xc0001c4700, {0xc000356000, 0x1000, 0x10100c0004ad800}) /usr/local/go/src/crypto/tls/conn.go:1277 +0x16f net/http.(*persistConn).Read(0xc00053a000, {0xc000356000, 0x40505d, 0x60}) /usr/local/go/src/net/http/transport.go:1926 +0x4e bufio.(*Reader).fill(0xc0002b0a80) /usr/local/go/src/bufio/bufio.go:101 +0x103 bufio.(*Reader).Peek(0xc0002b0a80, 0x1) /usr/local/go/src/bufio/bufio.go:139 +0x5d net/http.(*persistConn).readLoop(0xc00053a000) /usr/local/go/src/net/http/transport.go:2087 +0x1ac created by net/http.(*Transport).dialConn /usr/local/go/src/net/http/transport.go:1747 +0x1e05 goroutine 5811243 [IO wait]: internal/poll.runtime_pollWait(0x7f21944c28e8, 0x72) /usr/local/go/src/runtime/netpoll.go:234 +0x89 internal/poll.(*pollDesc).wait(0xc000466e80, 0xc000122000, 0x0) /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc000466e80, {0xc000122000, 0x1000, 0x1000}) /usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc000466e80, {0xc000122000, 0x4cee66, 0x2560c80}) /usr/local/go/src/net/fd_posix.go:56 +0x29 net.(*conn).Read(0xc00000ed40, {0xc000122000, 0x0, 0xc000732758}) /usr/local/go/src/net/net.go:183 +0x45 net/http.(*connReader).Read(0xc000732750, {0xc000122000, 0x1000, 0x1000}) /usr/local/go/src/net/http/server.go:780 +0x16d bufio.(*Reader).fill(0xc000692060) /usr/local/go/src/bufio/bufio.go:101 +0x103 bufio.(*Reader).Peek(0xc000692060, 0x4) /usr/local/go/src/bufio/bufio.go:139 +0x5d net/http.(*conn).serve(0xc000422000, {0x1a6fbd8, 0xc00042ede0}) /usr/local/go/src/net/http/server.go:1955 +0xc36 created by net/http.(*Server).Serve /usr/local/go/src/net/http/server.go:3034 +0x4e8 goroutine 5811387 [select, 5 minutes]: net/http.(*persistConn).writeLoop(0xc00053a000) /usr/local/go/src/net/http/transport.go:2386 +0xfb created by net/http.(*Transport).dialConn /usr/local/go/src/net/http/transport.go:1748 +0x1e65 goroutine 5206365 [select]: github.com/hashicorp/nomad-autoscaler/policy.(*Handler).Run(0xc00021df40, {0x1a6fb30, 0xc000400c00}, 0x0) /home/circleci/project/project/policy/handler.go:118 +0x411 github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run.func1({0xc000040720, 0x24}) /home/circleci/project/project/policy/manager.go:116 +0x46 created by github.com/hashicorp/nomad-autoscaler/policy.(*Manager).Run /home/circleci/project/project/policy/manager.go:115 +0xed3 goroutine 5206366 [select, 1262 minutes]: github.com/hashicorp/nomad-autoscaler/policy/file.(*Source).MonitorPolicy(0xc0004b6100, {0x1a6fb30, 0xc0006f35c0}, {{0xc000040720, 0x24}, 0xc00007e900, 0xc00007ea80, 0xc00007e8a0}) /home/circleci/project/project/policy/file/source.go:149 +0x7bc created by github.com/hashicorp/nomad-autoscaler/policy.(*Handler).Run /home/circleci/project/project/policy/handler.go:115 +0x2e5 rax 0xca rbx 0x0 rcx 0xffffffffffffffff rdx 0x0 rdi 0x2561d30 rsi 0x80 rbp 0x7fffd8ad9100 rsp 0x7fffd8ad90b8 r8 0x0 r9 0x0 r10 0x0 r11 0x286 r12 0x43e700 r13 0x0 r14 0x2561400 r15 0x0 rip 0x46aac1 rflags 0x286 cs 0x33 fs 0x0 gs 0x0Maybe it would make sense to have the health check start to fail when there’s any kind failure to read policies?
For us, this problem appears when the nomad server node restarts (for instance, manually restarting it with
systemctl restart nomad). I was able to reproduce it multiple times and the outcome was always the following:A possible quick workaround was inspired by atulsingh0:
Disclaimer: I came out with this fast-and-furious solution to restart the autoscaler job but it’s far from perfect and it works only if the autoscaler logs are in the file
autoscaler.stderr.0. Use it at your own riskHum…it’s the second time this is being reported, so there may be something wrong with our blocking query handling.
Do you happen to have anything sitting between Nomad and the Autoscaler? Like a load balancer or reverse proxy?
Yeah, it’s not ideal but you can query Nomad for the scaling policies, either using the
nomad scaling policy listcommand or the GET/v1/scaling/policiesendpoint, to get the policy ID.