apm-agent-ruby: Couldn't establish connection to APM Server: "#"

Describe the bug

The apm agent log says:

Couldn't establish connection to APM Server: with the following two messages: <NoMethodError: undefined method 'flush' for nil:NilClass> and sometimes I see #<Errno::ESPIPE: Invalid seek>

With loglevel 0 I see Closing request with reason timeout and Closing writer with reason timeout

I have had this problem on and off for some time. It works on our staging environment. It used to work in production as well but stopped a few days ago, just like that. In Kibana I can see the graph drop to zero and then silence.

Edit: I just noticed our staging machine’s apm data also took a nosedive yesterday. I’ll spare you the screen shots. Unless you want them.

Environment

OS: Alpine on Docker on Debian 9
Ruby version: 2.5.8
Rails: 4.2.11
APM Server version: 7.6.1
Agent version: 3.6.0 (tried with several others too)
Http Gem version: 4.4.1

Additional context

The resources on the server look just fine.
I am able to reach the apm_server by curl from anywhere I try. I even tried using sending a GET with HTTParty from a controller to apm_server.
A strange thing is that if I change the service_name in elastic_apm.yml and start the app (in docker), the new servicename shows up in kibana. So something comes through.
I see three spikes in yesterdays Requests per minute graph where a single transaction seems to have made it through. This could have happened after app restarts.
Currently a few other rails (5) apps and a sinatra app reports to the apm_server with no issues.
I got the full queue message and remedied that by increasing the pool size. This did not fix the failing connection issue, however. Now I’m seeing the Queue is full (256 items) message again though. Maybe that’s because it doesn’t connect to the server and empty the queue. I don’t know.

I think it’s the flush call in lib/elastic_apm/transport/connection/http.rb in the method request(method, url, body: nil, headers: nil) that throws the NoMethodError. On my local machine I added a try for the flush call. That stopped the error, but of course didn’t fix the issue.

I don’t know if this helps, but I wanted to be thorough:

   39:         def request(method, url, body: nil, headers: nil)
   40:           byebug
=> 41:           @client.send(
   42:             method,
   43:             url,
   44:             body: body,
   45:             headers: (headers ? @headers.merge(headers) : @headers).to_h,
(byebug) method
:post
(byebug) url
"http://myserver:8200/intake/v2/events"
(byebug) body
#<IO:fd 10>
(byebug) headers
{:"User-Agent"=>"elastic-apm-ruby/3.6.0 http.rb/4.4.1 ruby/2.5.1", "Content-Type"=>"application/x-ndjson", "Transfer-Encoding"=>"chunked", "Content-Encoding"=>"gzip"}

Agent config options

Click to expand

service_name: 'MyApp'

server_url: "http://myserver:8200"

breakdown_metrics: true

capture_body: 'all'

instrumented_rake_tasks: ['mytask:task1']

log_level: 1
log_path: 'log/elastic_apm.log'
logger: <%= Logger.new('log/elastic_apm.log') %>

pool_size: 4

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 23 (14 by maintainers)

Most upvoted comments

@gowthamgts That’s not an error. That’s the expected behaviour. However, you are not the first person to read it as an error, so we’ll change the wording of the message in a future version. Sorry for the confusion.

mikker on Nov 24, 2020

Hi @Ingstrup! Sounds to me like it could be an issue with Rails 4.2? I’ll investigate a bit and see if I can dig something up.

What server lib are you using? Puma? Passenger?

mikker on May 20, 2020