jmx_exporter: Won't run on machine with more than 20 CPU cores

I’m using the JMX Exporter to monitor Apache Spark, using it as a Java agent. This works fine until I try and run it on a server with more than 20 cores. In the spark logs I see the following line as spark is starting up:

2017-03-15 21:35:50.606:WARN:ipjsoejs.AbstractConnector:insufficient threads configured for SelectChannelConnector@0.0.0.0:19105

Then once spark is running the log gets spammed with lines like the following:

2017-03-15 21:35:53.858:WARN:ipjsoeji.nio:Dispatched Failed! SCEP@3712cd1e{l(/192.168.92.52:44556)<->r(/192.168.76.1:19105),d=false,open=true,ishut=false,oshut=false,rb=false,wb=false,w=true,i=0}-{AsyncHttpConnection@905c2db,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-14,l=0,c=0},r=0} to io.prometheus.jmx.shaded.org.eclipse.jetty.server.nio.SelectChannelConnector$ConnectorSelectorManager@5f9d5ea6 2017-03-15 21:35:53.859:WARN:ipjsoeji.nio:Dispatched Failed! SCEP@3712cd1e{l(/192.168.92.52:44556)<->r(/192.168.76.1:19105),d=false,open=true,ishut=false,oshut=false,rb=false,wb=false,w=true,i=1r}-{AsyncHttpConnection@905c2db,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-14,l=0,c=0},r=0} to io.prometheus.jmx.shaded.org.eclipse.jetty.server.nio.SelectChannelConnector$ConnectorSelectorManager@5f9d5ea6 2017-03-15 21:35:53.859:WARN:ipjsoeji.nio:Dispatched Failed! SCEP@3712cd1e{l(/192.168.92.52:44556)<->r(/192.168.76.1:19105),d=false,open=true,ishut=false,oshut=false,rb=false,wb=false,w=true,i=1r}-{AsyncHttpConnection@905c2db,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-14,l=0,c=0},r=0} to io.prometheus.jmx.shaded.org.eclipse.jetty.server.nio.SelectChannelConnector$ConnectorSelectorManager@5f9d5ea6

Not only is the exporter not working in this situation but spark isn’t running correctly either and my disk is filling up very quickly!

To reproduce this is simple. Use a machine with >20 cores, add the JMX exporter as a java agent to any java process and watch the logs. You can prove the 20 cores thing by disabling cores until you have <=20 by doing the following (on linux):

echo 0 > /sys/devices/system/cpu/cpuX/online

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 19 (5 by maintainers)

Commits related to this issue

Most upvoted comments

Hi, We are also experiencing this issue at random moments with javaagent 0.9 and 10 machine cores.

Am I correct to say that the switch to HTTPserv in version 0.10 should be a fix for this issue?