influxdb-client-go: Client is leaking TCP connections

I discovered this issue with the grafana flux datasource, but traced it back to the client. It seems TCP connections are not closed after calling client.Close().

Output of the program below:

...
Query returned 181 rows
Open TCP connections:       59
Query returned 181 rows
Open TCP connections:       60
Query returned 181 rows
Open TCP connections:       61
Query returned 181 rows
Open TCP connections:       62
...

Please run this go program to see for yourself (when on linux, change .8086 to :8086 for the netstat command).

package main

import (
	"context"
	"fmt"
	"log"
	"os/exec"

	influxdb2 "github.com/influxdata/influxdb-client-go"
)

func main() {
	for {
		client := influxdb2.NewClient("https://demo.factry.io:8086", "flux:xulf")
		queryAPI := client.QueryAPI("factry")

		query := `from(bucket: "simulation_tanksystem")
							|> range(start: -2h, stop:-30m)
							|> filter(fn: (r) => r._measurement == "WP1_fill_level")
							|> aggregateWindow(every: 30s, timeSrc: "_start", fn: mean, createEmpty:false)`

		// get parser flux query result
		result, err := queryAPI.Query(context.Background(), query)
		if err != nil {
			fmt.Println(err)
			return
		}
	        defer result.Close()

		count := 0
		// Use Next() to iterate over query result lines
		for result.Next() {
			// Observe when there is new grouping key producing new table
			count++
		}
		fmt.Printf("Query returned %v rows\n", count)
		if result.Err() != nil {
			fmt.Printf("Query error: %s\n", result.Err().Error())
		}

		// Ensures background processes finishes
		client.Close()

		getOpenConns()
	}
}

func getOpenConns() {
	out, err := exec.Command("sh", "-c", `netstat -atn | grep ".8086" | wc -l`).Output()
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Open TCP connections: %s", out)
}

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 31 (10 by maintainers)

Commits related to this issue

Most upvoted comments

@coussej @aknuds1 thank you for the discussion herein. I would end up with the following results:

  1. If an application frequently connects to any other server (such as a database), a connection pool must be used. Every influx client has a new connection pool OOTB, reusing a single client is therefore a solution. This might be documented to make things clear.
  2. Not reusing the connections might result in exhausting of system/networking resources, i.e. too many sockets being in TIMED-WAIT state.
  3. If an influxdb client internally creates a new HTTP client (a default case), it should close all idle connections. We might also better tune the default client to not create any idle connections at all.
  4. The influxdb client must be closed, because of the WriteAPI implementation that has an asynchronous retry mechanism that must be stopped on close.
  5. Users of this library are free to customize the connection pool and other HTTP/HTTPS specific parameters by using https://github.com/influxdata/influxdb-client-go/blob/fb49b499f9a53778bf4cc9ce62d15b0db3806473/api/http/options.go#L50.

Thank you @coussej, the ESTABLISHED connections are caused by (3) in https://github.com/influxdata/influxdb-client-go/issues/183#issuecomment-675394837 . It is the reason for keeping this issue open, it can be fixed.

@aknuds1, yes. The library does all what’s needed for proper handling of connections, as mentioned in the net/http docs:

The client must close the response body when finished with it:

@aknuds1 yes, it was the root cause. Your code in grafana handles the situation the best way possible. The code in this client seems to close everything properly from the go’s POV, I don’t think that we can force-close the socket.

@aknuds1 I am able to reproduce this on Ubuntu, the extra sockets that increase the number are in TIMED-WAIT state:

$ netstat -atn | grep ":8086" 
tcp        0      0 127.0.0.1:35098         127.0.0.1:8086          TIME_WAIT  
tcp        0      0 127.0.0.1:34814         127.0.0.1:8086          TIME_WAIT  
tcp        0      0 127.0.0.1:35022         127.0.0.1:8086          TIME_WAIT  
tcp        0      0 127.0.0.1:34782         127.0.0.1:8086          TIME_WAIT  
tcp        0      0 172.17.0.1:37026        172.17.0.2:8086         TIME_WAIT

This is AFAIK a normal state that depends on OS (https://knowledgebase.progress.com/articles/Article/Can-the-time-a-socket-spends-in-TIMED-WAIT-state-be-reduced). The sockets remain in TIMED-WAIT state for an additional 60 seconds in my case:

$ cat /proc/sys/net/ipv4/tcp_fin_timeout
60

After 60 seconds, the count of sockets returns to an initial state.

The code should better reuse clients to avoid this issue. https://stackoverflow.com/questions/39813587/go-client-program-generates-a-lot-a-sockets-in-time-wait-state might also provide a solution for you.

I will fix that, however, I’m on Windows 10 and I’m not able to reproduce that. Go 1.4, InfluxDB 1.8.1, InfluxDB 2 beta 16