memcached: `getpeername: Transport endpoint is not connected` with Extstore
Describe the bug
Hey, I have Memcached 1.6.12 deployed on AWS EC2 instances (specifically, r5d.2xlarge, more details in System Information), with Extstore enabled. Occasionally, the Memcached process would have too many open files and just hanging. When this happened, there were more than 50,000 entries under /proc/[pid]/fd/
directory. But regularity, there are only about 10,000 entires. There were also a lot of logs like this when this issue got triggered.
getpeername: Transport endpoint is not connected
accept4(): Too many open files
getpeername: Transport endpoint is not connected
To Reproduce This seems to happen randomly and hard to reproduce.
System Information
- OS/Distro: Ubuntu 16.04
- Version of OS/distro: xenial
- Version of Memcached: 1.6.12
- Hardware detail: Amazon EC2 r5d.2xlarge – 64G memory, 8CPU, 300G SSD
Additionally, I also tried it on another hardware, and the same errors were reported
- OS/Distro: Ubuntu 20.04
- Version of OS/distro: focal
- Version of Memcached: 1.6.12
- Hardware detail: Amazon EC2 r6gd.2xlarge – 64G memory, 8CPU, 474G SSD
Detail (please include!) Here’s the memcached.conf
-d
logfile /var/log/memcached.log
-m 48000
-p 11211
-u nobody
-l 0.0.0.0
-c 100000
-t 8
-I 10m
-o ext_path=/mnt/memcached/extstore:225G
-o ext_wbuf_size=16
-o ext_low_ttl=14400
-o modern
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 47 (25 by maintainers)
Alright I’ll close this out; the upstreamed patch should be in an official version within a week or two. Thanks for your patience in narrowing this down.
If it comes back or you have further details please open a new issue but reference this one.