pymemcache: Bad file descriptor

I have a ton of traffic hitting ~5 servers that connect to the same memcached server.

Memcached config:

      PORT="11211"
      USER="memcached"
      MAXCONN="10240"
      CACHESIZE="1024"
      OPTIONS=""
      LOGFILE="/var/log/memcached.log"

pymemcached connector:

Client(('127.0.0.1', 11211), serializer=json_serializer, deserializer=json_deserializer, connect_timeout=3, timeout=1, ignore_exc=True, no_delay=True)

The problem is that sometimes i get OSError [Errno 9] Bad file descriptor in the servers. The code is breaking in

pymemcache/client/base.py in _store_cmd at line 797

self.sock.sendall(b''.join(cmds))

And pymemcache/client/base.py in set at line 290

return self._store_cmd(b'set', {key: value}, expire, noreply)[key]

After the error was detected i checked the servers and it still had like 4GB of ram and alot of disk space

I changed the memcache lib from this one to the python-memcached just to test and i didn’t have any error, but the response time was slower than when i was using pymemcached.

I’m only opening a connection to the memcache server when the server is starting, maybe it could be that i need to refresh the connection after a while?

Thanks

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 31

Most upvoted comments

I refactored the pymemcache part of my application to comply with thread safety and i haven’t seen the error again

Thanks for the help

any update on this? For my project, I’ve seen a similar issue running pymemcache in kube cluster.

Error was reported on return sock.recv(size) in _recv function. I managed to produce it easily with heavy load on to the same python process. I later on switched to python-memcached for walk-around and never seen that error again. But, python-memcached is far slower and not as api feature rich as pymemcache.

Sorry I don’t have any more detail to give. I had to quickly get the project running. Would be good to see sock issues getting fixed.

I might only give more feedback in 1 or 2 weeks, to test everything and push to production

I suggest to do the following trick (it will require you to modify your pymemcache package locally):

  • Add self._ti = threading.get_ident() to Client.__init__()
  • Add if self._ti != threading.get_ident(): raise RuntimeError('wrong thread') to Client. _store_cmd(), Client. _fetch_cmd(), Client. _misc_cmd(), and Client._connect()

If you don’t observe a RuntimeError after doing that, we can be at least somewhat certain that this bug isn’t caused by a race.