redis-plus-plus: Connection Broken when ConnectionOptions::socket_timeout is set

Hi, I’m have a problem with the Subscriber class when setting a ConnectionOptions::socket_timeout. Basic example:

int main(int argc, char const *argv[])
{
	using namespace sw::redis;
	using namespace std::chrono_literals;
	ConnectionOptions connection_option;
	connection_option.host = "127.0.0.1";
	connection_option.port = 6379;
	// connection_option.socket_timeout = 10ms; /* comment out to obtain the error */

	auto redis = Redis(connection_option);
	auto sub = redis.subscriber();
	sub.on_message([](std::string channel, std::string msg) {
		std::cout << "Get " << msg << " on " << channel << std::endl;
	});
	sub.subscribe("topic_1");
	while (redis.ping() == "PONG")
	{
		try
		{
			sub.consume();
		}
		catch (const TimeoutError &timeout)
		{
			std::cout << "TIMEOUT" << std::endl;
			continue;
		}
		catch (const Error &err)
		{
			std::cout << err.what() << std::endl;
		}
		std::cout << "Pass" << std::endl;
		std::this_thread::sleep_for(100ms);
	}
	std::cout << "Exit" << std::endl;
	return 0;
}

All test are done voluntarily with no one publishing on that topics. Without the explicit set of socket_timeout, all works as expected:

  • a first pass is write
  • connection is blocked since there is no incoming messages.

With the explicit set of socket_timeout (comment out the specific line), the output is:

  • Pass
  • Failed to get reply: Unknown error code
  • Pass
  • Connection is broken
  • Pass
  • Connection is broken [so on and so forth]

NOTE: the connection is broken after some seconds In this case I would expect to see a TIMEOUT print, but no one is printed. In my case, this behavior is 100% repeatable. This error could be related to this modification? Am I missing something?

Thanks in advance!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 30 (16 by maintainers)

Most upvoted comments

Just to keep trace: now hiredis will be update soon fixing this bug (he simply removed lines 320, 321, 322, 323 from net.c) He also add some WIndows test too! Which are always welcome!

Hi @sewenew , We are discussing about too many arguments in this issue. Let me clean this up and move the MINGW Test in another “issues”. Thanks for the response about subscriber callback, it’s all clear now!

Damn! Yesterday I look in that specific code and I miss the bug! Your definition of DWORD as unsigned long seems correct. But, as you clearly highlight, the error is not in the conversion from DWORD to timeval, but the cast from a DWORD* to timeval*!

Let me summarize the problem so that I can open the Issue on the hiredis repo:

net.c

316    int redisContextSetTimeout(redisContext *c, const struct timeval tv) {
317        const void *to_ptr = &tv;
318        size_t to_sz = sizeof(tv);
319    #ifdef _WIN32
320        DWORD timeout_msec = tv.tv_sec * 1000 + tv.tv_usec / 1000;
321        to_ptr = &timeout_msec;
322       to_sz = sizeof(timeout_msec);
323    #endif
324        if (setsockopt(c->fd,SOL_SOCKET,SO_RCVTIMEO,to_ptr,to_sz) == -1) {
325            __redisSetErrorFromErrno(c,REDIS_ERR_IO,"setsockopt(SO_RCVTIMEO)");
326            return REDIS_ERR;
327        }
328        if (setsockopt(c->fd,SOL_SOCKET,SO_SNDTIMEO,to_ptr,to_sz) == -1) {
329            __redisSetErrorFromErrno(c,REDIS_ERR_IO,"setsockopt(SO_SNDTIMEO)");
330            return REDIS_ERR;
331        }
332       return REDIS_OK;
333    }

Line 320 a DWORD timeout_msec is correctly computed from the timeval tv Line 321 to_ptr is assign to the address of timeout_msec (to_ptr is di facto a DWORD*) Line 324 to_ptr is pass to setsockopt which is win32_setsockoption

sockcompat.c

212    int win32_setsockopt(SOCKET sockfd, int level, int optname, const void *optval, socklen_t optlen) {
213        int ret = 0;
214        if ((level == SOL_SOCKET) && ((optname == SO_RCVTIMEO) || (optname == SO_SNDTIMEO))) {
215            struct timeval *tv = optval;
216            DWORD timeout = tv->tv_sec * 1000 + tv->tv_usec / 1000;
217            ret = setsockopt(sockfd, level, optname, (const char*)&timeout, sizeof(DWORD));
218        } else {
219            ret = setsockopt(sockfd, level, optname, (const char*)optval, optlen);
220        }
221        _updateErrno(ret != SOCKET_ERROR);
222        return ret != SOCKET_ERROR ? ret : -1;
223    }

Line 215 casts optval (which is a DWORD*) to timeval* and this obviously causes the error.

I don’t understand why the #ifdef _WIN32 was introduced at line 320, In fact, commenting out line 320, 321, 322, 323 all seems work as expected! Now the timeout is called after the correct amount of milliseconds!