grpc: c++ gRPC Server unresponsive after opening too many files

What version of gRPC and what language are you using?

1.23.0, c++

What operating system (Linux, Windows,…) and version?

RHEL 7.6

What runtime / compiler are you using (e.g. python version or version of gcc)

clang++

What did you do?

limited the open files on the server. started more client threads than the server can handle regarding the open files.

  1. I compiled the unmodified examples/cpp/helloworld/greeter_server.cc.
  2. i executed:
$ bash
$ ulimit -n 20
$ ./HelloWorldServer
  1. i modified the greeter_client.cc to keep calling the server every second
/*
 *
 * Copyright 2015 gRPC authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */

#include <chrono>
#include <thread>
#include <unistd.h>

#include <iostream>
#include <memory>
#include <string>

#include <grpcpp/grpcpp.h>

#ifdef BAZEL_BUILD
#include "examples/protos/helloworld.grpc.pb.h"
#else
#include "helloworld.grpc.pb.h"
#endif

using grpc::Channel;
using grpc::ClientContext;
using grpc::Status;
using helloworld::HelloRequest;
using helloworld::HelloReply;
using helloworld::Greeter;

class GreeterClient {
 public:
  GreeterClient(std::shared_ptr<Channel> channel)
      : stub_(Greeter::NewStub(channel)) {}

  // Assembles the client's payload, sends it and presents the response back
  // from the server.
  std::string SayHello(const std::string& user) {
    // Data we are sending to the server.
    HelloRequest request;
    request.set_name(user);

    // Container for the data we expect from the server.
    HelloReply reply;

    // Context for the client. It could be used to convey extra information to
    // the server and/or tweak certain RPC behaviors.
    ClientContext context;

    // The actual RPC.
    Status status = stub_->SayHello(&context, request, &reply);

    // Act upon its status.
    if (status.ok()) {
      return reply.message();
    } else {
      std::cout << status.error_code() << ": " << status.error_message()
                << std::endl;
      return "RPC failed";
    }
  }

 private:
  std::unique_ptr<Greeter::Stub> stub_;
};

int main(int argc, char** argv) {
  // Instantiate the client. It requires a channel, out of which the actual RPCs
  // are created. This channel models a connection to an endpoint (in this case,
  // localhost at port 50051). We indicate that the channel isn't authenticated
  // (use of InsecureChannelCredentials()).
  GreeterClient greeter(grpc::CreateChannel(
      "localhost:50051", grpc::InsecureChannelCredentials()));
  std::string user("world");
  while (access( "stop", F_OK ) == -1) {
    std::string reply = greeter.SayHello(user);
    std::cout << "Greeter received: " << reply << std::endl;
    std::this_thread::sleep_for (std::chrono::seconds(1));
  }

  return 0;
}

  1. i executed: $ more test.sh
#!/bin/bash
count=10
for (( i=0; i<$count; i++ )); do
  ./HelloWorldClient &
done
wait
  1. output on the server side:
Server listening on 0.0.0.0:50051
E1001 09:54:20.166869652   24475 tcp_server_posix.cc:213]    Failed accept4: Too many open files
E1001 09:54:20.167309619   24475 ev_epollex_linux.cc:1313]   pollset_add_fd: {"created":"@1569923660.167265010","description":"pollset_transition_pollable_from_empty_to_fd","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923660.167264164","description":"get_fd_pollable","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923660.167261401","description":"Too many open files","errno":24,"file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":541,"os_error":"Too many open files","syscall":"epoll_create1"}]}]}
E1001 09:54:21.167986508   24475 ev_epollex_linux.cc:1313]   pollset_add_fd: {"created":"@1569923661.167923575","description":"pollset_transition_pollable_from_empty_to_fd","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923661.167922434","description":"get_fd_pollable","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923661.167918409","description":"Too many open files","errno":24,"file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":541,"os_error":"Too many open files","syscall":"epoll_create1"}]}]}
E1001 09:54:22.168806789   24475 ev_epollex_linux.cc:1313]   pollset_add_fd: {"created":"@1569923662.168744095","description":"pollset_transition_pollable_from_empty_to_fd","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923662.168743040","description":"get_fd_pollable","file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":307,"referenced_errors":[{"created":"@1569923662.168739697","description":"Too many open files","errno":24,"file":"src/core/lib/iomgr/ev_epollex_linux.cc","file_line":541,"os_error":"Too many open files","syscall":"epoll_create1"}]}]}
  1. initial output on the server side
Greeter received: Hello world
Greeter received: Hello world
Greeter received: Hello world
Greeter received: Hello world
...

  1. later and subsequent client calls show (also for single process calls)
14: Connection reset by peer
14: Connection reset by peer
14: Connection reset by peer
14: Connection reset by peer
14: Connection reset by peer
Greeter received: RPC failed
Greeter received: RPC failed
Greeter received: RPC failed
14: Connection reset by peer
Greeter received: RPC failed
Greeter received: RPC failed
Greeter received: RPC failed
14: failed to connect to all addresses
14: failed to connect to all addresses
14: failed to connect to all addresses
14: failed to connect to all addresses
14: failed to connect to all addresses

What did you expect to see?

Resource-exhausted errors, recovery of the server.

What did you see instead?

The server gets unresponsive.

Anything else we should know about your project / environment?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 21 (10 by maintainers)

Most upvoted comments

Thanks for fixing this! I will test as soon as possible and hopefully this will allow us to raise our threadlimit. Many thanks for tackling the problem. 😃

Looks like this issue got lost in the shuffle when we had some team changes. It’s on my radar now, I’ll look into it.