grpc: Interop clients should not override authority by default, for proper SNI testing

You have to specify GRPC_SSL_TARGET_NAME_OVERRIDE_ARG to get SNI. That’s dangerous as it may prevent server operators from presenting correct certificates, as is the case for grpc-test.sandbox.gogole.com.

Reproduction case:

ejona@bruteforce:~/clients/grpc-c$ docker_image=gcr.io/grpc-testing/grpc_interop_python:v1.15.0
ejona@bruteforce:~/clients/grpc-c$ docker run -i --rm=true -e PYTHONPATH=/var/local/git/grpc/src/python/gens -e LD_LIBRARY_PATH=/var/local/git/grpc/libs/opt -w /var/local/git/grpc --net=host $docker_image bash -c "py27_native/bin/python src/python/grpcio_tests/setup.py run_interop --client --args=\"--server_host=grpc-test.sandbox.googleapis.com --server_host_override=grpc-test.sandbox.googleapis.com --server_port=443 --use_tls=true --test_case=empty_unary\""
running run_interop
ejona@bruteforce:~/clients/grpc-c$ docker run -i --rm=true -e PYTHONPATH=/var/local/git/grpc/src/python/gens -e LD_LIBRARY_PATH=/var/local/git/grpc/libs/opt -w /var/local/git/grpc --net=host $docker_image bash -c "py27_native/bin/python src/python/grpcio_tests/setup.py run_interop --client --args=\"--server_host=grpc-test.sandbox.googleapis.com --server_port=443 --use_tls=true --test_case=empty_unary\""
running run_interop
Traceback (most recent call last):
  File "src/python/grpcio_tests/setup.py", line 107, in <module>
    test_runner=TEST_RUNNER,
  File "/var/local/git/grpc/py27_native/local/lib/python2.7/site-packages/setuptools/__init__.py", line 140, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/var/local/git/grpc/src/python/grpcio_tests/commands.py", line 188, in run
    self.run_client()
  File "/var/local/git/grpc/src/python/grpcio_tests/commands.py", line 204, in run_client
    client.test_interoperability()
  File "/var/local/git/grpc/src/python/grpcio_tests/tests/interop/client.py", line 127, in test_interoperability
    test_case.test_interoperability(stub, args)
  File "/var/local/git/grpc/src/python/grpcio_tests/tests/interop/methods.py", line 482, in test_interoperability
    _empty_unary(stub)
  File "/var/local/git/grpc/src/python/grpcio_tests/tests/interop/methods.py", line 145, in _empty_unary
    response = stub.EmptyCall(empty_pb2.Empty())
  File "/var/local/git/grpc/py27_native/local/lib/python2.7/site-packages/grpc/_channel.py", line 532, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/var/local/git/grpc/py27_native/local/lib/python2.7/site-packages/grpc/_channel.py", line 466, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connect Failed"
	debug_error_string = "{"created":"@1538496521.874311362","description":"Failed to create subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":2636,"referenced_errors":[{"created":"@1538496521.874124955","description":"Pick Cancelled","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":241,"referenced_errors":[{"created":"@1538496521.874069460","description":"Connect Failed","file":"src/core/ext/filters/client_channel/subchannel.cc","file_line":663,"grpc_status":14,"referenced_errors":[{"created":"@1538496521.873898314","description":"Peer name foo.test.google.fr is not in peer certificate","file":"src/core/lib/security/security_connector/security_connector.cc","file_line":880}]}]}]}"

I did confirm with wireshark that SNI is used with the override and not used without.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 24 (14 by maintainers)

Most upvoted comments

Had a discussion with @jiangtaoli2016. Agreed that SNI should be enabled by default.

Most use case does not require SNI. If user wants to connects to ‘prod.abc.com’, but server only has a certificate on ‘test.bcd.com’, client can use SNI to configure ‘test.bcd.com’ using grpc.ssl_target_name_override.

SNI != testing. In fact, our usage is the only time I’ve seen SNI necessary for testing. The client generally doesn’t know if it needs to use SNI. Also, it may not need to use SNI today, but needs to use SNI tomorrow as the server infrastructure changes. It is wrong to require the client to do additional configuration to enable SNI.

I’d agree that “most service deployments” do not require SNI, because SNI isn’t needed with microservices. But that isn’t the case for hosted services; things like googleapis.com. It’s very normal to need SNI in those environments. Any place virtual hosting takes place (which means an L7 LB, and maybe more rarely L4 LB) SNI should be used by clients.

Especially we have SPIFFE certificate coming out soon that user can have full control on server name check with a callback function (mainly for non-https semantics).

“Here is a use case that doesn’t need SNI” is not an argument against “we need SNI.”

It appears a bug in python, that a user could not use SNI properly.

This bug is not for the reported Python problem. I tested the interop client and saw SNI working. So something else is at play there.

Interop-client. In c++, php, python, interop-client will always use SNI. And if there is no hostname overrides from command line, “foo.test.google.fr” will be used as hostname_overrides.

Which is just wrong and why I was not aware of this bug. The interop client is testing interopability, which includes using SNI. The fact that the interop tests are not testing the default configuration is wrong and broken. It sounds like they always specify the override because SNI is not on by default; that is, they are working around the broken SNI in the interop client instead of fixing the gRPC library.

The interop client is important to this fix because it is the “test” that SNI is working correctly. This shouldn’t be considered “fixed” without a test, and I expect that the interop test should be that test. I’m fine if the one client is fixed (say, C++) and then other issues opened for the other clients. But I’m going to assume that implementations are broken until their interop client (which is something I can easily run) shows that it is working correctly.