opentelemetry-collector: Regression: crash when TLS insecure is set on authenticators

Component(s)

extension/oauth2clientauth

What happened?

Description

Trying to have 2 opentelemetry collectors talk to each other using the oauth2client extension ( agent < > server ).

Steps to Reproduce

Docker compose file

services:
  otel-agent:
    image: otel/opentelemetry-collector-contrib:0.64.1
    container_name: otel-agent
    hostname: secure-otel
    restart: unless-stopped
    ports:
      - 5317:4317
    environment:
      - TZ=Europe/Brussels
    volumes:
      - type: bind
        source: /var/lib/docker/volumes/secure-otel/otel-agent.yml
        target: /etc/otel-collector-config.yaml
        read_only: true
    command: ["--config=/etc/otel-collector-config.yaml"]
    depends_on:
      - otel-server
      
  otel-server:
    image: otel/opentelemetry-collector-contrib:0.64.1
    container_name: otel-server
    hostname: secure-otel
    restart: unless-stopped
    ports:
      - 4317:4317
    environment:
      - TZ=Europe/Brussels
    volumes:
      - type: bind
        source: /var/lib/docker/volumes/secure-otel/otel-server.yml
        target: /etc/otel-collector-config.yaml
        read_only: true
    command: ["--config=/etc/otel-collector-config.yaml"]
    
  tempo:
    image: grafana/tempo:1.5.0
    container_name: tempo
    hostname: secure-otel
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /var/lib/docker/volumes/secure-otel/tempo.yml:/etc/tempo.yaml
      - /var/lib/docker/volumes/secure-otel/tempo:/tmp/tempo
    restart: unless-stopped
    ports:
      - 3200:3200  # tempo
      - 4007:4317  # otlp grpc
    depends_on:
      - otel-server

With 2 collector configs defined

Expected Result

Correct data transfer

Actual Result

The opentelemetry configured as agent keeps crashing

Collector version

v0.64.1

Environment information

Environment

OS: Windows with WSL2 Ubuntu

OpenTelemetry Collector configuration

-- agent

extensions:
  oauth2client:
    client_id: ***
    client_secret: ***
    token_url: https://login.microsoftonline.com/***/oauth2/v2.0/token
    scopes: ["api://***/.default"]

receivers:
  otlp:
    protocols:
      grpc:

exporters:
  otlp/auth:
    endpoint: otel-server:4317
    tls:      
      insecure: true
    auth:
      authenticator: oauth2client

service:
  extensions:
    - oauth2client
  pipelines:
    traces:
      receivers:
        - otlp
      exporters:
        - otlp/auth

-- server
extensions:
  oidc:
    issuer_url: https://login.microsoftonline.com/***/v2.0
    audience: ***

receivers:
  otlp/auth:
    protocols:
      grpc:
        auth:
          authenticator: oidc

exporters:
  otlp:
    endpoint: tempo:4007
    tls:
      insecure: true

service:
  extensions:
    - oidc
  pipelines:
    traces:
      receivers:
        - otlp/auth
      exporters:
        - otlp

Log output

2022-11-17T14:46:34.893Z	info	service/telemetry.go:110	Setting up own telemetry...
2022-11-17T14:46:34.893Z	info	service/telemetry.go:140	Serving Prometheus metrics	{"address": ":8888", "level": "basic"}
2022-11-17T14:46:34.894Z	info	service/service.go:89	Starting otelcol-contrib...	{"Version": "0.64.1", "NumCPU": 12}
2022-11-17T14:46:34.894Z	info	extensions/extensions.go:41	Starting extensions...
2022-11-17T14:46:34.894Z	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "oauth2client"}
2022-11-17T14:46:34.894Z	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "oauth2client"}
2022-11-17T14:46:34.894Z	info	pipelines/pipelines.go:74	Starting exporters...
2022-11-17T14:46:34.894Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "traces", "name": "otlp/auth"}
2022-11-17T14:46:34.894Z	info	service/service.go:115	Starting shutdown...
2022-11-17T14:46:34.894Z	info	pipelines/pipelines.go:118	Stopping receivers...
2022-11-17T14:46:34.894Z	info	pipelines/pipelines.go:125	Stopping processors...
2022-11-17T14:46:34.894Z	info	pipelines/pipelines.go:132	Stopping exporters...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x9c012b]
goroutine 1 [running]:
google.golang.org/grpc.(*ClientConn).Close(0x0)
	google.golang.org/grpc@v1.50.1/clientconn.go:1016 +0x4b
go.opentelemetry.io/collector/exporter/otlpexporter.(*exporter).shutdown(0xc000997210?, {0x9?, 0x8ce8401?})
	go.opentelemetry.io/collector/exporter/otlpexporter@v0.64.1/otlp.go:93 +0x1d
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
	go.opentelemetry.io/collector@v0.64.1/component/component.go:91
go.opentelemetry.io/collector/exporter/exporterhelper.newBaseExporter.func2({0x74b9640, 0xc0000bc018})
	go.opentelemetry.io/collector@v0.64.1/exporter/exporterhelper/common.go:177 +0x5a
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
	go.opentelemetry.io/collector@v0.64.1/component/component.go:91
go.opentelemetry.io/collector/service/internal/pipelines.(*Pipelines).ShutdownAll(0xc000d36050, {0x74b9640, 0xc0000bc018})
	go.opentelemetry.io/collector@v0.64.1/service/internal/pipelines/pipelines.go:135 +0x36b
go.opentelemetry.io/collector/service.(*service).Shutdown(0xc00039b200, {0x74b9640, 0xc0000bc018})
	go.opentelemetry.io/collector@v0.64.1/service/service.go:121 +0xd4
go.opentelemetry.io/collector/service.(*Collector).shutdownServiceAndTelemetry(0xc0013c5a88, {0x74b9640?, 0xc0000bc018?})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:234 +0x36
go.opentelemetry.io/collector/service.(*Collector).setupConfigurationComponents(0xc0013c5a88, {0x74b9640, 0xc0000bc018})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:155 +0x286
go.opentelemetry.io/collector/service.(*Collector).Run(0xc0013c5a88, {0x74b9640, 0xc0000bc018})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:164 +0x46
go.opentelemetry.io/collector/service.NewCommand.func1(0xc00058cf00, {0x680c260?, 0x1?, 0x1?})
	go.opentelemetry.io/collector@v0.64.1/service/command.go:53 +0x479
github.com/spf13/cobra.(*Command).execute(0xc00058cf00, {0xc0000b4050, 0x1, 0x1})
	github.com/spf13/cobra@v1.6.1/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc00058cf00)
	github.com/spf13/cobra@v1.6.1/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.6.1/command.go:968
main.runInteractive({{0xc000991860, 0xc0009b8a50, 0xc000991c80, 0xc000991500}, {{0x6833f67, 0xf}, {0x68af92d, 0x1f}, {0x6805d6e, 0x6}}, ...})
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:32 +0x5d
main.run(...)
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:11
main.main()
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:25 +0x1d8

Additional context

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 22 (13 by maintainers)

Most upvoted comments

Thanks @Depechie I think I can reproduce the error, need to get to the bottom of it - I am guessing recent changes to core collector service lifecycle management is causing this issue.

Just FYI, I was able to reproduce this on local machine directly Steps : downloaded 0.64.1 version of otel-collector and started it with the following collector config yaml

./otelcol-contrib --config otel-agent.yml 

extensions:
 oauth2client:
   client_id: agent
   client_secret:  ****** 
   token_url: http://localhost:8080/auth/realms/opentelemetry/protocol/openid-connect/token

receivers:
 otlp:
   protocols:
     grpc:

exporters:
 otlp/auth:
   endpoint: myserver:5000
   tls:
     insecure: true
   auth:
     authenticator: oauth2client

service:
 telemetry:
   logs:
     level: "debug"
 extensions:
   - oauth2client
 pipelines:
   traces:
     receivers:
       - otlp
     exporters:
       - otlp/auth

caused the entire process to crash

./otelcol-contrib --config otel-agent.yml 
2022/11/18 01:51:18 proto: duplicate proto type registered: jaeger.api_v2.PostSpansRequest
2022/11/18 01:51:18 proto: duplicate proto type registered: jaeger.api_v2.PostSpansResponse
2022-11-18T01:51:19.294-0800	info	service/telemetry.go:110	Setting up own telemetry...
2022-11-18T01:51:19.296-0800	info	service/telemetry.go:140	Serving Prometheus metrics	{"address": ":8888", "level": "basic"}
2022-11-18T01:51:19.297-0800	debug	components/components.go:28	Stable component.	{"kind": "exporter", "data_type": "traces", "name": "otlp/auth", "stability": "stable"}
2022-11-18T01:51:19.298-0800	debug	components/components.go:28	Stable component.	{"kind": "receiver", "name": "otlp", "pipeline": "traces", "stability": "stable"}
2022-11-18T01:51:19.299-0800	info	service/service.go:89	Starting otelcol-contrib...	{"Version": "0.64.1", "NumCPU": 12}
2022-11-18T01:51:19.299-0800	info	extensions/extensions.go:41	Starting extensions...
2022-11-18T01:51:19.299-0800	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "oauth2client"}
2022-11-18T01:51:19.299-0800	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "oauth2client"}
2022-11-18T01:51:19.299-0800	info	pipelines/pipelines.go:74	Starting exporters...
2022-11-18T01:51:19.299-0800	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "traces", "name": "otlp/auth"}
2022-11-18T01:51:19.299-0800	info	zapgrpc/zapgrpc.go:174	[core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel created	{"grpc_log": true}
2022-11-18T01:51:19.300-0800	info	zapgrpc/zapgrpc.go:174	[core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel Connectivity change to SHUTDOWN	{"grpc_log": true}
2022-11-18T01:51:19.300-0800	info	zapgrpc/zapgrpc.go:174	[core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel deleted	{"grpc_log": true}
2022-11-18T01:51:19.300-0800	info	service/service.go:115	Starting shutdown...
2022-11-18T01:51:19.300-0800	info	pipelines/pipelines.go:118	Stopping receivers...
2022-11-18T01:51:19.300-0800	info	pipelines/pipelines.go:125	Stopping processors...
2022-11-18T01:51:19.300-0800	info	pipelines/pipelines.go:132	Stopping exporters...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x15bc0ab]

goroutine 1 [running]:
google.golang.org/grpc.(*ClientConn).Close(0x0)
	google.golang.org/grpc@v1.50.1/clientconn.go:1016 +0x4b
go.opentelemetry.io/collector/exporter/otlpexporter.(*exporter).shutdown(0xc000e487d0?, {0x9?, 0x962cd01?})
	go.opentelemetry.io/collector/exporter/otlpexporter@v0.64.1/otlp.go:93 +0x1d
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
	go.opentelemetry.io/collector@v0.64.1/component/component.go:91
go.opentelemetry.io/collector/exporter/exporterhelper.newBaseExporter.func2({0x7e544c0, 0xc00019e000})
	go.opentelemetry.io/collector@v0.64.1/exporter/exporterhelper/common.go:177 +0x5a
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
	go.opentelemetry.io/collector@v0.64.1/component/component.go:91
go.opentelemetry.io/collector/service/internal/pipelines.(*Pipelines).ShutdownAll(0xc0000c18b0, {0x7e544c0, 0xc00019e000})
	go.opentelemetry.io/collector@v0.64.1/service/internal/pipelines/pipelines.go:135 +0x36b
go.opentelemetry.io/collector/service.(*service).Shutdown(0xc000633800, {0x7e544c0, 0xc00019e000})
	go.opentelemetry.io/collector@v0.64.1/service/service.go:121 +0xd4
go.opentelemetry.io/collector/service.(*Collector).shutdownServiceAndTelemetry(0xc0015fba88, {0x7e544c0?, 0xc00019e000?})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:234 +0x36
go.opentelemetry.io/collector/service.(*Collector).setupConfigurationComponents(0xc0015fba88, {0x7e544c0, 0xc00019e000})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:155 +0x286
go.opentelemetry.io/collector/service.(*Collector).Run(0xc0015fba88, {0x7e544c0, 0xc00019e000})
	go.opentelemetry.io/collector@v0.64.1/service/collector.go:164 +0x46
go.opentelemetry.io/collector/service.NewCommand.func1(0xc00063d200, {0x71e628b?, 0x2?, 0x2?})
	go.opentelemetry.io/collector@v0.64.1/service/command.go:53 +0x479
github.com/spf13/cobra.(*Command).execute(0xc00063d200, {0xc00019a190, 0x2, 0x2})
	github.com/spf13/cobra@v1.6.1/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc00063d200)
	github.com/spf13/cobra@v1.6.1/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.6.1/command.go:968
main.runInteractive({{0xc00107a4e0, 0xc00107b6b0, 0xc00107a900, 0xc0004199e0}, {{0x720c2dc, 0xf}, {0x7283ffd, 0x1f}, {0x71e023a, 0x6}}, ...})
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:32 +0x5d
main.run(...)
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:11
main.main()
	github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:25 +0x1d8

cc: @jpkrohling

I moved this to the collector core repository. I don’t think this warrants a patch release, as this is triggered only on invalid configuration options.

@jpkrohling and @pavankrish123 is there any possibility you are able to help out with that local sample I’m trying to work out? In the meantime? The @jpkrohling example blog post does a local install of OpenTelemetry outside docker it seems?

If not no worries…

Will try few things on my end and get back to you soon @Depechie. Been busy last couple of days.

@Depechie please refer to this blog written by our friend @jpkrohling on how to create some dummy certs is TLS setup section. It’s fairly easy with cfssl tool.

You see the collector refuses to connect over plain text channel.