ClickHouse: Kafka Engine returns 401 Authentication issue when connecting to Confluent schema registry with a URL-encoded basic authentication URL

Describe the unexpected behavior I was using Kafka engine to connect a Kafka topic, everything works with a JSON topic. When I switch to using Avro and need to connect to the Confluent cloud schema registry using a URL-encoded basic authentication enabled URL, I got 401 authentication error.

How to reproduce

  • Which ClickHouse server version to use 21.5.5.12 (i tried on 21.12.2.17 as well and it didn’t work either)

  • Non-default settings, if any format_avro_schema_registry_url

  • CREATE TABLE statements for all tables involved I used the below create table command. Please note in format_avro_schema_registry_url I had to url-encode the secret because it contains /. The raw url provided in format_avro_schema_registry_url works in the browser.

CREATE TABLE demo ( session_id String, event_time DateTime(‘US/Pacific’) Codec(DoubleDelta, LZ4) ) ENGINE = Kafka() SETTINGS kafka_broker_list = ‘pkc-pgq85.us-west-2.aws.confluent.cloud:9092’, kafka_topic_list = ‘topic’, kafka_group_name = ‘cgname’, kafka_format = ‘AvroConfluent’, format_avro_schema_registry_url = ‘https://OZIHXXXXXXXXXX:WIs%2F8x2BngyrahGkl%2B%2FR%2ByOX8a95pi%XXXXXXXXXXXXXX@psrc-gn6wr.us-east-2.aws.confluent.cloud’;

  • Sample data for all these tables, use clickhouse-obfuscator if necessary
  • Queries to run that lead to unexpected result

Expected behavior A clear and concise description of what you expected to happen. I would expect either a different configuration for providing authentication information or make the url-encoded basic authentication url work.

Error message and/or stacktrace If applicable, add screenshots to help explain your problem.

I noticed the following in the logs

{}
 <Error> void DB::StorageKafka::threadFunc(size_t): Code: 86, e.displayText() = DB::Exception: Received error from remote server /schemas/ids/100017. HTTP status code: 401 Unauthorized, body: 
{
    "error_code": 401,
    "message": "Unauthorized"
}
: while fetching schema id = 100017: while parsing Kafka message (topic: predictions-keystroke-demo, partition: 1, offset: 9025)', Stack trace (when copying this message, always include the lines below):

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 17 (7 by maintainers)

Commits related to this issue

Most upvoted comments

Hi,

This problem is related only with the schema registry authentication and not with the encoding of the URL. I tested locally with a nginx reverse proxy with and without auth enabled, and it works without auth no matter the URL encoding and it does not work with auth enabled.

format_avro_schema_registry_url = 'http://SDDE%2hjkdkk:DAj%2kl09hasERd@localhost:8088/'

ClickHouse does not expect in the format_avro_schema_registry_url an URL with auth credentials, it expects a simple url because it will not perform any authentication challenge:

https://github.com/ClickHouse/ClickHouse/blob/7448cd2110c1f5684415bfc064e4e7aa75db12a8/src/Processors/Formats/Impl/AvroRowInputFormat.cpp#L777

It just treats base_url as Poco::URI AFAIK no authentication involved.

So a temporal fix will be to use nginx as a reverse proxy disabling auth. It would be nice if this can be fast-tracked and authentication can be added to the format_avro_schema_registry_url logic.

Thanks for your clarification. I got it! I will fix it ASAP.