ClickHouse: Kafka Engine returns 401 Authentication issue when connecting to Confluent schema registry with a URL-encoded basic authentication URL
Describe the unexpected behavior
I was using Kafka engine to connect a Kafka topic, everything works with a JSON topic. When I switch to using Avro and need to connect to the Confluent cloud schema registry using a URL-encoded basic authentication enabled URL
, I got 401 authentication error.
How to reproduce
-
Which ClickHouse server version to use 21.5.5.12 (i tried on 21.12.2.17 as well and it didn’t work either)
-
Non-default settings, if any format_avro_schema_registry_url
-
CREATE TABLE
statements for all tables involved I used the below create table command. Please note informat_avro_schema_registry_url
I had to url-encode the secret because it contains/
. The raw url provided in format_avro_schema_registry_url works in the browser.
CREATE TABLE demo ( session_id String, event_time DateTime(‘US/Pacific’) Codec(DoubleDelta, LZ4) ) ENGINE = Kafka() SETTINGS kafka_broker_list = ‘pkc-pgq85.us-west-2.aws.confluent.cloud:9092’, kafka_topic_list = ‘topic’, kafka_group_name = ‘cgname’, kafka_format = ‘AvroConfluent’, format_avro_schema_registry_url = ‘https://OZIHXXXXXXXXXX:WIs%2F8x2BngyrahGkl%2B%2FR%2ByOX8a95pi%XXXXXXXXXXXXXX@psrc-gn6wr.us-east-2.aws.confluent.cloud’;
- Sample data for all these tables, use clickhouse-obfuscator if necessary
- Queries to run that lead to unexpected result
Expected behavior A clear and concise description of what you expected to happen. I would expect either a different configuration for providing authentication information or make the url-encoded basic authentication url work.
Error message and/or stacktrace If applicable, add screenshots to help explain your problem.
I noticed the following in the logs
{}
<Error> void DB::StorageKafka::threadFunc(size_t): Code: 86, e.displayText() = DB::Exception: Received error from remote server /schemas/ids/100017. HTTP status code: 401 Unauthorized, body:
{
"error_code": 401,
"message": "Unauthorized"
}
: while fetching schema id = 100017: while parsing Kafka message (topic: predictions-keystroke-demo, partition: 1, offset: 9025)', Stack trace (when copying this message, always include the lines below):
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 2
- Comments: 17 (7 by maintainers)
Should be fixed in https://github.com/ClickHouse/ClickHouse/pull/49664
Hi,
This problem is related only with the schema registry authentication and not with the encoding of the URL. I tested locally with a nginx reverse proxy with and without auth enabled, and it works without auth no matter the URL encoding and it does not work with auth enabled.
format_avro_schema_registry_url = 'http://SDDE%2hjkdkk:DAj%2kl09hasERd@localhost:8088/'
ClickHouse does not expect in the
format_avro_schema_registry_url
an URL with auth credentials, it expects a simple url because it will not perform any authentication challenge:https://github.com/ClickHouse/ClickHouse/blob/7448cd2110c1f5684415bfc064e4e7aa75db12a8/src/Processors/Formats/Impl/AvroRowInputFormat.cpp#L777
It just treats
base_url
asPoco::URI
AFAIK no authentication involved.So a temporal fix will be to use nginx as a reverse proxy disabling auth. It would be nice if this can be fast-tracked and authentication can be added to the
format_avro_schema_registry_url
logic.Thanks for your clarification. I got it! I will fix it ASAP.