MQTTnet: AWS IoT Core disconnects with an AUTHORIZATION_ERROR randomly between 30 seconds and 5 minutes
An example project reproducing the issue as well as automating the deploying of AWS resources can be found here: https://github.com/TCROC/aws-iot-custom-auth
And for Windows 10 users who may have issues cross compiling for ARM64 Linux, a precompiled zip is here: https://github.com/TCROC/aws-iot-custom-auth/releases/download/precompiled-arm64-lambda/aws-iot-auth-issues.zip
For ease of reading, I’ll copy the README from that project in here as it explains the issue at hand:
README
aws-iot-custom-auth
Dependencies
Tested on Ubuntu 22.04 and Windows 10.
Windows 10 requires WSL Ubuntu 22.04 for cross compiling to ARM64 processors.
- Install git: https://git-scm.com/downloads
- NOTE: Reproduced with version:
git version 2.40.1
- NOTE: Reproduced with version:
- Install the rust toolset: https://www.rust-lang.org/tools/install
- NOTE: Reproduced with version:
rustup 1.26.0 (5af9b9484 2023-04-05), cargo 1.69.0 (6e9a83356 2023-04-12), rustc 1.69.0 (84c898d65 2023-04-16)
- NOTE: Reproduced with version:
- Install cargo lambda: https://github.com/awslabs/aws-lambda-rust-runtime
- NOTE: Reproduced with version:
cargo-lambda 0.19.0 (e7a2b99 2023-04-07Z)
- NOTE: Reproduced with version:
- Install aws cli v2: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
- NOTE: Reproduced with version:
aws-cli/2.11.15 Python/3.11.3 Linux/6.2.6-76060206-generic exe/x86_64.pop.22 prompt/off
- NOTE: Reproduced with version:
- Install dotnet 7.0: https://dotnet.microsoft.com/en-us/download/dotnet/7.0
- NOTE: Reproduced with version:
7.0.203
- NOTE: Reproduced with version:
- Clone:
git clone https://github.com/TCROC/aws-iot-custom-auth.git --recurse-submodules
NOTE: When running the scripts, you can ignore the aws cli errors that are logged. The scripts do things such as check if the lambda function is deployed by calling aws lambda get-function . If the command errors, the script assumes it doesn’t exist in the cloud and attempts to create one.
Create Lambda Authorizer
Run in a bash shell:
./create-lambda.sh
Create certificate
Run in a bash shell:
./create-cert.sh
Test Lambda Authorizer
Run in a bash shell:
./run-client-lambda.sh
Expected result: The mqtt client sends keep alive packets for 24 hours as specified in the policy returned from the lambda function.
Actual result: The mqtt client is disconnected anywhere between 30 seconds and 5 minutes.
Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/config-custom-auth.html
You can test the response of the authorizer in the console: https://docs.aws.amazon.com/lambda/latest/dg/testing-functions.html
Example test event:
NOTE: The password is testpassword base64 encoded
{
"token": "aToken",
"signatureVerified": false,
"protocols": [
"tls",
"http",
"mqtt"
],
"protocolData": {
"tls": {
"serverName": "serverName"
},
"http": {
"headers": {
"#{name}": "#{value}"
},
"queryString": "?#{name}=#{value}"
},
"mqtt": {
"username": "test",
"password": "dGVzdHBhc3N3b3Jk",
"clientId": "testid"
}
},
"connectionMetadata": {
"id": "UUID"
}
}
Example result:
{
"isAuthenticated": true,
"principalId": "testid",
"disconnectAfterInSeconds": 86400,
"refreshAfterInSeconds": 86400,
"policyDocuments": [
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iot:Connect"
],
"Resource": [
"arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
],
"Condition": {
"ArnEquals": {
"iot:LastWillTopic": [
"arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
]
}
}
},
{
"Effect": "Allow",
"Action": [
"iot:Receive"
],
"Resource": [
"arn:aws:iot:us-east-1:144868213084:topic/open/*"
],
"Condition": {}
},
{
"Effect": "Allow",
"Action": [
"iot:Publish"
],
"Resource": [
"arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
"arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
"arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
],
"Condition": {}
},
{
"Effect": "Allow",
"Action": [
"iot:Subscribe"
],
"Resource": [
"arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
"arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
"arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
"arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
],
"Condition": {}
}
]
}
]
}
Test Lambda Certificates
Run in a bash shell:
./run-client-cert.sh
Expected result: The mqtt authenticates and connects to IoT.
Actual Result: The client is immediately disconnected due to authorization error.
Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/x509-client-certs.html
Cleanup
The lambda functions, authorizers, and certificates in aws will be deleted.
Run in a bash shell:
./clean-aws.sh
Other Information
I’ve tested this in Unity / Mono and dotnet 7 with websocket4net, dotnet, tcp, and websocket transports. The issue reproduces in all of them.
I am also in discussions with AWS on this particular issue. At the moment, we have not been able to narrow down if it is a client side issue with MQTTnet or on AWS’s side.
I’ve been in an email chain with AWS Support and have recently opened up a ticket here: https://repost.aws/questions/QU-cOKeWC1TACCu_LVjS5BWw/iot-custom-authorizer-not-respecting-refreshafterinseconds-or-disconnectafterinseconds-in-returned-policy
I’m hoping we can narrow down on who’s end the issue is and get this resolved soon as it is currently holding up the mobile release of our project: https://store.steampowered.com/app/1343040/Blocky_Ball/
We are using IoT with custom authorizers to bridge AWS and Microsoft Azure PlayFab.
If you would like me to organize a debugging session between this team and the AWS team, let me know and I can see if I can pull some people together.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 47 (7 by maintainers)
Glad to hear that. I will also create a PR with the changes I made because they are useful for others.
Thank you 😃
I wouldn’t mind having another look at the Last Will condition sometime towards the end of the week, so I’d suggest keeping the issue open for a bit longer.
Ok cool. I just updated my authorizer to “allow all”. 🤞 it stays connected.
These were both run with dotnet 7 on Linux. I’ll go boot up my Windows machine and check quick one sec.
Whats make me wonder by just having a short view over the logs is that there is the following line (in thje last log):
How can it be that the connection is accepted AND the reason code is “not authorized” at the same time.
I will check the code…