hono: Finding credentials by tenant-id, auth-id and type does not use MongoDb index

The MongoDbDocumentBuilder class uses MongoDb’s $elemMatch operator for filtering credentials by tenant-id, auth-id and type. For this purpose an index on those fields was created. When using MongoDb’s explain it turned out that this index is in fact not used:

db.getCollection('device-credentials').find({
    "tenant-id": "foobar",
    "credentials": {
      "$elemMatch": {
        "auth-id": "spamegg",
        "type": "hashed-password"
      }
    }
  }).explain()
  
# RETURNS
{
    "queryPlanner" : {
        ...
        },
        "winningPlan" : {
            ...
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "tenant-id" : 1,
                    "device-id" : 1
                },
                "indexName" : "tenant-id_1_device-id_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "tenant-id" : [],
                    "device-id" : []
                },
                "isUnique" : true,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "tenant-id" : [ 
                        "[\"foobar\", \"foobar\"]"
                    ],
                    "device-id" : [ 
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : []
    },
    ...
}

Actually an index on tenant-id and device-id is used and a scan is done for all device ids (see the “[MinKey, MaxKey]” value for “device-id”.

As far as I can see there is no need to the $elemMatch operator for this query as it is only a simple match of two values and not more complicated match resorting to other MongoDb funtions ($gt, $and, $or,…).

If the query is done without the $elemMatch operator the index is used as expected:

db.getCollection('device-credentials').find({
    "tenant-id": "foobar",
    "credentials.auth-id": "spamegg",
    "credentials.type": "hashed-password"
  }).explain()
  
# RETURNS
{
    "queryPlanner" : {
        ...
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "credentials.type" : {
                    "$eq" : "hashed-password"
                }
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "tenant-id" : 1,
                    "credentials.auth-id" : 1,
                    "credentials.type" : 1
                },
                "indexName" : "tenant-id_1_credentials.auth-id_1_credentials.type_1",
                "isMultiKey" : true,
                "multiKeyPaths" : {
                    "tenant-id" : [],
                    "credentials.auth-id" : [ 
                        "credentials"
                    ],
                    "credentials.type" : [ 
                        "credentials"
                    ]
                },
                "isUnique" : true,
                "isSparse" : false,
                "isPartial" : true,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "tenant-id" : [ 
                        "[\"foobar\", \"foobar\"]"
                    ],
                    "credentials.auth-id" : [ 
                        "[\"spamegg\", \"spamegg\"]"
                    ],
                    "credentials.type" : [ 
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ 
            ...
        ]
    },
    ...
}

From https://github.com/eclipse/hono/commit/d75a28e7276a055fd5dc72264f8016355cb80223 I understand that the $elemMatch operator was used since the credentials to be returned should not contain sensitive information. Hence MongoDb’s projection operator was used. This did not work with a “plain” filter which is why the $elemMatch operator was introduced.

Since the result set returned by the query should usually be only one credential (consequently there’s no big advantage using the projection so that traffic is reduced), I think that commit could be reverted and the sensitive fields could be removed “in code”.

WDYT?

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 19 (19 by maintainers)

Commits related to this issue

[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to bosch-io/hono by sophokles73 3 years ago
[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to bosch-io/hono by sophokles73 3 years ago
[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to bosch-io/hono by sophokles73 3 years ago
[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to eclipse-hono/hono by sophokles73 3 years ago
[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to eclipse-hono/hono by sophokles73 3 years ago
[#2792] Use index when finding credentials by type and auth-id The MongoDB based Credentials service implementation does not use the existing index for finding credentials by tenant, auth-id and type... — committed to eclipse-hono/hono by sophokles73 3 years ago

Most upvoted comments

Ok, it is not a partial index. So, we need the unique index on (tenant-id,auth-id,type) in order to enforce the existing constraints when adding credentials and we can use your new index to speed up querying 😃

sophokles73 on Jul 27, 2021

Nice idea. Execution time will mostly depend on the number of tenants that contain credentials with the same (auth-id,type) tuple then. But IMHO we can assume that to be not that many.

sophokles73 on Jul 27, 2021

Since the result set returned by the query should usually be only one credential (consequently there’s no big advantage using the projection so that traffic is reduced), I think that commit could be reverted and the sensitive fields could be removed “in code”.

If I am not mistaken then the result set will contain all of the device’s credentials if the projection is omitted. We can, of course, manually filter the result set before returning it to the client …

sophokles73 on Jul 26, 2021