trino: Fetching all hive metadata failed issues
Hi all. I’m trying to find solution about fetching all hive metadata tables. Here is the information.
Environment
trino version - 419 hive version - 2.3.3 or 3.1.3
hive.metastore-timeout=5m
Problem
When I execute below query, it failed due to hive metastore timeout error. but under trino 417 works well within 3~4 minutes.
select * from hive.information_schema.tables
Root Cause
https://github.com/trinodb/trino/pull/17127
after 418 version, fetching all hive metadata logic has been changed.
For the < 418 version, it follows below logics
- get all schemas
- get all tables each schema
- concat the results and return it.
For the >= 418 version, it follows below logics
- get all tables at once.
- concat the results and return it.
This changes may happened too much load for hive metastore so that needed lots of memory compared with before. In my case, # of tables are around 500,000 so it definitely get too much stress for hivemetastore.
Is there any solution about this?
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (13 by maintainers)
So this is kind of the bug or issue in HMS. However, since we now expose the issue should we have a kill switch for the https://github.com/trinodb/trino/pull/17127? @findepi @huberty89 ?
Hi @kokosing @huberty89 . I’ve tested #18274 and I’ve confirmed that query was executed well. Thanks for support!
@huberty89 would you like to post a PR and create a kill switch?