trino: Column type mismatch between partition schema and table schema- table schema is string & partition schema is decimal(4,0)

Query fails when table schema is defined as string but partition schema is decimal. Error message below:

Query failed (#1111): There is a mismatch between the table and partition schemas. The types are incompatible and cannot be coerced. The column 'aaaa' in table 'ttttt' is declared as type 'string', but partition ‘pppp’ declared column 'aaaa' as type 'decimtal(4,0)'. [DB Errorcode=16777224]

The Hive Metastore has this column aaaa defined as type string but older partitions (like pppp) of parquet schema has that column defined as type decimal. Hive is able to read from the older partitions.

Ended up in this situation as the field type had to be changed from decimal to string, for business reasons.

Can Presto cast decimal to type string in above situation to return a result? (Not asking for string to decimal casting)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (11 by maintainers)

Most upvoted comments

For Hive setup, you can use this.

Hive 1.2

./presto-product-tests-launcher/bin/run-launcher env up --environment singlenode --without-presto

Hive 3.1

./presto-product-tests-launcher/bin/run-launcher env up --environment singlenode-hdp3 --without-presto

Our general rule for the Hive connector is that it should have the same behavior as Hive. If Hive can read the data in this evolution case, then we should as well (returning the same result).

I’m willing to pick this up starting the coming weekend. This is a pain point for me too.

The existing workarounds are to create a new external table with the problematic column removed but it would be nice to support such conversions where possible.

The Hive connector should support any conversion that is supported consistently in Hive (certain conversions work different for different file formats which is harder). Any numeric type to string should work. It starts with HiveCoercionPolicy and likely needs to actually be implemented elsewhere.

This commit is a good starting point to see what needs to be changed and where the tests go: https://github.com/prestosql/presto/commit/f0eb4f32f34d2e12eb1bdc0913b3a0602bbac642