You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Mustafa Iman (Jira)" <ji...@apache.org> on 2020/05/12 21:57:00 UTC

[jira] [Created] (TEZ-4177) Improve error message for external orc table

Mustafa Iman created TEZ-4177:
---------------------------------

             Summary: Improve error message for external orc table
                 Key: TEZ-4177
                 URL: https://issues.apache.org/jira/browse/TEZ-4177
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Mustafa Iman
            Assignee: Mustafa Iman


Since there is no schema validation for external tables, users may face various errors if their orc data and external table schema does not match. If orc schema has fewer columns than projection OrcEncodedDataConsumer may receive an incomplete TypeDescription array which will manifest itself as NullPointerException later.

We can at least verify that OrcEncodedDataConsumer gets enough TypeDescriptions. If assertion fails, user sees there is something wrong with the schema and hopefully resolves the problem quickly. If there are enough columns in the file but the schema of the query does not match, user generally sees a ClassCastException. If there are enough columns and types accidentally match, there is nothing we can do as this is an external table.

We have seen this when trying to use a managed table as external table location. Although user facing schemas are the same, managed table has acid related metadata. I am adding a q file demonstrating NullPointerException with TestMiniLlapLocalCliDriver and the output after the fix. I haven't added this to precommit tests as it is hard to assert the exception message from mini driver framework and effectively it is just changing the error.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)