You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2020/01/27 01:06:00 UTC
[jira] [Comment Edited] (DRILL-7551) Improve Error Reporting

    [ https://issues.apache.org/jira/browse/DRILL-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023322#comment-17023322 ] 

Paul Rogers edited comment on DRILL-7551 at 1/27/20 1:05 AM:
-------------------------------------------------------------

Fixing errors has a number of dimensions:
 # Inconsistent use of exceptions at runtime. We have {{UserException}} which creates some structure, but we also throw random other unchecked exceptions. \{{UserException}}s do not, however, provide a mapping into SQL errors of the type understood by xDBC drivers.
 # Inconsistent error context. A low level bit of code (a file open call, say) only knows that it failed and that is what it tends to report: ("IO Error 10".) At the next level up, the surrounding code might know a bit more. ("Error reading HDFS:/foo/bar1234.parquet".) What we need is a bit of synthesis to say, ("Too many network timeouts reading block 17 from the bar1234.parquet of the `foo` table stored in the HDFS system `sales`".)
 # Errors are exceptions and we are overly generous in showing every last bit of stack trace on the client, the server and so on. Even those of us who live in the code find that the few lines we care about (NPE in such-and-such call stack) is lost in hundreds of lines that, frankly, I've never personally looked at.
 # The client API is a bit of a mess in error reporting: returning unchecked {{UserException}}s rather than a well-structured {{DrillException}} (say) designed for client use. (This is probably because the Drill client was a quick short-term solution based on Drill's internal Drillbit-to-Drillbit RPC.)
# Catch errors as early as possible. Example: plan-time type checking (eventually), storage plugin validation in the UI (see comment below.)

In addition to the above execution-focused items, it would be good to look at the SQL parser/planner errors as well. Not sure that returning 20-30 lines of possible tokens is super-helpful when I make a SQL typo. Probably fine to say, "Didn't understand the SQL at line 10, position 3.");

To clean up our error act, we must move forward on each of these fronts.

For my part, I've been chipping away at item 1: trying to convert all code to throw {{UserException}}. EVF provides an "error context" that helps (but does not solve) item 2. I've also made a pass on items 3 & 4, but have been hesitant to make any changes to the client API for fear of breaking the two JDBC drivers and our (currently unstaffed) C++ client.

Would be great to get some help. For example, how can we provide user-meaningful context in our errors (Item 2)? How can we map errors in to standard SQL error and warning codes (part of item 1)? Maybe someone can help us figure out how to achieve item 4 with minimal client impact. And, of course, once we set the pattern we want to use, everyone can help by improving each of the many places were we raise exceptions.

Item 5 can be done independently of other tasks.


was (Author: paul.rogers):
Fixing errors has a number of dimensions:
 # Inconsistent use of exceptions at runtime. We have {{UserException}} which creates some structure, but we also throw random other unchecked exceptions. \{{UserException}}s do not, however, provide a mapping into SQL errors of the type understood by xDBC drivers.
 # Inconsistent error context. A low level bit of code (a file open call, say) only knows that it failed and that is what it tends to report: ("IO Error 10".) At the next level up, the surrounding code might know a bit more. ("Error reading HDFS:/foo/bar1234.parquet".) What we need is a bit of synthesis to say, ("Too many network timeouts reading block 17 from the bar1234.parquet of the `foo` table stored in the HDFS system `sales`".)
 # Errors are exceptions and we are overly generous in showing every last bit of stack trace on the client, the server and so on. Even those of us who live in the code find that the few lines we care about (NPE in such-and-such call stack) is lost in hundreds of lines that, frankly, I've never personally looked at.
 # The client API is a bit of a mess in error reporting: returning unchecked {{UserException}}s rather than a well-structured {{DrillException}} (say) designed for client use. (This is probably because the Drill client was a quick short-term solution based on Drill's internal Drillbit-to-Drillbit RPC.)

In addition to the above execution-focused items, it would be good to look at the SQL parser/planner errors as well. Not sure that returning 20-30 lines of possible tokens is super-helpful when I make a SQL typo. Probably fine to say, "Didn't understand the SQL at line 10, position 3.");

To clean up our error act, we must move forward on each of these fronts.

For my part, I've been chipping away at item 1: trying to convert all code to throw {{UserException}}. EVF provides an "error context" that helps (but does not solve) item 2. I've also made a pass on items 3 & 4, but have been hesitant to make any changes to the client API for fear of breaking the two JDBC drivers and our (currently unstaffed) C++ client.

Would be great to get some help. For example, how can we provide user-meaningful context in our errors (Item 2)? How can we map errors in to standard SQL error and warning codes (part of item 1)? Maybe someone can help us figure out how to achieve item 4 with minimal client impact. And, of course, once we set the pattern we want to use, everyone can help by improving each of the many places were we raise exceptions.

> Improve Error Reporting
> -----------------------
>
>                 Key: DRILL-7551
>                 URL: https://issues.apache.org/jira/browse/DRILL-7551
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.17.0
>            Reporter: Charles Givre
>            Priority: Major
>             Fix For: 1.18.0
>
>
> This Jira is to serve as a master Jira issue to improve the usability of error messages. Instead of dumping stack traces, the overall goal is to give the user something that can actually explain:
>  # What went wrong
>  # How to fix 
> Work that relates to this, should be created as subtasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)