You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by HanumathRao <gi...@git.apache.org> on 2017/10/16 21:01:38 UTC

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

GitHub user HanumathRao opened a pull request:

    https://github.com/apache/drill/pull/996

    DRILL-5878: TableNotFound exception is being reported for a wrong sto…

    …rage plugin.
    
    @paul-rogers  @chunhui-shi Please review these changes. These changes are for reporting a meaningful error in case of wrong storage plugin is specified in the query. Currently Drill reports TableNotFound exception for
    1) if no storage plugin exists.
    2) if no table with the tablename exists.
    These changes report SchemaNotFound exception for case 1) and TableNotFound exception for 2).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HanumathRao/drill DRILL-5878

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/996.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #996
    
----
commit 1eba3a7234fbcf61659a1667aebc5034081b4984
Author: Hanumath Rao Maduri <hm...@maprtech.com>
Date:   2017-09-16T23:54:00Z

    DRILL-5878: TableNotFound exception is being reported for a wrong storage plugin.

----


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r147818548
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    --- End diff --
    
    done.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145565621
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -149,6 +168,16 @@ public static void throwSchemaNotFoundException(final SchemaPlus defaultSchema,
             .build(logger);
       }
     
    +  /** Utility method to throw {@link UserException} with context information */
    +  public static void throwSchemaNotFoundException(final String defaultSchema, final String givenSchemaPath) {
    +    throw UserException.validationError()
    +        .message("Schema [%s] is not valid with respect to either root schema or current default schema.",
    --- End diff --
    
    Because names can contain dots, we must more careful format the schema. The schema should consists of a list (array) of name parts. For each name:
    
    * If the name contains a dot, wrap it in back ticks: `` `a.b` ``
    * Otherwise, use the name as is: `c`
    * Concatenate the name parts with dots: `` `a.b`.c ``
    
    Code to do this might already exist. @vvysotskyi may know.


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @arina-ielchiieva I think this check shouldn't cause much of a performance impact as it is in parser code and also it is checked right now by the DRILL's custom overload of getTable. If this function throws an exception then it is caught and reported to the user. I think there can be improvement to not check this by every code path(i.e valid code path as well) and only check when super.getTable returns null, in this way we can report error only when we couldn't find a table. Please let me know if this approach seems reasonable, so that I can go ahead and change the code accordingly.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146118431
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    +   * @param defaultSchema default schema
    +   * @param schemaPath current schema path
    +   * @param isCaseSensitive true if caseSensitive comparision is required.
    +   * @return common prefix schemaPath
    +   */
    +  public static String getPrefixSchemaPath(final String defaultSchema, final String schemaPath, final boolean isCaseSensitive) {
    +    if (!isCaseSensitive) {
    +      return Strings.commonPrefix(defaultSchema.toLowerCase(), schemaPath.toLowerCase());
    --- End diff --
    
    @paul-rogers  Thank you for this catch. I did try out some examples which were throwing schemanotfound exception. I have changed the code to handle it. getDefaultSchemaPath of session object returns the defaultSchema as String, hence I converted the SchemaPath to String by SchemaPathJOINER. As pointed out by your comment, this will loose some context information if the "." is part of the name. However, I also think that there is a bug in the defaultSchema code path as it is also loosing this information by converting it to String. For now I am not fixing that bug (which I am planning to fix in another JIRA) as I am not sure   about the changes in setting defaultSchema.
    
    Coming on to the specific valid use case which was throwing schemaNotFound exception is
    in dfs I have introduced "tmp.tmp1" in place of tmp.
    
    And if I query select * from dfs.`tmp.tmp1`.`employee.json` limit 1; this query reports SchemaNotFound exception even though a workspace with name tmp.tmp1 exists.
    
    This is currently working with new changes.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao closed the pull request at:

    https://github.com/apache/drill/pull/996


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @HanumathRao thanks for making the changes. 
    +1, LGTM.


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/996
  
    So does it mean that schema validation will be done twice: first in Drill and then in Calcite? Will it influence query parsing performance?


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145231215
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SqlConverter.java ---
    @@ -481,6 +485,19 @@ public RelOptTableImpl getTable(final List<String> names) {
                 .message("Temporary tables usage is disallowed. Used temporary table name: %s.", names)
                 .build(logger);
           }
    +
    +      // Check the schema and throw a valid SchemaNotFound exception instead of TableNotFound exception.
    --- End diff --
    
    Does it mean that Calcite instead of returning schema not found exception returns table not found exception?
    Per my understanding this PR customizes Drill but what if we go different path and enhance Calcite (or may be this is already done in newer Calcite versions)?


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146816584
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    --- End diff --
    
    Could you please add example in Java doc?


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r147896927
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/dfs/TestFileSelection.java ---
    @@ -63,4 +63,17 @@ public void testEmptyFolderThrowsTableNotFound() throws Exception {
         }
       }
     
    --- End diff --
    
    done


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146818260
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/dfs/TestFileSelection.java ---
    @@ -63,4 +63,17 @@ public void testEmptyFolderThrowsTableNotFound() throws Exception {
         }
       }
     
    --- End diff --
    
    Maybe it makes sense to move new unit tests and `testEmptyFolderThrowsTableNotFound` into separate class which will test behavior when incorrect object is defined (i.e. schema, workspace, table)?


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146816101
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/dfs/TestFileSelection.java ---
    @@ -63,4 +63,17 @@ public void testEmptyFolderThrowsTableNotFound() throws Exception {
         }
       }
     
    +  @Test(expected = Exception.class)
    +  public void testWrongSchemaThrowsSchemaNotFound() throws Exception {
    +    final String table = String.format("%s/empty", TestTools.getTestResourcesPath());
    +    final String query = String.format("select * from dfs1.`%s`", table);
    +    try {
    +      testNoResult(query);
    +    } catch (Exception ex) {
    +      final String pattern = String.format("[[dfs1]] is not valid with respect to either root schema or current default schema").toLowerCase();
    +      final boolean isSchemaNotFound = ex.getMessage().toLowerCase().contains(pattern);
    +      assertTrue(isSchemaNotFound);
    +      throw ex;
    +    }
    +  }
    --- End diff --
    
    Can you please add test case for the incorrect workspace, 
    a. `select * from dfs.incorrect_wk.table;`
    b. 
    ```
    use dfs; 
    select * from incorrect_wk.table;
    ```
    I assume it will return incorrect schema exception as well? 


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @arina-ielchiieva Thank you for the review comments. I have modified the code accordingly. Please let me know if anything needs to be changed.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145565023
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -121,6 +137,9 @@ public static String getSchemaPath(List<String> schemaPath) {
         return SCHEMA_PATH_JOINER.join(schemaPath);
       }
     
    +  /** Utility method to get the schema path from one or more strings. */
    +  public static String getSchemaPath(String s1, String s2, String... rest) { return SCHEMA_PATH_JOINER.join(s1,s2,rest); }
    --- End diff --
    
    Can't join the names for reasons cited above.


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @HanumathRao please close the pull request.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145268123
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SqlConverter.java ---
    @@ -481,6 +485,19 @@ public RelOptTableImpl getTable(final List<String> names) {
                 .message("Temporary tables usage is disallowed. Used temporary table name: %s.", names)
                 .build(logger);
           }
    +
    +      // Check the schema and throw a valid SchemaNotFound exception instead of TableNotFound exception.
    --- End diff --
    
    Thank you for the review. I agree that this should ideally be handled at Calcite layer. I also think that even after Calcite providing this functionality there should be some customization that needs to be done as we understand the context better than Calcite. Once the calcite fixes this issue then we can always change the code accordingly. 


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145564952
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    +   * @param defaultSchema default schema
    +   * @param schemaPath current schema path
    +   * @param isCaseSensitive true if caseSensitive comparision is required.
    +   * @return common prefix schemaPath
    +   */
    +  public static String getPrefixSchemaPath(final String defaultSchema, final String schemaPath, final boolean isCaseSensitive) {
    +    if (!isCaseSensitive) {
    +      return Strings.commonPrefix(defaultSchema.toLowerCase(), schemaPath.toLowerCase());
    --- End diff --
    
    Is the schema a single name? Probably not since we are talking about common prefixes.
    
    Drill supports names with dots. This is done by using an object per name part. (A `NamePart` in a `SchemaPath`.) Here are we concatenating the names? If so, won't e have false matches?
    
    ```
    ['a', 'b.c'] --> "a.b.c"
    ['a.b', 'c'] --> "a.b.c"
    ```
    The above are two distinct paths, but they will (wrongly) compare equals when concatenated.
    
    Instead, do we want a list of name parts, and want to compare the names part by part? Does Calcite represent the path as a collection of objects? (If not, we've got larger problems...)


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/996
  
    Well, with the short-term solution I am afraid that eventually we'll forget to revert it when Calcite checks take over and we'll be checking for schema twice.
    
    @HanumathRao before going any further with this pull request, could you please investigate the following:
    1. Does newer Calcite versions handle this case?
    2. If not, can this be handled in Calcite? If yes, please create Jira in Calcite. If possible create fix and check if Calcite community will accept it.
    



---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146815141
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SqlConverter.java ---
    @@ -481,6 +485,19 @@ public RelOptTableImpl getTable(final List<String> names) {
                 .message("Temporary tables usage is disallowed. Used temporary table name: %s.", names)
                 .build(logger);
           }
    +
    +      // Check the schema and throw a valid SchemaNotFound exception instead of TableNotFound exception.
    --- End diff --
    
    Could you please factor out this logic in a separate method?


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/996
  
    Merged in 7a2fc87ee20f706d85cb5c90cc441e6b44b71592.  @HanumathRao  pls close the PR. 


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r145564522
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    +   * @param defaultSchema default schema
    +   * @param schemaPath current schema path
    +   * @param isCaseSensitive true if caseSensitive comparision is required.
    +   * @return common prefix schemaPath
    +   */
    +  public static String getPrefixSchemaPath(final String defaultSchema, final String schemaPath, final boolean isCaseSensitive) {
    --- End diff --
    
    Isn't Drill always case insensitive since it follows SQL naming rules? We've often discussed how to handle case insensitive data sources, but we have no uniform, workable design. Is there a pattern we are following here?


---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @arina-ielchiieva Thank your for the comments. There is some work that went into calcite to handle meaningful error messages. This is the checkin that has those changes.
    https://github.com/apache/calcite/commit/5f9c019080c7231acaf3df80732d915351051d93#diff-0c11f3f4d738e3fa55968eb19f1c8050
    
    It reports following errors when a table cannot be resolved.
    {code}
    select empid from "hr".emps;
    Object 'EMPS' not found within 'hr'; did you mean 'emps'?
    !error
    {code}
    
    However, I think the error logic should be customized to particular software(in this case DRILL) so as to report semantically meaningful error messages. Drill knows more about the context and hence can provide more customized error messages to the end user. 



---

[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/996
  
    @paul-rogers thanks for approving pull request but actually before merging these changes in Drill (since I am not still convinced we need them), I was expecting @HanumathRao to do investigation if newer Calcite versions handle this case, if yes, maybe postpone this changes till the Calcite upgrade or cherry-pick them in Drill Calcite (if urgent). If not, create Jira in Calcite project and provide fix, and maybe cherry-pick this very fix once done (again if urgent). Since if all these checks can be done in Calcite we don't need to customize Drill.


---

[GitHub] drill pull request #996: DRILL-5878: TableNotFound exception is being report...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/996#discussion_r146118332
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java ---
    @@ -77,6 +77,22 @@ public static SchemaPlus findSchema(final SchemaPlus defaultSchema, final String
         return findSchema(defaultSchema, schemaPathAsList);
       }
     
    +  /**
    +   * Utility function to get the commonPrefix schema between two supplied schemas.
    +   * @param defaultSchema default schema
    +   * @param schemaPath current schema path
    +   * @param isCaseSensitive true if caseSensitive comparision is required.
    +   * @return common prefix schemaPath
    +   */
    +  public static String getPrefixSchemaPath(final String defaultSchema, final String schemaPath, final boolean isCaseSensitive) {
    --- End diff --
    
    @paul-rogers  Thanks for the review. For this API I am assuming the caller will supply the information whether it needs to be a case sensitive or case insensitive comparison. The caller for this function is passing the parserConfig case sensitive parameter.


---