You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Peter Vary <pv...@cloudera.com> on 2017/04/05 10:35:45 UTC

Review Request 58203: HIVE-16345 BeeLineDriver should be able to run qtest files which are using default database tables

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58203/
-----------------------------------------------------------

Review request for hive, Aihua Xu, Zoltan Haindrich, Yongzhi Chen, and Barna Zsombor Klara.


Bugs: HIVE-16345
    https://issues.apache.org/jira/browse/HIVE-16345


Repository: hive-git


Description
-------

The goal of the change is to run qtest files which contain queries on tables created by the init scripts.
It adds the possibility to rewrite the src table references to default.src

This patch contains the following changes:
- Added new parameter to the driver, to control weather the rewrite the table names or not (test.rewrite.source.tables) - default is true
- Made QTestUtil.getSrcTables() available for QFile class
- Run the QFile not with "!run testfile.q", but reading the file, and assembling the commands - enable us to parse the queries, and provide better feedback about the failing queries
- QFile rewrites the source tables, if it is required
- Used 9 qtest files from the CliDriver, and added them to BeeLine tests
- Added new filters, and removed redundant ones - I was able to remove every QFile specific filter, and corresponding setter methods as well
- Moved QFile classes to org.apache.hive.beeline package, so it can use package private methods from BeeLine, and Commands
- Refactored needsContinuation method in BeeLine, so it can be called from a static context as well

And one important change is:
- In Utilities.setMapRedWork, change the INPUT_NAME value in the conf to a mapreduce task specific value. This one is used by the IOContextMap to cache the IOContext objects. Using the same value for every mapred task prevented them to run in the same JVM. The test were running sequencially, but failed randomly in parallel


Diffs
-----

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 
  itests/src/test/resources/testconfiguration.properties 7a70c9c 
  itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java 0d63f5d 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2abf252 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java 760fde6 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java fcd50ec 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 79955e9 
  ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 385f9b7 
  ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_10.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_11.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_12.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_13.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_16.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/58203/diff/1/


Testing
-------

Run the test multiple times with the various combinations of the following parameters:
- test.rewrite.source.tables - runs with true, or without it, fails when set to false
- junit.parallel.threads - runs with 1, or without this parameter


Thanks,

Peter Vary


Re: Review Request 58203: HIVE-16345 BeeLineDriver should be able to run qtest files which are using default database tables

Posted by Peter Vary <pv...@cloudera.com>.

> On April 5, 2017, 9:34 p.m., Yongzhi Chen wrote:
> >

Thanks for the fast review Yongzhi!


> On April 5, 2017, 9:34 p.m., Yongzhi Chen wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 130 (patched)
> > <https://reviews.apache.org/r/58203/diff/1/?file=1685056#file1685056line130>
> >
> >     How do you handle the case command has comment following ';' and new command start after ;  ? Do these cases matters?
> >     For example:
> >     show tables; --comment
> >     
> >     show tables; select * from
> >     src;
> >     
> >     The beeline.Commands class has code similar to getCommands:
> >     handleMultiLineCmd, logic in execute
> >     Could you figure out a way to use the some of the code there?

Previously I was thinking about this, but dropped the idea because I found some differences between the BeeLine query parsing and CLI query parsing.

After your comment I reconsidered, and decided to use the BeeLine version. It makes more sense to use the same parsing than the actual BeeLine, and if there are differences handle the appropriately.

Thanks for pointing this out!


> On April 5, 2017, 9:34 p.m., Yongzhi Chen wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
> > Lines 160 (patched)
> > <https://reviews.apache.org/r/58203/diff/1/?file=1685056#file1685056line160>
> >
> >     Is that possible the table belong to other database?
> >     For example:
> >     use foo;
> >     select * from tableinfoo;

You are right, that this can cause a problem. To highlight it, I added a warning message.
Luckily this is only a problem, if in the new database there are tables which are in the source table list.

The warning message looks like this:

The query file /Users/petervary/dev/upstream/hive/ql/src/test/queries/clientpositive/escape_comments.q contains "use escape_comments_db;" command
The source table name rewrite is turned on, so this might cause problems when the used database contains tables named any of the following: [src_cbo, cbo_t1, lineitem, src, cbo_t2, cbo_t3, part, src_thrift, alltypesorc, srcbucket, src_json, srcpart, src_hbase, src_sequencefile, src1, srcbucket2]
To turn off the table name rewrite use -Dtest.rewrite.source.tables=false


> On April 5, 2017, 9:34 p.m., Yongzhi Chen wrote:
> > itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java
> > Line 92 (original), 90 (patched)
> > <https://reviews.apache.org/r/58203/diff/1/?file=1685057#file1685057line92>
> >
> >     Why we need to replace the tablename with default.tablename? Could you just add use default ?

When running multiple tests in parallel there is a possibility that two different test try to create tables with the same name.
So we archive the separation by using different databases, and we have put extra effort into running the tests with the default tables.


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58203/#review171166
-----------------------------------------------------------


On April 6, 2017, 9:25 a.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58203/
> -----------------------------------------------------------
> 
> (Updated April 6, 2017, 9:25 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Zoltan Haindrich, Yongzhi Chen, and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-16345
>     https://issues.apache.org/jira/browse/HIVE-16345
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The goal of the change is to run qtest files which contain queries on tables created by the init scripts.
> It adds the possibility to rewrite the src table references to default.src
> 
> This patch contains the following changes:
> - Added new parameter to the driver, to control weather the rewrite the table names or not (test.rewrite.source.tables) - default is true
> - Made QTestUtil.getSrcTables() available for QFile class
> - Run the QFile not with "!run testfile.q", but reading the file, and assembling the commands - enable us to parse the queries, and provide better feedback about the failing queries
> - QFile rewrites the source tables, if it is required
> - Used 9 qtest files from the CliDriver, and added them to BeeLine tests
> - Added new filters, and removed redundant ones - I was able to remove every QFile specific filter, and corresponding setter methods as well
> - Moved QFile classes to org.apache.hive.beeline package, so it can use package private methods from BeeLine, and Commands
> - Refactored needsContinuation method in BeeLine, so it can be called from a static context as well
> 
> And one important change is:
> - In Utilities.setMapRedWork, change the INPUT_NAME value in the conf to a mapreduce task specific value. This one is used by the IOContextMap to cache the IOContext objects. Using the same value for every mapred task prevented them to run in the same JVM. The test were running sequencially, but failed randomly in parallel
> 
> 
> Diffs
> -----
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 
>   beeline/src/java/org/apache/hive/beeline/Commands.java 2578728 
>   itests/src/test/resources/testconfiguration.properties 7a70c9c 
>   itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java 0d63f5d 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2abf252 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java 760fde6 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java fcd50ec 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 79955e9 
>   ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 385f9b7 
>   ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_10.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_11.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_12.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_13.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_16.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_3.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/58203/diff/2/
> 
> 
> Testing
> -------
> 
> Run the test multiple times with the various combinations of the following parameters:
> - test.rewrite.source.tables - runs with true, or without it, fails when set to false
> - junit.parallel.threads - runs with 1, or without this parameter
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 58203: HIVE-16345 BeeLineDriver should be able to run qtest files which are using default database tables

Posted by Yongzhi Chen <yc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58203/#review171166
-----------------------------------------------------------




itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
Lines 130 (patched)
<https://reviews.apache.org/r/58203/#comment244045>

    How do you handle the case command has comment following ';' and new command start after ;  ? Do these cases matters?
    For example:
    show tables; --comment
    
    show tables; select * from
    src;
    
    The beeline.Commands class has code similar to getCommands:
    handleMultiLineCmd, logic in execute
    Could you figure out a way to use the some of the code there?



itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java
Lines 160 (patched)
<https://reviews.apache.org/r/58203/#comment244048>

    Is that possible the table belong to other database?
    For example:
    use foo;
    select * from tableinfoo;



itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java
Line 92 (original), 90 (patched)
<https://reviews.apache.org/r/58203/#comment244047>

    Why we need to replace the tablename with default.tablename? Could you just add use default ?


- Yongzhi Chen


On April 5, 2017, 10:35 a.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58203/
> -----------------------------------------------------------
> 
> (Updated April 5, 2017, 10:35 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Zoltan Haindrich, Yongzhi Chen, and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-16345
>     https://issues.apache.org/jira/browse/HIVE-16345
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The goal of the change is to run qtest files which contain queries on tables created by the init scripts.
> It adds the possibility to rewrite the src table references to default.src
> 
> This patch contains the following changes:
> - Added new parameter to the driver, to control weather the rewrite the table names or not (test.rewrite.source.tables) - default is true
> - Made QTestUtil.getSrcTables() available for QFile class
> - Run the QFile not with "!run testfile.q", but reading the file, and assembling the commands - enable us to parse the queries, and provide better feedback about the failing queries
> - QFile rewrites the source tables, if it is required
> - Used 9 qtest files from the CliDriver, and added them to BeeLine tests
> - Added new filters, and removed redundant ones - I was able to remove every QFile specific filter, and corresponding setter methods as well
> - Moved QFile classes to org.apache.hive.beeline package, so it can use package private methods from BeeLine, and Commands
> - Refactored needsContinuation method in BeeLine, so it can be called from a static context as well
> 
> And one important change is:
> - In Utilities.setMapRedWork, change the INPUT_NAME value in the conf to a mapreduce task specific value. This one is used by the IOContextMap to cache the IOContext objects. Using the same value for every mapred task prevented them to run in the same JVM. The test were running sequencially, but failed randomly in parallel
> 
> 
> Diffs
> -----
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 
>   itests/src/test/resources/testconfiguration.properties 7a70c9c 
>   itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java 0d63f5d 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2abf252 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java 760fde6 
>   itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java fcd50ec 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 79955e9 
>   ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 385f9b7 
>   ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_10.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_11.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_12.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_13.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_16.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_3.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/58203/diff/1/
> 
> 
> Testing
> -------
> 
> Run the test multiple times with the various combinations of the following parameters:
> - test.rewrite.source.tables - runs with true, or without it, fails when set to false
> - junit.parallel.threads - runs with 1, or without this parameter
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 58203: HIVE-16345 BeeLineDriver should be able to run qtest files which are using default database tables

Posted by Peter Vary <pv...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58203/
-----------------------------------------------------------

(Updated April 6, 2017, 9:25 a.m.)


Review request for hive, Aihua Xu, Zoltan Haindrich, Yongzhi Chen, and Barna Zsombor Klara.


Changes
-------

Instead of the needsContinuation method refactor moved the getCommands to BeeLine, so we can reuse the whole file parsing algorithm in the tests.
Modified the test classes to use the BeeLine.getCommands method
Added a warning message, when use database command is used in the test script


Bugs: HIVE-16345
    https://issues.apache.org/jira/browse/HIVE-16345


Repository: hive-git


Description
-------

The goal of the change is to run qtest files which contain queries on tables created by the init scripts.
It adds the possibility to rewrite the src table references to default.src

This patch contains the following changes:
- Added new parameter to the driver, to control weather the rewrite the table names or not (test.rewrite.source.tables) - default is true
- Made QTestUtil.getSrcTables() available for QFile class
- Run the QFile not with "!run testfile.q", but reading the file, and assembling the commands - enable us to parse the queries, and provide better feedback about the failing queries
- QFile rewrites the source tables, if it is required
- Used 9 qtest files from the CliDriver, and added them to BeeLine tests
- Added new filters, and removed redundant ones - I was able to remove every QFile specific filter, and corresponding setter methods as well
- Moved QFile classes to org.apache.hive.beeline package, so it can use package private methods from BeeLine, and Commands
- Refactored needsContinuation method in BeeLine, so it can be called from a static context as well

And one important change is:
- In Utilities.setMapRedWork, change the INPUT_NAME value in the conf to a mapreduce task specific value. This one is used by the IOContextMap to cache the IOContext objects. Using the same value for every mapred task prevented them to run in the same JVM. The test were running sequencially, but failed randomly in parallel


Diffs (updated)
-----

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 11526a7 
  beeline/src/java/org/apache/hive/beeline/Commands.java 2578728 
  itests/src/test/resources/testconfiguration.properties 7a70c9c 
  itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java 0d63f5d 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2abf252 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/QFile.java ae5a349 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/QFileBeeLineClient.java 760fde6 
  itests/util/src/main/java/org/apache/hive/beeline/qfile/package-info.java fcd50ec 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 79955e9 
  ql/src/test/results/clientpositive/beeline/drop_with_concurrency.q.out 385f9b7 
  ql/src/test/results/clientpositive/beeline/escape_comments.q.out abc0fee 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_10.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_11.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_12.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_13.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_16.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/58203/diff/2/

Changes: https://reviews.apache.org/r/58203/diff/1-2/


Testing
-------

Run the test multiple times with the various combinations of the following parameters:
- test.rewrite.source.tables - runs with true, or without it, fails when set to false
- junit.parallel.threads - runs with 1, or without this parameter


Thanks,

Peter Vary