You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Venkat Ranganathan <n....@live.com> on 2015/08/08 06:09:06 UTC

Review Request 37251: SQOOP-2457: Add option to automatically compute statistics after loading date into a hive table

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37251/
-----------------------------------------------------------

Review request for Sqoop.


Bugs: SQOOP-2457
    https://issues.apache.org/jira/browse/SQOOP-2457


Repository: sqoop-trunk


Description
-------

With CBO and different execution engines like Tez depedning on statistics like row count heavily, it is important that we provide the option to update stats on data loaded into Hive as part of the --hive-import option.  Ideally these should be Hive managed, but there are use cases where this is not automatic and hence this option will help in those cases

Added a new option --hive-compute-stats which will add compute statistics statement for the loaded table/partition as the case may be for --hive-imports


Diffs
-----

  src/java/org/apache/sqoop/SqoopOptions.java 9405605 
  src/java/org/apache/sqoop/hive/HiveImport.java e03d33c 
  src/java/org/apache/sqoop/hive/TableDefWriter.java c9962e9 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 4e2e66d 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java b626964 
  src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 55e572e 
  testdata/hive/scripts/normalWithStatsImport.q PRE-CREATION 
  testdata/hive/scripts/partitionWithStatsImport.q PRE-CREATION 

Diff: https://reviews.apache.org/r/37251/diff/


Testing
-------

Added new tests and all tests pass


Thanks,

Venkat Ranganathan


Re: Review Request 37251: SQOOP-2457: Add option to automatically compute statistics after loading date into a hive table

Posted by Szabolcs Vasas <va...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37251/#review142346
-----------------------------------------------------------


Ship it!




Ship It!

- Szabolcs Vasas


On Aug. 8, 2015, 4:09 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37251/
> -----------------------------------------------------------
> 
> (Updated Aug. 8, 2015, 4:09 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-2457
>     https://issues.apache.org/jira/browse/SQOOP-2457
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> With CBO and different execution engines like Tez depedning on statistics like row count heavily, it is important that we provide the option to update stats on data loaded into Hive as part of the --hive-import option.  Ideally these should be Hive managed, but there are use cases where this is not automatic and hence this option will help in those cases
> 
> Added a new option --hive-compute-stats which will add compute statistics statement for the loaded table/partition as the case may be for --hive-imports
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 9405605 
>   src/java/org/apache/sqoop/hive/HiveImport.java e03d33c 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java c9962e9 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 4e2e66d 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java b626964 
>   src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 55e572e 
>   testdata/hive/scripts/normalWithStatsImport.q PRE-CREATION 
>   testdata/hive/scripts/partitionWithStatsImport.q PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37251/diff/
> 
> 
> Testing
> -------
> 
> Added new tests and all tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request 37251: SQOOP-2457: Add option to automatically compute statistics after loading date into a hive table

Posted by Erzsebet Szilagyi <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37251/#review143518
-----------------------------------------------------------


Fix it, then Ship it!





src/java/org/apache/sqoop/tool/BaseSqoopTool.java (line 499)
<https://reviews.apache.org/r/37251/#comment209330>

    This description seems to be just copied from a previous one, could you please update it?



src/java/org/apache/sqoop/tool/BaseSqoopTool.java (lines 499 - 501)
<https://reviews.apache.org/r/37251/#comment209331>

    Indentation seems to be slightly off here.


- Erzsebet Szilagyi


On Aug. 8, 2015, 6:09 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37251/
> -----------------------------------------------------------
> 
> (Updated Aug. 8, 2015, 6:09 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-2457
>     https://issues.apache.org/jira/browse/SQOOP-2457
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> With CBO and different execution engines like Tez depedning on statistics like row count heavily, it is important that we provide the option to update stats on data loaded into Hive as part of the --hive-import option.  Ideally these should be Hive managed, but there are use cases where this is not automatic and hence this option will help in those cases
> 
> Added a new option --hive-compute-stats which will add compute statistics statement for the loaded table/partition as the case may be for --hive-imports
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 9405605 
>   src/java/org/apache/sqoop/hive/HiveImport.java e03d33c 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java c9962e9 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 4e2e66d 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java b626964 
>   src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 55e572e 
>   testdata/hive/scripts/normalWithStatsImport.q PRE-CREATION 
>   testdata/hive/scripts/partitionWithStatsImport.q PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37251/diff/
> 
> 
> Testing
> -------
> 
> Added new tests and all tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>