You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Venkat Ranganathan <n....@live.com> on 2013/04/21 07:51:39 UTC

Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

Review request for Sqoop and Jarek Cecho.


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs
-----

  build.xml 1c33fee 
  ivy.xml 1fa4dd1 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/TestHCatalogBasic.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogExportManualTest.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogImportManualTest.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > build.xml, line 54
> > <https://reviews.apache.org/r/10688/diff/5/?file=288026#file288026line54>
> >
> >     I'm not feeling entirely comfortable about depending on SNAPSHOTS. Is there a particular feature that we're taking advantage of in 0.6.0 that is not in 0.5.0?

No, the functionality (from the contract point of view) is even compatible with 0.4.0 I think.   I could not successfully resolve the maven repos for the earlier versions and hence I had to switch to it.   I think now I tried to build and found that only 0.11.0 is available readily at repos.maven.org.   That was the reason.  I will update and switch to 0.5.0 if that version is available in the repos.   But given that we want to have readily available Hadoop 2 and Hadoop 1 artifacts, we may have to set to 0.11.0 assuming that is the version the HCatalog team decides to publish the repositories for.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/SqoopOptions.java, line 160
> > <https://reviews.apache.org/r/10688/diff/5/?file=288029#file288029line160>
> >
> >     Out of curiosity what the "stanza" stands for?

Stanza means paragraph :)   We used this a lot earlier in my work to describe the SQL snippets when ]we write essays to describe what we want from the database.   May be clause is a more general DB term.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/config/ConfigurationConstants.java, lines 66-67
> > <https://reviews.apache.org/r/10688/diff/5/?file=288030#file288030line66>
> >
> >     Does the new property make sense when it's valid only on Hadoop2 that actually do not have any JobTracker address at all? We already had issues with that on Sqoop2 side in SQOOP-1002.

I just wanted to consolidate all constants.  If we can remove it, that would be OK


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/config/ConfigurationConstants.java, lines 90-91
> > <https://reviews.apache.org/r/10688/diff/5/?file=288030#file288030line90>
> >
> >     Nit: Please put extra empty line between the property name and private constructor.

Thanks Will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java, lines 377-379
> > <https://reviews.apache.org/r/10688/diff/5/?file=288052#file288052line377>
> >
> >     Nit: Incorrect indentation.

Thanks will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java, lines 368-370
> > <https://reviews.apache.org/r/10688/diff/5/?file=288052#file288052line368>
> >
> >     Nit: Incorrect indentation.

Thanks  will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 849
> > <https://reviews.apache.org/r/10688/diff/5/?file=288043#file288043line849>
> >
> >     I think that we also want to skip invoking the output committer in case of hadoop 2.

Good catch.  I will modify the version check to just say if it is Hadoop2

Thanks


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java, lines 76-77
> > <https://reviews.apache.org/r/10688/diff/5/?file=288039#file288039line76>
> >
> >     Nit: Those extra lines seems unnecessary here because the next row also have type constant.

Will remove the extra lines - thanks


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, line 389
> > <https://reviews.apache.org/r/10688/diff/5/?file=288034#file288034line389>
> >
> >     This comment seems to be artifact from development. I would suggest to improve the message and move it into "debug" state in case that we would like to have it around.

Will remove


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java, lines 68-74
> > <https://reviews.apache.org/r/10688/diff/5/?file=288033#file288033line68>
> >
> >     I'm thinking if having subclass of the DataDrivenImportJob for HCat specific things that would override this and couple of other methods would be cleaner than having multiple if-else statements. What do you think Venkat?

Yes,   Good point.   It makes sense (and that is what I started with), but since I had to change the ExportJob in place, I used a similar scheme for consistency and followed what we did for other storage formats like Avro.   Also, this will make sure that irrespective of the changes to import job (on what it gets its feed from), we will be able to send it to HCat.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, lines 202-204
> > <https://reviews.apache.org/r/10688/diff/5/?file=288034#file288034line202>
> >
> >     Similarly as in the import. Would having dedicated classes for HCatalog make sense/would be cleaner that having one class for everything and having multiple if-else statements?

Good point Jarek.  Actually I had that implementation first - but then we will not be able to support update/upsert and call by procedure would  need to be modified to handle the HCat format.   Since we were using HCat more as storage format like Avro, I decided to implement in place.  And followed similar logic for Imports as well


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/ConnManager.java, line 197
> > <https://reviews.apache.org/r/10688/diff/5/?file=288032#file288032line197>
> >
> >     Is the timestamp mapped to String from similar reason as mentioned above with SMALLINT?

Timestamp is currently not a supported datatype in HCat (even though Hive supports it).   I will create a JIRA issue on HCat to support that now that HCatalog is a sub project of Hive.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > ivy.xml, lines 185-193
> > <https://reviews.apache.org/r/10688/diff/5/?file=288027#file288027line185>
> >
> >     Shouldn't those two dependencies be transitively propagated from HCatalog/Hive?

I had an issue building without the explicit dependency listed - may be because the repos were not having all the artifacts and the data nucleus was only available from datanucleus repository.   I will try to remove the dependency and retry.   
Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review20756
-----------------------------------------------------------


On May 4, 2013, 11:46 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 4, 2013, 11:46 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 1c33fee 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.

> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > build.xml, line 54
> > <https://reviews.apache.org/r/10688/diff/5/?file=288026#file288026line54>
> >
> >     I'm not feeling entirely comfortable about depending on SNAPSHOTS. Is there a particular feature that we're taking advantage of in 0.6.0 that is not in 0.5.0?
> 
> Venkat Ranganathan wrote:
>     No, the functionality (from the contract point of view) is even compatible with 0.4.0 I think.   I could not successfully resolve the maven repos for the earlier versions and hence I had to switch to it.   I think now I tried to build and found that only 0.11.0 is available readily at repos.maven.org.   That was the reason.  I will update and switch to 0.5.0 if that version is available in the repos.   But given that we want to have readily available Hadoop 2 and Hadoop 1 artifacts, we may have to set to 0.11.0 assuming that is the version the HCatalog team decides to publish the repositories for.

Using 0.11.0 is completely fine with me, or any other released version.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > ivy.xml, lines 185-193
> > <https://reviews.apache.org/r/10688/diff/5/?file=288027#file288027line185>
> >
> >     Shouldn't those two dependencies be transitively propagated from HCatalog/Hive?
> 
> Venkat Ranganathan wrote:
>     I had an issue building without the explicit dependency listed - may be because the repos were not having all the artifacts and the data nucleus was only available from datanucleus repository.   I will try to remove the dependency and retry.   
>     Thanks

Thank you sir, appreciated!


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/SqoopOptions.java, line 160
> > <https://reviews.apache.org/r/10688/diff/5/?file=288029#file288029line160>
> >
> >     Out of curiosity what the "stanza" stands for?
> 
> Venkat Ranganathan wrote:
>     Stanza means paragraph :)   We used this a lot earlier in my work to describe the SQL snippets when ]we write essays to describe what we want from the database.   May be clause is a more general DB term.

Och thank you :-) I think that it's fine, I was just curious...


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/ConnManager.java, line 197
> > <https://reviews.apache.org/r/10688/diff/5/?file=288032#file288032line197>
> >
> >     Is the timestamp mapped to String from similar reason as mentioned above with SMALLINT?
> 
> Venkat Ranganathan wrote:
>     Timestamp is currently not a supported datatype in HCat (even though Hive supports it).   I will create a JIRA issue on HCat to support that now that HCatalog is a sub project of Hive.

I see, thank you for the explanation sir.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, lines 202-204
> > <https://reviews.apache.org/r/10688/diff/5/?file=288034#file288034line202>
> >
> >     Similarly as in the import. Would having dedicated classes for HCatalog make sense/would be cleaner that having one class for everything and having multiple if-else statements?
> 
> Venkat Ranganathan wrote:
>     Good point Jarek.  Actually I had that implementation first - but then we will not be able to support update/upsert and call by procedure would  need to be modified to handle the HCat format.   Since we were using HCat more as storage format like Avro, I decided to implement in place.  And followed similar logic for Imports as well

Thank you for your feedback. Your explanation makes complete sense to me. I believe that even the AVRO implementation is currently a bit hacky, but that will be cleaned up in Sqoop2, so I don't have any further comments.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review20756
-----------------------------------------------------------


On May 4, 2013, 11:46 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 4, 2013, 11:46 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 1c33fee 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review20756
-----------------------------------------------------------


Hi Venkat,
thank you very much for working on the HCatalog support and please accept my deep apologies for the delay in review. I've done first pass today and I'm going to continue tomorrow. I do have couple of notes below:


build.xml
<https://reviews.apache.org/r/10688/#comment42837>

    I'm not feeling entirely comfortable about depending on SNAPSHOTS. Is there a particular feature that we're taking advantage of in 0.6.0 that is not in 0.5.0?



ivy.xml
<https://reviews.apache.org/r/10688/#comment42838>

    Shouldn't those two dependencies be transitively propagated from HCatalog/Hive?



src/java/org/apache/sqoop/SqoopOptions.java
<https://reviews.apache.org/r/10688/#comment42839>

    Out of curiosity what the "stanza" stands for?



src/java/org/apache/sqoop/config/ConfigurationConstants.java
<https://reviews.apache.org/r/10688/#comment42841>

    Does the new property make sense when it's valid only on Hadoop2 that actually do not have any JobTracker address at all? We already had issues with that on Sqoop2 side in SQOOP-1002.



src/java/org/apache/sqoop/config/ConfigurationConstants.java
<https://reviews.apache.org/r/10688/#comment42842>

    Nit: Please put extra empty line between the property name and private constructor.



src/java/org/apache/sqoop/manager/ConnManager.java
<https://reviews.apache.org/r/10688/#comment42844>

    Is the timestamp mapped to String from similar reason as mentioned above with SMALLINT?



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
<https://reviews.apache.org/r/10688/#comment42845>

    I'm thinking if having subclass of the DataDrivenImportJob for HCat specific things that would override this and couple of other methods would be cleaner than having multiple if-else statements. What do you think Venkat?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment42846>

    Similarly as in the import. Would having dedicated classes for HCatalog make sense/would be cleaner that having one class for everything and having multiple if-else statements?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment42847>

    This comment seems to be artifact from development. I would suggest to improve the message and move it into "debug" state in case that we would like to have it around.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java
<https://reviews.apache.org/r/10688/#comment42849>

    Nit: Those extra lines seems unnecessary here because the next row also have type constant.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment42850>

    I think that we also want to skip invoking the output committer in case of hadoop 2.



src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java
<https://reviews.apache.org/r/10688/#comment42851>

    Nit: Incorrect indentation.



src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java
<https://reviews.apache.org/r/10688/#comment42852>

    Nit: Incorrect indentation.


Jarcec

- Jarek Cecho


On May 4, 2013, 11:46 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 4, 2013, 11:46 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 1c33fee 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > Hi Venkat,
> > thank you very much for incorporating all my suggestions. I've took a deeper look and the changes seems great to me. I do have couple of high level notes:
> > 
> > 1) Tests for profile 200 and 23 are failing on obligate "java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected". I was looking into maven central and it seems that only hadoop 1.x jars were published for HCatalog 0.11.0. Do you think that we can ask the HCatalog/Hive team to also publish Hadoop 2 compatible jars (via different classifier for example).
> > 
> > 2) Would you mind updating user guide?
> > 
> > 3) I'll update the old-pom.xml file after this will get in as I'm using it for bootstrapping my IntelliJ project and we're missing the new dependencies on HCatalog.
> > 
> > I'm still missing running the patch on a real cluster, but otherwise I feel that we are very close to get it in!
> >

Thanks for reviewing.  
1) HIVE-4660 has been created to track that - I will check to see when it will be taken up - it is painful (and similar to the HBase situation). I built hcatalog locally with hadoop 2.x to test this with Hadoop.
2)  I have a separate JIRA for that.  Was waiting for the review of this.   Will update the documentation patch with the doc update

Thanks

Venkat


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, line 467
> > <https://reviews.apache.org/r/10688/diff/6/?file=296857#file296857line467>
> >
> >     Nit: Do you think that it would make sense to introduce a new FileType "HCATALOG"?

Yes, good idea.  Let me add that


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java, lines 131-136
> > <https://reviews.apache.org/r/10688/diff/6/?file=296861#file296861line131>
> >
> >     Is this method necessary? It seems to be only calling the parent method without any additional logic.

I had a debug log message to track the record reader.  Might have deleted by mistake.   Will fix it.  Thanks for the catch


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 243-260
> > <https://reviews.apache.org/r/10688/diff/6/?file=296866#file296866line243>
> >
> >     The "home" variable seems to be unused after the assignment, so I'm assuming that assignements are not necessary?

Yes.  When I refactored the code, I missed this part.  Let me fix it


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 268
> > <https://reviews.apache.org/r/10688/diff/6/?file=296866#file296866line268>
> >
> >     I do understand that Hive/HCatalog requires to have lowercase table names, but I'm a bit concerned about doing it without user knowledge. Do you think that it would make sense to detect if user specified uppercase letters and print out a warning?

Sounds reasonable.   If we detect that the table or database is not all in lowercase, we can issue a warning


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 975-976
> > <https://reviews.apache.org/r/10688/diff/6/?file=296866#file296866line975>
> >
> >     Nit: This seems to be doing the same as SqoopOptions.getHCatHomeDefault().

Yes.   Will change to use that


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/tool/BaseSqoopTool.java, line 1238
> > <https://reviews.apache.org/r/10688/diff/6/?file=296867#file296867line1238>
> >
> >     It seems to me that the --as-avrodatafile and --as-sequencefile are also not compatible with the HCatalog import/export so we might add the validations here.

Will add the validation


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, line 84
> > <https://reviews.apache.org/r/10688/diff/6/?file=296857#file296857line84>
> >
> >     I can see the same variable isHCatJob in ImportJobBase and ExportJobBase. Do you think that it would make sense to put it into JobBase class instead?

Yes, we can push it into JobBase.


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/ConnManager.java, line 227
> > <https://reviews.apache.org/r/10688/diff/6/?file=296855#file296855line227>
> >
> >     Nit: Can we throw here IllegalArgumentException similarly as in case of toAvroType() method? Dying fast seems to me as a better option that getting NPE somewhere later. I know that the toJavaType() is returning null as well at the moment, but  we can "fix" it in follow up JIRA.

Good point.  Even though I handled it one place (with explicit exception, it may be better to through this early). 


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/ConnManager.java, line 223
> > <https://reviews.apache.org/r/10688/diff/6/?file=296855#file296855line223>
> >
> >     Nit: The indent seems to be off.

Will fix


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/config/ConfigurationConstants.java, line 96
> > <https://reviews.apache.org/r/10688/diff/6/?file=296853#file296853line96>
> >
> >     Introducing this property is awesome idea, I would suggest to also use it across entire code base (for example in JobBase class). Considering the size of this patch already, I'm completely fine with doing that in follow up JIRA.

Yes, we should use fix other uses of this string.  We can address it in a follow-on JIRA


> On May 28, 2013, 9:33 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/SqoopOptions.java, line 1313
> > <https://reviews.apache.org/r/10688/diff/6/?file=296852#file296852line1313>
> >
> >     Nit: It seems that the SqoopOptions class is preferring get methods returning boolean to be called with either "is" or "do", e.g. something like isCreateHCatalogTable().

Sure.  will change


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21079
-----------------------------------------------------------


On May 24, 2013, 11:18 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 24, 2013, 11:18 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21079
-----------------------------------------------------------


Hi Venkat,
thank you very much for incorporating all my suggestions. I've took a deeper look and the changes seems great to me. I do have couple of high level notes:

1) Tests for profile 200 and 23 are failing on obligate "java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected". I was looking into maven central and it seems that only hadoop 1.x jars were published for HCatalog 0.11.0. Do you think that we can ask the HCatalog/Hive team to also publish Hadoop 2 compatible jars (via different classifier for example).

2) Would you mind updating user guide?

3) I'll update the old-pom.xml file after this will get in as I'm using it for bootstrapping my IntelliJ project and we're missing the new dependencies on HCatalog.

I'm still missing running the patch on a real cluster, but otherwise I feel that we are very close to get it in!



src/java/org/apache/sqoop/SqoopOptions.java
<https://reviews.apache.org/r/10688/#comment43598>

    Nit: It seems that the SqoopOptions class is preferring get methods returning boolean to be called with either "is" or "do", e.g. something like isCreateHCatalogTable().



src/java/org/apache/sqoop/config/ConfigurationConstants.java
<https://reviews.apache.org/r/10688/#comment43599>

    Introducing this property is awesome idea, I would suggest to also use it across entire code base (for example in JobBase class). Considering the size of this patch already, I'm completely fine with doing that in follow up JIRA.



src/java/org/apache/sqoop/manager/ConnManager.java
<https://reviews.apache.org/r/10688/#comment43600>

    Nit: The indent seems to be off.



src/java/org/apache/sqoop/manager/ConnManager.java
<https://reviews.apache.org/r/10688/#comment43601>

    Nit: Can we throw here IllegalArgumentException similarly as in case of toAvroType() method? Dying fast seems to me as a better option that getting NPE somewhere later. I know that the toJavaType() is returning null as well at the moment, but  we can "fix" it in follow up JIRA.



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment43611>

    I can see the same variable isHCatJob in ImportJobBase and ExportJobBase. Do you think that it would make sense to put it into JobBase class instead?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment43621>

    Nit: Do you think that it would make sense to introduce a new FileType "HCATALOG"?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java
<https://reviews.apache.org/r/10688/#comment43602>

    Is this method necessary? It seems to be only calling the parent method without any additional logic.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment43603>

    The "home" variable seems to be unused after the assignment, so I'm assuming that assignements are not necessary?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment43604>

    I do understand that Hive/HCatalog requires to have lowercase table names, but I'm a bit concerned about doing it without user knowledge. Do you think that it would make sense to detect if user specified uppercase letters and print out a warning?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment43605>

    Nit: This seems more debug line for me, what do you think?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment43606>

    Nit: This seems to be doing the same as SqoopOptions.getHCatHomeDefault().



src/java/org/apache/sqoop/tool/BaseSqoopTool.java
<https://reviews.apache.org/r/10688/#comment43620>

    It seems to me that the --as-avrodatafile and --as-sequencefile are also not compatible with the HCatalog import/export so we might add the validations here.


Jarcec

- Jarek Cecho


On May 24, 2013, 11:18 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 24, 2013, 11:18 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > Hi Venkat,
> > Thank you for incorporating my comments, greatly appreciated. I've took a deep look again and I do have following additional comments:
> > 
> > 1) Can we add the HCatalog tests into ThirdPartyTest suite? https://github.com/apache/sqoop/blob/trunk/src/test/com/cloudera/sqoop/ThirdPartyTests.java
> > 
> > 2) It seems that using --create-hcatalog-table will create the table and exist Sqoop without doing the import:
> > 
> > [root@bousa-hcat ~]# sqoop import --connect jdbc:mysql://mysql.ent.cloudera.com/sqoop --username sqoop --password sqoop --table text --hcatalog-table text --create-hcatalog-table
> > 13/06/04 15:44:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> > 13/06/04 15:44:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
> > 13/06/04 15:44:39 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
> > 13/06/04 15:44:39 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar
> > Note: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.java uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/04 15:44:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.jar
> > 13/06/04 15:44:42 WARN manager.MySQLManager: It looks like you are importing from mysql.
> > 13/06/04 15:44:42 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
> > 13/06/04 15:44:42 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
> > 13/06/04 15:44:42 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
> > 13/06/04 15:44:42 INFO mapreduce.ImportJobBase: Beginning import of text
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for import job
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details for job
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: Hive home is not set. job may fail if needed jar files are not found correctly.  Please set HIVE_HOME in sqoop-env.sh or provide --hive-home option.  Setting HIVE_HOME  to /usr/lib/hive
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: HCatalog home is not set. job may fail if needed jar files are not found correctly.  Please set HCAT_HOME in sqoop-env.sh or provide --hcatalog-home option.   Setting HCAT_HOME to /usr/lib/hcatalog
> > 13/06/04 15:44:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column names projected : [id, txt]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column name - type map :
> >         Names: [id, txt]
> >         Types : [4, 12]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.text for import
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 
> > 
> > create table default.text (
> >         id int,
> >         txt string)
> > stored as rcfile
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Executing HCatalog CLI in-process.
> > Hive history file=/tmp/root/hive_job_log_65f4f145-0b1e-4e09-8e40-b7edcfc15f83_2077084453.txt
> > OK
> > Time taken: 25.121 seconds
> > [root@bousa-hcat ~]#
> > 
> >
> 
> Venkat Ranganathan wrote:
>     Sure, I can add it to that.
>     
>     --create-hcatalog-table -  It seems to work by chance - That is, after creating the table a bunch of stuff is done that is not needed.   I will add additional checks there

Sorry I misunderstood your observation - There is even a test case to test this.   What I thought you said was just using --create-hcatalog-table also works like the --create-hive-table option without hive import.   Let me recheck this.

Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21420
-----------------------------------------------------------


On June 3, 2013, 4:16 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 4:16 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > Hi Venkat,
> > Thank you for incorporating my comments, greatly appreciated. I've took a deep look again and I do have following additional comments:
> > 
> > 1) Can we add the HCatalog tests into ThirdPartyTest suite? https://github.com/apache/sqoop/blob/trunk/src/test/com/cloudera/sqoop/ThirdPartyTests.java
> > 
> > 2) It seems that using --create-hcatalog-table will create the table and exist Sqoop without doing the import:
> > 
> > [root@bousa-hcat ~]# sqoop import --connect jdbc:mysql://mysql.ent.cloudera.com/sqoop --username sqoop --password sqoop --table text --hcatalog-table text --create-hcatalog-table
> > 13/06/04 15:44:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> > 13/06/04 15:44:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
> > 13/06/04 15:44:39 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
> > 13/06/04 15:44:39 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar
> > Note: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.java uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/04 15:44:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.jar
> > 13/06/04 15:44:42 WARN manager.MySQLManager: It looks like you are importing from mysql.
> > 13/06/04 15:44:42 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
> > 13/06/04 15:44:42 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
> > 13/06/04 15:44:42 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
> > 13/06/04 15:44:42 INFO mapreduce.ImportJobBase: Beginning import of text
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for import job
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details for job
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: Hive home is not set. job may fail if needed jar files are not found correctly.  Please set HIVE_HOME in sqoop-env.sh or provide --hive-home option.  Setting HIVE_HOME  to /usr/lib/hive
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: HCatalog home is not set. job may fail if needed jar files are not found correctly.  Please set HCAT_HOME in sqoop-env.sh or provide --hcatalog-home option.   Setting HCAT_HOME to /usr/lib/hcatalog
> > 13/06/04 15:44:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column names projected : [id, txt]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column name - type map :
> >         Names: [id, txt]
> >         Types : [4, 12]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.text for import
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 
> > 
> > create table default.text (
> >         id int,
> >         txt string)
> > stored as rcfile
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Executing HCatalog CLI in-process.
> > Hive history file=/tmp/root/hive_job_log_65f4f145-0b1e-4e09-8e40-b7edcfc15f83_2077084453.txt
> > OK
> > Time taken: 25.121 seconds
> > [root@bousa-hcat ~]#
> > 
> >
> 
> Venkat Ranganathan wrote:
>     Sure, I can add it to that.
>     
>     --create-hcatalog-table -  It seems to work by chance - That is, after creating the table a bunch of stuff is done that is not needed.   I will add additional checks there
> 
> Venkat Ranganathan wrote:
>     Sorry I misunderstood your observation - There is even a test case to test this.   What I thought you said was just using --create-hcatalog-table also works like the --create-hive-table option without hive import.   Let me recheck this.
>     
>     Thanks
> 
> Jarek Cecho wrote:
>     Hi Venkat,
>     please accept my apology for the confusion and let me to explain a bit better. I've noticed that when I'm using the parameter --create-hcatalog-table, the logger will get reconfigured and there is not Sqoop log available after the table is created. Notice that there is no log after the "Time taken...".

Yes.  That is what I am debugging.  My system tests on a real cluster passed but that was by comparing the results of the action.   There are a few issues - Hive does not have a logging configuration in place - a template is provided and until the user creates a logger configuration, it is not helpful.   I tries to pass in the hive logging configuration on the command line, but then we have to pass in a whole lot of things on the command line.   So, I have decided to disable in line execution of HCat scripts in real usage mode and for tests only we will support in line usage, but the configuration files I have checked in already should help with this.

BTW, this is also an issue with HiveImport I think, but there it is the last part of the import so it is OK, but we still will have issues with any output there.   


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 491
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line491>
> >
> >     Both Hive and HBase are idempotent when creating tables, so It might make sense to add "IF NOT EXISTS" in order to remain consistent.
> 
> Venkat Ranganathan wrote:
>     Good point.  I think we will otherwise earlier, but for consistency I think we should do this.   Will change

I went through this, and we use --create-hcatalog-table to mean that the table has to be created and the assumption is the table is not there.   We will fail if the table is there.   So, I have decided to leave this in but added a test case to test this scenario specifically


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21420
-----------------------------------------------------------


On June 3, 2013, 4:16 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 4:16 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.

> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > Hi Venkat,
> > Thank you for incorporating my comments, greatly appreciated. I've took a deep look again and I do have following additional comments:
> > 
> > 1) Can we add the HCatalog tests into ThirdPartyTest suite? https://github.com/apache/sqoop/blob/trunk/src/test/com/cloudera/sqoop/ThirdPartyTests.java
> > 
> > 2) It seems that using --create-hcatalog-table will create the table and exist Sqoop without doing the import:
> > 
> > [root@bousa-hcat ~]# sqoop import --connect jdbc:mysql://mysql.ent.cloudera.com/sqoop --username sqoop --password sqoop --table text --hcatalog-table text --create-hcatalog-table
> > 13/06/04 15:44:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> > 13/06/04 15:44:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
> > 13/06/04 15:44:39 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
> > 13/06/04 15:44:39 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar
> > Note: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.java uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/04 15:44:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.jar
> > 13/06/04 15:44:42 WARN manager.MySQLManager: It looks like you are importing from mysql.
> > 13/06/04 15:44:42 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
> > 13/06/04 15:44:42 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
> > 13/06/04 15:44:42 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
> > 13/06/04 15:44:42 INFO mapreduce.ImportJobBase: Beginning import of text
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for import job
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details for job
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: Hive home is not set. job may fail if needed jar files are not found correctly.  Please set HIVE_HOME in sqoop-env.sh or provide --hive-home option.  Setting HIVE_HOME  to /usr/lib/hive
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: HCatalog home is not set. job may fail if needed jar files are not found correctly.  Please set HCAT_HOME in sqoop-env.sh or provide --hcatalog-home option.   Setting HCAT_HOME to /usr/lib/hcatalog
> > 13/06/04 15:44:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column names projected : [id, txt]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column name - type map :
> >         Names: [id, txt]
> >         Types : [4, 12]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.text for import
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 
> > 
> > create table default.text (
> >         id int,
> >         txt string)
> > stored as rcfile
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Executing HCatalog CLI in-process.
> > Hive history file=/tmp/root/hive_job_log_65f4f145-0b1e-4e09-8e40-b7edcfc15f83_2077084453.txt
> > OK
> > Time taken: 25.121 seconds
> > [root@bousa-hcat ~]#
> > 
> >
> 
> Venkat Ranganathan wrote:
>     Sure, I can add it to that.
>     
>     --create-hcatalog-table -  It seems to work by chance - That is, after creating the table a bunch of stuff is done that is not needed.   I will add additional checks there
> 
> Venkat Ranganathan wrote:
>     Sorry I misunderstood your observation - There is even a test case to test this.   What I thought you said was just using --create-hcatalog-table also works like the --create-hive-table option without hive import.   Let me recheck this.
>     
>     Thanks

Hi Venkat,
please accept my apology for the confusion and let me to explain a bit better. I've noticed that when I'm using the parameter --create-hcatalog-table, the logger will get reconfigured and there is not Sqoop log available after the table is created. Notice that there is no log after the "Time taken...".


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java, lines 131-137
> > <https://reviews.apache.org/r/10688/diff/9/?file=299874#file299874line131>
> >
> >     This method seems to be required only for the debug message. Is it the only reason or did I miss something?
> 
> Venkat Ranganathan wrote:
>     Yes, it is needed for debugging purpose when we want to know when the sub record reader or main record reader are called

I see, thank you.


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 523
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line523>
> >
> >     It seems that at this point we are not reading the hive configuration files but yet executing the in-process Hive CLI that will as a result not pick up the configuration file and will use defaults that is not consistent with the executed mapreduce job that will use the proper configuration files. As a result the table will be created in different metastore then into which we are importing data.
> 
> Venkat Ranganathan wrote:
>      Hive and hcat configuration files and jars have to be in the classpath brought in by hcat -classpath.   Do you think that is not always in the configuration?   When I update the configure sqoop script, I will make sure the hive conf is added.

Yeah it seems that HCatalog 0.5.0 is not putting the hive configuration directory in the classpath - at least in my environment.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21420
-----------------------------------------------------------


On June 3, 2013, 4:16 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 4:16 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > Hi Venkat,
> > Thank you for incorporating my comments, greatly appreciated. I've took a deep look again and I do have following additional comments:
> > 
> > 1) Can we add the HCatalog tests into ThirdPartyTest suite? https://github.com/apache/sqoop/blob/trunk/src/test/com/cloudera/sqoop/ThirdPartyTests.java
> > 
> > 2) It seems that using --create-hcatalog-table will create the table and exist Sqoop without doing the import:
> > 
> > [root@bousa-hcat ~]# sqoop import --connect jdbc:mysql://mysql.ent.cloudera.com/sqoop --username sqoop --password sqoop --table text --hcatalog-table text --create-hcatalog-table
> > 13/06/04 15:44:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
> > 13/06/04 15:44:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
> > 13/06/04 15:44:39 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
> > 13/06/04 15:44:39 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar
> > Note: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.java uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/04 15:44:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.jar
> > 13/06/04 15:44:42 WARN manager.MySQLManager: It looks like you are importing from mysql.
> > 13/06/04 15:44:42 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
> > 13/06/04 15:44:42 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
> > 13/06/04 15:44:42 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
> > 13/06/04 15:44:42 INFO mapreduce.ImportJobBase: Beginning import of text
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for import job
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details for job
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: Hive home is not set. job may fail if needed jar files are not found correctly.  Please set HIVE_HOME in sqoop-env.sh or provide --hive-home option.  Setting HIVE_HOME  to /usr/lib/hive
> > 13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: HCatalog home is not set. job may fail if needed jar files are not found correctly.  Please set HCAT_HOME in sqoop-env.sh or provide --hcatalog-home option.   Setting HCAT_HOME to /usr/lib/hcatalog
> > 13/06/04 15:44:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column names projected : [id, txt]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column name - type map :
> >         Names: [id, txt]
> >         Types : [4, 12]
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.text for import
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 
> > 
> > create table default.text (
> >         id int,
> >         txt string)
> > stored as rcfile
> > 13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Executing HCatalog CLI in-process.
> > Hive history file=/tmp/root/hive_job_log_65f4f145-0b1e-4e09-8e40-b7edcfc15f83_2077084453.txt
> > OK
> > Time taken: 25.121 seconds
> > [root@bousa-hcat ~]#
> > 
> >

Sure, I can add it to that.

--create-hcatalog-table -  It seems to work by chance - That is, after creating the table a bunch of stuff is done that is not needed.   I will add additional checks there


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/docs/user/hcatalog.txt, lines 284-285
> > <https://reviews.apache.org/r/10688/diff/9/?file=299864#file299864line284>
> >
> >     This seem unnecessary, can we tweak the bash scripts to do this automatically if the hcat command is present?

Good point.  Since I modified  the hive unit tests to function correctly in the presence of real hive environment, this can be easily done.


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, line 95
> > <https://reviews.apache.org/r/10688/diff/9/?file=299870#file299870line95>
> >
> >     Nit: I think that this line can be also refactored to the parent class right?

Yes.   One thing to note is that by  moving the isHCatJob to the parent class we lost the ability to mark it as final.   Let me rework it


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ImportJobBase.java, line 85
> > <https://reviews.apache.org/r/10688/diff/9/?file=299871#file299871line85>
> >
> >     Nit: I think that this line can be also refactored to the parent class right?

Please see above


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java, lines 131-137
> > <https://reviews.apache.org/r/10688/diff/9/?file=299874#file299874line131>
> >
> >     This method seems to be required only for the debug message. Is it the only reason or did I miss something?

Yes, it is needed for debugging purpose when we want to know when the sub record reader or main record reader are called


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 237-241
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line237>
> >
> >     Nit: It seems that we are doing the options = opts; every in all cases so maybe it would be worth putting this line before "if" statement?

Sure


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 249
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line249>
> >
> >     Nit: Shouldn't be default Hive home in SqoopOptions.getDefaultHiveHome()?

Yes.   The message needs fixing


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 257
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line257>
> >
> >     Shouldn't be default Hive home in SqoopOptions.getDefaultHcatHome()?

Yes.  As above


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 491
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line491>
> >
> >     Both Hive and HBase are idempotent when creating tables, so It might make sense to add "IF NOT EXISTS" in order to remain consistent.

Good point.  I think we will otherwise earlier, but for consistency I think we should do this.   Will change


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 523
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line523>
> >
> >     It seems that at this point we are not reading the hive configuration files but yet executing the in-process Hive CLI that will as a result not pick up the configuration file and will use defaults that is not consistent with the executed mapreduce job that will use the proper configuration files. As a result the table will be created in different metastore then into which we are importing data.

 Hive and hcat configuration files and jars have to be in the classpath brought in by hcat -classpath.   Do you think that is not always in the configuration?   When I update the configure sqoop script, I will make sure the hive conf is added.


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 749-750
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line749>
> >
> >     Shouldn't we use here the SqoopOptions.getDefaultHiveHome()?

Yes.  WIll fix


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 871-875
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line871>
> >
> >     Nit: Those lines seems to be unused.

Good catch - earlier I had the ability to execute a command line but removed it in favor of a simpler model.  Will remove it


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 876
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line876>
> >
> >     Can we write the file in temporary directory rather than in current working directory? (that might not be writable).

Sure will change


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/docs/user/hcatalog.txt, line 160
> > <https://reviews.apache.org/r/10688/diff/9/?file=299864#file299864line160>
> >
> >     Can we add here information what will happen if the table already exists and this parameter is specified?

Sure.   Will do.


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 898-899
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line898>
> >
> >     I would suggest to alter this to single line:
> >     
> >     LOG.error("Error writing HCatalog load-in script: ", ioe);
> >     
> >     That will also print the stack trace.

Sure will do


> On June 4, 2013, 11:15 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, lines 906-907
> > <https://reviews.apache.org/r/10688/diff/9/?file=299879#file299879line906>
> >
> >     I would suggest to change this line to :
> >     
> >     LOG.warn("IOException closing stream to HCatalog script: ", ioe);
> >     
> >     That will also print out the stack trace.

Sure will do


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21420
-----------------------------------------------------------


On June 3, 2013, 4:16 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 4:16 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21420
-----------------------------------------------------------


Hi Venkat,
Thank you for incorporating my comments, greatly appreciated. I've took a deep look again and I do have following additional comments:

1) Can we add the HCatalog tests into ThirdPartyTest suite? https://github.com/apache/sqoop/blob/trunk/src/test/com/cloudera/sqoop/ThirdPartyTests.java

2) It seems that using --create-hcatalog-table will create the table and exist Sqoop without doing the import:

[root@bousa-hcat ~]# sqoop import --connect jdbc:mysql://mysql.ent.cloudera.com/sqoop --username sqoop --password sqoop --table text --hcatalog-table text --create-hcatalog-table
13/06/04 15:44:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/06/04 15:44:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
13/06/04 15:44:39 INFO tool.CodeGenTool: Beginning code generation
13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
13/06/04 15:44:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
13/06/04 15:44:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
13/06/04 15:44:39 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar
Note: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/06/04 15:44:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f726ee2a04cf955e797a4932d94668f7/text.jar
13/06/04 15:44:42 WARN manager.MySQLManager: It looks like you are importing from mysql.
13/06/04 15:44:42 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
13/06/04 15:44:42 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
13/06/04 15:44:42 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
13/06/04 15:44:42 INFO mapreduce.ImportJobBase: Beginning import of text
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for import job
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific details for job
13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: Hive home is not set. job may fail if needed jar files are not found correctly.  Please set HIVE_HOME in sqoop-env.sh or provide --hive-home option.  Setting HIVE_HOME  to /usr/lib/hive
13/06/04 15:44:42 WARN hcat.SqoopHCatUtilities: HCatalog home is not set. job may fail if needed jar files are not found correctly.  Please set HCAT_HOME in sqoop-env.sh or provide --hcatalog-home option.   Setting HCAT_HOME to /usr/lib/hcatalog
13/06/04 15:44:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `text` AS t LIMIT 1
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column names projected : [id, txt]
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Database column name - type map :
        Names: [id, txt]
        Types : [4, 12]
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Creating HCatalog table default.text for import
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: HCatalog Create table statement: 

create table default.text (
        id int,
        txt string)
stored as rcfile
13/06/04 15:44:42 INFO hcat.SqoopHCatUtilities: Executing HCatalog CLI in-process.
Hive history file=/tmp/root/hive_job_log_65f4f145-0b1e-4e09-8e40-b7edcfc15f83_2077084453.txt
OK
Time taken: 25.121 seconds
[root@bousa-hcat ~]#




src/docs/user/hcatalog.txt
<https://reviews.apache.org/r/10688/#comment44346>

    Can we add here information what will happen if the table already exists and this parameter is specified?



src/docs/user/hcatalog.txt
<https://reviews.apache.org/r/10688/#comment44347>

    This seem unnecessary, can we tweak the bash scripts to do this automatically if the hcat command is present?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment44350>

    Nit: I think that this line can be also refactored to the parent class right?



src/java/org/apache/sqoop/mapreduce/ImportJobBase.java
<https://reviews.apache.org/r/10688/#comment44349>

    Nit: I think that this line can be also refactored to the parent class right?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java
<https://reviews.apache.org/r/10688/#comment44351>

    This method seems to be required only for the debug message. Is it the only reason or did I miss something?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44369>

    Nit: It seems that we are doing the options = opts; every in all cases so maybe it would be worth putting this line before "if" statement?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44371>

    Nit: Shouldn't be default Hive home in SqoopOptions.getDefaultHiveHome()?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44372>

    Shouldn't be default Hive home in SqoopOptions.getDefaultHcatHome()?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44397>

    Both Hive and HBase are idempotent when creating tables, so It might make sense to add "IF NOT EXISTS" in order to remain consistent.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44395>

    It seems that at this point we are not reading the hive configuration files but yet executing the in-process Hive CLI that will as a result not pick up the configuration file and will use defaults that is not consistent with the executed mapreduce job that will use the proper configuration files. As a result the table will be created in different metastore then into which we are importing data.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44378>

    Shouldn't we use here the SqoopOptions.getDefaultHiveHome()?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44379>

    Shouldn't we use here the SqoopOptions.getDefaultHCatHome()?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44382>

    Nit: Considering that there might be Hadoop3 in the future, would it be simple to change the condition to (isLocalMode and isHadoop1) instead of enumerating all other possible hadoop versions?



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44386>

    Nit: Those lines seems to be unused.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44387>

    Can we write the file in temporary directory rather than in current working directory? (that might not be writable).



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44388>

    I would suggest to alter this to single line:
    
    LOG.error("Error writing HCatalog load-in script: ", ioe);
    
    That will also print the stack trace.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment44389>

    I would suggest to change this line to :
    
    LOG.warn("IOException closing stream to HCatalog script: ", ioe);
    
    That will also print out the stack trace.


Jarcec

- Jarek Cecho


On June 3, 2013, 4:16 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 4:16 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21577
-----------------------------------------------------------

Ship it!


Ship It!

- Jarek Cecho


On June 7, 2013, 2:03 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 7, 2013, 2:03 a.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop 61ff3f2 
>   bin/configure-sqoop.cmd f5fd608 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated June 7, 2013, 2:03 a.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Latest changes with the issue identified fixed.   

Thanks Jarek for a thorough review - Very much appreciated.   Will upload to JIRA also


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  bin/configure-sqoop 61ff3f2 
  bin/configure-sqoop.cmd f5fd608 
  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/docs/user/SqoopUserGuide.txt 01ac1cf 
  src/docs/user/hcatalog.txt PRE-CREATION 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On June 7, 2013, 1:03 a.m., Jarek Cecho wrote:
> > bin/configure-sqoop, line 61
> > <https://reviews.apache.org/r/10688/diff/10-11/?file=301227#file301227line61>
> >
> >     Nit: there seems to be extra "`" that is breaking the script.

My apologies.  After testing, I was made sure the merged branch was alright and might have introduced this by mistake.   Also I noticed that I did not make the corresponding change in configure-sqoop.cmd for Windows.   I have added that.

Thanks

Venkat


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21552
-----------------------------------------------------------


On June 6, 2013, 10:55 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 6, 2013, 10:55 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop 61ff3f2 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21552
-----------------------------------------------------------

Ship it!


Hi Venkat,
please do fix the last final typo and attach the patch to JIRA, I'll go ahead and commit it!


bin/configure-sqoop
<https://reviews.apache.org/r/10688/#comment44592>

    Nit: there seems to be extra "`" that is breaking the script.


Jarcec

- Jarek Cecho


On June 6, 2013, 10:55 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 6, 2013, 10:55 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop 61ff3f2 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated June 6, 2013, 10:55 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Updated patch with review comments - use an API that is backwards compatible for joining Strings.
Fixed script so that it works correctly with HCatalog jobs on BIGTOP
Other fixes.


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  bin/configure-sqoop 61ff3f2 
  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/docs/user/SqoopUserGuide.txt 01ac1cf 
  src/docs/user/hcatalog.txt PRE-CREATION 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.

> On June 6, 2013, 6:34 p.m., Jarek Cecho wrote:
> > Hi Venkat,
> > thank you very much for incorporating all my suggestions. I believe that we are almost at the end. I was again doing some testing and I've noticed few issues (some of them created by my own suggestions):
> > 
> > 1) I see compilation failure
> >     [javac] /home/jarcec/apache/repos/sqoop/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java:877: join(java.lang.CharSequence,java.lang.Iterable<?>) in org.apache.hadoop.util.StringUtils cannot be applied to (java.lang.String,java.lang.String[])
> >     [javac]     String argLine = StringUtils.join(",", argArray);
> > 
> > I've fixed that by changing the line to String argLine = StringUtils.join(",", Arrays.asList(argArray)) to unblock the review, however proper solution is up to you :-)
> > 
> > 2) We've changed the hardcoded paths to Hive and HCatalog home to SqoopOptions.getHiveHomeDefault() (or HCatalog), however those two methods actually can return null, which is causing ClassNotFoundExceptions later in the code. What about improving them in similar fashion:
> > 
> >   public static String getHiveHomeDefault() {
> >     // Set this with $HIVE_HOME, but -Dhive.home can override.
> >     String hiveHome = System.getenv("HIVE_HOME", "/usr/lib/hive");
> >     return System.getProperty("hive.home", hiveHome);
> >   }

Thanks for the review

1)   I did run all the tests with hadoop100 profile but it looks like StringUtils.join(String, String[]) is a new addition.   Unfortunately, there is no @since in the javadocs :(  Sorry about that
2)  Good catch - will fix it and use the default values I was using before for these two


> On June 6, 2013, 6:34 p.m., Jarek Cecho wrote:
> > bin/configure-sqoop, line 118
> > <https://reviews.apache.org/r/10688/diff/10/?file=301227#file301227line118>
> >
> >     Nit: Add HCatalog to dependency list

Will fix


> On June 6, 2013, 6:34 p.m., Jarek Cecho wrote:
> > bin/configure-sqoop, line 118
> > <https://reviews.apache.org/r/10688/diff/10/?file=301227#file301227line118>
> >
> >     Nit: Add HCatalog to dependency list

Will fix


> On June 6, 2013, 6:34 p.m., Jarek Cecho wrote:
> > bin/configure-sqoop, line 120
> > <https://reviews.apache.org/r/10688/diff/10/?file=301227#file301227line120>
> >
> >     Rest of the Sqoop is expecting variable HADOOP_COMMON_HOME whereas the underlying hcat script is expecting HADOOP_HOME, so on BigTop this line is ending with:
> >     
> >     Hadoop not found.
> >     
> >     I was able to workaround it by adding following line before the highlighted line:
> >     
> >     export HADOOP_HOME=$HADOOP_COMMON_HOME
> >     
> >     However I'm not sure whether this is the best solution or not :-/

I think that sounds like a good fix.  Thanks for that.   Let me add it and also add a comment so that it is not accidentally removed in future


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21527
-----------------------------------------------------------


On June 6, 2013, midnight, Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 6, 2013, midnight)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop 61ff3f2 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review21527
-----------------------------------------------------------


Hi Venkat,
thank you very much for incorporating all my suggestions. I believe that we are almost at the end. I was again doing some testing and I've noticed few issues (some of them created by my own suggestions):

1) I see compilation failure
    [javac] /home/jarcec/apache/repos/sqoop/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java:877: join(java.lang.CharSequence,java.lang.Iterable<?>) in org.apache.hadoop.util.StringUtils cannot be applied to (java.lang.String,java.lang.String[])
    [javac]     String argLine = StringUtils.join(",", argArray);

I've fixed that by changing the line to String argLine = StringUtils.join(",", Arrays.asList(argArray)) to unblock the review, however proper solution is up to you :-)

2) We've changed the hardcoded paths to Hive and HCatalog home to SqoopOptions.getHiveHomeDefault() (or HCatalog), however those two methods actually can return null, which is causing ClassNotFoundExceptions later in the code. What about improving them in similar fashion:

  public static String getHiveHomeDefault() {
    // Set this with $HIVE_HOME, but -Dhive.home can override.
    String hiveHome = System.getenv("HIVE_HOME", "/usr/lib/hive");
    return System.getProperty("hive.home", hiveHome);
  }


bin/configure-sqoop
<https://reviews.apache.org/r/10688/#comment44563>

    Nit: Add HCatalog to dependency list



bin/configure-sqoop
<https://reviews.apache.org/r/10688/#comment44566>

    Nit: Add HCatalog to dependency list



bin/configure-sqoop
<https://reviews.apache.org/r/10688/#comment44573>

    Rest of the Sqoop is expecting variable HADOOP_COMMON_HOME whereas the underlying hcat script is expecting HADOOP_HOME, so on BigTop this line is ending with:
    
    Hadoop not found.
    
    I was able to workaround it by adding following line before the highlighted line:
    
    export HADOOP_HOME=$HADOOP_COMMON_HOME
    
    However I'm not sure whether this is the best solution or not :-/


Jarcec

- Jarek Cecho


On June 6, 2013, midnight, Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated June 6, 2013, midnight)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop 61ff3f2 
>   build.xml 636c103 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/docs/user/SqoopUserGuide.txt 01ac1cf 
>   src/docs/user/hcatalog.txt PRE-CREATION 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated June 6, 2013, midnight)


Review request for Sqoop and Jarek Cecho.


Changes
-------

New review changes.    Fixed documentation and added a new test to validate we fail when create-hcatalog-table is provided with preexisting table.

Removed inline hcat client execution.   It causes issues with logger configuration being reset by hive and specifying hiveconfiguration on the command line will entail more significant changes.

For tests we still use inline hcat execution.


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  bin/configure-sqoop 61ff3f2 
  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/docs/user/SqoopUserGuide.txt 01ac1cf 
  src/docs/user/hcatalog.txt PRE-CREATION 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/ThirdPartyTests.java 06f7122 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated June 3, 2013, 4:16 a.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Same as previous one except fixed the trailing blanks in hcatalog.txt documentation.   No real change to the rendered HTML.   Sorry for one more minor update


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/docs/user/SqoopUserGuide.txt 01ac1cf 
  src/docs/user/hcatalog.txt PRE-CREATION 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated June 2, 2013, 8:33 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

The following are the changes in this version

All review comments addressed
Two new tests to check for invalid options (--as-avrofile and --as-sequencefile) with HCatalog jobs
Moved HCatalog tests to integration tests temporarily pending the release HCatalog artifacts for Hadoop 2.x
Added HCatalog docs to the user guide


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/docs/user/SqoopUserGuide.txt 01ac1cf 
  src/docs/user/hcatalog.txt PRE-CREATION 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated May 29, 2013, 8:54 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Updated patch with fixes latest review comments.   

Thanks Jarek for a thorough review.   We are still blocked on HIVE-4460


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated May 24, 2013, 11:18 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Thanks Jarek.  I have updated the patch with your comments addressed and also rebased to trunk.   Verified with Hadoop 1.x and 2.x.


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 636c103 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 42f521f 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 2627726 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated May 4, 2013, 11:46 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Used consistent cases for all columns internally in the HCatalog code.  Checkstyle fixes


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 1c33fee 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated April 30, 2013, 6:56 a.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Fixed a bug where dynamic partition key position caused partitioning keys to be wrongly inferred and added one addl test case


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 1c33fee 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated April 29, 2013, 11:21 p.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Incorporated the review changes - create hcatalog table automatically based on command line options, hive delimiter processing and run as part of ant test target


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 1c33fee 
  ivy.xml 1fa4dd1 
  ivy/ivysettings.xml c4cc561 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
  src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
  src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
  src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
  testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
  testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
  testdata/hcatalog/conf/log4j.properties PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/
-----------------------------------------------------------

(Updated April 24, 2013, 5:12 a.m.)


Review request for Sqoop and Jarek Cecho.


Changes
-------

Fixed the mappers' key type to be WritableComparable instead of a specific keytype so that we can handle tables with different storage formats.  Also added new tests for sequence and text format files


Description
-------

This patch implements the new feature of integrating HCatalog and Sqoop.   With this feature, it is possible to import and export data between Sqoop and HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses the high level appraches.  

With this integration, more fidelity can be brought to the process of moving data between enterprise data stores and hadoop ecosystem.


Diffs (updated)
-----

  build.xml 1c33fee 
  ivy.xml 1fa4dd1 
  src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
  src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
  src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
  src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
  src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
  src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
  src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
  src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
  src/perftest/ExportStressTest.java 0a41408 
  src/test/com/cloudera/sqoop/TestHCatalogBasic.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogExportManualTest.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogImportManualTest.java PRE-CREATION 
  src/test/com/cloudera/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 

Diff: https://reviews.apache.org/r/10688/diff/


Testing
-------

Two new integration test suites with more than 20 tests in total have been added to test various aspects of the integration.  A unit test to test the option management is also added.   All tests pass


Thanks,

Venkat Ranganathan