You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by daniel voros <da...@gmail.com> on 2018/03/02 15:39:17 UTC
Review Request 65884: SqoopJobDataPublisher is invoked before
Hive/HCat imports succeed
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65884/
-----------------------------------------------------------
Review request for Sqoop.
Bugs: SQOOP-3291
https://issues.apache.org/jira/browse/SQOOP-3291
Repository: sqoop-trunk
Description
-------
Job data is published to listeners (defined via sqoop.job.data.publish.class) in case of Hive and HCat imports. Currently this happens before the Hive import completes, so it gets reported even if Hive import fails.
Diffs
-----
src/java/org/apache/sqoop/hive/HiveImport.java c272911
src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 6529bd2
src/java/org/apache/sqoop/mapreduce/ImportJobBase.java fb5d054
src/java/org/apache/sqoop/mapreduce/PublishJobData.java fc18188
src/java/org/apache/sqoop/tool/ImportTool.java e992005
src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java b3579ac
src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a0
Diff: https://reviews.apache.org/r/65884/diff/1/
Testing
-------
- created unit test
- tested on a cluster with Atlas SqoopHook in place
Thanks,
daniel voros
Re: Review Request 65884: SqoopJobDataPublisher is invoked before
Hive/HCat imports succeed
Posted by daniel voros <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65884/#review198530
-----------------------------------------------------------
Patch #1 changes the following:
- moves the reporting after Hive import
- creats isolated classloader for Hive import, since Hive was polluting its classloader that caused trouble in Atlas' SqoopHook
- changes PublishJobData#publishJobData to accept class name as its first argument instead of unnecessary Configuration object
- daniel voros
On March 2, 2018, 3:39 p.m., daniel voros wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65884/
> -----------------------------------------------------------
>
> (Updated March 2, 2018, 3:39 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-3291
> https://issues.apache.org/jira/browse/SQOOP-3291
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Job data is published to listeners (defined via sqoop.job.data.publish.class) in case of Hive and HCat imports. Currently this happens before the Hive import completes, so it gets reported even if Hive import fails.
>
>
> Diffs
> -----
>
> src/java/org/apache/sqoop/hive/HiveImport.java c272911
> src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 6529bd2
> src/java/org/apache/sqoop/mapreduce/ImportJobBase.java fb5d054
> src/java/org/apache/sqoop/mapreduce/PublishJobData.java fc18188
> src/java/org/apache/sqoop/tool/ImportTool.java e992005
> src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java b3579ac
> src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a0
>
>
> Diff: https://reviews.apache.org/r/65884/diff/1/
>
>
> Testing
> -------
>
> - created unit test
> - tested on a cluster with Atlas SqoopHook in place
>
>
> Thanks,
>
> daniel voros
>
>
Re: Review Request 65884: SqoopJobDataPublisher is invoked before
Hive/HCat imports succeed
Posted by Fero Szabo via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65884/#review198933
-----------------------------------------------------------
src/java/org/apache/sqoop/hive/HiveImport.java
Lines 312-316 (patched)
<https://reviews.apache.org/r/65884/#comment279212>
Hi Dani,
I believe the original design that includes the testMode boolean flag, that determines whether we are in a test or not, is quite flawed. One thing you could consider is removing it and use some kind of mocking instead.
But, as it's used in 5 tests, this might be an issue with a bigger scope and best captured and solved in a different Jira? (Though that might just get forgotten and lost...)
- Fero Szabo
On March 2, 2018, 3:39 p.m., daniel voros wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65884/
> -----------------------------------------------------------
>
> (Updated March 2, 2018, 3:39 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-3291
> https://issues.apache.org/jira/browse/SQOOP-3291
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Job data is published to listeners (defined via sqoop.job.data.publish.class) in case of Hive and HCat imports. Currently this happens before the Hive import completes, so it gets reported even if Hive import fails.
>
>
> Diffs
> -----
>
> src/java/org/apache/sqoop/hive/HiveImport.java c272911
> src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 6529bd2
> src/java/org/apache/sqoop/mapreduce/ImportJobBase.java fb5d054
> src/java/org/apache/sqoop/mapreduce/PublishJobData.java fc18188
> src/java/org/apache/sqoop/tool/ImportTool.java e992005
> src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java b3579ac
> src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a0
>
>
> Diff: https://reviews.apache.org/r/65884/diff/1/
>
>
> Testing
> -------
>
> - created unit test
> - tested on a cluster with Atlas SqoopHook in place
>
>
> Thanks,
>
> daniel voros
>
>
Re: Review Request 65884: SqoopJobDataPublisher is invoked before
Hive/HCat imports succeed
Posted by Boglarka Egyed <bo...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65884/#review198887
-----------------------------------------------------------
Hi Daniel,
Nice catch! I took a look at your change and I have couple of comments, please find them below.
Also, PostgresqlExternalTableImportTest third party test fails for me with your patch:
[ERROR - org.apache.sqoop.tool.ImportTool.run(ImportTool.java:653)] Import failed: java.io.IOException: Hive exited with status 1
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:398)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:351)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:248)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:549)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:647)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.testutil.ImportJobTestCase.runImport(ImportJobTestCase.java:227)
at org.apache.sqoop.testutil.ImportJobTestCase.runImport(ImportJobTestCase.java:242)
at org.apache.sqoop.manager.postgresql.PostgresqlExternalTableImportTest.doImportAndVerify(PostgresqlExternalTableImportTest.java:251)
at org.apache.sqoop.manager.postgresql.PostgresqlExternalTableImportTest.testJdbcBasedImport(PostgresqlExternalTableImportTest.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:535)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1182)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1033)
Could you please take a look?
Thank you,
Bogi
src/java/org/apache/sqoop/hive/HiveImport.java
Lines 312-316 (patched)
<https://reviews.apache.org/r/65884/#comment279159>
I'm wondering of you could maybe mock an exception instead of introducing this test-only logic in the implementation?
src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java
Lines 115-116 (patched)
<https://reviews.apache.org/r/65884/#comment279137>
It seema to be like some debugging lines have been left here.
src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java
Lines 173-178 (patched)
<https://reviews.apache.org/r/65884/#comment279138>
Instead of this try-catch logic could you please use the ExpectedException rule like in TestIncrementalImport for example?
- Boglarka Egyed
On March 2, 2018, 3:39 p.m., daniel voros wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65884/
> -----------------------------------------------------------
>
> (Updated March 2, 2018, 3:39 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-3291
> https://issues.apache.org/jira/browse/SQOOP-3291
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Job data is published to listeners (defined via sqoop.job.data.publish.class) in case of Hive and HCat imports. Currently this happens before the Hive import completes, so it gets reported even if Hive import fails.
>
>
> Diffs
> -----
>
> src/java/org/apache/sqoop/hive/HiveImport.java c272911
> src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 6529bd2
> src/java/org/apache/sqoop/mapreduce/ImportJobBase.java fb5d054
> src/java/org/apache/sqoop/mapreduce/PublishJobData.java fc18188
> src/java/org/apache/sqoop/tool/ImportTool.java e992005
> src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java b3579ac
> src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a0
>
>
> Diff: https://reviews.apache.org/r/65884/diff/1/
>
>
> Testing
> -------
>
> - created unit test
> - tested on a cluster with Atlas SqoopHook in place
>
>
> Thanks,
>
> daniel voros
>
>