You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Wesley Tanaka (JIRA)" <ji...@apache.org> on 2017/04/03 05:06:41 UTC

[jira] [Comment Edited] (BEAM-1859) sorter extension depends on hadoop but does not declare as such in repository artifact

    [ https://issues.apache.org/jira/browse/BEAM-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952999#comment-15952999 ] 

Wesley Tanaka edited comment on BEAM-1859 at 4/3/17 5:06 AM:
-------------------------------------------------------------

Adding org.apache.hadoop:hadoop-core:0.20.2 as a dependency explicitly does resolve the issue, thanks, I'll just do that; I didn't know that it was a best practice to assume it was already installed.

In case it helps to know it, my use case is one of learning the Beam API, not of trying to actually accomplish something with it:

* I am trying to learn the Beam API
* So I am trying to create different toy composite PTransforms
* and I'd like to speed up the code/test/debug cycle relative to uploading code into a cluster
* so, despite this being nonsensical w.r.t. the actual use of Beam, I am trying to hack together some code to get DirectRunner to read lines from stdin and write lines to stdout and run the same code against different inputs to see how it behaves.

In case it's also interesting to know, in my actual use case, I don't actually have Hadoop setup, I'm using Beam with only Flink and Kafka at the moment.


was (Author: wtanaka):
Adding org.apache.hadoop:hadoop-core:0.20.2 as a dependency explicitly does resolve the issue, thanks, I'll just do that; I didn't know that it was a best practice to assume it was already installed.

In case it helps to know it, my use case is one of learning the Beam API, not of trying to actually accomplish something with it:

* I am trying to learn the Beam API
* So I am trying to create different toy composite PTransforms
* and I'd like to speed up the code/test/debug cycle relative to uploading code into a cluster
* so, despite this being nonsensical w.r.t. the actual use of Beam, I am trying to hack together some code to get DirectRunner to read lines from stdin and write lines to stdout and run the same code against different inputs to see how it behaves.

> sorter extension depends on hadoop but does not declare as such in repository artifact
> --------------------------------------------------------------------------------------
>
>                 Key: BEAM-1859
>                 URL: https://issues.apache.org/jira/browse/BEAM-1859
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 0.6.0
>            Reporter: Wesley Tanaka
>            Assignee: Davor Bonaci
>             Fix For: Not applicable
>
>
> When SortValues is used via {{org.apache.beam:beam-sdks-java-extensions-sorter:0.6.0}}, this exception is raised:
> {noformat}
> Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> 	at org.apache.beam.sdk.extensions.sorter.BufferedExternalSorter.create(BufferedExternalSorter.java:98)
> 	at org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn.processElement(SortValues.java:153)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 	at org.apache.beam.sdk.extensions.sorter.BufferedExternalSorter.create(BufferedExternalSorter.java:98)
> 	at org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn.processElement(SortValues.java:153)
> 	at org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn$auxiliary$uK25yOmK.invokeProcessElement(Unknown Source)
> 	at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:198)
> 	at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:159)
> 	at org.apache.beam.runners.core.PushbackSideInputDoFnRunner.processElement(PushbackSideInputDoFnRunner.java:111)
> 	at org.apache.beam.runners.core.PushbackSideInputDoFnRunner.processElementInReadyWindows(PushbackSideInputDoFnRunner.java:77)
> 	at org.apache.beam.runners.direct.ParDoEvaluator.processElement(ParDoEvaluator.java:134)
> 	at org.apache.beam.runners.direct.DoFnLifecycleManagerRemovingTransformEvaluator.processElement(DoFnLifecycleManagerRemovingTransformEvaluator.java:51)
> 	at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
> 	at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I think the issue is that beam-sdks-java-extensions-sorter should declare that it depends on that hadoop library but does not?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)