You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:12:35 UTC

[jira] [Resolved] (SPARK-20573) --packages fails when transitive dependency can only be resolved from repository specified in POM's tag

     [ https://issues.apache.org/jira/browse/SPARK-20573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-20573.
----------------------------------
    Resolution: Incomplete

> --packages fails when transitive dependency can only be resolved from repository specified in POM's <repositories> tag
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-20573
>                 URL: https://issues.apache.org/jira/browse/SPARK-20573
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Josh Rosen
>            Priority: Major
>              Labels: bulk-closed
>
> With a clean Ivy cache, run the following command:
> {code}
> ./bin/spark-shell --packages com.twitter.elephantbird:elephant-bird-core:4.4
> {code}
> This will fail with {{unresolved dependency: com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found}}.
>  If you look at the elephant-bird-core POM (at http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird-core/4.4/elephant-bird-core-4.4.pom) you'll see a direct dependency on hadoop-lzo. This library is only present in Twitter's public Maven repository, hosted at http://maven.twttr.com.The elephant-bird-core POM does not directly declare Twitter's external repository. Instead, that external repository is inherited from elephant-bird-core's parent POM (at http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird/4.4/elephant-bird-4.4.pom).
> From the Ivy output it looks like it it didn't even attempt to resolve from the Twitter repo:
> {code}
> :: problems summary ::
> :::: WARNINGS
> 		module not found: com.hadoop.gplcompression#hadoop-lzo;0.4.16
> 	==== local-m2-cache: tried
> 	  file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
> 	  -- artifact com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
> 	  file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
> 	==== local-ivy-cache: tried
> 	  /Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/ivys/ivy.xml
> 	  -- artifact com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
> 	  /Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/jars/hadoop-lzo.jar
> 	==== central: tried
> 	  https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
> 	  -- artifact com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
> 	  https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
> 	==== spark-packages: tried
> 	  http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
> 	  -- artifact com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
> 	  http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
> 		::::::::::::::::::::::::::::::::::::::::::::::
> 		::          UNRESOLVED DEPENDENCIES         ::
> 		::::::::::::::::::::::::::::::::::::::::::::::
> 		:: com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found
> 		::::::::::::::::::::::::::::::::::::::::::::::
> {code}
> If you manually specify the Twitter repository as an additional external repository then everything works fine.
> This is a somewhat frustrating behavior from an end-user's point of view because unless they dig through the POMs themselves it is not obvious why things are broken or how to fix them. When Maven resolves this coordinate it properly fetches the transitive dependencies from the additional repositories specified in the referencing POMs. My hunch is that this behavior is caused by either a bug in Ivy itself or a bug in Spark's usage / configuration of the embedded Ivy resolver.
> It would be great to see if we can find other test-cases to narrow down the scope of the bug. I'm wondering whether POM-specified repositories will work if they're specified in the POM of the top-level dependency being resolved. It would also be useful to determine whether Ivy handles additional repositories in the top-level of transitive dependencies' POMs: maybe the problem is the specific combination of transitive dep + repository inherited from that dep's parent POM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org