You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Yi Tian <ti...@gmail.com> on 2014/12/12 02:47:10 UTC

Is there any document to explain how to build the hive jars for spark?

Hi, all

We found some bugs in hive-0.12, but we could not wait for hive 
community fixing them.

We want to fix these bugs in our lab and build a new release which could 
be recognized by spark.

As we know, spark depends on a special release of hive, like:

|<dependency>
   <groupId>org.spark-project.hive</groupId>
   <artifactId>hive-metastore</artifactId>
   <version>${hive.version}</version>
</dependency>
|

The different between |org.spark-project.hive| and |org.apache.hive| was 
described by Patrick:

|There are two differences:

1. We publish hive with a shaded protobuf dependency to avoid
conflicts with some Hadoop versions.
2. We publish a proper hive-exec jar that only includes hive packages.
The upstream version of hive-exec bundles a bunch of other random
dependencies in it which makes it really hard for third-party projects
to use it.
|

Is there any document to guide us how to build the hive jars for spark?

Any help would be greatly appreciated.

Re: Is there any document to explain how to build the hive jars for spark?

Posted by Michael Armbrust <mi...@databricks.com>.

The modified version of hive can be found here:
https://github.com/pwendell/hive

On Thu, Dec 11, 2014 at 5:47 PM, Yi Tian <ti...@gmail.com> wrote:
>
> Hi, all
>
> We found some bugs in hive-0.12, but we could not wait for hive community
> fixing them.
>
> We want to fix these bugs in our lab and build a new release which could
> be recognized by spark.
>
> As we know, spark depends on a special release of hive, like:
>
> |<dependency>
>   <groupId>org.spark-project.hive</groupId>
>   <artifactId>hive-metastore</artifactId>
>   <version>${hive.version}</version>
> </dependency>
> |
>
> The different between |org.spark-project.hive| and |org.apache.hive| was
> described by Patrick:
>
> |There are two differences:
>
> 1. We publish hive with a shaded protobuf dependency to avoid
> conflicts with some Hadoop versions.
> 2. We publish a proper hive-exec jar that only includes hive packages.
> The upstream version of hive-exec bundles a bunch of other random
> dependencies in it which makes it really hard for third-party projects
> to use it.
> |
>
> Is there any document to guide us how to build the hive jars for spark?
>
> Any help would be greatly appreciated.
>
> 
>