You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Deenar Toraskar <de...@gmail.com> on 2016/01/25 16:22:59 UTC

Sharing HiveContext in Spark JobServer / getOrCreate

Hi

I am using a shared sparkContext for all of my Spark jobs. Some of the jobs
use HiveContext, but there isn't a getOrCreate method on HiveContext which
will allow reuse of an existing HiveContext. Such a method exists on
SQLContext only (def getOrCreate(sparkContext: SparkContext): SQLContext).

Is there any reason that a HiveContext cannot be shared amongst multiple
threads within the same Spark driver process?

In addition I cannot seem to be able to cast a HiveContext to a SQLContext,
but this works fine in the spark shell, I am doing something wrong here?

scala> sqlContext

res19: org.apache.spark.sql.SQLContext =
org.apache.spark.sql.hive.HiveContext@383b3357

scala> import org.apache.spark.sql.SQLContext

import org.apache.spark.sql.SQLContext

scala> SQLContext.getOrCreate(sc)

res18: org.apache.spark.sql.SQLContext =
org.apache.spark.sql.hive.HiveContext@383b3357



Regards
Deenar

Re: Sharing HiveContext in Spark JobServer / getOrCreate

Posted by Deenar Toraskar <de...@gmail.com>.
On 25 January 2016 at 21:09, Deenar Toraskar <
deenar.toraskar@thinkreactive.co.uk> wrote:

> No I hadn't. This is useful, but in some cases we do want to share the
> same temporary tables between jobs so really wanted a getOrCreate
> equivalent on HIveContext.
>
> Deenar
>
>
>
> On 25 January 2016 at 18:10, Ted Yu <yu...@gmail.com> wrote:
>
>> Have you noticed the following method of HiveContext ?
>>
>>    * Returns a new HiveContext as new session, which will have separated
>> SQLConf, UDF/UDAF,
>>    * temporary tables and SessionState, but sharing the same
>> CacheManager, IsolatedClientLoader
>>    * and Hive client (both of execution and metadata) with existing
>> HiveContext.
>>    */
>>   override def newSession(): HiveContext = {
>>
>> Cheers
>>
>> On Mon, Jan 25, 2016 at 7:22 AM, Deenar Toraskar <
>> deenar.toraskar@gmail.com> wrote:
>>
>>> Hi
>>>
>>> I am using a shared sparkContext for all of my Spark jobs. Some of the
>>> jobs use HiveContext, but there isn't a getOrCreate method on HiveContext
>>> which will allow reuse of an existing HiveContext. Such a method exists on
>>> SQLContext only (def getOrCreate(sparkContext: SparkContext):
>>> SQLContext).
>>>
>>> Is there any reason that a HiveContext cannot be shared amongst multiple
>>> threads within the same Spark driver process?
>>>
>>> In addition I cannot seem to be able to cast a HiveContext to a
>>> SQLContext, but this works fine in the spark shell, I am doing something
>>> wrong here?
>>>
>>> scala> sqlContext
>>>
>>> res19: org.apache.spark.sql.SQLContext =
>>> org.apache.spark.sql.hive.HiveContext@383b3357
>>>
>>> scala> import org.apache.spark.sql.SQLContext
>>>
>>> import org.apache.spark.sql.SQLContext
>>>
>>> scala> SQLContext.getOrCreate(sc)
>>>
>>> res18: org.apache.spark.sql.SQLContext =
>>> org.apache.spark.sql.hive.HiveContext@383b3357
>>>
>>>
>>>
>>> Regards
>>> Deenar
>>>
>>
>>
>

Re: Sharing HiveContext in Spark JobServer / getOrCreate

Posted by Ted Yu <yu...@gmail.com>.
Have you noticed the following method of HiveContext ?

   * Returns a new HiveContext as new session, which will have separated
SQLConf, UDF/UDAF,
   * temporary tables and SessionState, but sharing the same CacheManager,
IsolatedClientLoader
   * and Hive client (both of execution and metadata) with existing
HiveContext.
   */
  override def newSession(): HiveContext = {

Cheers

On Mon, Jan 25, 2016 at 7:22 AM, Deenar Toraskar <de...@gmail.com>
wrote:

> Hi
>
> I am using a shared sparkContext for all of my Spark jobs. Some of the
> jobs use HiveContext, but there isn't a getOrCreate method on HiveContext
> which will allow reuse of an existing HiveContext. Such a method exists on
> SQLContext only (def getOrCreate(sparkContext: SparkContext): SQLContext).
>
> Is there any reason that a HiveContext cannot be shared amongst multiple
> threads within the same Spark driver process?
>
> In addition I cannot seem to be able to cast a HiveContext to a
> SQLContext, but this works fine in the spark shell, I am doing something
> wrong here?
>
> scala> sqlContext
>
> res19: org.apache.spark.sql.SQLContext =
> org.apache.spark.sql.hive.HiveContext@383b3357
>
> scala> import org.apache.spark.sql.SQLContext
>
> import org.apache.spark.sql.SQLContext
>
> scala> SQLContext.getOrCreate(sc)
>
> res18: org.apache.spark.sql.SQLContext =
> org.apache.spark.sql.hive.HiveContext@383b3357
>
>
>
> Regards
> Deenar
>