You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Michael Armbrust <mi...@databricks.com> on 2016/07/18 19:16:18 UTC

Re: transtition SQLContext to SparkSession

+ dev, reynold

Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
public/deprecated?

On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com> wrote:

> in my codebase i would like to gradually transition to SparkSession, so
> while i start using SparkSession i also want a SQLContext to be available
> as before (but with a deprecated warning when i use it). this should be
> easy since SQLContext is now a wrapper for SparkSession.
>
> so basically:
> val session = SparkSession.builder.set(..., ...).getOrCreate()
> val sqlc = new SQLContext(session)
>
> however this doesnt work, the SQLContext constructor i am trying to use is
> private. SparkSession.sqlContext is also private.
>
> am i missing something?
>
> a non-gradual switch is not very realistic in any significant codebase,
> and i do not want to create SparkSession and SQLContext independendly (both
> from same SparkContext) since that can only lead to confusion and
> inconsistent settings.
>

Re: transtition SQLContext to SparkSession

Posted by Michael Allman <mi...@videoamp.com>.
Hi Reynold,

So far we've been able to transition everything to `SparkSession`. I was just following up on behalf of Maciej.

Michael

> On Jul 19, 2016, at 11:02 AM, Reynold Xin <rx...@databricks.com> wrote:
> 
> dropping user list
> 
> Yup I just took a look -- you are right.
> 
> What's the reason you'd need a HiveContext? The only method that HiveContext has and SQLContext does not have is refreshTable. Given this is meant for helping code transition, it might be easier to just use SQLContext and change the places that use refreshTable?
> 
> In order for SparkSession.sqlContext to return an actual HiveContext, we'd need to use reflection to create a HiveContext, which is pretty hacky.
> 
> 
> 
> On Tue, Jul 19, 2016 at 10:58 AM, Michael Allman <michael@videoamp.com <ma...@videoamp.com>> wrote:
> Sorry Reynold, I want to triple check this with you. I'm looking at the `SparkSession.sqlContext` field in the latest 2.0 branch, and it appears that that val is set specifically to an instance of the `SQLContext` class. A cast to `HiveContext` will fail. Maybe there's a misunderstanding here. This is what I'm looking at:
> 
> https://github.com/apache/spark/blob/24ea875198ffcef4a4c3ba28aba128d6d7d9a395/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L122 <https://github.com/apache/spark/blob/24ea875198ffcef4a4c3ba28aba128d6d7d9a395/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L122>
> 
> Michael
> 
> 
> 
>> On Jul 19, 2016, at 10:01 AM, Reynold Xin <rxin@databricks.com <ma...@databricks.com>> wrote:
>> 
>> Yes. But in order to access methods available only in HiveContext a user cast is required. 
>> 
>> On Tuesday, July 19, 2016, Maciej Bryński <maciek@brynski.pl <ma...@brynski.pl>> wrote:
>> @Reynold Xin,
>> How this will work with Hive Support ?
>> SparkSession.sqlContext return HiveContext ?
>> 
>> 2016-07-19 0:26 GMT+02:00 Reynold Xin <rxin@databricks.com <>>:
>> > Good idea.
>> >
>> > https://github.com/apache/spark/pull/14252 <https://github.com/apache/spark/pull/14252>
>> >
>> >
>> >
>> > On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <michael@databricks.com <>>
>> > wrote:
>> >>
>> >> + dev, reynold
>> >>
>> >> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
>> >> public/deprecated?
>> >>
>> >> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <koert@tresata.com <>> wrote:
>> >>>
>> >>> in my codebase i would like to gradually transition to SparkSession, so
>> >>> while i start using SparkSession i also want a SQLContext to be available as
>> >>> before (but with a deprecated warning when i use it). this should be easy
>> >>> since SQLContext is now a wrapper for SparkSession.
>> >>>
>> >>> so basically:
>> >>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>> >>> val sqlc = new SQLContext(session)
>> >>>
>> >>> however this doesnt work, the SQLContext constructor i am trying to use
>> >>> is private. SparkSession.sqlContext is also private.
>> >>>
>> >>> am i missing something?
>> >>>
>> >>> a non-gradual switch is not very realistic in any significant codebase,
>> >>> and i do not want to create SparkSession and SQLContext independendly (both
>> >>> from same SparkContext) since that can only lead to confusion and
>> >>> inconsistent settings.
>> >>
>> >>
>> >
>> 
>> 
>> 
>> --
>> Maciek Bryński
> 
> 


Re: transtition SQLContext to SparkSession

Posted by Reynold Xin <rx...@databricks.com>.
dropping user list

Yup I just took a look -- you are right.

What's the reason you'd need a HiveContext? The only method that
HiveContext has and SQLContext does not have is refreshTable. Given this is
meant for helping code transition, it might be easier to just use
SQLContext and change the places that use refreshTable?

In order for SparkSession.sqlContext to return an actual HiveContext, we'd
need to use reflection to create a HiveContext, which is pretty hacky.



On Tue, Jul 19, 2016 at 10:58 AM, Michael Allman <mi...@videoamp.com>
wrote:

> Sorry Reynold, I want to triple check this with you. I'm looking at the
> `SparkSession.sqlContext` field in the latest 2.0 branch, and it appears
> that that val is set specifically to an instance of the `SQLContext` class.
> A cast to `HiveContext` will fail. Maybe there's a misunderstanding here.
> This is what I'm looking at:
>
>
> https://github.com/apache/spark/blob/24ea875198ffcef4a4c3ba28aba128d6d7d9a395/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L122
>
> Michael
>
>
>
> On Jul 19, 2016, at 10:01 AM, Reynold Xin <rx...@databricks.com> wrote:
>
> Yes. But in order to access methods available only in HiveContext a user
> cast is required.
>
> On Tuesday, July 19, 2016, Maciej Bryński <ma...@brynski.pl> wrote:
>
>> @Reynold Xin,
>> How this will work with Hive Support ?
>> SparkSession.sqlContext return HiveContext ?
>>
>> 2016-07-19 0:26 GMT+02:00 Reynold Xin <rx...@databricks.com>:
>> > Good idea.
>> >
>> > https://github.com/apache/spark/pull/14252
>> >
>> >
>> >
>> > On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <
>> michael@databricks.com>
>> > wrote:
>> >>
>> >> + dev, reynold
>> >>
>> >> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should
>> be
>> >> public/deprecated?
>> >>
>> >> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com>
>> wrote:
>> >>>
>> >>> in my codebase i would like to gradually transition to SparkSession,
>> so
>> >>> while i start using SparkSession i also want a SQLContext to be
>> available as
>> >>> before (but with a deprecated warning when i use it). this should be
>> easy
>> >>> since SQLContext is now a wrapper for SparkSession.
>> >>>
>> >>> so basically:
>> >>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>> >>> val sqlc = new SQLContext(session)
>> >>>
>> >>> however this doesnt work, the SQLContext constructor i am trying to
>> use
>> >>> is private. SparkSession.sqlContext is also private.
>> >>>
>> >>> am i missing something?
>> >>>
>> >>> a non-gradual switch is not very realistic in any significant
>> codebase,
>> >>> and i do not want to create SparkSession and SQLContext independendly
>> (both
>> >>> from same SparkContext) since that can only lead to confusion and
>> >>> inconsistent settings.
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Maciek Bryński
>>
>
>

Re: transtition SQLContext to SparkSession

Posted by Michael Allman <mi...@videoamp.com>.
Sorry Reynold, I want to triple check this with you. I'm looking at the `SparkSession.sqlContext` field in the latest 2.0 branch, and it appears that that val is set specifically to an instance of the `SQLContext` class. A cast to `HiveContext` will fail. Maybe there's a misunderstanding here. This is what I'm looking at:

https://github.com/apache/spark/blob/24ea875198ffcef4a4c3ba28aba128d6d7d9a395/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L122

Michael


> On Jul 19, 2016, at 10:01 AM, Reynold Xin <rx...@databricks.com> wrote:
> 
> Yes. But in order to access methods available only in HiveContext a user cast is required. 
> 
> On Tuesday, July 19, 2016, Maciej Bryński <maciek@brynski.pl <ma...@brynski.pl>> wrote:
> @Reynold Xin,
> How this will work with Hive Support ?
> SparkSession.sqlContext return HiveContext ?
> 
> 2016-07-19 0:26 GMT+02:00 Reynold Xin <rxin@databricks.com <javascript:;>>:
> > Good idea.
> >
> > https://github.com/apache/spark/pull/14252 <https://github.com/apache/spark/pull/14252>
> >
> >
> >
> > On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <michael@databricks.com <javascript:;>>
> > wrote:
> >>
> >> + dev, reynold
> >>
> >> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
> >> public/deprecated?
> >>
> >> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <koert@tresata.com <javascript:;>> wrote:
> >>>
> >>> in my codebase i would like to gradually transition to SparkSession, so
> >>> while i start using SparkSession i also want a SQLContext to be available as
> >>> before (but with a deprecated warning when i use it). this should be easy
> >>> since SQLContext is now a wrapper for SparkSession.
> >>>
> >>> so basically:
> >>> val session = SparkSession.builder.set(..., ...).getOrCreate()
> >>> val sqlc = new SQLContext(session)
> >>>
> >>> however this doesnt work, the SQLContext constructor i am trying to use
> >>> is private. SparkSession.sqlContext is also private.
> >>>
> >>> am i missing something?
> >>>
> >>> a non-gradual switch is not very realistic in any significant codebase,
> >>> and i do not want to create SparkSession and SQLContext independendly (both
> >>> from same SparkContext) since that can only lead to confusion and
> >>> inconsistent settings.
> >>
> >>
> >
> 
> 
> 
> --
> Maciek Bryński


Re: transtition SQLContext to SparkSession

Posted by Reynold Xin <rx...@databricks.com>.
Yes. But in order to access methods available only in HiveContext a user
cast is required.

On Tuesday, July 19, 2016, Maciej Bryński <ma...@brynski.pl> wrote:

> @Reynold Xin,
> How this will work with Hive Support ?
> SparkSession.sqlContext return HiveContext ?
>
> 2016-07-19 0:26 GMT+02:00 Reynold Xin <rxin@databricks.com <javascript:;>
> >:
> > Good idea.
> >
> > https://github.com/apache/spark/pull/14252
> >
> >
> >
> > On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <
> michael@databricks.com <javascript:;>>
> > wrote:
> >>
> >> + dev, reynold
> >>
> >> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
> >> public/deprecated?
> >>
> >> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <koert@tresata.com
> <javascript:;>> wrote:
> >>>
> >>> in my codebase i would like to gradually transition to SparkSession, so
> >>> while i start using SparkSession i also want a SQLContext to be
> available as
> >>> before (but with a deprecated warning when i use it). this should be
> easy
> >>> since SQLContext is now a wrapper for SparkSession.
> >>>
> >>> so basically:
> >>> val session = SparkSession.builder.set(..., ...).getOrCreate()
> >>> val sqlc = new SQLContext(session)
> >>>
> >>> however this doesnt work, the SQLContext constructor i am trying to use
> >>> is private. SparkSession.sqlContext is also private.
> >>>
> >>> am i missing something?
> >>>
> >>> a non-gradual switch is not very realistic in any significant codebase,
> >>> and i do not want to create SparkSession and SQLContext independendly
> (both
> >>> from same SparkContext) since that can only lead to confusion and
> >>> inconsistent settings.
> >>
> >>
> >
>
>
>
> --
> Maciek Bryński
>

Re: transtition SQLContext to SparkSession

Posted by Reynold Xin <rx...@databricks.com>.
Yes. But in order to access methods available only in HiveContext a user
cast is required.

On Tuesday, July 19, 2016, Maciej Bryński <ma...@brynski.pl> wrote:

> @Reynold Xin,
> How this will work with Hive Support ?
> SparkSession.sqlContext return HiveContext ?
>
> 2016-07-19 0:26 GMT+02:00 Reynold Xin <rxin@databricks.com <javascript:;>
> >:
> > Good idea.
> >
> > https://github.com/apache/spark/pull/14252
> >
> >
> >
> > On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <
> michael@databricks.com <javascript:;>>
> > wrote:
> >>
> >> + dev, reynold
> >>
> >> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
> >> public/deprecated?
> >>
> >> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <koert@tresata.com
> <javascript:;>> wrote:
> >>>
> >>> in my codebase i would like to gradually transition to SparkSession, so
> >>> while i start using SparkSession i also want a SQLContext to be
> available as
> >>> before (but with a deprecated warning when i use it). this should be
> easy
> >>> since SQLContext is now a wrapper for SparkSession.
> >>>
> >>> so basically:
> >>> val session = SparkSession.builder.set(..., ...).getOrCreate()
> >>> val sqlc = new SQLContext(session)
> >>>
> >>> however this doesnt work, the SQLContext constructor i am trying to use
> >>> is private. SparkSession.sqlContext is also private.
> >>>
> >>> am i missing something?
> >>>
> >>> a non-gradual switch is not very realistic in any significant codebase,
> >>> and i do not want to create SparkSession and SQLContext independendly
> (both
> >>> from same SparkContext) since that can only lead to confusion and
> >>> inconsistent settings.
> >>
> >>
> >
>
>
>
> --
> Maciek Bryński
>

Re: transtition SQLContext to SparkSession

Posted by Maciej Bryński <ma...@brynski.pl>.
@Reynold Xin,
How this will work with Hive Support ?
SparkSession.sqlContext return HiveContext ?

2016-07-19 0:26 GMT+02:00 Reynold Xin <rx...@databricks.com>:
> Good idea.
>
> https://github.com/apache/spark/pull/14252
>
>
>
> On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <mi...@databricks.com>
> wrote:
>>
>> + dev, reynold
>>
>> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
>> public/deprecated?
>>
>> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com> wrote:
>>>
>>> in my codebase i would like to gradually transition to SparkSession, so
>>> while i start using SparkSession i also want a SQLContext to be available as
>>> before (but with a deprecated warning when i use it). this should be easy
>>> since SQLContext is now a wrapper for SparkSession.
>>>
>>> so basically:
>>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>>> val sqlc = new SQLContext(session)
>>>
>>> however this doesnt work, the SQLContext constructor i am trying to use
>>> is private. SparkSession.sqlContext is also private.
>>>
>>> am i missing something?
>>>
>>> a non-gradual switch is not very realistic in any significant codebase,
>>> and i do not want to create SparkSession and SQLContext independendly (both
>>> from same SparkContext) since that can only lead to confusion and
>>> inconsistent settings.
>>
>>
>



-- 
Maciek Bryński

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: transtition SQLContext to SparkSession

Posted by Maciej Bryński <ma...@brynski.pl>.
@Reynold Xin,
How this will work with Hive Support ?
SparkSession.sqlContext return HiveContext ?

2016-07-19 0:26 GMT+02:00 Reynold Xin <rx...@databricks.com>:
> Good idea.
>
> https://github.com/apache/spark/pull/14252
>
>
>
> On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <mi...@databricks.com>
> wrote:
>>
>> + dev, reynold
>>
>> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
>> public/deprecated?
>>
>> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com> wrote:
>>>
>>> in my codebase i would like to gradually transition to SparkSession, so
>>> while i start using SparkSession i also want a SQLContext to be available as
>>> before (but with a deprecated warning when i use it). this should be easy
>>> since SQLContext is now a wrapper for SparkSession.
>>>
>>> so basically:
>>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>>> val sqlc = new SQLContext(session)
>>>
>>> however this doesnt work, the SQLContext constructor i am trying to use
>>> is private. SparkSession.sqlContext is also private.
>>>
>>> am i missing something?
>>>
>>> a non-gradual switch is not very realistic in any significant codebase,
>>> and i do not want to create SparkSession and SQLContext independendly (both
>>> from same SparkContext) since that can only lead to confusion and
>>> inconsistent settings.
>>
>>
>



-- 
Maciek Bryński

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: transtition SQLContext to SparkSession

Posted by Reynold Xin <rx...@databricks.com>.
Good idea.

https://github.com/apache/spark/pull/14252



On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <mi...@databricks.com>
wrote:

> + dev, reynold
>
> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
> public/deprecated?
>
> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> in my codebase i would like to gradually transition to SparkSession, so
>> while i start using SparkSession i also want a SQLContext to be available
>> as before (but with a deprecated warning when i use it). this should be
>> easy since SQLContext is now a wrapper for SparkSession.
>>
>> so basically:
>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>> val sqlc = new SQLContext(session)
>>
>> however this doesnt work, the SQLContext constructor i am trying to use
>> is private. SparkSession.sqlContext is also private.
>>
>> am i missing something?
>>
>> a non-gradual switch is not very realistic in any significant codebase,
>> and i do not want to create SparkSession and SQLContext independendly (both
>> from same SparkContext) since that can only lead to confusion and
>> inconsistent settings.
>>
>
>

Re: transtition SQLContext to SparkSession

Posted by Reynold Xin <rx...@databricks.com>.
Good idea.

https://github.com/apache/spark/pull/14252



On Mon, Jul 18, 2016 at 12:16 PM, Michael Armbrust <mi...@databricks.com>
wrote:

> + dev, reynold
>
> Yeah, thats a good point.  I wonder if SparkSession.sqlContext should be
> public/deprecated?
>
> On Mon, Jul 18, 2016 at 8:37 AM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> in my codebase i would like to gradually transition to SparkSession, so
>> while i start using SparkSession i also want a SQLContext to be available
>> as before (but with a deprecated warning when i use it). this should be
>> easy since SQLContext is now a wrapper for SparkSession.
>>
>> so basically:
>> val session = SparkSession.builder.set(..., ...).getOrCreate()
>> val sqlc = new SQLContext(session)
>>
>> however this doesnt work, the SQLContext constructor i am trying to use
>> is private. SparkSession.sqlContext is also private.
>>
>> am i missing something?
>>
>> a non-gradual switch is not very realistic in any significant codebase,
>> and i do not want to create SparkSession and SQLContext independendly (both
>> from same SparkContext) since that can only lead to confusion and
>> inconsistent settings.
>>
>
>