You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Hemanth Gudela <he...@qvantel.com> on 2017/04/22 07:56:51 UTC

Spark SQL - Global Temporary View is not behaving as expected

Hi,

According to documentation<http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>, global temporary views are cross-session accessible.

But when I try to query a global temporary view from another spark shell like this -->
Instance 1 of spark-shell
----------------------------------
scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")

Instance 2 of spark-shell (while Instance 1 of spark-shell is still alive)
---------------------------------
scala> spark.sql("select * from global_temp.gView1").show()
org.apache.spark.sql.AnalysisException: Table or view not found: `global_temp`.`gView1`
'Project [*]
+- 'UnresolvedRelation `global_temp`.`gView1`

I am expecting that global temporary view created in shell 1 should be accessible in shell 2, but it isn’t!
Please correct me if I missing something here.

Thanks (in advance),
Hemanth

Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by vincent gromakowski <vi...@gmail.com>.
Look at Spark jobserver namedRDD that are supposed to be thread safe...

2017-04-24 16:01 GMT+02:00 Hemanth Gudela <he...@qvantel.com>:

> Hello Gene,
>
>
>
> Thanks, but Alluxio did not solve my spark streaming use case because my
> source parquet files in Alluxio in-memory are not ”appended” but are
> periodically being ”overwritten” due to the nature of business need.
>
> Spark jobs fail when trying to read parquet files at the same time when
> other job is writing parquet files in Alluxio.
>
>
>
> Could you suggest a way to synchronize parquet reads and writes in Allxio
> in-memory. i.e. when one spark job is writing a dataframe as parquet file
> in alluxio in-memory, the other spark jobs trying to read must wait until
> the write is finished.
>
>
>
> Thanks,
>
> Hemanth
>
>
>
> *From: *Gene Pang <ge...@gmail.com>
> *Date: *Monday, 24 April 2017 at 16.41
> *To: *vincent gromakowski <vi...@gmail.com>
> *Cc: *Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org"
> <us...@spark.apache.org>, Felix Cheung <fe...@hotmail.com>
>
> *Subject: *Re: Spark SQL - Global Temporary View is not behaving as
> expected
>
>
>
> As Vincent mentioned, Alluxio helps with sharing data across different
> Spark contexts. This blog post about Spark dataframes and Alluxio
> discusses that use case
> <https://alluxio.com/blog/effective-spark-dataframes-with-alluxio>.
>
>
>
> Thanks,
>
> Gene
>
>
>
> On Sat, Apr 22, 2017 at 2:14 AM, vincent gromakowski <
> vincent.gromakowski@gmail.com> wrote:
>
> Look at alluxio for sharing across drivers or spark jobserver
>
>
>
> Le 22 avr. 2017 10:24 AM, "Hemanth Gudela" <he...@qvantel.com> a
> écrit :
>
> Thanks for your reply.
>
>
>
> Creating a table is an option, but such approach slows down reads & writes
> for a real-time analytics streaming use case that I’m currently working on.
>
> If at all global temporary view could have been accessible across
> sessions/spark contexts, that would have simplified my usecase a lot.
>
>
>
> But yeah, thanks for explaining the behavior of global temporary view, now
> it’s clear J
>
>
>
> -Hemanth
>
>
>
> *From: *Felix Cheung <fe...@hotmail.com>
> *Date: *Saturday, 22 April 2017 at 11.05
> *To: *Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org"
> <us...@spark.apache.org>
> *Subject: *Re: Spark SQL - Global Temporary View is not behaving as
> expected
>
>
>
> Cross session is this context is multiple spark sessions from the same
> spark context. Since you are running two shells, you are having different
> spark context.
>
>
>
> Do you have to you a temp view? Could you create a table?
>
>
>
> _____________________________
> From: Hemanth Gudela <he...@qvantel.com>
> Sent: Saturday, April 22, 2017 12:57 AM
> Subject: Spark SQL - Global Temporary View is not behaving as expected
> To: <us...@spark.apache.org>
>
>
> Hi,
>
>
>
> According to documentation
> <http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>,
> global temporary views are cross-session accessible.
>
>
>
> But when I try to query a global temporary view from another spark shell
> like thisà
>
> *Instance 1 of spark-shell*
>
> ----------------------------------
>
> scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")
>
>
>
> *Instance 2 of spark-shell *(while Instance 1 of spark-shell is still
> alive)
>
> ---------------------------------
>
> scala> spark.sql("select * from global_temp.gView1").show()
>
> org.apache.spark.sql.AnalysisException: Table or view not found:
> `global_temp`.`gView1`
>
> 'Project [*]
>
> +- 'UnresolvedRelation `global_temp`.`gView1`
>
>
>
> I am expecting that global temporary view created in shell 1 should be
> accessible in shell 2, but it isn’t!
>
> Please correct me if I missing something here.
>
>
>
> Thanks (in advance),
>
> Hemanth
>
>
>
>
>

Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by Hemanth Gudela <he...@qvantel.com>.
Hello Gene,

Thanks, but Alluxio did not solve my spark streaming use case because my source parquet files in Alluxio in-memory are not ”appended” but are periodically being ”overwritten” due to the nature of business need.
Spark jobs fail when trying to read parquet files at the same time when other job is writing parquet files in Alluxio.

Could you suggest a way to synchronize parquet reads and writes in Allxio in-memory. i.e. when one spark job is writing a dataframe as parquet file in alluxio in-memory, the other spark jobs trying to read must wait until the write is finished.

Thanks,
Hemanth

From: Gene Pang <ge...@gmail.com>
Date: Monday, 24 April 2017 at 16.41
To: vincent gromakowski <vi...@gmail.com>
Cc: Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org" <us...@spark.apache.org>, Felix Cheung <fe...@hotmail.com>
Subject: Re: Spark SQL - Global Temporary View is not behaving as expected

As Vincent mentioned, Alluxio helps with sharing data across different Spark contexts. This blog post about Spark dataframes and Alluxio discusses that use case<https://alluxio.com/blog/effective-spark-dataframes-with-alluxio>.

Thanks,
Gene

On Sat, Apr 22, 2017 at 2:14 AM, vincent gromakowski <vi...@gmail.com>> wrote:
Look at alluxio for sharing across drivers or spark jobserver

Le 22 avr. 2017 10:24 AM, "Hemanth Gudela" <he...@qvantel.com>> a écrit :
Thanks for your reply.

Creating a table is an option, but such approach slows down reads & writes for a real-time analytics streaming use case that I’m currently working on.
If at all global temporary view could have been accessible across sessions/spark contexts, that would have simplified my usecase a lot.

But yeah, thanks for explaining the behavior of global temporary view, now it’s clear ☺

-Hemanth

From: Felix Cheung <fe...@hotmail.com>>
Date: Saturday, 22 April 2017 at 11.05
To: Hemanth Gudela <he...@qvantel.com>>, "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: Re: Spark SQL - Global Temporary View is not behaving as expected

Cross session is this context is multiple spark sessions from the same spark context. Since you are running two shells, you are having different spark context.

Do you have to you a temp view? Could you create a table?

_____________________________
From: Hemanth Gudela <he...@qvantel.com>>
Sent: Saturday, April 22, 2017 12:57 AM
Subject: Spark SQL - Global Temporary View is not behaving as expected
To: <us...@spark.apache.org>>


Hi,

According to documentation<http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>, global temporary views are cross-session accessible.

But when I try to query a global temporary view from another spark shell like this-->
Instance 1 of spark-shell
----------------------------------
scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")

Instance 2 of spark-shell (while Instance 1 of spark-shell is still alive)
---------------------------------
scala> spark.sql("select * from global_temp.gView1").show()
org.apache.spark.sql.AnalysisException: Table or view not found: `global_temp`.`gView1`
'Project [*]
+- 'UnresolvedRelation `global_temp`.`gView1`

I am expecting that global temporary view created in shell 1 should be accessible in shell 2, but it isn’t!
Please correct me if I missing something here.

Thanks (in advance),
Hemanth



Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by Gene Pang <ge...@gmail.com>.
As Vincent mentioned, Alluxio helps with sharing data across different
Spark contexts. This blog post about Spark dataframes and Alluxio discusses
that use case
<https://alluxio.com/blog/effective-spark-dataframes-with-alluxio>.

Thanks,
Gene

On Sat, Apr 22, 2017 at 2:14 AM, vincent gromakowski <
vincent.gromakowski@gmail.com> wrote:

> Look at alluxio for sharing across drivers or spark jobserver
>
> Le 22 avr. 2017 10:24 AM, "Hemanth Gudela" <he...@qvantel.com> a
> écrit :
>
>> Thanks for your reply.
>>
>>
>>
>> Creating a table is an option, but such approach slows down reads &
>> writes for a real-time analytics streaming use case that I’m currently
>> working on.
>>
>> If at all global temporary view could have been accessible across
>> sessions/spark contexts, that would have simplified my usecase a lot.
>>
>>
>>
>> But yeah, thanks for explaining the behavior of global temporary view,
>> now it’s clear J
>>
>>
>>
>> -Hemanth
>>
>>
>>
>> *From: *Felix Cheung <fe...@hotmail.com>
>> *Date: *Saturday, 22 April 2017 at 11.05
>> *To: *Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org"
>> <us...@spark.apache.org>
>> *Subject: *Re: Spark SQL - Global Temporary View is not behaving as
>> expected
>>
>>
>>
>> Cross session is this context is multiple spark sessions from the same
>> spark context. Since you are running two shells, you are having different
>> spark context.
>>
>>
>>
>> Do you have to you a temp view? Could you create a table?
>>
>>
>>
>> _____________________________
>> From: Hemanth Gudela <he...@qvantel.com>
>> Sent: Saturday, April 22, 2017 12:57 AM
>> Subject: Spark SQL - Global Temporary View is not behaving as expected
>> To: <us...@spark.apache.org>
>>
>>
>>
>> Hi,
>>
>>
>>
>> According to documentation
>> <http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>,
>> global temporary views are cross-session accessible.
>>
>>
>>
>> But when I try to query a global temporary view from another spark shell
>> like thisà
>>
>> *Instance 1 of spark-shell*
>>
>> ----------------------------------
>>
>> scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")
>>
>>
>>
>> *Instance 2 of spark-shell *(while Instance 1 of spark-shell is still
>> alive)
>>
>> ---------------------------------
>>
>> scala> spark.sql("select * from global_temp.gView1").show()
>>
>> org.apache.spark.sql.AnalysisException: Table or view not found:
>> `global_temp`.`gView1`
>>
>> 'Project [*]
>>
>> +- 'UnresolvedRelation `global_temp`.`gView1`
>>
>>
>>
>> I am expecting that global temporary view created in shell 1 should be
>> accessible in shell 2, but it isn’t!
>>
>> Please correct me if I missing something here.
>>
>>
>>
>> Thanks (in advance),
>>
>> Hemanth
>>
>>
>>
>

Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by vincent gromakowski <vi...@gmail.com>.
Look at alluxio for sharing across drivers or spark jobserver

Le 22 avr. 2017 10:24 AM, "Hemanth Gudela" <he...@qvantel.com> a
écrit :

> Thanks for your reply.
>
>
>
> Creating a table is an option, but such approach slows down reads & writes
> for a real-time analytics streaming use case that I’m currently working on.
>
> If at all global temporary view could have been accessible across
> sessions/spark contexts, that would have simplified my usecase a lot.
>
>
>
> But yeah, thanks for explaining the behavior of global temporary view, now
> it’s clear J
>
>
>
> -Hemanth
>
>
>
> *From: *Felix Cheung <fe...@hotmail.com>
> *Date: *Saturday, 22 April 2017 at 11.05
> *To: *Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org"
> <us...@spark.apache.org>
> *Subject: *Re: Spark SQL - Global Temporary View is not behaving as
> expected
>
>
>
> Cross session is this context is multiple spark sessions from the same
> spark context. Since you are running two shells, you are having different
> spark context.
>
>
>
> Do you have to you a temp view? Could you create a table?
>
>
>
> _____________________________
> From: Hemanth Gudela <he...@qvantel.com>
> Sent: Saturday, April 22, 2017 12:57 AM
> Subject: Spark SQL - Global Temporary View is not behaving as expected
> To: <us...@spark.apache.org>
>
>
>
> Hi,
>
>
>
> According to documentation
> <http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>,
> global temporary views are cross-session accessible.
>
>
>
> But when I try to query a global temporary view from another spark shell
> like thisà
>
> *Instance 1 of spark-shell*
>
> ----------------------------------
>
> scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")
>
>
>
> *Instance 2 of spark-shell *(while Instance 1 of spark-shell is still
> alive)
>
> ---------------------------------
>
> scala> spark.sql("select * from global_temp.gView1").show()
>
> org.apache.spark.sql.AnalysisException: Table or view not found:
> `global_temp`.`gView1`
>
> 'Project [*]
>
> +- 'UnresolvedRelation `global_temp`.`gView1`
>
>
>
> I am expecting that global temporary view created in shell 1 should be
> accessible in shell 2, but it isn’t!
>
> Please correct me if I missing something here.
>
>
>
> Thanks (in advance),
>
> Hemanth
>
>
>

Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by Hemanth Gudela <he...@qvantel.com>.
Thanks for your reply.

Creating a table is an option, but such approach slows down reads & writes for a real-time analytics streaming use case that I’m currently working on.
If at all global temporary view could have been accessible across sessions/spark contexts, that would have simplified my usecase a lot.

But yeah, thanks for explaining the behavior of global temporary view, now it’s clear ☺

-Hemanth

From: Felix Cheung <fe...@hotmail.com>
Date: Saturday, 22 April 2017 at 11.05
To: Hemanth Gudela <he...@qvantel.com>, "user@spark.apache.org" <us...@spark.apache.org>
Subject: Re: Spark SQL - Global Temporary View is not behaving as expected

Cross session is this context is multiple spark sessions from the same spark context. Since you are running two shells, you are having different spark context.

Do you have to you a temp view? Could you create a table?

_____________________________
From: Hemanth Gudela <he...@qvantel.com>>
Sent: Saturday, April 22, 2017 12:57 AM
Subject: Spark SQL - Global Temporary View is not behaving as expected
To: <us...@spark.apache.org>>



Hi,

According to documentation<http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>, global temporary views are cross-session accessible.

But when I try to query a global temporary view from another spark shell like this-->
Instance 1 of spark-shell
----------------------------------
scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")

Instance 2 of spark-shell (while Instance 1 of spark-shell is still alive)
---------------------------------
scala> spark.sql("select * from global_temp.gView1").show()
org.apache.spark.sql.AnalysisException: Table or view not found: `global_temp`.`gView1`
'Project [*]
+- 'UnresolvedRelation `global_temp`.`gView1`

I am expecting that global temporary view created in shell 1 should be accessible in shell 2, but it isn’t!
Please correct me if I missing something here.

Thanks (in advance),
Hemanth


Re: Spark SQL - Global Temporary View is not behaving as expected

Posted by Felix Cheung <fe...@hotmail.com>.
Cross session is this context is multiple spark sessions from the same spark context. Since you are running two shells, you are having different spark context.

Do you have to you a temp view? Could you create a table?

_____________________________
From: Hemanth Gudela <he...@qvantel.com>>
Sent: Saturday, April 22, 2017 12:57 AM
Subject: Spark SQL - Global Temporary View is not behaving as expected
To: <us...@spark.apache.org>>


Hi,

According to documentation<http://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view>, global temporary views are cross-session accessible.

But when I try to query a global temporary view from another spark shell like this-->
Instance 1 of spark-shell
----------------------------------
scala> spark.sql("select 1 as col1").createGlobalTempView("gView1")

Instance 2 of spark-shell (while Instance 1 of spark-shell is still alive)
---------------------------------
scala> spark.sql("select * from global_temp.gView1").show()
org.apache.spark.sql.AnalysisException: Table or view not found: `global_temp`.`gView1`
'Project [*]
+- 'UnresolvedRelation `global_temp`.`gView1`

I am expecting that global temporary view created in shell 1 should be accessible in shell 2, but it isn’t!
Please correct me if I missing something here.

Thanks (in advance),
Hemanth