You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ThomasThomas <th...@gmail.com> on 2018/05/12 14:57:33 UTC

Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Hi There,

Our use case is like this.

We have a nested(multiple) JSON message flowing through Kafka Queue.  Read
the message from Kafka using Spark Structured Streaming(SSS) and  explode
the data and flatten all data into single record using DataFrame joins and
land into a relational database table(DB2). 

But we are getting the following error when we write into db using JDBC.

“org.apache.spark.sql.AnalysisException: Inner join between two streaming
DataFrames/Datasets is not supported;”

Any help would be greatly appreciated.


Thanks,
Thomas Thomas
Mastermind Solutions LLC.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Posted by ThomasThomas <th...@gmail.com>.
Thanks for the quick response...I'm able to inner join the dataframes with
regular spark session. The issue is only with the spark streaming session.
BTW I'm using Spark 2.2.0 version...



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Perhaps this link might help you.

https://stackoverflow.com/questions/48699445/inner-join-not-working-in-dataframe-using-spark-2-1

Best,
Passion

On Sat, May 12, 2018, 10:57 AM ThomasThomas <th...@gmail.com> wrote:

> Hi There,
>
> Our use case is like this.
>
> We have a nested(multiple) JSON message flowing through Kafka Queue.  Read
> the message from Kafka using Spark Structured Streaming(SSS) and  explode
> the data and flatten all data into single record using DataFrame joins and
> land into a relational database table(DB2).
>
> But we are getting the following error when we write into db using JDBC.
>
> “org.apache.spark.sql.AnalysisException: Inner join between two streaming
> DataFrames/Datasets is not supported;”
>
> Any help would be greatly appreciated.
>
> Thanks,
> Thomas Thomas
> Mastermind Solutions LLC.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

After you leave Spark Structured Streaming right after you generate RDDs
(for your streaming queries) you can do any kind of "joins". You're again
in the old good days of RDD programming (with all the whistles and bells).

Please note that Spark Structured Streaming != Spark Streaming since the
former uses Dataset API while the latter RDD API.

Don't touch RDD API and Spark Streaming unless you know what you're doing :)

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski

On Tue, May 15, 2018 at 5:36 PM, ☼ R Nair (रविशंकर नायर) <
ravishankar.nair@gmail.com> wrote:

> Hi Jacek,
>
> If we use RDD instead of Dataframe, can we accomplish the same? I mean, is
> joining  between RDDS allowed in Spark streaming ?
>
> Best,
> Ravi
>
> On Sun, May 13, 2018 at 11:18 AM Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi,
>>
>> The exception message should be self-explanatory and says that you cannot
>> join two streaming Datasets. This feature was added in 2.3 if I'm not
>> mistaken.
>>
>> Just to be sure that you work with two streaming Datasets, can you show
>> the query plan of the join query?
>>
>> Jacek
>>
>> On Sat, 12 May 2018, 16:57 ThomasThomas, <th...@gmail.com> wrote:
>>
>>> Hi There,
>>>
>>> Our use case is like this.
>>>
>>> We have a nested(multiple) JSON message flowing through Kafka Queue.
>>> Read
>>> the message from Kafka using Spark Structured Streaming(SSS) and  explode
>>> the data and flatten all data into single record using DataFrame joins
>>> and
>>> land into a relational database table(DB2).
>>>
>>> But we are getting the following error when we write into db using JDBC.
>>>
>>> “org.apache.spark.sql.AnalysisException: Inner join between two
>>> streaming
>>> DataFrames/Datasets is not supported;”
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>> Thomas Thomas
>>> Mastermind Solutions LLC.
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>

Re: Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Hi Jacek,

If we use RDD instead of Dataframe, can we accomplish the same? I mean, is
joining  between RDDS allowed in Spark streaming ?

Best,
Ravi

On Sun, May 13, 2018 at 11:18 AM Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> The exception message should be self-explanatory and says that you cannot
> join two streaming Datasets. This feature was added in 2.3 if I'm not
> mistaken.
>
> Just to be sure that you work with two streaming Datasets, can you show
> the query plan of the join query?
>
> Jacek
>
> On Sat, 12 May 2018, 16:57 ThomasThomas, <th...@gmail.com> wrote:
>
>> Hi There,
>>
>> Our use case is like this.
>>
>> We have a nested(multiple) JSON message flowing through Kafka Queue.  Read
>> the message from Kafka using Spark Structured Streaming(SSS) and  explode
>> the data and flatten all data into single record using DataFrame joins and
>> land into a relational database table(DB2).
>>
>> But we are getting the following error when we write into db using JDBC.
>>
>> “org.apache.spark.sql.AnalysisException: Inner join between two streaming
>> DataFrames/Datasets is not supported;”
>>
>> Any help would be greatly appreciated.
>>
>> Thanks,
>> Thomas Thomas
>> Mastermind Solutions LLC.
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>

Re: Spark Structured Streaming is giving error “org.apache.spark.sql.AnalysisException: Inner join between two streaming DataFrames/Datasets is not supported;”

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

The exception message should be self-explanatory and says that you cannot
join two streaming Datasets. This feature was added in 2.3 if I'm not
mistaken.

Just to be sure that you work with two streaming Datasets, can you show the
query plan of the join query?

Jacek

On Sat, 12 May 2018, 16:57 ThomasThomas, <th...@gmail.com> wrote:

> Hi There,
>
> Our use case is like this.
>
> We have a nested(multiple) JSON message flowing through Kafka Queue.  Read
> the message from Kafka using Spark Structured Streaming(SSS) and  explode
> the data and flatten all data into single record using DataFrame joins and
> land into a relational database table(DB2).
>
> But we are getting the following error when we write into db using JDBC.
>
> “org.apache.spark.sql.AnalysisException: Inner join between two streaming
> DataFrames/Datasets is not supported;”
>
> Any help would be greatly appreciated.
>
> Thanks,
> Thomas Thomas
> Mastermind Solutions LLC.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>