You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by Suhas H M <ma...@gmail.com> on 2018/02/23 01:08:55 UTC

Phoenix-Spark driver structured streaming support?

Hi,

Is spark structured streaming supported using Phoenix-Spark driver?
 when phoenix-spark driver is used to write the structured streamed data,
we get the exception

Exception in thread "main" java.lang.UnsupportedOperationException: Data
source org.apache.phoenix.spark does not support streamed writing
at
org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:287)
at
org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:266)



Code:

Dataset<Row> inputDF =
sparkSession
.readStream()
.schema(jsonSchema)
.json(inputPath);



StreamingQuery query = inputDF
.writeStream()
.format("org.apache.phoenix.spark")
.outputMode(OutputMode.Complete())
.option("zkUrl", "localhost:2181")
.option("table","SHM2")
.start();

query.awaitTermination();


Jira - https://issues.apache.org/jira/browse/PHOENIX-4627

Re: Phoenix-Spark driver structured streaming support?

Posted by Suhas H M <ma...@gmail.com>.

Thanks, but this is just a sample test i was doing. My use case is to read
data from Kafka in a streaming fashion and then write to Phoenix.


On Thu, Feb 22, 2018 at 5:15 PM, Pedro Boado <pe...@gmail.com> wrote:

> No it's not supported.
>
> Why don't you just run your example in spark batch and save the
> dataframe/rdd to Phoenix? Your data is coming from a json file (which at
> the end is a static source, not a stream)
>
>
>
> On 23 Feb 2018 01:08, "Suhas H M" <ma...@gmail.com> wrote:
>
> > Hi,
> >
> > Is spark structured streaming supported using Phoenix-Spark driver?
> >  when phoenix-spark driver is used to write the structured streamed data,
> > we get the exception
> >
> > Exception in thread "main" java.lang.UnsupportedOperationException: Data
> > source org.apache.phoenix.spark does not support streamed writing
> > at
> > org.apache.spark.sql.execution.datasources.DataSource.createSink(
> > DataSource.scala:287)
> > at
> > org.apache.spark.sql.streaming.DataStreamWriter.
> > start(DataStreamWriter.scala:266)
> >
> >
> >
> > Code:
> >
> > Dataset<Row> inputDF =
> > sparkSession
> > .readStream()
> > .schema(jsonSchema)
> > .json(inputPath);
> >
> >
> >
> > StreamingQuery query = inputDF
> > .writeStream()
> > .format("org.apache.phoenix.spark")
> > .outputMode(OutputMode.Complete())
> > .option("zkUrl", "localhost:2181")
> > .option("table","SHM2")
> > .start();
> >
> > query.awaitTermination();
> >
> >
> > Jira - https://issues.apache.org/jira/browse/PHOENIX-4627
> >
>

Re: Phoenix-Spark driver structured streaming support?

Posted by Pedro Boado <pe...@gmail.com>.

No it's not supported.

Why don't you just run your example in spark batch and save the
dataframe/rdd to Phoenix? Your data is coming from a json file (which at
the end is a static source, not a stream)



On 23 Feb 2018 01:08, "Suhas H M" <ma...@gmail.com> wrote:

> Hi,
>
> Is spark structured streaming supported using Phoenix-Spark driver?
>  when phoenix-spark driver is used to write the structured streamed data,
> we get the exception
>
> Exception in thread "main" java.lang.UnsupportedOperationException: Data
> source org.apache.phoenix.spark does not support streamed writing
> at
> org.apache.spark.sql.execution.datasources.DataSource.createSink(
> DataSource.scala:287)
> at
> org.apache.spark.sql.streaming.DataStreamWriter.
> start(DataStreamWriter.scala:266)
>
>
>
> Code:
>
> Dataset<Row> inputDF =
> sparkSession
> .readStream()
> .schema(jsonSchema)
> .json(inputPath);
>
>
>
> StreamingQuery query = inputDF
> .writeStream()
> .format("org.apache.phoenix.spark")
> .outputMode(OutputMode.Complete())
> .option("zkUrl", "localhost:2181")
> .option("table","SHM2")
> .start();
>
> query.awaitTermination();
>
>
> Jira - https://issues.apache.org/jira/browse/PHOENIX-4627
>