You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Rajesh Katkar <ka...@gmail.com> on 2023/04/06 07:22:35 UTC

spark streaming and kinesis integration

Hi Spark Team,

We need to read/write the kinesis streams using spark streaming.

 We checked the official documentation -
https://spark.apache.org/docs/latest/streaming-kinesis-integration.html

It does not mention kinesis connector. Alternative is -
https://github.com/qubole/kinesis-sql which is not active now.  This is now
handed over here - https://github.com/roncemer/spark-sql-kinesis

Also according to SPARK-18165
<https://issues.apache.org/jira/browse/SPARK-18165> , Spark officially do
not have any kinesis connector

We have few below questions , It would be great if you can answer

   1. Does Spark provides officially any kinesis connector which have
   readstream/writestream and endorse any connector for production use cases ?

   2.
   https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
   documentation does not mention how to write to kinesis. This method has
   default dynamodb as checkpoint, can we override it ?
   3. We have rocksdb as a state store but when we ran an application using
   official
   https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
rocksdb
   configurations were not effective. Can you please confirm if rocksdb is not
   applicable in these cases?
   4. rocksdb however works with qubole connector , do you have any plan to
   release kinesis connector?
   5. Please help/recommend us for any good stable kinesis connector or
   some pointers around it

RE: spark streaming and kinesis integration

Posted by "Jonske, Kurt" <kj...@alvarezandmarsal.com.INVALID>.
unsubscribe

Regards,
Kurt Jonske
Senior Director
Alvarez & Marsal
Direct:  212 328 8532
Mobile:  312 560 5040
Email:  kjonske@alvarezandmarsal.com<ma...@alvarezandmarsal.com>
www.alvarezandmarsal.com

From: Mich Talebzadeh <mi...@gmail.com>
Sent: Thursday, April 06, 2023 11:45 AM
To: Rajesh Katkar <ka...@gmail.com>
Cc: user@spark.incubator.apache.org
Subject: Re: spark streaming and kinesis integration


⚠ [EXTERNAL EMAIL]: Use Caution

Do you have a high level diagram of the proposed solution?

In so far as I know k8s does not support spark structured streaming?

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


 [https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE]   view my Linkedin profile<https://protect-us.mimecast.com/s/geRNCR61G4svBlOwGI9l42n?domain=linkedin.com/>

 https://en.everybodywiki.com/Mich_Talebzadeh<https://protect-us.mimecast.com/s/IvkpCVOQM8Tx9KZV2szZ50n?domain=en.everybodywiki.com>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>> wrote:
Use case is , we want to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark like it has for kafka.

Checking here if anyone used kinesis and spark streaming combination ?

On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com>> wrote:
Hi Rajesh,

What is the use case for Kinesis here? I have not used it personally, Which use case it concerns

https://aws.amazon.com/kinesis/<https://protect-us.mimecast.com/s/EbXfCW6qNgs5GY416iKUuW5?domain=aws.amazon.com/>

Can you use something else instead?

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


 [https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE]   view my Linkedin profile<https://protect-us.mimecast.com/s/geRNCR61G4svBlOwGI9l42n?domain=linkedin.com/>

 https://en.everybodywiki.com/Mich_Talebzadeh<https://protect-us.mimecast.com/s/IvkpCVOQM8Tx9KZV2szZ50n?domain=en.everybodywiki.com>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>> wrote:

Hi Spark Team,

We need to read/write the kinesis streams using spark streaming.

 We checked the official documentation - https://spark.apache.org/docs/latest/streaming-kinesis-integration.html<https://protect-us.mimecast.com/s/pmRCCXD5OjTX0N9l4iksfyX?domain=spark.apache.org>

It does not mention kinesis connector. Alternative is - https://github.com/qubole/kinesis-sql<https://protect-us.mimecast.com/s/wqnCCYE5PksLOZ9KDiMx-Ed?domain=github.com> which is not active now.  This is now handed over here - https://github.com/roncemer/spark-sql-kinesis<https://protect-us.mimecast.com/s/D3qVCZ60Qls52Rj17iP85Ej?domain=github.com>

Also according to SPARK-18165<https://protect-us.mimecast.com/s/s6R_C1w4AmIM5mZr6CyDJHr?domain=issues.apache.org> , Spark officially do not have any kinesis connector

We have few below questions , It would be great if you can answer

  1.  Does Spark provides officially any kinesis connector which have readstream/writestream and endorse any connector for production use cases ?
  2.  https://spark.apache.org/docs/latest/streaming-kinesis-integration.html<https://protect-us.mimecast.com/s/pmRCCXD5OjTX0N9l4iksfyX?domain=spark.apache.org> This documentation does not mention how to write to kinesis. This method has default dynamodb as checkpoint, can we override it ?
  3.  We have rocksdb as a state store but when we ran an application using official  https://spark.apache.org/docs/latest/streaming-kinesis-integration.html<https://protect-us.mimecast.com/s/pmRCCXD5OjTX0N9l4iksfyX?domain=spark.apache.org> rocksdb configurations were not effective. Can you please confirm if rocksdb is not applicable in these cases?
  4.  rocksdb however works with qubole connector , do you have any plan to release kinesis connector?
  5.  Please help/recommend us for any good stable kinesis connector or some pointers around it

Re: spark streaming and kinesis integration

Posted by Mich Talebzadeh <mi...@gmail.com>.
Just to clarify, a major benefit of k8s in this case is to host your Spark
applications in the form of containers in an automated fashion so that one
can easily deploy as many instances of the application as required
(autoscaling). From below:

https://price2meet.com/gcp/docs/dataproc_docs_concepts_configuring-clusters_autoscaling.pdf

Autoscaling does not support Spark Structured Streaming (
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html)
(see Autoscaling and Spark Structured Streaming
(#autoscaling_and_spark_structured_streaming)) .

On the same token k8s is more suitable (as of now)  for batch jobs than
Spark Structured Streaming.
https://issues.apache.org/jira/browse/SPARK-12133

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 10 Apr 2023 at 19:06, Mich Talebzadeh <mi...@gmail.com>
wrote:

> What I said was this
> "In so far as I know k8s does not support spark structured streaming?"
>
> So it is an open question. I just recalled it. I have not tested myself. I
> know structured streaming works on Google Dataproc cluster but I have not
> seen any official link that says Spark Structured Streaming is supported on
> k8s.
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com>
> wrote:
>
>> Do you have any link or ticket which justifies that k8s does not support
>> spark streaming ?
>>
>> On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
>> wrote:
>>
>>> Do you have a high level diagram of the proposed solution?
>>>
>>> In so far as I know k8s does not support spark structured streaming?
>>>
>>> Mich Talebzadeh,
>>> Lead Solutions Architect/Engineering Lead
>>> Palantir Technologies
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
>>> wrote:
>>>
>>>> Use case is , we want to read/write to kinesis streams using k8s
>>>> Officially I could not find the connector or reader for kinesis from
>>>> spark like it has for kafka.
>>>>
>>>> Checking here if anyone used kinesis and spark streaming combination ?
>>>>
>>>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Hi Rajesh,
>>>>>
>>>>> What is the use case for Kinesis here? I have not used it personally,
>>>>> Which use case it concerns
>>>>>
>>>>> https://aws.amazon.com/kinesis/
>>>>>
>>>>> Can you use something else instead?
>>>>>
>>>>> HTH
>>>>>
>>>>> Mich Talebzadeh,
>>>>> Lead Solutions Architect/Engineering Lead
>>>>> Palantir Technologies
>>>>> London
>>>>> United Kingdom
>>>>>
>>>>>
>>>>>    view my Linkedin profile
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>
>>>>>
>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Spark Team,
>>>>>>
>>>>>> We need to read/write the kinesis streams using spark streaming.
>>>>>>
>>>>>>  We checked the official documentation -
>>>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>>>
>>>>>> It does not mention kinesis connector. Alternative is -
>>>>>> https://github.com/qubole/kinesis-sql which is not active now.  This
>>>>>> is now handed over here -
>>>>>> https://github.com/roncemer/spark-sql-kinesis
>>>>>>
>>>>>> Also according to SPARK-18165
>>>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark
>>>>>> officially do not have any kinesis connector
>>>>>>
>>>>>> We have few below questions , It would be great if you can answer
>>>>>>
>>>>>>    1. Does Spark provides officially any kinesis connector which
>>>>>>    have readstream/writestream and endorse any connector for production use
>>>>>>    cases ?
>>>>>>    2.
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>>>    documentation does not mention how to write to kinesis. This method has
>>>>>>    default dynamodb as checkpoint, can we override it ?
>>>>>>    3. We have rocksdb as a state store but when we ran an
>>>>>>    application using official
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>>>    applicable in these cases?
>>>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>>>    plan to release kinesis connector?
>>>>>>    5. Please help/recommend us for any good stable kinesis connector
>>>>>>    or some pointers around it
>>>>>>
>>>>>>

Re: Re: spark streaming and kinesis integration

Posted by 孙令哲 <li...@hirain.com>.
Hi Rajesh,


It's working fine, at least for now. But you'll need to build your own spark image using later versions.


Lingzhe Sun
Hirain Technologies

 







Original:
From:Rajesh Katkar <ka...@gmail.com>Date:2023-04-12 21:36:52To:Lingzhe Sun<li...@hirain.com>Cc:Mich Talebzadeh <mi...@gmail.com> , user <us...@spark.incubator.apache.org>Subject:Re: Re: spark streaming and kinesis integrationHi Lingzhe,

We are also started using this operator.
Do you see any issues with it? 




On Wed, 12 Apr, 2023, 7:25 am Lingzhe Sun, <li...@hirain.com> wrote:

Hi Mich,


FYI we're using spark operator(https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build stateful structured streaming on k8s for a year. Haven't test it using non-operator way.


Besides that, the main contributor of the spark operator, Yinan Li, has been inactive for quite long time. Kind of worried that this project might finally become outdated as k8s is evolving. So if anyone is interested, please support the project.


Lingzhe Sun
Hirain Technologies

 
From: Mich Talebzadeh
Date: 2023-04-11 02:06
To: Rajesh Katkar
CC: user
Subject: Re: spark streaming and kinesis integration


What I said was this"In so far as I know k8s does not support spark structured streaming?"


So it is an open question. I just recalled it. I have not tested myself. I know structured streaming works on Google Dataproc cluster but I have not seen any official link that says Spark Structured Streaming is supported on k8s.


HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom




   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
 










On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com> wrote:

Do you have any link or ticket which justifies that k8s does not support spark streaming ?

On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com> wrote:

Do you have a high level diagram of the proposed solution?

In so far as I know k8s does not support spark structured streaming?

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom




   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
 










On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com> wrote:

Use case is , we want to read/write to kinesis streams using k8sOfficially I could not find the connector or reader for kinesis from spark like it has for kafka.


Checking here if anyone used kinesis and spark streaming combination ?


On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com> wrote:

Hi Rajesh,

What is the use case for Kinesis here? I have not used it personally, Which use case it concerns


https://aws.amazon.com/kinesis/



Can you use something else instead?


HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom




   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
 










On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com> wrote:

Hi Spark Team,
We need to read/write the kinesis streams using spark streaming.
 We checked the official documentation - https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
It does not mention kinesis connector. Alternative is - https://github.com/qubole/kinesis-sql which is not active now.  This is now handed over here - https://github.com/roncemer/spark-sql-kinesis
Also according to SPARK-18165 , Spark officially do not have any kinesis connector 
We have few below questions , It would be great if you can answer 
Does Spark provides officially any kinesis connector which have readstream/writestream and endorse any connector for production use cases ?  https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This documentation does not mention how to write to kinesis. This method has default dynamodb as checkpoint, can we override it ?We have rocksdb as a state store but when we ran an application using official  https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb configurations were not effective. Can you please confirm if rocksdb is not applicable in these cases?rocksdb however works with qubole connector , do you have any plan to release kinesis connector?Please help/recommend us for any good stable kinesis connector or some pointers around it














Re: Re: spark streaming and kinesis integration

Posted by Yi Huang <hu...@gmail.com>.
unsubscribe

On Wed, Apr 12, 2023 at 3:59 PM Rajesh Katkar <ka...@gmail.com>
wrote:

> Hi Lingzhe,
>
> We are also started using this operator.
> Do you see any issues with it?
>
>
> On Wed, 12 Apr, 2023, 7:25 am Lingzhe Sun, <li...@hirain.com> wrote:
>
>> Hi Mich,
>>
>> FYI we're using spark operator(
>> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build
>> stateful structured streaming on k8s for a year. Haven't test it using
>> non-operator way.
>>
>> Besides that, the main contributor of the spark operator, Yinan Li, has
>> been inactive for quite long time. Kind of worried that this project might
>> finally become outdated as k8s is evolving. So if anyone is interested,
>> please support the project.
>>
>> ------------------------------
>> Lingzhe Sun
>> Hirain Technologies
>>
>>
>> *From:* Mich Talebzadeh <mi...@gmail.com>
>> *Date:* 2023-04-11 02:06
>> *To:* Rajesh Katkar <ka...@gmail.com>
>> *CC:* user <us...@spark.incubator.apache.org>
>> *Subject:* Re: spark streaming and kinesis integration
>> What I said was this
>> "In so far as I know k8s does not support spark structured streaming?"
>>
>> So it is an open question. I just recalled it. I have not tested myself.
>> I know structured streaming works on Google Dataproc cluster but I have not
>> seen any official link that says Spark Structured Streaming is supported on
>> k8s.
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Lead Solutions Architect/Engineering Lead
>> Palantir Technologies
>> London
>> United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com>
>> wrote:
>>
>>> Do you have any link or ticket which justifies that k8s does not support
>>> spark streaming ?
>>>
>>> On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
>>> wrote:
>>>
>>>> Do you have a high level diagram of the proposed solution?
>>>>
>>>> In so far as I know k8s does not support spark structured streaming?
>>>>
>>>> Mich Talebzadeh,
>>>> Lead Solutions Architect/Engineering Lead
>>>> Palantir Technologies
>>>> London
>>>> United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
>>>> wrote:
>>>>
>>>>> Use case is , we want to read/write to kinesis streams using k8s
>>>>> Officially I could not find the connector or reader for kinesis from
>>>>> spark like it has for kafka.
>>>>>
>>>>> Checking here if anyone used kinesis and spark streaming combination ?
>>>>>
>>>>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <
>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>
>>>>>> Hi Rajesh,
>>>>>>
>>>>>> What is the use case for Kinesis here? I have not used it personally,
>>>>>> Which use case it concerns
>>>>>>
>>>>>> https://aws.amazon.com/kinesis/
>>>>>>
>>>>>> Can you use something else instead?
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Mich Talebzadeh,
>>>>>> Lead Solutions Architect/Engineering Lead
>>>>>> Palantir Technologies
>>>>>> London
>>>>>> United Kingdom
>>>>>>
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>> arise from relying on this email's technical content is explicitly
>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>> arising from such loss, damage or destruction.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Spark Team,
>>>>>>>
>>>>>>> We need to read/write the kinesis streams using spark streaming.
>>>>>>>
>>>>>>>  We checked the official documentation -
>>>>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>>>>
>>>>>>> It does not mention kinesis connector. Alternative is -
>>>>>>> https://github.com/qubole/kinesis-sql which is not active now.
>>>>>>> This is now handed over here -
>>>>>>> https://github.com/roncemer/spark-sql-kinesis
>>>>>>>
>>>>>>> Also according to SPARK-18165
>>>>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark
>>>>>>> officially do not have any kinesis connector
>>>>>>>
>>>>>>> We have few below questions , It would be great if you can answer
>>>>>>>
>>>>>>>    1. Does Spark provides officially any kinesis connector which
>>>>>>>    have readstream/writestream and endorse any connector for production use
>>>>>>>    cases ?
>>>>>>>    2.
>>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>>>>    documentation does not mention how to write to kinesis. This method has
>>>>>>>    default dynamodb as checkpoint, can we override it ?
>>>>>>>    3. We have rocksdb as a state store but when we ran an
>>>>>>>    application using official
>>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>>>>    applicable in these cases?
>>>>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>>>>    plan to release kinesis connector?
>>>>>>>    5. Please help/recommend us for any good stable kinesis
>>>>>>>    connector or some pointers around it
>>>>>>>
>>>>>>>

Re: Re: spark streaming and kinesis integration

Posted by Rajesh Katkar <ka...@gmail.com>.
Hi Lingzhe,

We are also started using this operator.
Do you see any issues with it?


On Wed, 12 Apr, 2023, 7:25 am Lingzhe Sun, <li...@hirain.com> wrote:

> Hi Mich,
>
> FYI we're using spark operator(
> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build
> stateful structured streaming on k8s for a year. Haven't test it using
> non-operator way.
>
> Besides that, the main contributor of the spark operator, Yinan Li, has
> been inactive for quite long time. Kind of worried that this project might
> finally become outdated as k8s is evolving. So if anyone is interested,
> please support the project.
>
> ------------------------------
> Lingzhe Sun
> Hirain Technologies
>
>
> *From:* Mich Talebzadeh <mi...@gmail.com>
> *Date:* 2023-04-11 02:06
> *To:* Rajesh Katkar <ka...@gmail.com>
> *CC:* user <us...@spark.incubator.apache.org>
> *Subject:* Re: spark streaming and kinesis integration
> What I said was this
> "In so far as I know k8s does not support spark structured streaming?"
>
> So it is an open question. I just recalled it. I have not tested myself. I
> know structured streaming works on Google Dataproc cluster but I have not
> seen any official link that says Spark Structured Streaming is supported on
> k8s.
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com>
> wrote:
>
>> Do you have any link or ticket which justifies that k8s does not support
>> spark streaming ?
>>
>> On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
>> wrote:
>>
>>> Do you have a high level diagram of the proposed solution?
>>>
>>> In so far as I know k8s does not support spark structured streaming?
>>>
>>> Mich Talebzadeh,
>>> Lead Solutions Architect/Engineering Lead
>>> Palantir Technologies
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
>>> wrote:
>>>
>>>> Use case is , we want to read/write to kinesis streams using k8s
>>>> Officially I could not find the connector or reader for kinesis from
>>>> spark like it has for kafka.
>>>>
>>>> Checking here if anyone used kinesis and spark streaming combination ?
>>>>
>>>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Hi Rajesh,
>>>>>
>>>>> What is the use case for Kinesis here? I have not used it personally,
>>>>> Which use case it concerns
>>>>>
>>>>> https://aws.amazon.com/kinesis/
>>>>>
>>>>> Can you use something else instead?
>>>>>
>>>>> HTH
>>>>>
>>>>> Mich Talebzadeh,
>>>>> Lead Solutions Architect/Engineering Lead
>>>>> Palantir Technologies
>>>>> London
>>>>> United Kingdom
>>>>>
>>>>>
>>>>>    view my Linkedin profile
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>
>>>>>
>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Spark Team,
>>>>>>
>>>>>> We need to read/write the kinesis streams using spark streaming.
>>>>>>
>>>>>>  We checked the official documentation -
>>>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>>>
>>>>>> It does not mention kinesis connector. Alternative is -
>>>>>> https://github.com/qubole/kinesis-sql which is not active now.  This
>>>>>> is now handed over here -
>>>>>> https://github.com/roncemer/spark-sql-kinesis
>>>>>>
>>>>>> Also according to SPARK-18165
>>>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark
>>>>>> officially do not have any kinesis connector
>>>>>>
>>>>>> We have few below questions , It would be great if you can answer
>>>>>>
>>>>>>    1. Does Spark provides officially any kinesis connector which
>>>>>>    have readstream/writestream and endorse any connector for production use
>>>>>>    cases ?
>>>>>>    2.
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>>>    documentation does not mention how to write to kinesis. This method has
>>>>>>    default dynamodb as checkpoint, can we override it ?
>>>>>>    3. We have rocksdb as a state store but when we ran an
>>>>>>    application using official
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>>>    applicable in these cases?
>>>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>>>    plan to release kinesis connector?
>>>>>>    5. Please help/recommend us for any good stable kinesis connector
>>>>>>    or some pointers around it
>>>>>>
>>>>>>

Re: Re: spark streaming and kinesis integration

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Lingzhe Sun,

Thanks for your comments. I am afraid I won't be able to take part in this
project and contribute.

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 12 Apr 2023 at 02:55, Lingzhe Sun <li...@hirain.com> wrote:

> Hi Mich,
>
> FYI we're using spark operator(
> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build
> stateful structured streaming on k8s for a year. Haven't test it using
> non-operator way.
>
> Besides that, the main contributor of the spark operator, Yinan Li, has
> been inactive for quite long time. Kind of worried that this project might
> finally become outdated as k8s is evolving. So if anyone is interested,
> please support the project.
>
> ------------------------------
> Lingzhe Sun
> Hirain Technologies
>
>
> *From:* Mich Talebzadeh <mi...@gmail.com>
> *Date:* 2023-04-11 02:06
> *To:* Rajesh Katkar <ka...@gmail.com>
> *CC:* user <us...@spark.incubator.apache.org>
> *Subject:* Re: spark streaming and kinesis integration
> What I said was this
> "In so far as I know k8s does not support spark structured streaming?"
>
> So it is an open question. I just recalled it. I have not tested myself. I
> know structured streaming works on Google Dataproc cluster but I have not
> seen any official link that says Spark Structured Streaming is supported on
> k8s.
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com>
> wrote:
>
>> Do you have any link or ticket which justifies that k8s does not support
>> spark streaming ?
>>
>> On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
>> wrote:
>>
>>> Do you have a high level diagram of the proposed solution?
>>>
>>> In so far as I know k8s does not support spark structured streaming?
>>>
>>> Mich Talebzadeh,
>>> Lead Solutions Architect/Engineering Lead
>>> Palantir Technologies
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
>>> wrote:
>>>
>>>> Use case is , we want to read/write to kinesis streams using k8s
>>>> Officially I could not find the connector or reader for kinesis from
>>>> spark like it has for kafka.
>>>>
>>>> Checking here if anyone used kinesis and spark streaming combination ?
>>>>
>>>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Hi Rajesh,
>>>>>
>>>>> What is the use case for Kinesis here? I have not used it personally,
>>>>> Which use case it concerns
>>>>>
>>>>> https://aws.amazon.com/kinesis/
>>>>>
>>>>> Can you use something else instead?
>>>>>
>>>>> HTH
>>>>>
>>>>> Mich Talebzadeh,
>>>>> Lead Solutions Architect/Engineering Lead
>>>>> Palantir Technologies
>>>>> London
>>>>> United Kingdom
>>>>>
>>>>>
>>>>>    view my Linkedin profile
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>
>>>>>
>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Spark Team,
>>>>>>
>>>>>> We need to read/write the kinesis streams using spark streaming.
>>>>>>
>>>>>>  We checked the official documentation -
>>>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>>>
>>>>>> It does not mention kinesis connector. Alternative is -
>>>>>> https://github.com/qubole/kinesis-sql which is not active now.  This
>>>>>> is now handed over here -
>>>>>> https://github.com/roncemer/spark-sql-kinesis
>>>>>>
>>>>>> Also according to SPARK-18165
>>>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark
>>>>>> officially do not have any kinesis connector
>>>>>>
>>>>>> We have few below questions , It would be great if you can answer
>>>>>>
>>>>>>    1. Does Spark provides officially any kinesis connector which
>>>>>>    have readstream/writestream and endorse any connector for production use
>>>>>>    cases ?
>>>>>>    2.
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>>>    documentation does not mention how to write to kinesis. This method has
>>>>>>    default dynamodb as checkpoint, can we override it ?
>>>>>>    3. We have rocksdb as a state store but when we ran an
>>>>>>    application using official
>>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>>>    applicable in these cases?
>>>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>>>    plan to release kinesis connector?
>>>>>>    5. Please help/recommend us for any good stable kinesis connector
>>>>>>    or some pointers around it
>>>>>>
>>>>>>

Re: Re: spark streaming and kinesis integration

Posted by Lingzhe Sun <li...@hirain.com>.
Hi Mich,

FYI we're using spark operator(https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build stateful structured streaming on k8s for a year. Haven't test it using non-operator way.

Besides that, the main contributor of the spark operator, Yinan Li, has been inactive for quite long time. Kind of worried that this project might finally become outdated as k8s is evolving. So if anyone is interested, please support the project.



Lingzhe Sun
Hirain Technologies
 
From: Mich Talebzadeh
Date: 2023-04-11 02:06
To: Rajesh Katkar
CC: user
Subject: Re: spark streaming and kinesis integration
What I said was this
"In so far as I know k8s does not support spark structured streaming?"

So it is an open question. I just recalled it. I have not tested myself. I know structured streaming works on Google Dataproc cluster but I have not seen any official link that says Spark Structured Streaming is supported on k8s.

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom

   view my Linkedin profile

 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. 
 


On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com> wrote:
Do you have any link or ticket which justifies that k8s does not support spark streaming ?

On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com> wrote:
Do you have a high level diagram of the proposed solution?

In so far as I know k8s does not support spark structured streaming?

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom

   view my Linkedin profile

 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. 
 


On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com> wrote:
Use case is , we want to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark like it has for kafka.

Checking here if anyone used kinesis and spark streaming combination ?

On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com> wrote:
Hi Rajesh,

What is the use case for Kinesis here? I have not used it personally, Which use case it concerns

https://aws.amazon.com/kinesis/

Can you use something else instead?

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom

   view my Linkedin profile

 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. 
 


On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com> wrote:
Hi Spark Team,
We need to read/write the kinesis streams using spark streaming.
 We checked the official documentation - https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
It does not mention kinesis connector. Alternative is - https://github.com/qubole/kinesis-sql which is not active now.  This is now handed over here - https://github.com/roncemer/spark-sql-kinesis
Also according to SPARK-18165 , Spark officially do not have any kinesis connector 
We have few below questions , It would be great if you can answer 
Does Spark provides officially any kinesis connector which have readstream/writestream and endorse any connector for production use cases ?  
https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This documentation does not mention how to write to kinesis. This method has default dynamodb as checkpoint, can we override it ?
We have rocksdb as a state store but when we ran an application using official  https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb configurations were not effective. Can you please confirm if rocksdb is not applicable in these cases?
rocksdb however works with qubole connector , do you have any plan to release kinesis connector?
Please help/recommend us for any good stable kinesis connector or some pointers around it

Re: spark streaming and kinesis integration

Posted by Mich Talebzadeh <mi...@gmail.com>.
What I said was this
"In so far as I know k8s does not support spark structured streaming?"

So it is an open question. I just recalled it. I have not tested myself. I
know structured streaming works on Google Dataproc cluster but I have not
seen any official link that says Spark Structured Streaming is supported on
k8s.

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 10 Apr 2023 at 06:31, Rajesh Katkar <ka...@gmail.com> wrote:

> Do you have any link or ticket which justifies that k8s does not support
> spark streaming ?
>
> On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
> wrote:
>
>> Do you have a high level diagram of the proposed solution?
>>
>> In so far as I know k8s does not support spark structured streaming?
>>
>> Mich Talebzadeh,
>> Lead Solutions Architect/Engineering Lead
>> Palantir Technologies
>> London
>> United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
>> wrote:
>>
>>> Use case is , we want to read/write to kinesis streams using k8s
>>> Officially I could not find the connector or reader for kinesis from
>>> spark like it has for kafka.
>>>
>>> Checking here if anyone used kinesis and spark streaming combination ?
>>>
>>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com>
>>> wrote:
>>>
>>>> Hi Rajesh,
>>>>
>>>> What is the use case for Kinesis here? I have not used it personally,
>>>> Which use case it concerns
>>>>
>>>> https://aws.amazon.com/kinesis/
>>>>
>>>> Can you use something else instead?
>>>>
>>>> HTH
>>>>
>>>> Mich Talebzadeh,
>>>> Lead Solutions Architect/Engineering Lead
>>>> Palantir Technologies
>>>> London
>>>> United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Spark Team,
>>>>>
>>>>> We need to read/write the kinesis streams using spark streaming.
>>>>>
>>>>>  We checked the official documentation -
>>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>>
>>>>> It does not mention kinesis connector. Alternative is -
>>>>> https://github.com/qubole/kinesis-sql which is not active now.  This
>>>>> is now handed over here -
>>>>> https://github.com/roncemer/spark-sql-kinesis
>>>>>
>>>>> Also according to SPARK-18165
>>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark
>>>>> officially do not have any kinesis connector
>>>>>
>>>>> We have few below questions , It would be great if you can answer
>>>>>
>>>>>    1. Does Spark provides officially any kinesis connector which have
>>>>>    readstream/writestream and endorse any connector for production use cases ?
>>>>>
>>>>>    2.
>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>>    documentation does not mention how to write to kinesis. This method has
>>>>>    default dynamodb as checkpoint, can we override it ?
>>>>>    3. We have rocksdb as a state store but when we ran an application
>>>>>    using official
>>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>>    applicable in these cases?
>>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>>    plan to release kinesis connector?
>>>>>    5. Please help/recommend us for any good stable kinesis connector
>>>>>    or some pointers around it
>>>>>
>>>>>

Re: spark streaming and kinesis integration

Posted by Rajesh Katkar <ka...@gmail.com>.
Do you have any link or ticket which justifies that k8s does not support
spark streaming ?

On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh, <mi...@gmail.com>
wrote:

> Do you have a high level diagram of the proposed solution?
>
> In so far as I know k8s does not support spark structured streaming?
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com>
> wrote:
>
>> Use case is , we want to read/write to kinesis streams using k8s
>> Officially I could not find the connector or reader for kinesis from
>> spark like it has for kafka.
>>
>> Checking here if anyone used kinesis and spark streaming combination ?
>>
>> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com>
>> wrote:
>>
>>> Hi Rajesh,
>>>
>>> What is the use case for Kinesis here? I have not used it personally,
>>> Which use case it concerns
>>>
>>> https://aws.amazon.com/kinesis/
>>>
>>> Can you use something else instead?
>>>
>>> HTH
>>>
>>> Mich Talebzadeh,
>>> Lead Solutions Architect/Engineering Lead
>>> Palantir Technologies
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>>> wrote:
>>>
>>>> Hi Spark Team,
>>>>
>>>> We need to read/write the kinesis streams using spark streaming.
>>>>
>>>>  We checked the official documentation -
>>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>>
>>>> It does not mention kinesis connector. Alternative is -
>>>> https://github.com/qubole/kinesis-sql which is not active now.  This
>>>> is now handed over here - https://github.com/roncemer/spark-sql-kinesis
>>>>
>>>> Also according to SPARK-18165
>>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark officially
>>>> do not have any kinesis connector
>>>>
>>>> We have few below questions , It would be great if you can answer
>>>>
>>>>    1. Does Spark provides officially any kinesis connector which have
>>>>    readstream/writestream and endorse any connector for production use cases ?
>>>>
>>>>    2.
>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>>    documentation does not mention how to write to kinesis. This method has
>>>>    default dynamodb as checkpoint, can we override it ?
>>>>    3. We have rocksdb as a state store but when we ran an application
>>>>    using official
>>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>>    applicable in these cases?
>>>>    4. rocksdb however works with qubole connector , do you have any
>>>>    plan to release kinesis connector?
>>>>    5. Please help/recommend us for any good stable kinesis connector
>>>>    or some pointers around it
>>>>
>>>>

Re: spark streaming and kinesis integration

Posted by Mich Talebzadeh <mi...@gmail.com>.
Do you have a high level diagram of the proposed solution?

In so far as I know k8s does not support spark structured streaming?

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 6 Apr 2023 at 16:40, Rajesh Katkar <ka...@gmail.com> wrote:

> Use case is , we want to read/write to kinesis streams using k8s
> Officially I could not find the connector or reader for kinesis from spark
> like it has for kafka.
>
> Checking here if anyone used kinesis and spark streaming combination ?
>
> On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com>
> wrote:
>
>> Hi Rajesh,
>>
>> What is the use case for Kinesis here? I have not used it personally,
>> Which use case it concerns
>>
>> https://aws.amazon.com/kinesis/
>>
>> Can you use something else instead?
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Lead Solutions Architect/Engineering Lead
>> Palantir Technologies
>> London
>> United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
>> wrote:
>>
>>> Hi Spark Team,
>>>
>>> We need to read/write the kinesis streams using spark streaming.
>>>
>>>  We checked the official documentation -
>>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>>
>>> It does not mention kinesis connector. Alternative is -
>>> https://github.com/qubole/kinesis-sql which is not active now.  This is
>>> now handed over here - https://github.com/roncemer/spark-sql-kinesis
>>>
>>> Also according to SPARK-18165
>>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark officially
>>> do not have any kinesis connector
>>>
>>> We have few below questions , It would be great if you can answer
>>>
>>>    1. Does Spark provides officially any kinesis connector which have
>>>    readstream/writestream and endorse any connector for production use cases ?
>>>
>>>    2.
>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>>    documentation does not mention how to write to kinesis. This method has
>>>    default dynamodb as checkpoint, can we override it ?
>>>    3. We have rocksdb as a state store but when we ran an application
>>>    using official
>>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>>    configurations were not effective. Can you please confirm if rocksdb is not
>>>    applicable in these cases?
>>>    4. rocksdb however works with qubole connector , do you have any
>>>    plan to release kinesis connector?
>>>    5. Please help/recommend us for any good stable kinesis connector or
>>>    some pointers around it
>>>
>>>

Re: spark streaming and kinesis integration

Posted by Rajesh Katkar <ka...@gmail.com>.
Use case is , we want to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark
like it has for kafka.

Checking here if anyone used kinesis and spark streaming combination ?

On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh, <mi...@gmail.com>
wrote:

> Hi Rajesh,
>
> What is the use case for Kinesis here? I have not used it personally,
> Which use case it concerns
>
> https://aws.amazon.com/kinesis/
>
> Can you use something else instead?
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com>
> wrote:
>
>> Hi Spark Team,
>>
>> We need to read/write the kinesis streams using spark streaming.
>>
>>  We checked the official documentation -
>> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>>
>> It does not mention kinesis connector. Alternative is -
>> https://github.com/qubole/kinesis-sql which is not active now.  This is
>> now handed over here - https://github.com/roncemer/spark-sql-kinesis
>>
>> Also according to SPARK-18165
>> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark officially
>> do not have any kinesis connector
>>
>> We have few below questions , It would be great if you can answer
>>
>>    1. Does Spark provides officially any kinesis connector which have
>>    readstream/writestream and endorse any connector for production use cases ?
>>
>>    2.
>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>>    documentation does not mention how to write to kinesis. This method has
>>    default dynamodb as checkpoint, can we override it ?
>>    3. We have rocksdb as a state store but when we ran an application
>>    using official
>>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>>    configurations were not effective. Can you please confirm if rocksdb is not
>>    applicable in these cases?
>>    4. rocksdb however works with qubole connector , do you have any plan
>>    to release kinesis connector?
>>    5. Please help/recommend us for any good stable kinesis connector or
>>    some pointers around it
>>
>>

Re: spark streaming and kinesis integration

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Rajesh,

What is the use case for Kinesis here? I have not used it personally, Which
use case it concerns

https://aws.amazon.com/kinesis/

Can you use something else instead?

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar <ka...@gmail.com> wrote:

> Hi Spark Team,
>
> We need to read/write the kinesis streams using spark streaming.
>
>  We checked the official documentation -
> https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
>
> It does not mention kinesis connector. Alternative is -
> https://github.com/qubole/kinesis-sql which is not active now.  This is
> now handed over here - https://github.com/roncemer/spark-sql-kinesis
>
> Also according to SPARK-18165
> <https://issues.apache.org/jira/browse/SPARK-18165> , Spark officially do
> not have any kinesis connector
>
> We have few below questions , It would be great if you can answer
>
>    1. Does Spark provides officially any kinesis connector which have
>    readstream/writestream and endorse any connector for production use cases ?
>
>    2.
>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html This
>    documentation does not mention how to write to kinesis. This method has
>    default dynamodb as checkpoint, can we override it ?
>    3. We have rocksdb as a state store but when we ran an application
>    using official
>    https://spark.apache.org/docs/latest/streaming-kinesis-integration.html rocksdb
>    configurations were not effective. Can you please confirm if rocksdb is not
>    applicable in these cases?
>    4. rocksdb however works with qubole connector , do you have any plan
>    to release kinesis connector?
>    5. Please help/recommend us for any good stable kinesis connector or
>    some pointers around it
>
>