You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jungtaek Lim <ka...@gmail.com> on 2023/01/11 03:20:56 UTC

[DISCUSS] Deprecate DStream in 3.4

Hi dev,

I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
promoting Structured Streaming.
(Sorry for the late proposal, if we don't make the change in 3.4, we will
have to wait for another 6 months.)

We have been focusing on Structured Streaming for years (across multiple
major and minor versions), and during the time we haven't made any
improvements for DStream. Furthermore, recently we updated the DStream doc
to explicitly say DStream is a legacy project.
https://spark.apache.org/docs/latest/streaming-programming-guide.html#note

The baseline of deprecation is that we don't see a particular use case
which only DStream solves. This is a different story with GraphX and MLLIB,
as we don't have replacements for that.

The proposal does not mean we will remove the API soon, as the Spark
project has been making deprecation against public API. I don't intend to
propose the target version for removal. The goal is to guide users to
refrain from constructing a new workload with DStream. We might want to go
with this in future, but it would require a new discussion thread at that
time.

What do you think?

Thanks,
Jungtaek Lim (HeartSaVioR)

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Reynold Xin <rx...@databricks.com.INVALID>.
+1

On Thu, Jan 12, 2023 at 9:46 PM, Dongjoon Hyun < dongjoon.hyun@gmail.com > wrote:

> 
> +1 for the proposal (guiding only without any code change).
> 
> 
> Thanks,
> Dongjoon.
> 
> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu < zsxwing@ gmail. com (
> zsxwing@gmail.com ) > wrote:
> 
> 
>> +1
>> 
>> 
>> 
>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das < tathagata. das1565@ gmail.
>> com ( tathagata.das1565@gmail.com ) > wrote:
>> 
>> 
>>> +1
>>> 
>>> 
>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon < gurwls223@ gmail. com (
>>> gurwls223@gmail.com ) > wrote:
>>> 
>>> 
>>>> +1
>>>> 
>>>> 
>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim < kabhwan. opensource@ gmail. com
>>>> ( kabhwan.opensource@gmail.com ) > wrote:
>>>> 
>>>> 
>>>>> bump for more visibility.
>>>>> 
>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim < kabhwan. opensource@ gmail.
>>>>> com ( kabhwan.opensource@gmail.com ) > wrote:
>>>>> 
>>>>> 
>>>>>> Hi dev,
>>>>>> 
>>>>>> 
>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
>>>>>> promoting Structured Streaming.
>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we will
>>>>>> have to wait for another 6 months.)
>>>>>> 
>>>>>> 
>>>>>> We have been focusing on Structured Streaming for years (across multiple
>>>>>> major and minor versions), and during the time we haven't made any
>>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>>> to explicitly say DStream is a legacy project.
>>>>>> https:/ / spark. apache. org/ docs/ latest/ streaming-programming-guide. html#note
>>>>>> (
>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>> )
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> The baseline of deprecation is that we don't see a particular use case
>>>>>> which only DStream solves. This is a different story with GraphX and
>>>>>> MLLIB, as we don't have replacements for that.
>>>>>> 
>>>>>> 
>>>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>>>> project has been making deprecation against public API. I don't intend to
>>>>>> propose the target version for removal. The goal is to guide users to
>>>>>> refrain from constructing a new workload with DStream. We might want to go
>>>>>> with this in future, but it would require a new discussion thread at that
>>>>>> time.
>>>>>> 
>>>>>> 
>>>>>> What do you think?
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
There might be possible terminology differences, so let me elaborate the
action item from the proposal explicitly:

- Add "deprecation" annotation to the user-facing public API in streaming
directory (DStream)
- Write a release note to explicitly mention the deprecation. (Maybe
promote again that they are encouraged to move to SS.)

This is not an action item from the proposal:

- Add (tentative) target version to remove the API on the deprecation
message.

Hope this makes the proposal crystally clear.

On Fri, Jan 13, 2023 at 3:05 PM Jungtaek Lim <ka...@gmail.com>
wrote:

> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
> which incurs code change for sure. Guidance on the Spark website is done
> already as I mentioned - we updated the DStream doc page to mention that
> DStream is a "legacy" project and users should move to SS. I don't feel
> this is sufficient to refrain users from using it, hence initiating
> this proposal.
>
> Sorry to make confusion. I just wanted to make sure the goal of the
> proposal is not "removing" the API. The discussion on the removal of API
> doesn't tend to go well, so I wanted to make sure I don't mean that.
>
> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com>
> wrote:
>
>> +1 for the proposal (guiding only without any code change).
>>
>> Thanks,
>> Dongjoon.
>>
>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:
>>
>>> +1
>>>
>>>
>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>>> tathagata.das1565@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>
>>>>>> bump for more visibility.
>>>>>>
>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> Hi dev,
>>>>>>>
>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
>>>>>>> favor of promoting Structured Streaming.
>>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>>>>>> will have to wait for another 6 months.)
>>>>>>>
>>>>>>> We have been focusing on Structured Streaming for years (across
>>>>>>> multiple major and minor versions), and during the time we haven't made any
>>>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>>>> to explicitly say DStream is a legacy project.
>>>>>>>
>>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>>>
>>>>>>> The baseline of deprecation is that we don't see a particular use
>>>>>>> case which only DStream solves. This is a different story with GraphX and
>>>>>>> MLLIB, as we don't have replacements for that.
>>>>>>>
>>>>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>>>>> project has been making deprecation against public API. I don't intend to
>>>>>>> propose the target version for removal. The goal is to guide users to
>>>>>>> refrain from constructing a new workload with DStream. We might want to go
>>>>>>> with this in future, but it would require a new discussion thread at that
>>>>>>> time.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>>
>>>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
Heads-up: It's addressed via
https://issues.apache.org/jira/browse/SPARK-42075. We just marked
deprecation in the entry point of DStream, StreamContext. Marking all
classes in the DStream module is not pragmatic and users would see the
warning message anyway.

On Mon, Jan 16, 2023 at 8:26 AM Jungtaek Lim <ka...@gmail.com>
wrote:

> Given that I got more than 3 PMC members' positive votes as well as
> several active contributors' positive votes as well, I will proceed with
> the actual work.
> (It may take a couple of more days as folk in US will help me and there's
> a holiday in US.)
>
> Please let me know if we want to have an official vote thread before
> moving forward.
>
> Thanks all for providing your voices on this!
>
> On Sat, Jan 14, 2023 at 3:56 AM Anish Shrigondekar <
> anish.shrigondekar@databricks.com> wrote:
>
>> +1 on the Dstreams deprecation proposal
>>
>> On Fri, Jan 13, 2023 at 10:47 AM Jerry Peng <je...@gmail.com>
>> wrote:
>>
>>> +1 in general for marking the DStreams API as deprecated
>>>
>>> Jungtaek, can you please provide / elaborate on the concrete actions you
>>> intend on taking for the depreciation process?
>>>
>>> Best,
>>>
>>> Jerry
>>>
>>> On Thu, Jan 12, 2023 at 11:16 PM L. C. Hsieh <vi...@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>> On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
>>>> <ka...@gmail.com> wrote:
>>>> >
>>>> > Yes, exactly. I'm sorry to bring confusion - should have clarified
>>>> action items on the proposal.
>>>> >
>>>> > On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <
>>>> dongjoon.hyun@gmail.com> wrote:
>>>> >>
>>>> >> Then, could you elaborate `the proposed code change` specifically?
>>>> >> Maybe, usual deprecation warning logs and annotation on the API?
>>>> >>
>>>> >>
>>>> >> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>> >>>
>>>> >>> Maybe I need to clarify - my proposal is "explicitly" deprecating
>>>> it, which incurs code change for sure. Guidance on the Spark website is
>>>> done already as I mentioned - we updated the DStream doc page to mention
>>>> that DStream is a "legacy" project and users should move to SS. I don't
>>>> feel this is sufficient to refrain users from using it, hence initiating
>>>> this proposal.
>>>> >>>
>>>> >>> Sorry to make confusion. I just wanted to make sure the goal of the
>>>> proposal is not "removing" the API. The discussion on the removal of API
>>>> doesn't tend to go well, so I wanted to make sure I don't mean that.
>>>> >>>
>>>> >>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <
>>>> dongjoon.hyun@gmail.com> wrote:
>>>> >>>>
>>>> >>>> +1 for the proposal (guiding only without any code change).
>>>> >>>>
>>>> >>>> Thanks,
>>>> >>>> Dongjoon.
>>>> >>>>
>>>> >>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com>
>>>> wrote:
>>>> >>>>>
>>>> >>>>> +1
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>>>> tathagata.das1565@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> +1
>>>> >>>>>>
>>>> >>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <
>>>> gurwls223@gmail.com> wrote:
>>>> >>>>>>>
>>>> >>>>>>> +1
>>>> >>>>>>>
>>>> >>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> bump for more visibility.
>>>> >>>>>>>>
>>>> >>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>> Hi dev,
>>>> >>>>>>>>>
>>>> >>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4,
>>>> in favor of promoting Structured Streaming.
>>>> >>>>>>>>> (Sorry for the late proposal, if we don't make the change in
>>>> 3.4, we will have to wait for another 6 months.)
>>>> >>>>>>>>>
>>>> >>>>>>>>> We have been focusing on Structured Streaming for years
>>>> (across multiple major and minor versions), and during the time we haven't
>>>> made any improvements for DStream. Furthermore, recently we updated the
>>>> DStream doc to explicitly say DStream is a legacy project.
>>>> >>>>>>>>>
>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>> >>>>>>>>>
>>>> >>>>>>>>> The baseline of deprecation is that we don't see a particular
>>>> use case which only DStream solves. This is a different story with GraphX
>>>> and MLLIB, as we don't have replacements for that.
>>>> >>>>>>>>>
>>>> >>>>>>>>> The proposal does not mean we will remove the API soon, as
>>>> the Spark project has been making deprecation against public API. I don't
>>>> intend to propose the target version for removal. The goal is to guide
>>>> users to refrain from constructing a new workload with DStream. We might
>>>> want to go with this in future, but it would require a new discussion
>>>> thread at that time.
>>>> >>>>>>>>>
>>>> >>>>>>>>> What do you think?
>>>> >>>>>>>>>
>>>> >>>>>>>>> Thanks,
>>>> >>>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>
>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
Given that I got more than 3 PMC members' positive votes as well as several
active contributors' positive votes as well, I will proceed with the actual
work.
(It may take a couple of more days as folk in US will help me and there's a
holiday in US.)

Please let me know if we want to have an official vote thread before moving
forward.

Thanks all for providing your voices on this!

On Sat, Jan 14, 2023 at 3:56 AM Anish Shrigondekar <
anish.shrigondekar@databricks.com> wrote:

> +1 on the Dstreams deprecation proposal
>
> On Fri, Jan 13, 2023 at 10:47 AM Jerry Peng <je...@gmail.com>
> wrote:
>
>> +1 in general for marking the DStreams API as deprecated
>>
>> Jungtaek, can you please provide / elaborate on the concrete actions you
>> intend on taking for the depreciation process?
>>
>> Best,
>>
>> Jerry
>>
>> On Thu, Jan 12, 2023 at 11:16 PM L. C. Hsieh <vi...@gmail.com> wrote:
>>
>>> +1
>>>
>>> On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
>>> <ka...@gmail.com> wrote:
>>> >
>>> > Yes, exactly. I'm sorry to bring confusion - should have clarified
>>> action items on the proposal.
>>> >
>>> > On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com>
>>> wrote:
>>> >>
>>> >> Then, could you elaborate `the proposed code change` specifically?
>>> >> Maybe, usual deprecation warning logs and annotation on the API?
>>> >>
>>> >>
>>> >> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>> >>>
>>> >>> Maybe I need to clarify - my proposal is "explicitly" deprecating
>>> it, which incurs code change for sure. Guidance on the Spark website is
>>> done already as I mentioned - we updated the DStream doc page to mention
>>> that DStream is a "legacy" project and users should move to SS. I don't
>>> feel this is sufficient to refrain users from using it, hence initiating
>>> this proposal.
>>> >>>
>>> >>> Sorry to make confusion. I just wanted to make sure the goal of the
>>> proposal is not "removing" the API. The discussion on the removal of API
>>> doesn't tend to go well, so I wanted to make sure I don't mean that.
>>> >>>
>>> >>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <
>>> dongjoon.hyun@gmail.com> wrote:
>>> >>>>
>>> >>>> +1 for the proposal (guiding only without any code change).
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Dongjoon.
>>> >>>>
>>> >>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com>
>>> wrote:
>>> >>>>>
>>> >>>>> +1
>>> >>>>>
>>> >>>>>
>>> >>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>>> tathagata.das1565@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> +1
>>> >>>>>>
>>> >>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> +1
>>> >>>>>>>
>>> >>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>> >>>>>>>>
>>> >>>>>>>> bump for more visibility.
>>> >>>>>>>>
>>> >>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>> Hi dev,
>>> >>>>>>>>>
>>> >>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4,
>>> in favor of promoting Structured Streaming.
>>> >>>>>>>>> (Sorry for the late proposal, if we don't make the change in
>>> 3.4, we will have to wait for another 6 months.)
>>> >>>>>>>>>
>>> >>>>>>>>> We have been focusing on Structured Streaming for years
>>> (across multiple major and minor versions), and during the time we haven't
>>> made any improvements for DStream. Furthermore, recently we updated the
>>> DStream doc to explicitly say DStream is a legacy project.
>>> >>>>>>>>>
>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>> >>>>>>>>>
>>> >>>>>>>>> The baseline of deprecation is that we don't see a particular
>>> use case which only DStream solves. This is a different story with GraphX
>>> and MLLIB, as we don't have replacements for that.
>>> >>>>>>>>>
>>> >>>>>>>>> The proposal does not mean we will remove the API soon, as the
>>> Spark project has been making deprecation against public API. I don't
>>> intend to propose the target version for removal. The goal is to guide
>>> users to refrain from constructing a new workload with DStream. We might
>>> want to go with this in future, but it would require a new discussion
>>> thread at that time.
>>> >>>>>>>>>
>>> >>>>>>>>> What do you think?
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>
>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Anish Shrigondekar <an...@databricks.com.INVALID>.
+1 on the Dstreams deprecation proposal

On Fri, Jan 13, 2023 at 10:47 AM Jerry Peng <je...@gmail.com>
wrote:

> +1 in general for marking the DStreams API as deprecated
>
> Jungtaek, can you please provide / elaborate on the concrete actions you
> intend on taking for the depreciation process?
>
> Best,
>
> Jerry
>
> On Thu, Jan 12, 2023 at 11:16 PM L. C. Hsieh <vi...@gmail.com> wrote:
>
>> +1
>>
>> On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
>> <ka...@gmail.com> wrote:
>> >
>> > Yes, exactly. I'm sorry to bring confusion - should have clarified
>> action items on the proposal.
>> >
>> > On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com>
>> wrote:
>> >>
>> >> Then, could you elaborate `the proposed code change` specifically?
>> >> Maybe, usual deprecation warning logs and annotation on the API?
>> >>
>> >>
>> >> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>
>> >>> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
>> which incurs code change for sure. Guidance on the Spark website is done
>> already as I mentioned - we updated the DStream doc page to mention that
>> DStream is a "legacy" project and users should move to SS. I don't feel
>> this is sufficient to refrain users from using it, hence initiating this
>> proposal.
>> >>>
>> >>> Sorry to make confusion. I just wanted to make sure the goal of the
>> proposal is not "removing" the API. The discussion on the removal of API
>> doesn't tend to go well, so I wanted to make sure I don't mean that.
>> >>>
>> >>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <
>> dongjoon.hyun@gmail.com> wrote:
>> >>>>
>> >>>> +1 for the proposal (guiding only without any code change).
>> >>>>
>> >>>> Thanks,
>> >>>> Dongjoon.
>> >>>>
>> >>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> +1
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>> tathagata.das1565@gmail.com> wrote:
>> >>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> +1
>> >>>>>>>
>> >>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> bump for more visibility.
>> >>>>>>>>
>> >>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi dev,
>> >>>>>>>>>
>> >>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
>> favor of promoting Structured Streaming.
>> >>>>>>>>> (Sorry for the late proposal, if we don't make the change in
>> 3.4, we will have to wait for another 6 months.)
>> >>>>>>>>>
>> >>>>>>>>> We have been focusing on Structured Streaming for years (across
>> multiple major and minor versions), and during the time we haven't made any
>> improvements for DStream. Furthermore, recently we updated the DStream doc
>> to explicitly say DStream is a legacy project.
>> >>>>>>>>>
>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>> >>>>>>>>>
>> >>>>>>>>> The baseline of deprecation is that we don't see a particular
>> use case which only DStream solves. This is a different story with GraphX
>> and MLLIB, as we don't have replacements for that.
>> >>>>>>>>>
>> >>>>>>>>> The proposal does not mean we will remove the API soon, as the
>> Spark project has been making deprecation against public API. I don't
>> intend to propose the target version for removal. The goal is to guide
>> users to refrain from constructing a new workload with DStream. We might
>> want to go with this in future, but it would require a new discussion
>> thread at that time.
>> >>>>>>>>>
>> >>>>>>>>> What do you think?
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Jungtaek Lim (HeartSaVioR)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
I described it in the thread -  I had to add it in the reply so it's not
easy to find. Sorry for the inconvenience.

https://lists.apache.org/thread/d9yg7w9pnb9rw7c2yglp4qk6jt43y0kw


On Sat, Jan 14, 2023 at 3:46 AM Jerry Peng <je...@gmail.com>
wrote:

> +1 in general for marking the DStreams API as deprecated
>
> Jungtaek, can you please provide / elaborate on the concrete actions you
> intend on taking for the depreciation process?
>
> Best,
>
> Jerry
>
> On Thu, Jan 12, 2023 at 11:16 PM L. C. Hsieh <vi...@gmail.com> wrote:
>
>> +1
>>
>> On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
>> <ka...@gmail.com> wrote:
>> >
>> > Yes, exactly. I'm sorry to bring confusion - should have clarified
>> action items on the proposal.
>> >
>> > On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com>
>> wrote:
>> >>
>> >> Then, could you elaborate `the proposed code change` specifically?
>> >> Maybe, usual deprecation warning logs and annotation on the API?
>> >>
>> >>
>> >> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>
>> >>> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
>> which incurs code change for sure. Guidance on the Spark website is done
>> already as I mentioned - we updated the DStream doc page to mention that
>> DStream is a "legacy" project and users should move to SS. I don't feel
>> this is sufficient to refrain users from using it, hence initiating this
>> proposal.
>> >>>
>> >>> Sorry to make confusion. I just wanted to make sure the goal of the
>> proposal is not "removing" the API. The discussion on the removal of API
>> doesn't tend to go well, so I wanted to make sure I don't mean that.
>> >>>
>> >>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <
>> dongjoon.hyun@gmail.com> wrote:
>> >>>>
>> >>>> +1 for the proposal (guiding only without any code change).
>> >>>>
>> >>>> Thanks,
>> >>>> Dongjoon.
>> >>>>
>> >>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> +1
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>> tathagata.das1565@gmail.com> wrote:
>> >>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> +1
>> >>>>>>>
>> >>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> bump for more visibility.
>> >>>>>>>>
>> >>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi dev,
>> >>>>>>>>>
>> >>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
>> favor of promoting Structured Streaming.
>> >>>>>>>>> (Sorry for the late proposal, if we don't make the change in
>> 3.4, we will have to wait for another 6 months.)
>> >>>>>>>>>
>> >>>>>>>>> We have been focusing on Structured Streaming for years (across
>> multiple major and minor versions), and during the time we haven't made any
>> improvements for DStream. Furthermore, recently we updated the DStream doc
>> to explicitly say DStream is a legacy project.
>> >>>>>>>>>
>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>> >>>>>>>>>
>> >>>>>>>>> The baseline of deprecation is that we don't see a particular
>> use case which only DStream solves. This is a different story with GraphX
>> and MLLIB, as we don't have replacements for that.
>> >>>>>>>>>
>> >>>>>>>>> The proposal does not mean we will remove the API soon, as the
>> Spark project has been making deprecation against public API. I don't
>> intend to propose the target version for removal. The goal is to guide
>> users to refrain from constructing a new workload with DStream. We might
>> want to go with this in future, but it would require a new discussion
>> thread at that time.
>> >>>>>>>>>
>> >>>>>>>>> What do you think?
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Jungtaek Lim (HeartSaVioR)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jerry Peng <je...@gmail.com>.
+1 in general for marking the DStreams API as deprecated

Jungtaek, can you please provide / elaborate on the concrete actions you
intend on taking for the depreciation process?

Best,

Jerry

On Thu, Jan 12, 2023 at 11:16 PM L. C. Hsieh <vi...@gmail.com> wrote:

> +1
>
> On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
> <ka...@gmail.com> wrote:
> >
> > Yes, exactly. I'm sorry to bring confusion - should have clarified
> action items on the proposal.
> >
> > On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com>
> wrote:
> >>
> >> Then, could you elaborate `the proposed code change` specifically?
> >> Maybe, usual deprecation warning logs and annotation on the API?
> >>
> >>
> >> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>>
> >>> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
> which incurs code change for sure. Guidance on the Spark website is done
> already as I mentioned - we updated the DStream doc page to mention that
> DStream is a "legacy" project and users should move to SS. I don't feel
> this is sufficient to refrain users from using it, hence initiating this
> proposal.
> >>>
> >>> Sorry to make confusion. I just wanted to make sure the goal of the
> proposal is not "removing" the API. The discussion on the removal of API
> doesn't tend to go well, so I wanted to make sure I don't mean that.
> >>>
> >>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com>
> wrote:
> >>>>
> >>>> +1 for the proposal (guiding only without any code change).
> >>>>
> >>>> Thanks,
> >>>> Dongjoon.
> >>>>
> >>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com>
> wrote:
> >>>>>
> >>>>> +1
> >>>>>
> >>>>>
> >>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
> tathagata.das1565@gmail.com> wrote:
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> +1
> >>>>>>>
> >>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> bump for more visibility.
> >>>>>>>>
> >>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi dev,
> >>>>>>>>>
> >>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
> favor of promoting Structured Streaming.
> >>>>>>>>> (Sorry for the late proposal, if we don't make the change in
> 3.4, we will have to wait for another 6 months.)
> >>>>>>>>>
> >>>>>>>>> We have been focusing on Structured Streaming for years (across
> multiple major and minor versions), and during the time we haven't made any
> improvements for DStream. Furthermore, recently we updated the DStream doc
> to explicitly say DStream is a legacy project.
> >>>>>>>>>
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
> >>>>>>>>>
> >>>>>>>>> The baseline of deprecation is that we don't see a particular
> use case which only DStream solves. This is a different story with GraphX
> and MLLIB, as we don't have replacements for that.
> >>>>>>>>>
> >>>>>>>>> The proposal does not mean we will remove the API soon, as the
> Spark project has been making deprecation against public API. I don't
> intend to propose the target version for removal. The goal is to guide
> users to refrain from constructing a new workload with DStream. We might
> want to go with this in future, but it would require a new discussion
> thread at that time.
> >>>>>>>>>
> >>>>>>>>> What do you think?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Jungtaek Lim (HeartSaVioR)
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by "L. C. Hsieh" <vi...@gmail.com>.
+1

On Thu, Jan 12, 2023 at 10:39 PM Jungtaek Lim
<ka...@gmail.com> wrote:
>
> Yes, exactly. I'm sorry to bring confusion - should have clarified action items on the proposal.
>
> On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com> wrote:
>>
>> Then, could you elaborate `the proposed code change` specifically?
>> Maybe, usual deprecation warning logs and annotation on the API?
>>
>>
>> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>>
>>> Maybe I need to clarify - my proposal is "explicitly" deprecating it, which incurs code change for sure. Guidance on the Spark website is done already as I mentioned - we updated the DStream doc page to mention that DStream is a "legacy" project and users should move to SS. I don't feel this is sufficient to refrain users from using it, hence initiating this proposal.
>>>
>>> Sorry to make confusion. I just wanted to make sure the goal of the proposal is not "removing" the API. The discussion on the removal of API doesn't tend to go well, so I wanted to make sure I don't mean that.
>>>
>>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com> wrote:
>>>>
>>>> +1 for the proposal (guiding only without any code change).
>>>>
>>>> Thanks,
>>>> Dongjoon.
>>>>
>>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:
>>>>>
>>>>> +1
>>>>>
>>>>>
>>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <ta...@gmail.com> wrote:
>>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>>>>>>
>>>>>>> +1
>>>>>>>
>>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <ka...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> bump for more visibility.
>>>>>>>>
>>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi dev,
>>>>>>>>>
>>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor of promoting Structured Streaming.
>>>>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we will have to wait for another 6 months.)
>>>>>>>>>
>>>>>>>>> We have been focusing on Structured Streaming for years (across multiple major and minor versions), and during the time we haven't made any improvements for DStream. Furthermore, recently we updated the DStream doc to explicitly say DStream is a legacy project.
>>>>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>>>>>
>>>>>>>>> The baseline of deprecation is that we don't see a particular use case which only DStream solves. This is a different story with GraphX and MLLIB, as we don't have replacements for that.
>>>>>>>>>
>>>>>>>>> The proposal does not mean we will remove the API soon, as the Spark project has been making deprecation against public API. I don't intend to propose the target version for removal. The goal is to guide users to refrain from constructing a new workload with DStream. We might want to go with this in future, but it would require a new discussion thread at that time.
>>>>>>>>>
>>>>>>>>> What do you think?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jungtaek Lim (HeartSaVioR)

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
Yes, exactly. I'm sorry to bring confusion - should have clarified action
items on the proposal.

On Fri, Jan 13, 2023 at 3:31 PM Dongjoon Hyun <do...@gmail.com>
wrote:

> Then, could you elaborate `the proposed code change` specifically?
> Maybe, usual deprecation warning logs and annotation on the API?
>
>
> On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
>
>> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
>> which incurs code change for sure. Guidance on the Spark website is done
>> already as I mentioned - we updated the DStream doc page to mention that
>> DStream is a "legacy" project and users should move to SS. I don't feel
>> this is sufficient to refrain users from using it, hence initiating
>> this proposal.
>>
>> Sorry to make confusion. I just wanted to make sure the goal of the
>> proposal is not "removing" the API. The discussion on the removal of API
>> doesn't tend to go well, so I wanted to make sure I don't mean that.
>>
>> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com>
>> wrote:
>>
>>> +1 for the proposal (guiding only without any code change).
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>>
>>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>>>> tathagata.das1565@gmail.com> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> bump for more visibility.
>>>>>>>
>>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi dev,
>>>>>>>>
>>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
>>>>>>>> favor of promoting Structured Streaming.
>>>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4,
>>>>>>>> we will have to wait for another 6 months.)
>>>>>>>>
>>>>>>>> We have been focusing on Structured Streaming for years (across
>>>>>>>> multiple major and minor versions), and during the time we haven't made any
>>>>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>>>>> to explicitly say DStream is a legacy project.
>>>>>>>>
>>>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>>>>
>>>>>>>> The baseline of deprecation is that we don't see a particular use
>>>>>>>> case which only DStream solves. This is a different story with GraphX and
>>>>>>>> MLLIB, as we don't have replacements for that.
>>>>>>>>
>>>>>>>> The proposal does not mean we will remove the API soon, as the
>>>>>>>> Spark project has been making deprecation against public API. I don't
>>>>>>>> intend to propose the target version for removal. The goal is to guide
>>>>>>>> users to refrain from constructing a new workload with DStream. We might
>>>>>>>> want to go with this in future, but it would require a new discussion
>>>>>>>> thread at that time.
>>>>>>>>
>>>>>>>> What do you think?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>>>
>>>>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Dongjoon Hyun <do...@gmail.com>.
Then, could you elaborate `the proposed code change` specifically?
Maybe, usual deprecation warning logs and annotation on the API?


On Thu, Jan 12, 2023 at 10:05 PM Jungtaek Lim <ka...@gmail.com>
wrote:

> Maybe I need to clarify - my proposal is "explicitly" deprecating it,
> which incurs code change for sure. Guidance on the Spark website is done
> already as I mentioned - we updated the DStream doc page to mention that
> DStream is a "legacy" project and users should move to SS. I don't feel
> this is sufficient to refrain users from using it, hence initiating
> this proposal.
>
> Sorry to make confusion. I just wanted to make sure the goal of the
> proposal is not "removing" the API. The discussion on the removal of API
> doesn't tend to go well, so I wanted to make sure I don't mean that.
>
> On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com>
> wrote:
>
>> +1 for the proposal (guiding only without any code change).
>>
>> Thanks,
>> Dongjoon.
>>
>> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:
>>
>>> +1
>>>
>>>
>>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>>> tathagata.das1565@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>
>>>>>> bump for more visibility.
>>>>>>
>>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> Hi dev,
>>>>>>>
>>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in
>>>>>>> favor of promoting Structured Streaming.
>>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>>>>>> will have to wait for another 6 months.)
>>>>>>>
>>>>>>> We have been focusing on Structured Streaming for years (across
>>>>>>> multiple major and minor versions), and during the time we haven't made any
>>>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>>>> to explicitly say DStream is a legacy project.
>>>>>>>
>>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>>>
>>>>>>> The baseline of deprecation is that we don't see a particular use
>>>>>>> case which only DStream solves. This is a different story with GraphX and
>>>>>>> MLLIB, as we don't have replacements for that.
>>>>>>>
>>>>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>>>>> project has been making deprecation against public API. I don't intend to
>>>>>>> propose the target version for removal. The goal is to guide users to
>>>>>>> refrain from constructing a new workload with DStream. We might want to go
>>>>>>> with this in future, but it would require a new discussion thread at that
>>>>>>> time.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>>
>>>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
Maybe I need to clarify - my proposal is "explicitly" deprecating it, which
incurs code change for sure. Guidance on the Spark website is done already
as I mentioned - we updated the DStream doc page to mention that DStream is
a "legacy" project and users should move to SS. I don't feel this is
sufficient to refrain users from using it, hence initiating this proposal.

Sorry to make confusion. I just wanted to make sure the goal of the
proposal is not "removing" the API. The discussion on the removal of API
doesn't tend to go well, so I wanted to make sure I don't mean that.

On Fri, Jan 13, 2023 at 2:46 PM Dongjoon Hyun <do...@gmail.com>
wrote:

> +1 for the proposal (guiding only without any code change).
>
> Thanks,
> Dongjoon.
>
> On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:
>
>> +1
>>
>>
>> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <
>> tathagata.das1565@gmail.com> wrote:
>>
>>> +1
>>>
>>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>>
>>>>> bump for more visibility.
>>>>>
>>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>>> kabhwan.opensource@gmail.com> wrote:
>>>>>
>>>>>> Hi dev,
>>>>>>
>>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor
>>>>>> of promoting Structured Streaming.
>>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>>>>> will have to wait for another 6 months.)
>>>>>>
>>>>>> We have been focusing on Structured Streaming for years (across
>>>>>> multiple major and minor versions), and during the time we haven't made any
>>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>>> to explicitly say DStream is a legacy project.
>>>>>>
>>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>>
>>>>>> The baseline of deprecation is that we don't see a particular use
>>>>>> case which only DStream solves. This is a different story with GraphX and
>>>>>> MLLIB, as we don't have replacements for that.
>>>>>>
>>>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>>>> project has been making deprecation against public API. I don't intend to
>>>>>> propose the target version for removal. The goal is to guide users to
>>>>>> refrain from constructing a new workload with DStream. We might want to go
>>>>>> with this in future, but it would require a new discussion thread at that
>>>>>> time.
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> Thanks,
>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>
>>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Dongjoon Hyun <do...@gmail.com>.
+1 for the proposal (guiding only without any code change).

Thanks,
Dongjoon.

On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu <zs...@gmail.com> wrote:

> +1
>
>
> On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <ta...@gmail.com>
> wrote:
>
>> +1
>>
>> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> +1
>>>
>>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <ka...@gmail.com>
>>> wrote:
>>>
>>>> bump for more visibility.
>>>>
>>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>>> kabhwan.opensource@gmail.com> wrote:
>>>>
>>>>> Hi dev,
>>>>>
>>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor
>>>>> of promoting Structured Streaming.
>>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>>>> will have to wait for another 6 months.)
>>>>>
>>>>> We have been focusing on Structured Streaming for years (across
>>>>> multiple major and minor versions), and during the time we haven't made any
>>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>>> to explicitly say DStream is a legacy project.
>>>>>
>>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>>
>>>>> The baseline of deprecation is that we don't see a particular use case
>>>>> which only DStream solves. This is a different story with GraphX and MLLIB,
>>>>> as we don't have replacements for that.
>>>>>
>>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>>> project has been making deprecation against public API. I don't intend to
>>>>> propose the target version for removal. The goal is to guide users to
>>>>> refrain from constructing a new workload with DStream. We might want to go
>>>>> with this in future, but it would require a new discussion thread at that
>>>>> time.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>
>>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Shixiong Zhu <zs...@gmail.com>.
+1


On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das <ta...@gmail.com>
wrote:

> +1
>
> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> +1
>>
>> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <ka...@gmail.com>
>> wrote:
>>
>>> bump for more visibility.
>>>
>>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>>> kabhwan.opensource@gmail.com> wrote:
>>>
>>>> Hi dev,
>>>>
>>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor
>>>> of promoting Structured Streaming.
>>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>>> will have to wait for another 6 months.)
>>>>
>>>> We have been focusing on Structured Streaming for years (across
>>>> multiple major and minor versions), and during the time we haven't made any
>>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>>> to explicitly say DStream is a legacy project.
>>>>
>>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>>
>>>> The baseline of deprecation is that we don't see a particular use case
>>>> which only DStream solves. This is a different story with GraphX and MLLIB,
>>>> as we don't have replacements for that.
>>>>
>>>> The proposal does not mean we will remove the API soon, as the Spark
>>>> project has been making deprecation against public API. I don't intend to
>>>> propose the target version for removal. The goal is to guide users to
>>>> refrain from constructing a new workload with DStream. We might want to go
>>>> with this in future, but it would require a new discussion thread at that
>>>> time.
>>>>
>>>> What do you think?
>>>>
>>>> Thanks,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Tathagata Das <ta...@gmail.com>.
+1

On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> +1
>
> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <ka...@gmail.com>
> wrote:
>
>> bump for more visibility.
>>
>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
>> kabhwan.opensource@gmail.com> wrote:
>>
>>> Hi dev,
>>>
>>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
>>> promoting Structured Streaming.
>>> (Sorry for the late proposal, if we don't make the change in 3.4, we
>>> will have to wait for another 6 months.)
>>>
>>> We have been focusing on Structured Streaming for years (across multiple
>>> major and minor versions), and during the time we haven't made any
>>> improvements for DStream. Furthermore, recently we updated the DStream doc
>>> to explicitly say DStream is a legacy project.
>>>
>>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>>
>>> The baseline of deprecation is that we don't see a particular use case
>>> which only DStream solves. This is a different story with GraphX and MLLIB,
>>> as we don't have replacements for that.
>>>
>>> The proposal does not mean we will remove the API soon, as the Spark
>>> project has been making deprecation against public API. I don't intend to
>>> propose the target version for removal. The goal is to guide users to
>>> refrain from constructing a new workload with DStream. We might want to go
>>> with this in future, but it would require a new discussion thread at that
>>> time.
>>>
>>> What do you think?
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Hyukjin Kwon <gu...@gmail.com>.
+1

On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim <ka...@gmail.com>
wrote:

> bump for more visibility.
>
> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <
> kabhwan.opensource@gmail.com> wrote:
>
>> Hi dev,
>>
>> I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
>> promoting Structured Streaming.
>> (Sorry for the late proposal, if we don't make the change in 3.4, we will
>> have to wait for another 6 months.)
>>
>> We have been focusing on Structured Streaming for years (across multiple
>> major and minor versions), and during the time we haven't made any
>> improvements for DStream. Furthermore, recently we updated the DStream doc
>> to explicitly say DStream is a legacy project.
>> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>>
>> The baseline of deprecation is that we don't see a particular use case
>> which only DStream solves. This is a different story with GraphX and MLLIB,
>> as we don't have replacements for that.
>>
>> The proposal does not mean we will remove the API soon, as the Spark
>> project has been making deprecation against public API. I don't intend to
>> propose the target version for removal. The goal is to guide users to
>> refrain from constructing a new workload with DStream. We might want to go
>> with this in future, but it would require a new discussion thread at that
>> time.
>>
>> What do you think?
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>

Re: [DISCUSS] Deprecate DStream in 3.4

Posted by Jungtaek Lim <ka...@gmail.com>.
bump for more visibility.

On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim <ka...@gmail.com>
wrote:

> Hi dev,
>
> I'd like to propose the deprecation of DStream in Spark 3.4, in favor of
> promoting Structured Streaming.
> (Sorry for the late proposal, if we don't make the change in 3.4, we will
> have to wait for another 6 months.)
>
> We have been focusing on Structured Streaming for years (across multiple
> major and minor versions), and during the time we haven't made any
> improvements for DStream. Furthermore, recently we updated the DStream doc
> to explicitly say DStream is a legacy project.
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#note
>
> The baseline of deprecation is that we don't see a particular use case
> which only DStream solves. This is a different story with GraphX and MLLIB,
> as we don't have replacements for that.
>
> The proposal does not mean we will remove the API soon, as the Spark
> project has been making deprecation against public API. I don't intend to
> propose the target version for removal. The goal is to guide users to
> refrain from constructing a new workload with DStream. We might want to go
> with this in future, but it would require a new discussion thread at that
> time.
>
> What do you think?
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>