You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Noel OConnor <no...@gmail.com> on 2022/09/14 13:22:00 UTC

DataStream and DataStreamSource

Hi,
I'm new to flink and I'm trying to integrate it with apache pulsar.
I've gone through the demos and I get how they work but one aspect
that I can't figure out is what's the difference between a DataStream
and a DataStreamSource.
When would you use one over the other?

cheers
Noel

Re: DataStream and DataStreamSource

Posted by Noel OConnor <no...@gmail.com>.
Awesome, thanks for the info! much appreciated.

On Wed, Sep 14, 2022 at 5:04 PM Jing Ge <ji...@ververica.com> wrote:
>
> Hi,
>
> Welcome to the Flink community!
>
> A DataStreamSource is a DataStream. It is normally used as the starting point of a DataStream.  All related methods in StreamExecutionEnvironment that create a DataStream return actually a DataStreamSource, because it is where a DataStream starts.
>
> Commonly, you don't care about DataStreamSource, just use DataStream even if methods in StreamExecutionEnvironment return a DataStreamSource [1]. DataStreamSource created by those methods in StreamExecutionEnvironment will use built-in SourceTranfromation. If you want to modify the configuration of the transformation, you can specifically use the DataStreamSource type(instead of DataStream) which provides some setter methods extended from SingleOutputStreamOperator.
>
> Best regards,
> Jing
>
> [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/overview/
>
> On Wed, Sep 14, 2022 at 3:23 PM Noel OConnor <no...@gmail.com> wrote:
>>
>> Hi,
>> I'm new to flink and I'm trying to integrate it with apache pulsar.
>> I've gone through the demos and I get how they work but one aspect
>> that I can't figure out is what's the difference between a DataStream
>> and a DataStreamSource.
>> When would you use one over the other?
>>
>> cheers
>> Noel

Re: DataStream and DataStreamSource

Posted by Jing Ge <ji...@ververica.com>.
Hi,

Welcome to the Flink community!

A DataStreamSource is a DataStream. It is normally used as the starting
point of a DataStream.  All related methods in StreamExecutionEnvironment
that create a DataStream return actually a DataStreamSource, because it is
where a DataStream starts.

Commonly, you don't care about DataStreamSource, just use DataStream even
if methods in StreamExecutionEnvironment return a DataStreamSource [1].
DataStreamSource created by those methods in StreamExecutionEnvironment
will use built-in SourceTranfromation. If you want to modify the
configuration of the transformation, you can specifically use the
DataStreamSource type(instead of DataStream) which provides some setter
methods extended from SingleOutputStreamOperator.

Best regards,
Jing

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/overview/

On Wed, Sep 14, 2022 at 3:23 PM Noel OConnor <no...@gmail.com> wrote:

> Hi,
> I'm new to flink and I'm trying to integrate it with apache pulsar.
> I've gone through the demos and I get how they work but one aspect
> that I can't figure out is what's the difference between a DataStream
> and a DataStreamSource.
> When would you use one over the other?
>
> cheers
> Noel
>