You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Georgi Stoyanov <gs...@live.com> on 2018/05/09 09:55:35 UTC

Using two different configurations for StreamExecutionEnvironment

Hi, folks


We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.

So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

If you have some ideas or know-how from your projects, please share them 😊

Kind Regards,
Georgi

Re: Using two different configurations for StreamExecutionEnvironment

Posted by Chesnay Schepler <ch...@apache.org>.
Ah my bad, the StreamingMultipleProgramsTestBase doesn't allow setting 
the configuration...

I went ahead and wrote you a utility class that your test class should 
extend. The configuration for the cluster is passed through the constructor.

public class MyMultipleProgramsTestBaseextends AbstractTestBase {

    private LocalFlinkMiniClustercluster; public MyMultipleProgramsTestBase(Configuration config) {
       super(config); }

    @Before public void setup()throws Exception {
       cluster = TestBaseUtils.startCluster(config, true); int numTm =config.getInteger(ConfigConstants.LOCAL_NUMBER_TASK_MANAGER, ConfigConstants.DEFAULT_LOCAL_NUMBER_TASK_MANAGER); int numSlots =config.getInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 1); TestStreamEnvironment.setAsContext(cluster, numTm * numSlots); }

    @After public void teardown()throws Exception {
       TestStreamEnvironment.unsetAsContext(); stopCluster(cluster, TestBaseUtils.DEFAULT_TIMEOUT); }
}


On 09.05.2018 14:48, Georgi Stoyanov wrote:
>
> Sorry for the lame question – but how can I achieve that?
>
> I’m with 1.4, created a new class that extends  /StreamingMultipleProgramsTestBase, /call the super() and then call the 
> main method.
> The thing is that in my Util class I create all of the things about 
> Checkpoints (including the setting of the hdfs folder and the other 
> stuffs described below) and I again can’t run it cause of 
> “CheckpointConfig says to persist periodic checkpoints, but no 
> checkpoint directory has been configured. You can configure configure 
> one via key 'state.checkpoints.dir'”
> Regards
> Georgi
>
> ------------------------------------------------------------------------
> *From:* Chesnay Schepler <ch...@apache.org>
> *Sent:* Wednesday, May 9, 2018 3:19:19 PM
> *To:* Georgi Stoyanov; user@flink.apache.org
> *Subject:* Re: Using two different configurations for 
> StreamExecutionEnvironment
> The jobs are still executed in local JVM along with the cluster, so 
> local debugging is still possible.
>
> On 09.05.2018 14:15, Georgi Stoyanov wrote:
>>
>> Hi Chesnay
>>
>> Thanks for the suggestion but this doesn’t sound like a good option, 
>> since I prefer local to remote debugging.
>>
>> My question sounds like really common thing and the guys behind Flink 
>> should’ve think about it.
>>
>> Of course I’m really new to the field of stream processing and maybe 
>> I don’t understand completely the processes here.
>>
>> ------------------------------------------------------------------------
>> *From:* Chesnay Schepler <ch...@apache.org>
>> *Sent:* Wednesday, May 9, 2018 2:34:31 PM
>> *To:* user@flink.apache.org
>> *Subject:* Re: Using two different configurations for 
>> StreamExecutionEnvironment
>> I suggest to run jobs in the IDE only as tests. This allows you to 
>> use various Flink utilities to setup a cluster with your desired 
>> configuration, all while keeping these details out of the actual job 
>> code.
>>
>> If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource 
>> class. The class can be used as a jUnit Rule and takes care of 
>> everything needed to run any job against the cluster it sets up 
>> internally.
>>
>> For previous versions have your test class extend 
>> StreamingMultipleProgramsTestBase.
>>
>> The test method would simply call the main method of your job.
>>
>> On 09.05.2018 11:55, Georgi Stoyanov wrote:
>>>
>>> Hi, folks
>>>
>>> We have an Util that creates for us StreamExecutionEnvironment with 
>>> some Checkpoint Configurations. The configuration for externalized 
>>> checkpoints and state backend fails running of the job locally from 
>>> IntelliJ. As a solution we currently comment those two 
>>> configurations, but I don’t like that.
>>> So far I found that I can just use ‘/createLocalEnvoirment’ /method 
>>> but I still have no much ideas how to place it nicely in the 
>>> existing logic (In the way that it doesn’t look like a hack)
>>>
>>> If you have some ideas or know-how from your projects, please share 
>>> them 😊
>>>
>>> Kind Regards,
>>>
>>> Georgi
>>>
>>
>


RE: Using two different configurations for StreamExecutionEnvironment

Posted by Georgi Stoyanov <gs...@live.com>.
Sorry for the lame question – but how can I achieve that?

 I’m with 1.4, created a new class that extends StreamingMultipleProgramsTestBase, call the super() and then call the main method.

The thing is that in my Util class I create all of the things about Checkpoints (including the setting of the hdfs folder and the other stuffs described below) and I again can’t run it cause of “CheckpointConfig says to persist periodic checkpoints, but no checkpoint directory has been configured. You can configure configure one via key 'state.checkpoints.dir'”



Regards

Georgi

________________________________
From: Chesnay Schepler <ch...@apache.org>
Sent: Wednesday, May 9, 2018 3:19:19 PM
To: Georgi Stoyanov; user@flink.apache.org
Subject: Re: Using two different configurations for StreamExecutionEnvironment

The jobs are still executed in local JVM along with the cluster, so local debugging is still possible.

On 09.05.2018 14:15, Georgi Stoyanov wrote:

Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.
My question sounds like really common thing and the guys behind Flink should’ve think about it.
Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.



________________________________
From: Chesnay Schepler <ch...@apache.org>
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: user@flink.apache.org<ma...@flink.apache.org>
Subject: Re: Using two different configurations for StreamExecutionEnvironment

I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:
Hi, folks


We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.

So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

If you have some ideas or know-how from your projects, please share them 😊

Kind Regards,
Georgi



Re: Using two different configurations for StreamExecutionEnvironment

Posted by Chesnay Schepler <ch...@apache.org>.
The jobs are still executed in local JVM along with the cluster, so 
local debugging is still possible.

On 09.05.2018 14:15, Georgi Stoyanov wrote:
>
> Hi Chesnay
>
> Thanks for the suggestion but this doesn’t sound like a good option, 
> since I prefer local to remote debugging.
>
> My question sounds like really common thing and the guys behind Flink 
> should’ve think about it.
>
> Of course I’m really new to the field of stream processing and maybe I 
> don’t understand completely the processes here.
>
> ------------------------------------------------------------------------
> *From:* Chesnay Schepler <ch...@apache.org>
> *Sent:* Wednesday, May 9, 2018 2:34:31 PM
> *To:* user@flink.apache.org
> *Subject:* Re: Using two different configurations for 
> StreamExecutionEnvironment
> I suggest to run jobs in the IDE only as tests. This allows you to use 
> various Flink utilities to setup a cluster with your desired 
> configuration, all while keeping these details out of the actual job code.
>
> If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource 
> class. The class can be used as a jUnit Rule and takes care of 
> everything needed to run any job against the cluster it sets up 
> internally.
>
> For previous versions have your test class extend 
> StreamingMultipleProgramsTestBase.
>
> The test method would simply call the main method of your job.
>
> On 09.05.2018 11:55, Georgi Stoyanov wrote:
>>
>> Hi, folks
>>
>> We have an Util that creates for us StreamExecutionEnvironment with 
>> some Checkpoint Configurations. The configuration for externalized 
>> checkpoints and state backend fails running of the job locally from 
>> IntelliJ. As a solution we currently comment those two 
>> configurations, but I don’t like that.
>> So far I found that I can just use ‘/createLocalEnvoirment’ /method 
>> but I still have no much ideas how to place it nicely in the existing 
>> logic (In the way that it doesn’t look like a hack)
>>
>> If you have some ideas or know-how from your projects, please share 
>> them 😊
>>
>> Kind Regards,
>>
>> Georgi
>>
>


RE: Using two different configurations for StreamExecutionEnvironment

Posted by Georgi Stoyanov <gs...@live.com>.
Hi Chesnay

Thanks for the suggestion but this doesn’t sound like a good option, since I prefer local to remote debugging.
My question sounds like really common thing and the guys behind Flink should’ve think about it.
Of course I’m really new to the field of stream processing and maybe I don’t understand completely the processes here.



________________________________
From: Chesnay Schepler <ch...@apache.org>
Sent: Wednesday, May 9, 2018 2:34:31 PM
To: user@flink.apache.org
Subject: Re: Using two different configurations for StreamExecutionEnvironment

I suggest to run jobs in the IDE only as tests. This allows you to use various Flink utilities to setup a cluster with your desired configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource class. The class can be used as a jUnit Rule and takes care of everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:
Hi, folks


We have an Util that creates for us StreamExecutionEnvironment with some Checkpoint Configurations. The configuration for externalized checkpoints and state backend fails running of the job locally from IntelliJ. As a solution we currently comment those two configurations, but I don’t like that.

So far I found that I can just use ‘createLocalEnvoirment’ method but I still have no much ideas how to place it nicely in the existing logic (In the way that it doesn’t look like a hack)

If you have some ideas or know-how from your projects, please share them 😊

Kind Regards,
Georgi


Re: Using two different configurations for StreamExecutionEnvironment

Posted by Chesnay Schepler <ch...@apache.org>.
I suggest to run jobs in the IDE only as tests. This allows you to use 
various Flink utilities to setup a cluster with your desired 
configuration, all while keeping these details out of the actual job code.

If you are using 1.5-SNAPSHOT, have a look at the MiniClusterResource 
class. The class can be used as a jUnit Rule and takes care of 
everything needed to run any job against the cluster it sets up internally.

For previous versions have your test class extend 
StreamingMultipleProgramsTestBase.

The test method would simply call the main method of your job.

On 09.05.2018 11:55, Georgi Stoyanov wrote:
>
> Hi, folks
>
> We have an Util that creates for us StreamExecutionEnvironment with 
> some Checkpoint Configurations. The configuration for externalized 
> checkpoints and state backend fails running of the job locally from 
> IntelliJ. As a solution we currently comment those two configurations, 
> but I don’t like that.
> So far I found that I can just use ‘/createLocalEnvoirment’ /method 
> but I still have no much ideas how to place it nicely in the existing 
> logic (In the way that it doesn’t look like a hack)
>
> If you have some ideas or know-how from your projects, please share 
> them 😊
>
> Kind Regards,
>
> Georgi
>