You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Bart Kastermans <fl...@kasterma.net> on 2017/08/24 12:12:57 UTC

Database connection from job

I am using the scala api for Flink, and am trying to set up a JDBC
database connection
in my job (on every incoming event I want to query the database to get
some data
to enrich the event).  Because of the serialization and deserialization
of the code as
it is send from the flink master to the flink workers I cannot just open
the connection
in my main method.  Can someone give me a pointer to the lifecycle
methods that
are called by the worker to do local initialization of the job?  I have
not yet been able
to find any references or examples of this in the documentation.

Thanks!

Best,
Bart

Re: Database connection from job

Posted by Aljoscha Krettek <al...@apache.org>.
Hi Bart,

I think you might be interested in the (admittedly short) section of the doc about RichFunctions: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/api_concepts.html#rich-functions <https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/api_concepts.html#rich-functions>

If you make your user function a RichFunction you can implement the lifecycle methods open() and close() that allow you to setup, for example, a database connection that you wan't to reuse for the lifetime of your user function.

Best,
Aljoscha

> On 24. Aug 2017, at 17:42, Stefan Richter <s....@data-artisans.com> wrote:
> 
> Hi,
> 
> the lifecycle is described here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/task_lifecycle.html <https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/task_lifecycle.html>
> 
> Best,
> Stefan
> 
>> Am 24.08.2017 um 14:12 schrieb Bart Kastermans <flink@kasterma.net <ma...@kasterma.net>>:
>> 
>> I am using the scala api for Flink, and am trying to set up a JDBC
>> database connection
>> in my job (on every incoming event I want to query the database to get
>> some data
>> to enrich the event).  Because of the serialization and deserialization
>> of the code as
>> it is send from the flink master to the flink workers I cannot just open
>> the connection
>> in my main method.  Can someone give me a pointer to the lifecycle
>> methods that
>> are called by the worker to do local initialization of the job?  I have
>> not yet been able
>> to find any references or examples of this in the documentation.
>> 
>> Thanks!
>> 
>> Best,
>> Bart
> 


Re: Database connection from job

Posted by Stefan Richter <s....@data-artisans.com>.
Hi,

the lifecycle is described here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/task_lifecycle.html <https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/task_lifecycle.html>

Best,
Stefan

> Am 24.08.2017 um 14:12 schrieb Bart Kastermans <fl...@kasterma.net>:
> 
> I am using the scala api for Flink, and am trying to set up a JDBC
> database connection
> in my job (on every incoming event I want to query the database to get
> some data
> to enrich the event).  Because of the serialization and deserialization
> of the code as
> it is send from the flink master to the flink workers I cannot just open
> the connection
> in my main method.  Can someone give me a pointer to the lifecycle
> methods that
> are called by the worker to do local initialization of the job?  I have
> not yet been able
> to find any references or examples of this in the documentation.
> 
> Thanks!
> 
> Best,
> Bart