You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Maximilian Bode <ma...@tngtech.com> on 2016/01/19 17:35:57 UTC

JDBCInputFormat GC overhead limit exceeded error

Hi everyone,

I am facing a problem using the JDBCInputFormat which occurred in a larger Flink job. As a minimal example I can reproduce it when just writing data into a csv after having read it from a database, i.e.

DataSet<Tuple1<String>> existingData = env.createInput(
	JDBCInputFormat.buildJDBCInputFormat()
		.setDrivername("oracle.jdbc.driver.OracleDriver")
		.setUsername(…)
		.setPassword(…)
		.setDBUrl(…)
		.setQuery("select DATA from TABLENAME")
		.finish(),
	new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
existingData.writeAsCsv(…);

where DATA is a column containing strings of length ~25 characters and TABLENAME contains 20 million rows.

After starting the job on a YARN cluster (using -tm 3072 and leaving the other memory settings at default values), Flink happily goes along at first but then fails after something like three million records have been sent by the JDBCInputFormat. The Exception reads "The slot in which the task was executed has been released. Probably loss of TaskManager …". The local taskmanager.log in the affected container reads
"java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
        at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
        at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
        at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)"

Any ideas what is going wrong here?

Cheers,
Max

— 
Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Robert Metzger <rm...@apache.org>.

@Max: If you're too busy right now to open a PR for this, I can also do it
... just let me know.

On Wed, Jan 20, 2016 at 7:28 PM, Stephan Ewen <se...@apache.org> wrote:

> Super, thanks for finding this. Makes a lot of sense to have a result set
> that does hold onto data.
>
> Would be great if you could open a pull request with this fix, as other
> users will benefit from that as well!
>
> Cheers,
> Stephan
>
>
> On Wed, Jan 20, 2016 at 6:03 PM, Maximilian Bode <
> maximilian.bode@tngtech.com> wrote:
>
>> Hi Stephan,
>>
>> thanks for your fast answer. Just setting the Flink-managed memory to a
>> low value would not have worked for us, as we need joins etc. in the same
>> job.
>>
>> After investigating the JDBCInputFormat, we found the line
>> statement = dbConn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
>> ResultSet.CONCUR_READ_ONLY);
>> to be the culprit; to be more exact, the scrollable result set. When
>> replaced with TYPE_FORWARD_ONLY, some changes have to be made to
>> nextRecord() and reachedEnd(), but this does the job without memory leak.
>>
>> Another change that might be useful (as far as performance is concerned)
>> is disabling autocommits and letting users decide the fetchSize (somewhat
>> in parallel to batchInterval in JDBCOutputFormat).
>>
>> Cheers,
>> Max
>> —
>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com * 0176
>> 1000 75 50
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>> Am 19.01.2016 um 21:26 schrieb Stephan Ewen <se...@apache.org>:
>>
>> Hi!
>>
>> This kind of error (GC overhead exceeded) usually means that the system
>> is reaching a state where it has very many still living objects and frees
>> little memory during each collection. As a consequence, it is basically
>> busy with only garbage collection.
>>
>> Your job probably has about 500-600 MB or free memory, the rest is at
>> that memory size reserved for JVM overhead and Flink's worker memory.
>> Now, since your job actually does not keep any objects or rows around,
>> this should be plenty. I can only suspect that the Oracle JDBC driver is
>> very memory hungry, thus pushing the system to the limit. (I found this,
>> for example
>>
>> What you can do:
>>  For this kind of job, you can simply tell Flink to not reserve as much
>> memory, by using the option "taskmanager.memory.size=1". If the JDBC driver
>> has no leak, but is simply super hungry, this should solve it.
>>
>> Greetings,
>> Stephan
>>
>>
>> I also found these resources concerning Oracle JDBC memory:
>>
>>  -
>> http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks
>> (bottom answer)
>>  - https://community.oracle.com/thread/2220078?tstart=0
>>
>>
>> On Tue, Jan 19, 2016 at 5:44 PM, Maximilian Bode <
>> maximilian.bode@tngtech.com> wrote:
>>
>>> Hi Robert,
>>>
>>> I am using 0.10.1.
>>>
>>>
>>> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rm...@apache.org>:
>>>
>>> Hi Max,
>>>
>>> which version of Flink are you using?
>>>
>>> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <
>>> maximilian.bode@tngtech.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I am facing a problem using the JDBCInputFormat which occurred in a
>>>> larger Flink job. As a minimal example I can reproduce it when just writing
>>>> data into a csv after having read it from a database, i.e.
>>>>
>>>> DataSet<Tuple1<String>> existingData = env.createInput(
>>>> JDBCInputFormat.buildJDBCInputFormat()
>>>> .setDrivername("oracle.jdbc.driver.OracleDriver")
>>>> .setUsername(…)
>>>> .setPassword(…)
>>>> .setDBUrl(…)
>>>> .setQuery("select DATA from TABLENAME")
>>>> .finish(),
>>>> new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
>>>> existingData.writeAsCsv(…);
>>>>
>>>> where DATA is a column containing strings of length ~25 characters and
>>>> TABLENAME contains 20 million rows.
>>>>
>>>> After starting the job on a YARN cluster (using -tm 3072 and leaving
>>>> the other memory settings at default values), Flink happily goes along at
>>>> first but then fails after something like three million records have been
>>>> sent by the JDBCInputFormat. The Exception reads "The slot in which the
>>>> task was executed has been released. Probably loss of TaskManager …". The
>>>> local taskmanager.log in the affected container reads
>>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>>         at
>>>> java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>>>>
>>>> at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>>>>         at
>>>> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>>>>         at
>>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>>>         at
>>>> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>         at java.lang.Thread.run(Thread.java:744)"
>>>>
>>>> Any ideas what is going wrong here?
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>> —
>>>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
>>>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>>>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>>>
>>>>
>>>
>>>
>>
>>
>

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Stephan Ewen <se...@apache.org>.

Super, thanks for finding this. Makes a lot of sense to have a result set
that does hold onto data.

Would be great if you could open a pull request with this fix, as other
users will benefit from that as well!

Cheers,
Stephan


On Wed, Jan 20, 2016 at 6:03 PM, Maximilian Bode <
maximilian.bode@tngtech.com> wrote:

> Hi Stephan,
>
> thanks for your fast answer. Just setting the Flink-managed memory to a
> low value would not have worked for us, as we need joins etc. in the same
> job.
>
> After investigating the JDBCInputFormat, we found the line
> statement = dbConn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
> ResultSet.CONCUR_READ_ONLY);
> to be the culprit; to be more exact, the scrollable result set. When
> replaced with TYPE_FORWARD_ONLY, some changes have to be made to
> nextRecord() and reachedEnd(), but this does the job without memory leak.
>
> Another change that might be useful (as far as performance is concerned)
> is disabling autocommits and letting users decide the fetchSize (somewhat
> in parallel to batchInterval in JDBCOutputFormat).
>
> Cheers,
> Max
> —
> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com * 0176
> 1000 75 50
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
> Am 19.01.2016 um 21:26 schrieb Stephan Ewen <se...@apache.org>:
>
> Hi!
>
> This kind of error (GC overhead exceeded) usually means that the system is
> reaching a state where it has very many still living objects and frees
> little memory during each collection. As a consequence, it is basically
> busy with only garbage collection.
>
> Your job probably has about 500-600 MB or free memory, the rest is at that
> memory size reserved for JVM overhead and Flink's worker memory.
> Now, since your job actually does not keep any objects or rows around,
> this should be plenty. I can only suspect that the Oracle JDBC driver is
> very memory hungry, thus pushing the system to the limit. (I found this,
> for example
>
> What you can do:
>  For this kind of job, you can simply tell Flink to not reserve as much
> memory, by using the option "taskmanager.memory.size=1". If the JDBC driver
> has no leak, but is simply super hungry, this should solve it.
>
> Greetings,
> Stephan
>
>
> I also found these resources concerning Oracle JDBC memory:
>
>  -
> http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks
> (bottom answer)
>  - https://community.oracle.com/thread/2220078?tstart=0
>
>
> On Tue, Jan 19, 2016 at 5:44 PM, Maximilian Bode <
> maximilian.bode@tngtech.com> wrote:
>
>> Hi Robert,
>>
>> I am using 0.10.1.
>>
>>
>> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rm...@apache.org>:
>>
>> Hi Max,
>>
>> which version of Flink are you using?
>>
>> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <
>> maximilian.bode@tngtech.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I am facing a problem using the JDBCInputFormat which occurred in a
>>> larger Flink job. As a minimal example I can reproduce it when just writing
>>> data into a csv after having read it from a database, i.e.
>>>
>>> DataSet<Tuple1<String>> existingData = env.createInput(
>>> JDBCInputFormat.buildJDBCInputFormat()
>>> .setDrivername("oracle.jdbc.driver.OracleDriver")
>>> .setUsername(…)
>>> .setPassword(…)
>>> .setDBUrl(…)
>>> .setQuery("select DATA from TABLENAME")
>>> .finish(),
>>> new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
>>> existingData.writeAsCsv(…);
>>>
>>> where DATA is a column containing strings of length ~25 characters and
>>> TABLENAME contains 20 million rows.
>>>
>>> After starting the job on a YARN cluster (using -tm 3072 and leaving the
>>> other memory settings at default values), Flink happily goes along at first
>>> but then fails after something like three million records have been sent by
>>> the JDBCInputFormat. The Exception reads "The slot in which the task was
>>> executed has been released. Probably loss of TaskManager …". The local
>>> taskmanager.log in the affected container reads
>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at
>>> java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>>>
>>> at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:744)"
>>>
>>> Any ideas what is going wrong here?
>>>
>>> Cheers,
>>> Max
>>>
>>> —
>>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
>>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>>
>>>
>>
>>
>
>

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Maximilian Bode <ma...@tngtech.com>.

Hi Stephan,

thanks for your fast answer. Just setting the Flink-managed memory to a low value would not have worked for us, as we need joins etc. in the same job.

After investigating the JDBCInputFormat, we found the line 
statement = dbConn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
to be the culprit; to be more exact, the scrollable result set. When replaced with TYPE_FORWARD_ONLY, some changes have to be made to nextRecord() and reachedEnd(), but this does the job without memory leak.

Another change that might be useful (as far as performance is concerned) is disabling autocommits and letting users decide the fetchSize (somewhat in parallel to batchInterval in JDBCOutputFormat).

Cheers,
Max
— 
Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com * 0176 1000 75 50
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

> Am 19.01.2016 um 21:26 schrieb Stephan Ewen <se...@apache.org>:
> 
> Hi!
> 
> This kind of error (GC overhead exceeded) usually means that the system is reaching a state where it has very many still living objects and frees little memory during each collection. As a consequence, it is basically busy with only garbage collection.
> 
> Your job probably has about 500-600 MB or free memory, the rest is at that memory size reserved for JVM overhead and Flink's worker memory.
> Now, since your job actually does not keep any objects or rows around, this should be plenty. I can only suspect that the Oracle JDBC driver is very memory hungry, thus pushing the system to the limit. (I found this, for example 
> 
> What you can do:
>  For this kind of job, you can simply tell Flink to not reserve as much memory, by using the option "taskmanager.memory.size=1". If the JDBC driver has no leak, but is simply super hungry, this should solve it.
> 
> Greetings,
> Stephan
> 
> 
> I also found these resources concerning Oracle JDBC memory:
> 
>  - http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks <http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks> (bottom answer)
>  - https://community.oracle.com/thread/2220078?tstart=0 <https://community.oracle.com/thread/2220078?tstart=0>
> 
> 
> On Tue, Jan 19, 2016 at 5:44 PM, Maximilian Bode <maximilian.bode@tngtech.com <ma...@tngtech.com>> wrote:
> Hi Robert,
> 
> I am using 0.10.1.
> 
> 
>> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rmetzger@apache.org <ma...@apache.org>>:
>> 
>> Hi Max,
>> 
>> which version of Flink are you using?
>> 
>> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <maximilian.bode@tngtech.com <ma...@tngtech.com>> wrote:
>> Hi everyone,
>> 
>> I am facing a problem using the JDBCInputFormat which occurred in a larger Flink job. As a minimal example I can reproduce it when just writing data into a csv after having read it from a database, i.e.
>> 
>> DataSet<Tuple1<String>> existingData = env.createInput(
>> 	JDBCInputFormat.buildJDBCInputFormat()
>> 		.setDrivername("oracle.jdbc.driver.OracleDriver")
>> 		.setUsername(…)
>> 		.setPassword(…)
>> 		.setDBUrl(…)
>> 		.setQuery("select DATA from TABLENAME")
>> 		.finish(),
>> 	new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
>> existingData.writeAsCsv(…);
>> 
>> where DATA is a column containing strings of length ~25 characters and TABLENAME contains 20 million rows.
>> 
>> After starting the job on a YARN cluster (using -tm 3072 and leaving the other memory settings at default values), Flink happily goes along at first but then fails after something like three million records have been sent by the JDBCInputFormat. The Exception reads "The slot in which the task was executed has been released. Probably loss of TaskManager …". The local taskmanager.log in the affected container reads
>> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:744)"
>> 
>> Any ideas what is going wrong here?
>> 
>> Cheers,
>> Max
>> 
>> — 
>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com <ma...@tngtech.com>
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>> 
>> 
> 
>

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Stephan Ewen <se...@apache.org>.

Hi!

This kind of error (GC overhead exceeded) usually means that the system is
reaching a state where it has very many still living objects and frees
little memory during each collection. As a consequence, it is basically
busy with only garbage collection.

Your job probably has about 500-600 MB or free memory, the rest is at that
memory size reserved for JVM overhead and Flink's worker memory.
Now, since your job actually does not keep any objects or rows around, this
should be plenty. I can only suspect that the Oracle JDBC driver is very
memory hungry, thus pushing the system to the limit. (I found this, for
example

What you can do:
 For this kind of job, you can simply tell Flink to not reserve as much
memory, by using the option "taskmanager.memory.size=1". If the JDBC driver
has no leak, but is simply super hungry, this should solve it.

Greetings,
Stephan


I also found these resources concerning Oracle JDBC memory:

 -
http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks
(bottom answer)
 - https://community.oracle.com/thread/2220078?tstart=0


On Tue, Jan 19, 2016 at 5:44 PM, Maximilian Bode <
maximilian.bode@tngtech.com> wrote:

> Hi Robert,
>
> I am using 0.10.1.
>
>
> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rm...@apache.org>:
>
> Hi Max,
>
> which version of Flink are you using?
>
> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <
> maximilian.bode@tngtech.com> wrote:
>
>> Hi everyone,
>>
>> I am facing a problem using the JDBCInputFormat which occurred in a
>> larger Flink job. As a minimal example I can reproduce it when just writing
>> data into a csv after having read it from a database, i.e.
>>
>> DataSet<Tuple1<String>> existingData = env.createInput(
>> JDBCInputFormat.buildJDBCInputFormat()
>> .setDrivername("oracle.jdbc.driver.OracleDriver")
>> .setUsername(…)
>> .setPassword(…)
>> .setDBUrl(…)
>> .setQuery("select DATA from TABLENAME")
>> .finish(),
>> new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
>> existingData.writeAsCsv(…);
>>
>> where DATA is a column containing strings of length ~25 characters and
>> TABLENAME contains 20 million rows.
>>
>> After starting the job on a YARN cluster (using -tm 3072 and leaving the
>> other memory settings at default values), Flink happily goes along at first
>> but then fails after something like three million records have been sent by
>> the JDBCInputFormat. The Exception reads "The slot in which the task was
>> executed has been released. Probably loss of TaskManager …". The local
>> taskmanager.log in the affected container reads
>> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at
>> java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>>
>> at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>>         at
>> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>         at
>> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:744)"
>>
>> Any ideas what is going wrong here?
>>
>> Cheers,
>> Max
>>
>> —
>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>
>

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Maximilian Bode <ma...@tngtech.com>.

Hi Robert,

I am using 0.10.1.

> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rm...@apache.org>:
> 
> Hi Max,
> 
> which version of Flink are you using?
> 
> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <maximilian.bode@tngtech.com <ma...@tngtech.com>> wrote:
> Hi everyone,
> 
> I am facing a problem using the JDBCInputFormat which occurred in a larger Flink job. As a minimal example I can reproduce it when just writing data into a csv after having read it from a database, i.e.
> 
> DataSet<Tuple1<String>> existingData = env.createInput(
> 	JDBCInputFormat.buildJDBCInputFormat()
> 		.setDrivername("oracle.jdbc.driver.OracleDriver")
> 		.setUsername(…)
> 		.setPassword(…)
> 		.setDBUrl(…)
> 		.setQuery("select DATA from TABLENAME")
> 		.finish(),
> 	new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
> existingData.writeAsCsv(…);
> 
> where DATA is a column containing strings of length ~25 characters and TABLENAME contains 20 million rows.
> 
> After starting the job on a YARN cluster (using -tm 3072 and leaving the other memory settings at default values), Flink happily goes along at first but then fails after something like three million records have been sent by the JDBCInputFormat. The Exception reads "The slot in which the task was executed has been released. Probably loss of TaskManager …". The local taskmanager.log in the affected container reads
> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>         at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>         at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>         at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)"
> 
> Any ideas what is going wrong here?
> 
> Cheers,
> Max
> 
> —
> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com <ma...@tngtech.com>
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
> 
>

Re: JDBCInputFormat GC overhead limit exceeded error

Posted by Robert Metzger <rm...@apache.org>.

Hi Max,

which version of Flink are you using?

On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <
maximilian.bode@tngtech.com> wrote:

> Hi everyone,
>
> I am facing a problem using the JDBCInputFormat which occurred in a larger
> Flink job. As a minimal example I can reproduce it when just writing data
> into a csv after having read it from a database, i.e.
>
> DataSet<Tuple1<String>> existingData = env.createInput(
> JDBCInputFormat.buildJDBCInputFormat()
> .setDrivername("oracle.jdbc.driver.OracleDriver")
> .setUsername(…)
> .setPassword(…)
> .setDBUrl(…)
> .setQuery("select DATA from TABLENAME")
> .finish(),
> new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
> existingData.writeAsCsv(…);
>
> where DATA is a column containing strings of length ~25 characters and
> TABLENAME contains 20 million rows.
>
> After starting the job on a YARN cluster (using -tm 3072 and leaving the
> other memory settings at default values), Flink happily goes along at first
> but then fails after something like three million records have been sent by
> the JDBCInputFormat. The Exception reads "The slot in which the task was
> executed has been released. Probably loss of TaskManager …". The local
> taskmanager.log in the affected container reads
> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at
> java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>
> at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)"
>
> Any ideas what is going wrong here?
>
> Cheers,
> Max
>
> —
> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
>