You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Puneet Duggal <pu...@gmail.com> on 2021/09/13 11:49:37 UTC

JVM Metaspace capacity planning

Hi,

So on going through multiple resources, got basic idea that JVM Metaspace is used by flink class loader to load class metadata which is used to create objects in heap. Also this is a one time activity since all the objects of single class require single class metadata object in JVM Metaspace. 

But while deploying multiple jobs on task manager, i saw almost linear increase in consumption of metaspace (irrespective of parallelism). Even if those multiple jobs have exactly same implementation. So wanted to confirm if each job in flink has its own class loader which loads required classes in Task Manager JVM Metaspace.

PS: Any documentation for this will be of great help.

Thanks,
Puneet

Re: JVM Metaspace capacity planning

Posted by Puneet Duggal <pu...@gmail.com>.

Thank you guys, this documentation exactly lists out the issues that i am facing. 

> On 14-Sep-2021, at 2:14 PM, Guowei Ma <gu...@gmail.com> wrote:
> 
> Hi, Puneet
> In general every job  has its own classloader. You could find more detailed information from doc [1].
> You could put some common jar into the "/lib" to avoid this [2].
> 
> [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/ <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/>
> [2] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code>
> 
> Best,
> Guowei
> 
> 
> On Mon, Sep 13, 2021 at 10:06 PM Puneet Duggal <puneetduggal1795@gmail.com <ma...@gmail.com>> wrote:
> Hi,
> 
> Thank you for quick reply. So in my case i am using Datastream Apis.Each job is a real time processing engine which consumes data from kafka and performs some processing on top of it before ingesting into sink.
> 
> JVM Metaspace size earlier set was around 256MB (default) which i had to increase to 3GB so that ~30 parallel jobs can run simultaneously on single task manager.
> 
> Regards,
> Puneet
> 
>> On 13-Sep-2021, at 5:46 PM, Caizhi Weng <tsreaper96@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi!
>> 
>> Which API are you using? The datastream API or the Table / SQL API? If it is the Table / SQL API then some Java classes for some operators (for example aggregations, projection, filter, etc.) will be generated when compiling user code to executable Java code. These Java classes are new to the JVM. So if you're running too many jobs in the same Flink cluster a metaspace OOM might occur. There is already a JIRA ticket for this [1].
>> 
>> I don't know much about the behavior of class loaders, so I'll wait for others to apply in this aspect.
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-15024 <https://issues.apache.org/jira/browse/FLINK-15024>
>> Puneet Duggal <puneetduggal1795@gmail.com <ma...@gmail.com>> 于2021年9月13日周一 下午7:49写道：
>> Hi,
>> 
>> So on going through multiple resources, got basic idea that JVM Metaspace is used by flink class loader to load class metadata which is used to create objects in heap. Also this is a one time activity since all the objects of single class require single class metadata object in JVM Metaspace. 
>> 
>> But while deploying multiple jobs on task manager, i saw almost linear increase in consumption of metaspace (irrespective of parallelism). Even if those multiple jobs have exactly same implementation. So wanted to confirm if each job in flink has its own class loader which loads required classes in Task Manager JVM Metaspace.
>> 
>> PS: Any documentation for this will be of great help.
>> 
>> Thanks,
>> Puneet
>

Re: JVM Metaspace capacity planning

Posted by Guowei Ma <gu...@gmail.com>.

Hi, Puneet
In general every job  has its own classloader. You could find more detailed
information from doc [1].
You could put some common jar into the "/lib" to avoid this [2].

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code

Best,
Guowei


On Mon, Sep 13, 2021 at 10:06 PM Puneet Duggal <pu...@gmail.com>
wrote:

> Hi,
>
> Thank you for quick reply. So in my case i am using Datastream Apis.Each
> job is a real time processing engine which consumes data from kafka and
> performs some processing on top of it before ingesting into sink.
>
> JVM Metaspace size earlier set was around 256MB (default) which i had to
> increase to 3GB so that ~30 parallel jobs can run simultaneously on single
> task manager.
>
> Regards,
> Puneet
>
> On 13-Sep-2021, at 5:46 PM, Caizhi Weng <ts...@gmail.com> wrote:
>
> Hi!
>
> Which API are you using? The datastream API or the Table / SQL API? If it
> is the Table / SQL API then some Java classes for some operators (for
> example aggregations, projection, filter, etc.) will be generated when
> compiling user code to executable Java code. These Java classes are new to
> the JVM. So if you're running too many jobs in the same Flink cluster a
> metaspace OOM might occur. There is already a JIRA ticket for this [1].
>
> I don't know much about the behavior of class loaders, so I'll wait for
> others to apply in this aspect.
>
> [1] https://issues.apache.org/jira/browse/FLINK-15024
>
> Puneet Duggal <pu...@gmail.com> 于2021年9月13日周一 下午7:49写道：
>
>> Hi,
>>
>> So on going through multiple resources, got basic idea that JVM Metaspace
>> is used by flink class loader to load class metadata which is used to
>> create objects in heap. Also this is a one time activity since all the
>> objects of single class require single class metadata object in JVM
>> Metaspace.
>>
>> But while deploying multiple jobs on task manager, i saw almost linear
>> increase in consumption of metaspace (irrespective of parallelism). Even if
>> those multiple jobs have exactly same implementation. So wanted to confirm
>> if each job in flink has its own class loader which loads required classes
>> in Task Manager JVM Metaspace.
>>
>> PS: Any documentation for this will be of great help.
>>
>> Thanks,
>> Puneet
>
>
>

Re: JVM Metaspace capacity planning

Posted by Puneet Duggal <pu...@gmail.com>.

Hi,

Thank you for quick reply. So in my case i am using Datastream Apis.Each job is a real time processing engine which consumes data from kafka and performs some processing on top of it before ingesting into sink.

JVM Metaspace size earlier set was around 256MB (default) which i had to increase to 3GB so that ~30 parallel jobs can run simultaneously on single task manager.

Regards,
Puneet

> On 13-Sep-2021, at 5:46 PM, Caizhi Weng <ts...@gmail.com> wrote:
> 
> Hi!
> 
> Which API are you using? The datastream API or the Table / SQL API? If it is the Table / SQL API then some Java classes for some operators (for example aggregations, projection, filter, etc.) will be generated when compiling user code to executable Java code. These Java classes are new to the JVM. So if you're running too many jobs in the same Flink cluster a metaspace OOM might occur. There is already a JIRA ticket for this [1].
> 
> I don't know much about the behavior of class loaders, so I'll wait for others to apply in this aspect.
> 
> [1] https://issues.apache.org/jira/browse/FLINK-15024 <https://issues.apache.org/jira/browse/FLINK-15024>
> Puneet Duggal <puneetduggal1795@gmail.com <ma...@gmail.com>> 于2021年9月13日周一 下午7:49写道：
> Hi,
> 
> So on going through multiple resources, got basic idea that JVM Metaspace is used by flink class loader to load class metadata which is used to create objects in heap. Also this is a one time activity since all the objects of single class require single class metadata object in JVM Metaspace. 
> 
> But while deploying multiple jobs on task manager, i saw almost linear increase in consumption of metaspace (irrespective of parallelism). Even if those multiple jobs have exactly same implementation. So wanted to confirm if each job in flink has its own class loader which loads required classes in Task Manager JVM Metaspace.
> 
> PS: Any documentation for this will be of great help.
> 
> Thanks,
> Puneet

Re: JVM Metaspace capacity planning

Posted by Caizhi Weng <ts...@gmail.com>.

Hi!

Which API are you using? The datastream API or the Table / SQL API? If it
is the Table / SQL API then some Java classes for some operators (for
example aggregations, projection, filter, etc.) will be generated when
compiling user code to executable Java code. These Java classes are new to
the JVM. So if you're running too many jobs in the same Flink cluster a
metaspace OOM might occur. There is already a JIRA ticket for this [1].

I don't know much about the behavior of class loaders, so I'll wait for
others to apply in this aspect.

[1] https://issues.apache.org/jira/browse/FLINK-15024

Puneet Duggal <pu...@gmail.com> 于2021年9月13日周一 下午7:49写道：

> Hi,
>
> So on going through multiple resources, got basic idea that JVM Metaspace
> is used by flink class loader to load class metadata which is used to
> create objects in heap. Also this is a one time activity since all the
> objects of single class require single class metadata object in JVM
> Metaspace.
>
> But while deploying multiple jobs on task manager, i saw almost linear
> increase in consumption of metaspace (irrespective of parallelism). Even if
> those multiple jobs have exactly same implementation. So wanted to confirm
> if each job in flink has its own class loader which loads required classes
> in Task Manager JVM Metaspace.
>
> PS: Any documentation for this will be of great help.
>
> Thanks,
> Puneet