You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Matt Fysh <ma...@gmail.com> on 2022/10/31 03:47:56 UTC

OutOfMemoryError (java heap space) on small, local test

Hi there,

I am running a local test with:
* source = env.from_collection
* sink = datastream.execute_and_collect
with a map function between, and two very small data points in the
collection

I'm able to generate an OutOfMemoryError, and due to the nature of this
test using simple source and sink, plus not having large data size
requirements, I suspect this is due to a bug.

I'm running v1.13.2 and have created a docker-based reproduction repository
here: https://github.com/mattfysh/pyflink-oom

Please take a look and let me know what you think

Thanks!
Matt

Re: OutOfMemoryError (java heap space) on small, local test

Posted by Matt Fysh <ma...@gmail.com>.
Thanks Leonard for taking a look. It seems odd that returning a list of
objects can cause a fatal error such as this, and since I am new to Flink
and also relatively new to Python, I assume that I am doing
something wrong as returning a list of objects is a fairly common data
modelling scenario.

Please let me know which sections of the docs, or which areas of Python, I
should read to learn how to find a solution to this problem

Thanks

On Mon, 31 Oct 2022 at 18:49, Leonard Xu <xb...@gmail.com> wrote:

> Hi, Matt
>
> I’ve checked your job is pretty simple, I've CC Xingbo who is a PyFlink
> expert to help take a quick look.
>
>
> Best,
> Leonard
>
> 2022年10月31日 上午11:47,Matt Fysh <ma...@gmail.com> 写道:
>
> Hi there,
>
> I am running a local test with:
> * source = env.from_collection
> * sink = datastream.execute_and_collect
> with a map function between, and two very small data points in the
> collection
>
> I'm able to generate an OutOfMemoryError, and due to the nature of this
> test using simple source and sink, plus not having large data size
> requirements, I suspect this is due to a bug.
>
> I'm running v1.13.2 and have created a docker-based reproduction
> repository here: https://github.com/mattfysh/pyflink-oom
>
> Please take a look and let me know what you think
>
> Thanks!
> Matt
>
>
>

Re: OutOfMemoryError (java heap space) on small, local test

Posted by Leonard Xu <xb...@gmail.com>.
Hi, Matt

I’ve checked your job is pretty simple, I've CC Xingbo who is a PyFlink expert to help take a quick look. 


Best,
Leonard

> 2022年10月31日 上午11:47,Matt Fysh <ma...@gmail.com> 写道:
> 
> Hi there,
> 
> I am running a local test with:
> * source = env.from_collection
> * sink = datastream.execute_and_collect
> with a map function between, and two very small data points in the collection
> 
> I'm able to generate an OutOfMemoryError, and due to the nature of this test using simple source and sink, plus not having large data size requirements, I suspect this is due to a bug.
> 
> I'm running v1.13.2 and have created a docker-based reproduction repository here: https://github.com/mattfysh/pyflink-oom <https://github.com/mattfysh/pyflink-oom>
> 
> Please take a look and let me know what you think
> 
> Thanks!
> Matt