You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Sebastian Neef <ge...@mailbox.tu-berlin.de> on 2017/04/29 11:26:58 UTC

Weird serialization bug?

Hello Apache Flink users,

I implemented a FilterFunction some months ago that worked quite well
back then. However, I wanted to check it out right now and it somehow
broke in the sense that Flink can't serialize it anymore. I might be
mistaken, but afaik I didn't touch the code at all.

I think that I've tracked down the problem to the following minimal
working PoC:

- A simple interface:

>     interface testFunc extends Serializable {
>         boolean filter();
>     }

- A TestFilterFunction which is applied on a DataSet:

>  public void doSomeFiltering() {
>         class fooo implements testFunc {
>             public boolean filter() {
>                 return false;
>             }
>         }
> 
>         class TestFilterFunction implements FilterFunction<IPage> {
> 
>             testFunc filter;
> 
>             class fooo2 implements testFunc {
>                 public boolean filter() {
>                     return false;
>                 }
>             }
> 
>             TestFilterFunction() {
>                 // WORKS with fooo2()
>                 // DOES NOT WORK with fooo()
>                 this.filter = new fooo2();
>             }
>             @Override
>             public boolean filter(IPage iPage) throws Exception {
>                 return filter.filter();
>             }
>         }
> 	filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }

Flink will work fine when the "fooo2" class is used. However, when using
the "fooo()" class, I get the following error:

> ------------------------------------------------------------
>  The program finished with the following exception:
> 
> The implementation of the FilterFunction is not serializable. The object probably contains or references non serializable fields.
> 	org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
> 	org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
> 	org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
> 	testpackage.testclass.applyFilters(testclass.java:105)

I'm a little bit confused, why Flink manages to serialize the "fooo2"
class, but not the "fooo" class. Is this is a bug or do I miss something
here?

Cheers,
Sebastian


Re: Weird serialization bug?

Posted by Sebastian Neef <ge...@mailbox.tu-berlin.de>.
Hi,

thanks for the help!

Making the class fooo static did the trick.

I was just a bit confused, because I'm using a similar contruction
somewhere else in the code and it works flawlessy.

Best regards,
Sebastian

Re: Weird serialization bug?

Posted by Aljoscha Krettek <al...@apache.org>.
To elaborate on what Ted said: fooo is defined inside a method and probably has references to outer (non serialisable) classes.

> On 30. Apr 2017, at 01:15, Ted Yu <yu...@gmail.com> wrote:
> 
> Have you tried making fooo static ?
> 
> Cheers
> 
> On Sat, Apr 29, 2017 at 4:26 AM, Sebastian Neef <gehaxelt@mailbox.tu-berlin.de <ma...@mailbox.tu-berlin.de>> wrote:
> Hello Apache Flink users,
> 
> I implemented a FilterFunction some months ago that worked quite well
> back then. However, I wanted to check it out right now and it somehow
> broke in the sense that Flink can't serialize it anymore. I might be
> mistaken, but afaik I didn't touch the code at all.
> 
> I think that I've tracked down the problem to the following minimal
> working PoC:
> 
> - A simple interface:
> 
> >     interface testFunc extends Serializable {
> >         boolean filter();
> >     }
> 
> - A TestFilterFunction which is applied on a DataSet:
> 
> >  public void doSomeFiltering() {
> >         class fooo implements testFunc {
> >             public boolean filter() {
> >                 return false;
> >             }
> >         }
> >
> >         class TestFilterFunction implements FilterFunction<IPage> {
> >
> >             testFunc filter;
> >
> >             class fooo2 implements testFunc {
> >                 public boolean filter() {
> >                     return false;
> >                 }
> >             }
> >
> >             TestFilterFunction() {
> >                 // WORKS with fooo2()
> >                 // DOES NOT WORK with fooo()
> >                 this.filter = new fooo2();
> >             }
> >             @Override
> >             public boolean filter(IPage iPage) throws Exception {
> >                 return filter.filter();
> >             }
> >         }
> >       filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }
> 
> Flink will work fine when the "fooo2" class is used. However, when using
> the "fooo()" class, I get the following error:
> 
> > ------------------------------------------------------------
> >  The program finished with the following exception:
> >
> > The implementation of the FilterFunction is not serializable. The object probably contains or references non serializable fields.
> >       org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
> >       org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
> >       org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
> >       testpackage.testclass.applyFilters(testclass.java:105)
> 
> I'm a little bit confused, why Flink manages to serialize the "fooo2"
> class, but not the "fooo" class. Is this is a bug or do I miss something
> here?
> 
> Cheers,
> Sebastian
> 
> 


Re: Weird serialization bug?

Posted by Ted Yu <yu...@gmail.com>.
Have you tried making fooo static ?

Cheers

On Sat, Apr 29, 2017 at 4:26 AM, Sebastian Neef <
gehaxelt@mailbox.tu-berlin.de> wrote:

> Hello Apache Flink users,
>
> I implemented a FilterFunction some months ago that worked quite well
> back then. However, I wanted to check it out right now and it somehow
> broke in the sense that Flink can't serialize it anymore. I might be
> mistaken, but afaik I didn't touch the code at all.
>
> I think that I've tracked down the problem to the following minimal
> working PoC:
>
> - A simple interface:
>
> >     interface testFunc extends Serializable {
> >         boolean filter();
> >     }
>
> - A TestFilterFunction which is applied on a DataSet:
>
> >  public void doSomeFiltering() {
> >         class fooo implements testFunc {
> >             public boolean filter() {
> >                 return false;
> >             }
> >         }
> >
> >         class TestFilterFunction implements FilterFunction<IPage> {
> >
> >             testFunc filter;
> >
> >             class fooo2 implements testFunc {
> >                 public boolean filter() {
> >                     return false;
> >                 }
> >             }
> >
> >             TestFilterFunction() {
> >                 // WORKS with fooo2()
> >                 // DOES NOT WORK with fooo()
> >                 this.filter = new fooo2();
> >             }
> >             @Override
> >             public boolean filter(IPage iPage) throws Exception {
> >                 return filter.filter();
> >             }
> >         }
> >       filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }
>
> Flink will work fine when the "fooo2" class is used. However, when using
> the "fooo()" class, I get the following error:
>
> > ------------------------------------------------------------
> >  The program finished with the following exception:
> >
> > The implementation of the FilterFunction is not serializable. The object
> probably contains or references non serializable fields.
> >       org.apache.flink.api.java.ClosureCleaner.clean(
> ClosureCleaner.java:100)
> >       org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
> >       org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
> >       testpackage.testclass.applyFilters(testclass.java:105)
>
> I'm a little bit confused, why Flink manages to serialize the "fooo2"
> class, but not the "fooo" class. Is this is a bug or do I miss something
> here?
>
> Cheers,
> Sebastian
>
>