You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Joern Kottmann <ko...@gmail.com> on 2017/12/07 14:19:39 UTC

Flink 1.4.0 RC3 and Avro objects with maps have null values

Hello,

after having a version mismatch between Avro in Flink 1.3.2 I decided
to see how things work with Flink 1.4.0.

The pipeline I am building runs now, deployed as standalone on YARN
with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
1.8.2 instead of an 1.7.x version).

The default setting for yarn.per-job-cluster.include-user-jar is
ORDERED which in my case was a bit tricky, because I first deployed a
pipeline having a jar starting with "A" so it was placed before the
flink jars, later I worked on a mostly identical pipeline with a name
a bit further down in the alphabet, and that was placed behind the
flink jars causing various issues. I would recommend to change this as
default to LAST. Then the jar name has no impact on how things are
loaded.
Should I open a jira for this?

Now after updating to Flink 1.4.0 and also running on YARN as
standalone I am getting null values in a map field I did set in
earlier steps in the pipeline on Avro objects. The thing is a bit
weird, because avroObject.toString would show the values just fine,
but I get null on retrieval.

I have my own TableInputFormat to read from HBase and output a Tuple2
containing an Avro object.
Next comes a map which tries to access it.

The failing piece looks like this:
ao.getTheMap().get("theKey") (returns null)
Doing an ao.toString would show the "theMap" and the value of "theKey".

Is this a bug in Flink 1.4.0 or is there something wrong with my dependencies?

Thanks,
Jörn

Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Stephan Ewen <se...@apache.org>.
One thing to be aware of when testing Flink 1.4 is the changes to
dependencies and classloading.

By default, Flink 1.4 uses now inverted classloading to allow users to use
their own copies of dependencies, irrespective of what the underlying
classpath is spoiled with. You can for example use a different Avro
versions than Hadoop pull in, even without shading, or even different Akka
/ Jackson / etc versions.

That is a nice improvement, but it may have some impacts on tools that have
been build before. When you see classcast exceptions (like X cannot be cast
to X), that is probably caused by the fact that the classloader duplicates
a dependency from the JVM classpath in user-space, but objects/classes move
between the domains.

You can always switch back the old model by adding :
"classloader.resolve-order:
parent-first" to the configuration.

On Fri, Dec 8, 2017 at 2:13 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> I think this Avro issue might be related: https://issues.apache.org/
> jira/browse/AVRO-803
>
> > On 7. Dec 2017, at 17:19, Timo Walther <tw...@apache.org> wrote:
> >
> > Hi Jörn,
> >
> > thanks for the little example. Maybe Avro changed the behavior about
> maps from the old version we used in 1.3 to the newest version in Flink
> 1.4.0.
> >
> > I will investigate this and might open an issue for it.
> >
> > Regards,
> > Timo
> >
> >
> > Am 12/7/17 um 5:07 PM schrieb Joern Kottmann:
> >> Hello Timo,
> >>
> >> thanks for your quick response. I can't share the code of that pipeline
> here.
> >>
> >> The Flink version I am using is this one:
> >> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
> >>
> >> That was compiled by me with mvn install -DskipTests (for some reason
> >> the tests failed, can also share that, maybe a firewall issue on
> >> Fedora 27).
> >>
> >> I managed to isolate the problem and it can be reproduced with a few
> >> lines running in the IDE:
> >> https://github.com/kottmann/flink-avro-issue
> >>
> >> And it is indeed like you suspected, the key is a Utf8, and I pass in
> >> a String to the get.
> >>
> >> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
> >>
> >> Thanks again!
> >> Jörn
> >>
> >> On Thu, Dec 7, 2017 at 5:07 PM, Joern Kottmann <ko...@gmail.com>
> wrote:
> >>> Hello Timo,
> >>>
> >>> thanks for your quick response. I can't share the code of that
> pipeline here.
> >>>
> >>> The Flink version I am using is this one:
> >>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
> >>>
> >>> That was compiled by me with mvn install -DskipTests (for some reason
> >>> the tests failed, can also share that, maybe a firewall issue on
> >>> Fedora 27).
> >>>
> >>> I managed to isolate the problem and it can be reproduced with a few
> >>> lines running in the IDE:
> >>> https://github.com/kottmann/flink-avro-issue
> >>>
> >>> And it is indeed like you suspected, the key is a Utf8, and I pass in
> >>> a String to the get.
> >>>
> >>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
> >>>
> >>> Thanks again!
> >>> Jörn
> >>>
> >>> On Thu, Dec 7, 2017 at 3:57 PM, Timo Walther <tw...@apache.org>
> wrote:
> >>>> Can you also check the type of the keys in your map. Avro
> distinguished
> >>>> between String and Utf8 class. Maybe this is why your key cannot be
> found.
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>>
> >>>> Am 12/7/17 um 3:54 PM schrieb Timo Walther:
> >>>>
> >>>>> Hi Jörn,
> >>>>>
> >>>>> could you tell us a bit more about your job? Did you import the
> flink-avro
> >>>>> module? How does the Flink TypeInformation for your Avro type look
> like
> >>>>> using println(ds.getType)? It sounds like a very weird error if the
> >>>>> toString() method shows the key. Can you reproduce the error in your
> IDE and
> >>>>> enter the `get("theKey")` method using a debugger. I will loop in
> Aljoscha
> >>>>> that worked on Avro recently.
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>>
> >>>>> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
> >>>>>> Hello,
> >>>>>>
> >>>>>> after having a version mismatch between Avro in Flink 1.3.2 I
> decided
> >>>>>> to see how things work with Flink 1.4.0.
> >>>>>>
> >>>>>> The pipeline I am building runs now, deployed as standalone on YARN
> >>>>>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use
> Avro
> >>>>>> 1.8.2 instead of an 1.7.x version).
> >>>>>>
> >>>>>> The default setting for yarn.per-job-cluster.include-user-jar is
> >>>>>> ORDERED which in my case was a bit tricky, because I first deployed
> a
> >>>>>> pipeline having a jar starting with "A" so it was placed before the
> >>>>>> flink jars, later I worked on a mostly identical pipeline with a
> name
> >>>>>> a bit further down in the alphabet, and that was placed behind the
> >>>>>> flink jars causing various issues. I would recommend to change this
> as
> >>>>>> default to LAST. Then the jar name has no impact on how things are
> >>>>>> loaded.
> >>>>>> Should I open a jira for this?
> >>>>>>
> >>>>>> Now after updating to Flink 1.4.0 and also running on YARN as
> >>>>>> standalone I am getting null values in a map field I did set in
> >>>>>> earlier steps in the pipeline on Avro objects. The thing is a bit
> >>>>>> weird, because avroObject.toString would show the values just fine,
> >>>>>> but I get null on retrieval.
> >>>>>>
> >>>>>> I have my own TableInputFormat to read from HBase and output a
> Tuple2
> >>>>>> containing an Avro object.
> >>>>>> Next comes a map which tries to access it.
> >>>>>>
> >>>>>> The failing piece looks like this:
> >>>>>> ao.getTheMap().get("theKey") (returns null)
> >>>>>> Doing an ao.toString would show the "theMap" and the value of
> "theKey".
> >>>>>>
> >>>>>> Is this a bug in Flink 1.4.0 or is there something wrong with my
> >>>>>> dependencies?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Jörn
> >>>>>
> >
>
>

Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Aljoscha Krettek <al...@apache.org>.
I think this Avro issue might be related: https://issues.apache.org/jira/browse/AVRO-803

> On 7. Dec 2017, at 17:19, Timo Walther <tw...@apache.org> wrote:
> 
> Hi Jörn,
> 
> thanks for the little example. Maybe Avro changed the behavior about maps from the old version we used in 1.3 to the newest version in Flink 1.4.0.
> 
> I will investigate this and might open an issue for it.
> 
> Regards,
> Timo
> 
> 
> Am 12/7/17 um 5:07 PM schrieb Joern Kottmann:
>> Hello Timo,
>> 
>> thanks for your quick response. I can't share the code of that pipeline here.
>> 
>> The Flink version I am using is this one:
>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>> 
>> That was compiled by me with mvn install -DskipTests (for some reason
>> the tests failed, can also share that, maybe a firewall issue on
>> Fedora 27).
>> 
>> I managed to isolate the problem and it can be reproduced with a few
>> lines running in the IDE:
>> https://github.com/kottmann/flink-avro-issue
>> 
>> And it is indeed like you suspected, the key is a Utf8, and I pass in
>> a String to the get.
>> 
>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>> 
>> Thanks again!
>> Jörn
>> 
>> On Thu, Dec 7, 2017 at 5:07 PM, Joern Kottmann <ko...@gmail.com> wrote:
>>> Hello Timo,
>>> 
>>> thanks for your quick response. I can't share the code of that pipeline here.
>>> 
>>> The Flink version I am using is this one:
>>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>>> 
>>> That was compiled by me with mvn install -DskipTests (for some reason
>>> the tests failed, can also share that, maybe a firewall issue on
>>> Fedora 27).
>>> 
>>> I managed to isolate the problem and it can be reproduced with a few
>>> lines running in the IDE:
>>> https://github.com/kottmann/flink-avro-issue
>>> 
>>> And it is indeed like you suspected, the key is a Utf8, and I pass in
>>> a String to the get.
>>> 
>>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>>> 
>>> Thanks again!
>>> Jörn
>>> 
>>> On Thu, Dec 7, 2017 at 3:57 PM, Timo Walther <tw...@apache.org> wrote:
>>>> Can you also check the type of the keys in your map. Avro distinguished
>>>> between String and Utf8 class. Maybe this is why your key cannot be found.
>>>> 
>>>> Regards,
>>>> Timo
>>>> 
>>>> 
>>>> Am 12/7/17 um 3:54 PM schrieb Timo Walther:
>>>> 
>>>>> Hi Jörn,
>>>>> 
>>>>> could you tell us a bit more about your job? Did you import the flink-avro
>>>>> module? How does the Flink TypeInformation for your Avro type look like
>>>>> using println(ds.getType)? It sounds like a very weird error if the
>>>>> toString() method shows the key. Can you reproduce the error in your IDE and
>>>>> enter the `get("theKey")` method using a debugger. I will loop in Aljoscha
>>>>> that worked on Avro recently.
>>>>> 
>>>>> Regards,
>>>>> Timo
>>>>> 
>>>>> 
>>>>> 
>>>>> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
>>>>>> Hello,
>>>>>> 
>>>>>> after having a version mismatch between Avro in Flink 1.3.2 I decided
>>>>>> to see how things work with Flink 1.4.0.
>>>>>> 
>>>>>> The pipeline I am building runs now, deployed as standalone on YARN
>>>>>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
>>>>>> 1.8.2 instead of an 1.7.x version).
>>>>>> 
>>>>>> The default setting for yarn.per-job-cluster.include-user-jar is
>>>>>> ORDERED which in my case was a bit tricky, because I first deployed a
>>>>>> pipeline having a jar starting with "A" so it was placed before the
>>>>>> flink jars, later I worked on a mostly identical pipeline with a name
>>>>>> a bit further down in the alphabet, and that was placed behind the
>>>>>> flink jars causing various issues. I would recommend to change this as
>>>>>> default to LAST. Then the jar name has no impact on how things are
>>>>>> loaded.
>>>>>> Should I open a jira for this?
>>>>>> 
>>>>>> Now after updating to Flink 1.4.0 and also running on YARN as
>>>>>> standalone I am getting null values in a map field I did set in
>>>>>> earlier steps in the pipeline on Avro objects. The thing is a bit
>>>>>> weird, because avroObject.toString would show the values just fine,
>>>>>> but I get null on retrieval.
>>>>>> 
>>>>>> I have my own TableInputFormat to read from HBase and output a Tuple2
>>>>>> containing an Avro object.
>>>>>> Next comes a map which tries to access it.
>>>>>> 
>>>>>> The failing piece looks like this:
>>>>>> ao.getTheMap().get("theKey") (returns null)
>>>>>> Doing an ao.toString would show the "theMap" and the value of "theKey".
>>>>>> 
>>>>>> Is this a bug in Flink 1.4.0 or is there something wrong with my
>>>>>> dependencies?
>>>>>> 
>>>>>> Thanks,
>>>>>> Jörn
>>>>> 
> 


Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Timo Walther <tw...@apache.org>.
Hi Jörn,

thanks for the little example. Maybe Avro changed the behavior about 
maps from the old version we used in 1.3 to the newest version in Flink 
1.4.0.

I will investigate this and might open an issue for it.

Regards,
Timo


Am 12/7/17 um 5:07 PM schrieb Joern Kottmann:
> Hello Timo,
>
> thanks for your quick response. I can't share the code of that pipeline here.
>
> The Flink version I am using is this one:
> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>
> That was compiled by me with mvn install -DskipTests (for some reason
> the tests failed, can also share that, maybe a firewall issue on
> Fedora 27).
>
> I managed to isolate the problem and it can be reproduced with a few
> lines running in the IDE:
> https://github.com/kottmann/flink-avro-issue
>
> And it is indeed like you suspected, the key is a Utf8, and I pass in
> a String to the get.
>
> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>
> Thanks again!
> Jörn
>
> On Thu, Dec 7, 2017 at 5:07 PM, Joern Kottmann <ko...@gmail.com> wrote:
>> Hello Timo,
>>
>> thanks for your quick response. I can't share the code of that pipeline here.
>>
>> The Flink version I am using is this one:
>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>>
>> That was compiled by me with mvn install -DskipTests (for some reason
>> the tests failed, can also share that, maybe a firewall issue on
>> Fedora 27).
>>
>> I managed to isolate the problem and it can be reproduced with a few
>> lines running in the IDE:
>> https://github.com/kottmann/flink-avro-issue
>>
>> And it is indeed like you suspected, the key is a Utf8, and I pass in
>> a String to the get.
>>
>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>>
>> Thanks again!
>> Jörn
>>
>> On Thu, Dec 7, 2017 at 3:57 PM, Timo Walther <tw...@apache.org> wrote:
>>> Can you also check the type of the keys in your map. Avro distinguished
>>> between String and Utf8 class. Maybe this is why your key cannot be found.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>> Am 12/7/17 um 3:54 PM schrieb Timo Walther:
>>>
>>>> Hi Jörn,
>>>>
>>>> could you tell us a bit more about your job? Did you import the flink-avro
>>>> module? How does the Flink TypeInformation for your Avro type look like
>>>> using println(ds.getType)? It sounds like a very weird error if the
>>>> toString() method shows the key. Can you reproduce the error in your IDE and
>>>> enter the `get("theKey")` method using a debugger. I will loop in Aljoscha
>>>> that worked on Avro recently.
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>>
>>>>
>>>> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
>>>>> Hello,
>>>>>
>>>>> after having a version mismatch between Avro in Flink 1.3.2 I decided
>>>>> to see how things work with Flink 1.4.0.
>>>>>
>>>>> The pipeline I am building runs now, deployed as standalone on YARN
>>>>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
>>>>> 1.8.2 instead of an 1.7.x version).
>>>>>
>>>>> The default setting for yarn.per-job-cluster.include-user-jar is
>>>>> ORDERED which in my case was a bit tricky, because I first deployed a
>>>>> pipeline having a jar starting with "A" so it was placed before the
>>>>> flink jars, later I worked on a mostly identical pipeline with a name
>>>>> a bit further down in the alphabet, and that was placed behind the
>>>>> flink jars causing various issues. I would recommend to change this as
>>>>> default to LAST. Then the jar name has no impact on how things are
>>>>> loaded.
>>>>> Should I open a jira for this?
>>>>>
>>>>> Now after updating to Flink 1.4.0 and also running on YARN as
>>>>> standalone I am getting null values in a map field I did set in
>>>>> earlier steps in the pipeline on Avro objects. The thing is a bit
>>>>> weird, because avroObject.toString would show the values just fine,
>>>>> but I get null on retrieval.
>>>>>
>>>>> I have my own TableInputFormat to read from HBase and output a Tuple2
>>>>> containing an Avro object.
>>>>> Next comes a map which tries to access it.
>>>>>
>>>>> The failing piece looks like this:
>>>>> ao.getTheMap().get("theKey") (returns null)
>>>>> Doing an ao.toString would show the "theMap" and the value of "theKey".
>>>>>
>>>>> Is this a bug in Flink 1.4.0 or is there something wrong with my
>>>>> dependencies?
>>>>>
>>>>> Thanks,
>>>>> Jörn
>>>>


Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Joern Kottmann <ko...@gmail.com>.
Hello Timo,

thanks for your quick response. I can't share the code of that pipeline here.

The Flink version I am using is this one:
http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz

That was compiled by me with mvn install -DskipTests (for some reason
the tests failed, can also share that, maybe a firewall issue on
Fedora 27).

I managed to isolate the problem and it can be reproduced with a few
lines running in the IDE:
https://github.com/kottmann/flink-avro-issue

And it is indeed like you suspected, the key is a Utf8, and I pass in
a String to the get.

But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?

Thanks again!
Jörn

On Thu, Dec 7, 2017 at 5:07 PM, Joern Kottmann <ko...@gmail.com> wrote:
> Hello Timo,
>
> thanks for your quick response. I can't share the code of that pipeline here.
>
> The Flink version I am using is this one:
> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>
> That was compiled by me with mvn install -DskipTests (for some reason
> the tests failed, can also share that, maybe a firewall issue on
> Fedora 27).
>
> I managed to isolate the problem and it can be reproduced with a few
> lines running in the IDE:
> https://github.com/kottmann/flink-avro-issue
>
> And it is indeed like you suspected, the key is a Utf8, and I pass in
> a String to the get.
>
> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>
> Thanks again!
> Jörn
>
> On Thu, Dec 7, 2017 at 3:57 PM, Timo Walther <tw...@apache.org> wrote:
>> Can you also check the type of the keys in your map. Avro distinguished
>> between String and Utf8 class. Maybe this is why your key cannot be found.
>>
>> Regards,
>> Timo
>>
>>
>> Am 12/7/17 um 3:54 PM schrieb Timo Walther:
>>
>>> Hi Jörn,
>>>
>>> could you tell us a bit more about your job? Did you import the flink-avro
>>> module? How does the Flink TypeInformation for your Avro type look like
>>> using println(ds.getType)? It sounds like a very weird error if the
>>> toString() method shows the key. Can you reproduce the error in your IDE and
>>> enter the `get("theKey")` method using a debugger. I will loop in Aljoscha
>>> that worked on Avro recently.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
>>>>
>>>> Hello,
>>>>
>>>> after having a version mismatch between Avro in Flink 1.3.2 I decided
>>>> to see how things work with Flink 1.4.0.
>>>>
>>>> The pipeline I am building runs now, deployed as standalone on YARN
>>>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
>>>> 1.8.2 instead of an 1.7.x version).
>>>>
>>>> The default setting for yarn.per-job-cluster.include-user-jar is
>>>> ORDERED which in my case was a bit tricky, because I first deployed a
>>>> pipeline having a jar starting with "A" so it was placed before the
>>>> flink jars, later I worked on a mostly identical pipeline with a name
>>>> a bit further down in the alphabet, and that was placed behind the
>>>> flink jars causing various issues. I would recommend to change this as
>>>> default to LAST. Then the jar name has no impact on how things are
>>>> loaded.
>>>> Should I open a jira for this?
>>>>
>>>> Now after updating to Flink 1.4.0 and also running on YARN as
>>>> standalone I am getting null values in a map field I did set in
>>>> earlier steps in the pipeline on Avro objects. The thing is a bit
>>>> weird, because avroObject.toString would show the values just fine,
>>>> but I get null on retrieval.
>>>>
>>>> I have my own TableInputFormat to read from HBase and output a Tuple2
>>>> containing an Avro object.
>>>> Next comes a map which tries to access it.
>>>>
>>>> The failing piece looks like this:
>>>> ao.getTheMap().get("theKey") (returns null)
>>>> Doing an ao.toString would show the "theMap" and the value of "theKey".
>>>>
>>>> Is this a bug in Flink 1.4.0 or is there something wrong with my
>>>> dependencies?
>>>>
>>>> Thanks,
>>>> Jörn
>>>
>>>
>>

Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Timo Walther <tw...@apache.org>.
Can you also check the type of the keys in your map. Avro distinguished 
between String and Utf8 class. Maybe this is why your key cannot be found.

Regards,
Timo


Am 12/7/17 um 3:54 PM schrieb Timo Walther:
> Hi Jörn,
>
> could you tell us a bit more about your job? Did you import the 
> flink-avro module? How does the Flink TypeInformation for your Avro 
> type look like using println(ds.getType)? It sounds like a very weird 
> error if the toString() method shows the key. Can you reproduce the 
> error in your IDE and enter the `get("theKey")` method using a 
> debugger. I will loop in Aljoscha that worked on Avro recently.
>
> Regards,
> Timo
>
>
>
> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
>> Hello,
>>
>> after having a version mismatch between Avro in Flink 1.3.2 I decided
>> to see how things work with Flink 1.4.0.
>>
>> The pipeline I am building runs now, deployed as standalone on YARN
>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
>> 1.8.2 instead of an 1.7.x version).
>>
>> The default setting for yarn.per-job-cluster.include-user-jar is
>> ORDERED which in my case was a bit tricky, because I first deployed a
>> pipeline having a jar starting with "A" so it was placed before the
>> flink jars, later I worked on a mostly identical pipeline with a name
>> a bit further down in the alphabet, and that was placed behind the
>> flink jars causing various issues. I would recommend to change this as
>> default to LAST. Then the jar name has no impact on how things are
>> loaded.
>> Should I open a jira for this?
>>
>> Now after updating to Flink 1.4.0 and also running on YARN as
>> standalone I am getting null values in a map field I did set in
>> earlier steps in the pipeline on Avro objects. The thing is a bit
>> weird, because avroObject.toString would show the values just fine,
>> but I get null on retrieval.
>>
>> I have my own TableInputFormat to read from HBase and output a Tuple2
>> containing an Avro object.
>> Next comes a map which tries to access it.
>>
>> The failing piece looks like this:
>> ao.getTheMap().get("theKey") (returns null)
>> Doing an ao.toString would show the "theMap" and the value of "theKey".
>>
>> Is this a bug in Flink 1.4.0 or is there something wrong with my 
>> dependencies?
>>
>> Thanks,
>> Jörn
>


Re: Flink 1.4.0 RC3 and Avro objects with maps have null values

Posted by Timo Walther <tw...@apache.org>.
Hi Jörn,

could you tell us a bit more about your job? Did you import the 
flink-avro module? How does the Flink TypeInformation for your Avro type 
look like using println(ds.getType)? It sounds like a very weird error 
if the toString() method shows the key. Can you reproduce the error in 
your IDE and enter the `get("theKey")` method using a debugger. I will 
loop in Aljoscha that worked on Avro recently.

Regards,
Timo



Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
> Hello,
>
> after having a version mismatch between Avro in Flink 1.3.2 I decided
> to see how things work with Flink 1.4.0.
>
> The pipeline I am building runs now, deployed as standalone on YARN
> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use Avro
> 1.8.2 instead of an 1.7.x version).
>
> The default setting for yarn.per-job-cluster.include-user-jar is
> ORDERED which in my case was a bit tricky, because I first deployed a
> pipeline having a jar starting with "A" so it was placed before the
> flink jars, later I worked on a mostly identical pipeline with a name
> a bit further down in the alphabet, and that was placed behind the
> flink jars causing various issues. I would recommend to change this as
> default to LAST. Then the jar name has no impact on how things are
> loaded.
> Should I open a jira for this?
>
> Now after updating to Flink 1.4.0 and also running on YARN as
> standalone I am getting null values in a map field I did set in
> earlier steps in the pipeline on Avro objects. The thing is a bit
> weird, because avroObject.toString would show the values just fine,
> but I get null on retrieval.
>
> I have my own TableInputFormat to read from HBase and output a Tuple2
> containing an Avro object.
> Next comes a map which tries to access it.
>
> The failing piece looks like this:
> ao.getTheMap().get("theKey") (returns null)
> Doing an ao.toString would show the "theMap" and the value of "theKey".
>
> Is this a bug in Flink 1.4.0 or is there something wrong with my dependencies?
>
> Thanks,
> Jörn