You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@avro.apache.org by Johannes Schulte <jo...@gmail.com> on 2013/01/30 23:32:44 UTC

AvroMultipleOutput ignores schemas (other than default)

Hi!

Maybe I'm the only one ever used this :D.

Adding namedOutputs with AvroMultipleOutputs.addNamedOutput just adds them
to a static map which is of course not available on the cluster during
reduce execution. The unit tests pass though since the Instance of
AvroMultipleOutputs is the same in the Reducer as in the Job's main class,
so the added schemas there are present.
Fix would be to add the namedOutput schemas to the job configuration so
they can be parsed in the reducers. Example patch for the new mapreduce api
here:

https://gist.github.com/4677875

Have a nice evening,

Johannes

Re: AvroMultipleOutput ignores schemas (other than default)

Posted by Doug Cutting <cu...@apache.org>.

Johannes,

Can you (or someone) please file an issue in Jira for this?

Thanks!

Doug

On Wed, Jan 30, 2013 at 2:32 PM, Johannes Schulte
<jo...@gmail.com> wrote:
> Hi!
>
> Maybe I'm the only one ever used this :D.
>
> Adding namedOutputs with AvroMultipleOutputs.addNamedOutput just adds them
> to a static map which is of course not available on the cluster during
> reduce execution. The unit tests pass though since the Instance of
> AvroMultipleOutputs is the same in the Reducer as in the Job's main class,
> so the added schemas there are present.
> Fix would be to add the namedOutput schemas to the job configuration so they
> can be parsed in the reducers. Example patch for the new mapreduce api here:
>
> https://gist.github.com/4677875
>
> Have a nice evening,
>
> Johannes
>