You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Li Li <fa...@gmail.com> on 2014/09/12 05:05:07 UTC

MultipleTextOutputFormat in new api of 1.2.1?

I want to output different key ranges to different directory.
As of old api, there is a MultipleTextOutputFormat. I just need
rewrite generateFileNameForKeyValue.
But I can't find it in new api.
There is a MultipleOutputs. But it's not that good because it need
predefine keys by
MultipleOutputs.addNamedOutput
But before I run it, I don't know how many keys.

Re: MultipleTextOutputFormat in new api of 1.2.1?

Posted by Adam Kawa <ka...@gmail.com>.
Afaik, dynamic partitions in the new mapreduce API are actually not
supported (please read http://grepalex.com/2013/07/16/multipleoutputs-part2/
and
http://stackoverflow.com/questions/25503034/dynamic-key-based-names-of-output-files-in-new-hadoop-api
).

If you don't want to use old mapred API, then dynamic partitioning in Hive
might an alternative.

2014-09-12 5:05 GMT+02:00 Li Li <fa...@gmail.com>:

> I want to output different key ranges to different directory.
> As of old api, there is a MultipleTextOutputFormat. I just need
> rewrite generateFileNameForKeyValue.
> But I can't find it in new api.
> There is a MultipleOutputs. But it's not that good because it need
> predefine keys by
> MultipleOutputs.addNamedOutput
> But before I run it, I don't know how many keys.
>

Re: MultipleTextOutputFormat in new api of 1.2.1?

Posted by Adam Kawa <ka...@gmail.com>.
Afaik, dynamic partitions in the new mapreduce API are actually not
supported (please read http://grepalex.com/2013/07/16/multipleoutputs-part2/
and
http://stackoverflow.com/questions/25503034/dynamic-key-based-names-of-output-files-in-new-hadoop-api
).

If you don't want to use old mapred API, then dynamic partitioning in Hive
might an alternative.

2014-09-12 5:05 GMT+02:00 Li Li <fa...@gmail.com>:

> I want to output different key ranges to different directory.
> As of old api, there is a MultipleTextOutputFormat. I just need
> rewrite generateFileNameForKeyValue.
> But I can't find it in new api.
> There is a MultipleOutputs. But it's not that good because it need
> predefine keys by
> MultipleOutputs.addNamedOutput
> But before I run it, I don't know how many keys.
>

Re: MultipleTextOutputFormat in new api of 1.2.1?

Posted by Adam Kawa <ka...@gmail.com>.
Afaik, dynamic partitions in the new mapreduce API are actually not
supported (please read http://grepalex.com/2013/07/16/multipleoutputs-part2/
and
http://stackoverflow.com/questions/25503034/dynamic-key-based-names-of-output-files-in-new-hadoop-api
).

If you don't want to use old mapred API, then dynamic partitioning in Hive
might an alternative.

2014-09-12 5:05 GMT+02:00 Li Li <fa...@gmail.com>:

> I want to output different key ranges to different directory.
> As of old api, there is a MultipleTextOutputFormat. I just need
> rewrite generateFileNameForKeyValue.
> But I can't find it in new api.
> There is a MultipleOutputs. But it's not that good because it need
> predefine keys by
> MultipleOutputs.addNamedOutput
> But before I run it, I don't know how many keys.
>

Re: MultipleTextOutputFormat in new api of 1.2.1?

Posted by Adam Kawa <ka...@gmail.com>.
Afaik, dynamic partitions in the new mapreduce API are actually not
supported (please read http://grepalex.com/2013/07/16/multipleoutputs-part2/
and
http://stackoverflow.com/questions/25503034/dynamic-key-based-names-of-output-files-in-new-hadoop-api
).

If you don't want to use old mapred API, then dynamic partitioning in Hive
might an alternative.

2014-09-12 5:05 GMT+02:00 Li Li <fa...@gmail.com>:

> I want to output different key ranges to different directory.
> As of old api, there is a MultipleTextOutputFormat. I just need
> rewrite generateFileNameForKeyValue.
> But I can't find it in new api.
> There is a MultipleOutputs. But it's not that good because it need
> predefine keys by
> MultipleOutputs.addNamedOutput
> But before I run it, I don't know how many keys.
>