You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Sharad Agarwal <sh...@gmail.com> on 2011/11/30 05:29:22 UTC
Map/Reduce being idempotent for export
How do we ensure that map/reduce tasks are idempotent. To be more clear: if
there are more than 1 map/reduce attempts for an export job, how do we make
sure that data from only single task attempt is committed ?
Normally this is handled by OutputCommitter, however I see
NullOutputCommitter which doesn't do anything.
Thanks
Sharad
Re: Map/Reduce being idempotent for export
Posted by Kate Ting <ka...@cloudera.com>.
Sharad,
Sqoop handles this by the following arguments:
--staging-table <staging-table-name>
The table in which data will be staged before being inserted into the
destination table.
--clear-staging-table
Indicates that any data present in the staging table can be deleted.
For more information:
http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html#_syntax_3
Note that only some connectors support staging tables.
Regards, Kate
On Tue, Nov 29, 2011 at 8:29 PM, Sharad Agarwal <sh...@gmail.com>wrote:
> How do we ensure that map/reduce tasks are idempotent. To be more clear: if
> there are more than 1 map/reduce attempts for an export job, how do we make
> sure that data from only single task attempt is committed ?
>
> Normally this is handled by OutputCommitter, however I see
> NullOutputCommitter which doesn't do anything.
>
> Thanks
> Sharad
>