You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Sharad Agarwal <sh...@gmail.com> on 2011/11/30 05:29:22 UTC

Map/Reduce being idempotent for export

How do we ensure that map/reduce tasks are idempotent. To be more clear: if
there are more than 1 map/reduce attempts for an export job, how do we make
sure that data from only single task attempt is committed ?

Normally this is handled by OutputCommitter, however I see
NullOutputCommitter which doesn't do anything.

Thanks
Sharad

Re: Map/Reduce being idempotent for export

Posted by Kate Ting <ka...@cloudera.com>.
Sharad,

Sqoop handles this by the following arguments:

--staging-table <staging-table-name>
The table in which data will be staged before being inserted into the
destination table.

--clear-staging-table
Indicates that any data present in the staging table can be deleted.

For more information:
http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html#_syntax_3

Note that only some connectors support staging tables.

Regards, Kate

On Tue, Nov 29, 2011 at 8:29 PM, Sharad Agarwal <sh...@gmail.com>wrote:

> How do we ensure that map/reduce tasks are idempotent. To be more clear: if
> there are more than 1 map/reduce attempts for an export job, how do we make
> sure that data from only single task attempt is committed ?
>
> Normally this is handled by OutputCommitter, however I see
> NullOutputCommitter which doesn't do anything.
>
> Thanks
> Sharad
>