You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ravikant Dindokar <ra...@gmail.com> on 2015/06/28 12:45:14 UTC

Static variable in reducer

Hi Hadoop user,

I have graph data file in the form of edge list
<Source Vertex_id> <Sink Vertex_id>

I want to assign each edge a unique ID. In the map function I emit
(key,value) as (<Source Vertex_id>, <Sink Vertex_id>)

In the reducer, for each value , I am using a combination of static count
variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
generate a unique ID.

edgeId=(localcount <<16)|(taskId << 55);

I am able to generate unique IDs.

My question is if a reducer fails will this work?

What exactly happens when a reducer fails and computed again?

PFA source code for mapper & reducer.

Thanks
Ravikant

Re: Static variable in reducer

Posted by Ravikant Dindokar <ra...@gmail.com>.
Thanks a lot Shahab.

On Sun, Jun 28, 2015 at 8:38 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> You asked a similar question earlier also so I will copy those comments
> here with what I replied then:
>
> http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
> Basically, to summarize, you shouldn't incorporate common sharable state
> among reducers. You need to rethink your design.
>
> Moving on, if you still want to do this then in your scenario: If a
> reducer fails (runs out of memory, hdd crashes etc. or in case of
> speculative execution,) then it will be given a new attempt number with the
> old task id when it is being recomputed/retried and your custom counter
> variable (the static one) should be reinitialized (as it will be in a new
> JVM).
>
> Regards,
> Shahab
>
> On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have graph data file in the form of edge list
>> <Source Vertex_id> <Sink Vertex_id>
>>
>> I want to assign each edge a unique ID. In the map function I emit
>> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>>
>> In the reducer, for each value , I am using a combination of static count
>> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
>> generate a unique ID.
>>
>> edgeId=(localcount <<16)|(taskId << 55);
>>
>> I am able to generate unique IDs.
>>
>> My question is if a reducer fails will this work?
>>
>> What exactly happens when a reducer fails and computed again?
>>
>> PFA source code for mapper & reducer.
>>
>> Thanks
>> Ravikant
>>
>
>

Re: Static variable in reducer

Posted by Ravikant Dindokar <ra...@gmail.com>.
Thanks a lot Shahab.

On Sun, Jun 28, 2015 at 8:38 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> You asked a similar question earlier also so I will copy those comments
> here with what I replied then:
>
> http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
> Basically, to summarize, you shouldn't incorporate common sharable state
> among reducers. You need to rethink your design.
>
> Moving on, if you still want to do this then in your scenario: If a
> reducer fails (runs out of memory, hdd crashes etc. or in case of
> speculative execution,) then it will be given a new attempt number with the
> old task id when it is being recomputed/retried and your custom counter
> variable (the static one) should be reinitialized (as it will be in a new
> JVM).
>
> Regards,
> Shahab
>
> On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have graph data file in the form of edge list
>> <Source Vertex_id> <Sink Vertex_id>
>>
>> I want to assign each edge a unique ID. In the map function I emit
>> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>>
>> In the reducer, for each value , I am using a combination of static count
>> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
>> generate a unique ID.
>>
>> edgeId=(localcount <<16)|(taskId << 55);
>>
>> I am able to generate unique IDs.
>>
>> My question is if a reducer fails will this work?
>>
>> What exactly happens when a reducer fails and computed again?
>>
>> PFA source code for mapper & reducer.
>>
>> Thanks
>> Ravikant
>>
>
>

Re: Static variable in reducer

Posted by Ravikant Dindokar <ra...@gmail.com>.
Thanks a lot Shahab.

On Sun, Jun 28, 2015 at 8:38 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> You asked a similar question earlier also so I will copy those comments
> here with what I replied then:
>
> http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
> Basically, to summarize, you shouldn't incorporate common sharable state
> among reducers. You need to rethink your design.
>
> Moving on, if you still want to do this then in your scenario: If a
> reducer fails (runs out of memory, hdd crashes etc. or in case of
> speculative execution,) then it will be given a new attempt number with the
> old task id when it is being recomputed/retried and your custom counter
> variable (the static one) should be reinitialized (as it will be in a new
> JVM).
>
> Regards,
> Shahab
>
> On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have graph data file in the form of edge list
>> <Source Vertex_id> <Sink Vertex_id>
>>
>> I want to assign each edge a unique ID. In the map function I emit
>> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>>
>> In the reducer, for each value , I am using a combination of static count
>> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
>> generate a unique ID.
>>
>> edgeId=(localcount <<16)|(taskId << 55);
>>
>> I am able to generate unique IDs.
>>
>> My question is if a reducer fails will this work?
>>
>> What exactly happens when a reducer fails and computed again?
>>
>> PFA source code for mapper & reducer.
>>
>> Thanks
>> Ravikant
>>
>
>

Re: Static variable in reducer

Posted by Ravikant Dindokar <ra...@gmail.com>.
Thanks a lot Shahab.

On Sun, Jun 28, 2015 at 8:38 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> You asked a similar question earlier also so I will copy those comments
> here with what I replied then:
>
> http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
> Basically, to summarize, you shouldn't incorporate common sharable state
> among reducers. You need to rethink your design.
>
> Moving on, if you still want to do this then in your scenario: If a
> reducer fails (runs out of memory, hdd crashes etc. or in case of
> speculative execution,) then it will be given a new attempt number with the
> old task id when it is being recomputed/retried and your custom counter
> variable (the static one) should be reinitialized (as it will be in a new
> JVM).
>
> Regards,
> Shahab
>
> On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have graph data file in the form of edge list
>> <Source Vertex_id> <Sink Vertex_id>
>>
>> I want to assign each edge a unique ID. In the map function I emit
>> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>>
>> In the reducer, for each value , I am using a combination of static count
>> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
>> generate a unique ID.
>>
>> edgeId=(localcount <<16)|(taskId << 55);
>>
>> I am able to generate unique IDs.
>>
>> My question is if a reducer fails will this work?
>>
>> What exactly happens when a reducer fails and computed again?
>>
>> PFA source code for mapper & reducer.
>>
>> Thanks
>> Ravikant
>>
>
>

Re: Static variable in reducer

Posted by Shahab Yunus <sh...@gmail.com>.
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).

Regards,
Shahab

On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Hadoop user,
>
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
>
> edgeId=(localcount <<16)|(taskId << 55);
>
> I am able to generate unique IDs.
>
> My question is if a reducer fails will this work?
>
> What exactly happens when a reducer fails and computed again?
>
> PFA source code for mapper & reducer.
>
> Thanks
> Ravikant
>

Re: Static variable in reducer

Posted by Shahab Yunus <sh...@gmail.com>.
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).

Regards,
Shahab

On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Hadoop user,
>
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
>
> edgeId=(localcount <<16)|(taskId << 55);
>
> I am able to generate unique IDs.
>
> My question is if a reducer fails will this work?
>
> What exactly happens when a reducer fails and computed again?
>
> PFA source code for mapper & reducer.
>
> Thanks
> Ravikant
>

Re: Static variable in reducer

Posted by Shahab Yunus <sh...@gmail.com>.
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).

Regards,
Shahab

On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Hadoop user,
>
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
>
> edgeId=(localcount <<16)|(taskId << 55);
>
> I am able to generate unique IDs.
>
> My question is if a reducer fails will this work?
>
> What exactly happens when a reducer fails and computed again?
>
> PFA source code for mapper & reducer.
>
> Thanks
> Ravikant
>

Re: Static variable in reducer

Posted by Shahab Yunus <sh...@gmail.com>.
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).

Regards,
Shahab

On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Hadoop user,
>
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
>
> edgeId=(localcount <<16)|(taskId << 55);
>
> I am able to generate unique IDs.
>
> My question is if a reducer fails will this work?
>
> What exactly happens when a reducer fails and computed again?
>
> PFA source code for mapper & reducer.
>
> Thanks
> Ravikant
>