You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ravikant Dindokar <ra...@gmail.com> on 2015/06/28 21:10:21 UTC

Reducer called twice for same key

Hi Hadoop user,

I have two map classes processing two different input files. Both map
functions have same key,value format to emit.

But Reducer called twice for same key , one for value from first map while
one for value from other map.

I am printing (key ,value) pairs in reducer  :
./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1

both maps emit Longwritable key and Text value.


Any idea why this is happening?
Is there any way to get hash values generated by hadoop for keys emitted by
mapper?

Thanks
Ravikant

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Shahab,

It was my mistake. I misplaced the SOP statements, and concluded
incorrectly. Sorry for misleading question.
Harshit figured it out earlier. It seems that  last two messages were not
addressed to the mailing list.

Thanks


On Mon, Jun 29, 2015 at 7:17 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> Ravikant,
>
> How is the output that you sent in the email maps to the one you are are
> printing in the code (using SOP statements)?
>
> Where do you see reducer being called again for the same key? Maybe, I am
> missing something but the output statements in the code look different.
>
> Regards,
> Shahab
>
> On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Harshit,
>>
>> PFA
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> Can you share PALReducer also?
>>>
>>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Adding source code for more clarity
>>>>
>>>> Problem statement is simple
>>>>
>>>> PartitionFileMapper : it takes input file which has tab separated value
>>>> V , P
>>>> It emits (V, -1#P)
>>>>
>>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>>> It emits (V, E#-1)
>>>>
>>>> in reducer I want to emit
>>>> (V,E#P)
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> By custom key, did you meant some class object ? then no.
>>>>>
>>>>> I have two map methods each having different file as input. And both
>>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>>> container file I can see,
>>>>>
>>>>> key & value separated by ':'
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>>> :3278620528725786624:5352454#-1
>>>>>
>>>>> for key 391 reducer is called twice. , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> In map method I parse the string from input file as Long variable and
>>>>> then emit it as LongWritable.
>>>>>
>>>>> Is there something I am missing when I use multipleInput
>>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <mathursharp@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> As per Map Reduce, it is not possible that two different reducers
>>>>>> will get same keys.
>>>>>> I think you have created some custom key type? If that is the case
>>>>>> then there should be some issue with the comparator.
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Hadoop user,
>>>>>>>
>>>>>>> I have two map classes processing two different input files. Both
>>>>>>> map functions have same key,value format to emit.
>>>>>>>
>>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>>> while one for value from other map.
>>>>>>>
>>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>>
>>>>>>> both maps emit Longwritable key and Text value.
>>>>>>>
>>>>>>>
>>>>>>> Any idea why this is happening?
>>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>>> emitted by mapper?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Ravikant
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harshit Mathur
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Shahab,

It was my mistake. I misplaced the SOP statements, and concluded
incorrectly. Sorry for misleading question.
Harshit figured it out earlier. It seems that  last two messages were not
addressed to the mailing list.

Thanks


On Mon, Jun 29, 2015 at 7:17 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> Ravikant,
>
> How is the output that you sent in the email maps to the one you are are
> printing in the code (using SOP statements)?
>
> Where do you see reducer being called again for the same key? Maybe, I am
> missing something but the output statements in the code look different.
>
> Regards,
> Shahab
>
> On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Harshit,
>>
>> PFA
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> Can you share PALReducer also?
>>>
>>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Adding source code for more clarity
>>>>
>>>> Problem statement is simple
>>>>
>>>> PartitionFileMapper : it takes input file which has tab separated value
>>>> V , P
>>>> It emits (V, -1#P)
>>>>
>>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>>> It emits (V, E#-1)
>>>>
>>>> in reducer I want to emit
>>>> (V,E#P)
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> By custom key, did you meant some class object ? then no.
>>>>>
>>>>> I have two map methods each having different file as input. And both
>>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>>> container file I can see,
>>>>>
>>>>> key & value separated by ':'
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>>> :3278620528725786624:5352454#-1
>>>>>
>>>>> for key 391 reducer is called twice. , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> In map method I parse the string from input file as Long variable and
>>>>> then emit it as LongWritable.
>>>>>
>>>>> Is there something I am missing when I use multipleInput
>>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <mathursharp@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> As per Map Reduce, it is not possible that two different reducers
>>>>>> will get same keys.
>>>>>> I think you have created some custom key type? If that is the case
>>>>>> then there should be some issue with the comparator.
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Hadoop user,
>>>>>>>
>>>>>>> I have two map classes processing two different input files. Both
>>>>>>> map functions have same key,value format to emit.
>>>>>>>
>>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>>> while one for value from other map.
>>>>>>>
>>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>>
>>>>>>> both maps emit Longwritable key and Text value.
>>>>>>>
>>>>>>>
>>>>>>> Any idea why this is happening?
>>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>>> emitted by mapper?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Ravikant
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harshit Mathur
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Shahab,

It was my mistake. I misplaced the SOP statements, and concluded
incorrectly. Sorry for misleading question.
Harshit figured it out earlier. It seems that  last two messages were not
addressed to the mailing list.

Thanks


On Mon, Jun 29, 2015 at 7:17 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> Ravikant,
>
> How is the output that you sent in the email maps to the one you are are
> printing in the code (using SOP statements)?
>
> Where do you see reducer being called again for the same key? Maybe, I am
> missing something but the output statements in the code look different.
>
> Regards,
> Shahab
>
> On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Harshit,
>>
>> PFA
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> Can you share PALReducer also?
>>>
>>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Adding source code for more clarity
>>>>
>>>> Problem statement is simple
>>>>
>>>> PartitionFileMapper : it takes input file which has tab separated value
>>>> V , P
>>>> It emits (V, -1#P)
>>>>
>>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>>> It emits (V, E#-1)
>>>>
>>>> in reducer I want to emit
>>>> (V,E#P)
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> By custom key, did you meant some class object ? then no.
>>>>>
>>>>> I have two map methods each having different file as input. And both
>>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>>> container file I can see,
>>>>>
>>>>> key & value separated by ':'
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>>> :3278620528725786624:5352454#-1
>>>>>
>>>>> for key 391 reducer is called twice. , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> In map method I parse the string from input file as Long variable and
>>>>> then emit it as LongWritable.
>>>>>
>>>>> Is there something I am missing when I use multipleInput
>>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <mathursharp@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> As per Map Reduce, it is not possible that two different reducers
>>>>>> will get same keys.
>>>>>> I think you have created some custom key type? If that is the case
>>>>>> then there should be some issue with the comparator.
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Hadoop user,
>>>>>>>
>>>>>>> I have two map classes processing two different input files. Both
>>>>>>> map functions have same key,value format to emit.
>>>>>>>
>>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>>> while one for value from other map.
>>>>>>>
>>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>>
>>>>>>> both maps emit Longwritable key and Text value.
>>>>>>>
>>>>>>>
>>>>>>> Any idea why this is happening?
>>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>>> emitted by mapper?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Ravikant
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harshit Mathur
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Shahab,

It was my mistake. I misplaced the SOP statements, and concluded
incorrectly. Sorry for misleading question.
Harshit figured it out earlier. It seems that  last two messages were not
addressed to the mailing list.

Thanks


On Mon, Jun 29, 2015 at 7:17 PM, Shahab Yunus <sh...@gmail.com>
wrote:

> Ravikant,
>
> How is the output that you sent in the email maps to the one you are are
> printing in the code (using SOP statements)?
>
> Where do you see reducer being called again for the same key? Maybe, I am
> missing something but the output statements in the code look different.
>
> Regards,
> Shahab
>
> On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Harshit,
>>
>> PFA
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> Can you share PALReducer also?
>>>
>>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Adding source code for more clarity
>>>>
>>>> Problem statement is simple
>>>>
>>>> PartitionFileMapper : it takes input file which has tab separated value
>>>> V , P
>>>> It emits (V, -1#P)
>>>>
>>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>>> It emits (V, E#-1)
>>>>
>>>> in reducer I want to emit
>>>> (V,E#P)
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> By custom key, did you meant some class object ? then no.
>>>>>
>>>>> I have two map methods each having different file as input. And both
>>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>>> container file I can see,
>>>>>
>>>>> key & value separated by ':'
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>>> :3278620528725786624:5352454#-1
>>>>>
>>>>> for key 391 reducer is called twice. , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> In map method I parse the string from input file as Long variable and
>>>>> then emit it as LongWritable.
>>>>>
>>>>> Is there something I am missing when I use multipleInput
>>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <mathursharp@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> As per Map Reduce, it is not possible that two different reducers
>>>>>> will get same keys.
>>>>>> I think you have created some custom key type? If that is the case
>>>>>> then there should be some issue with the comparator.
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Hadoop user,
>>>>>>>
>>>>>>> I have two map classes processing two different input files. Both
>>>>>>> map functions have same key,value format to emit.
>>>>>>>
>>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>>> while one for value from other map.
>>>>>>>
>>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>>
>>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>>
>>>>>>> both maps emit Longwritable key and Text value.
>>>>>>>
>>>>>>>
>>>>>>> Any idea why this is happening?
>>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>>> emitted by mapper?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Ravikant
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harshit Mathur
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>

Re: Reducer called twice for same key

Posted by Shahab Yunus <sh...@gmail.com>.
Ravikant,

How is the output that you sent in the email maps to the one you are are
printing in the code (using SOP statements)?

Where do you see reducer being called again for the same key? Maybe, I am
missing something but the output statements in the code look different.

Regards,
Shahab

On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Harshit,
>
> PFA
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> Can you share PALReducer also?
>>
>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Adding source code for more clarity
>>>
>>> Problem statement is simple
>>>
>>> PartitionFileMapper : it takes input file which has tab separated value
>>> V , P
>>> It emits (V, -1#P)
>>>
>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>> It emits (V, E#-1)
>>>
>>> in reducer I want to emit
>>> (V,E#P)
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> By custom key, did you meant some class object ? then no.
>>>>
>>>> I have two map methods each having different file as input. And both
>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>> container file I can see,
>>>>
>>>> key & value separated by ':'
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>> :3278620528725786624:5352454#-1
>>>>
>>>> for key 391 reducer is called twice. , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> In map method I parse the string from input file as Long variable and
>>>> then emit it as LongWritable.
>>>>
>>>> Is there something I am missing when I use multipleInput
>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> As per Map Reduce, it is not possible that two different reducers will
>>>>> get same keys.
>>>>> I think you have created some custom key type? If that is the case
>>>>> then there should be some issue with the comparator.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>
>>>>>> Hi Hadoop user,
>>>>>>
>>>>>> I have two map classes processing two different input files. Both map
>>>>>> functions have same key,value format to emit.
>>>>>>
>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>> while one for value from other map.
>>>>>>
>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>
>>>>>> both maps emit Longwritable key and Text value.
>>>>>>
>>>>>>
>>>>>> Any idea why this is happening?
>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>> emitted by mapper?
>>>>>>
>>>>>> Thanks
>>>>>> Ravikant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harshit Mathur
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Shahab Yunus <sh...@gmail.com>.
Ravikant,

How is the output that you sent in the email maps to the one you are are
printing in the code (using SOP statements)?

Where do you see reducer being called again for the same key? Maybe, I am
missing something but the output statements in the code look different.

Regards,
Shahab

On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Harshit,
>
> PFA
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> Can you share PALReducer also?
>>
>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Adding source code for more clarity
>>>
>>> Problem statement is simple
>>>
>>> PartitionFileMapper : it takes input file which has tab separated value
>>> V , P
>>> It emits (V, -1#P)
>>>
>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>> It emits (V, E#-1)
>>>
>>> in reducer I want to emit
>>> (V,E#P)
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> By custom key, did you meant some class object ? then no.
>>>>
>>>> I have two map methods each having different file as input. And both
>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>> container file I can see,
>>>>
>>>> key & value separated by ':'
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>> :3278620528725786624:5352454#-1
>>>>
>>>> for key 391 reducer is called twice. , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> In map method I parse the string from input file as Long variable and
>>>> then emit it as LongWritable.
>>>>
>>>> Is there something I am missing when I use multipleInput
>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> As per Map Reduce, it is not possible that two different reducers will
>>>>> get same keys.
>>>>> I think you have created some custom key type? If that is the case
>>>>> then there should be some issue with the comparator.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>
>>>>>> Hi Hadoop user,
>>>>>>
>>>>>> I have two map classes processing two different input files. Both map
>>>>>> functions have same key,value format to emit.
>>>>>>
>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>> while one for value from other map.
>>>>>>
>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>
>>>>>> both maps emit Longwritable key and Text value.
>>>>>>
>>>>>>
>>>>>> Any idea why this is happening?
>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>> emitted by mapper?
>>>>>>
>>>>>> Thanks
>>>>>> Ravikant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harshit Mathur
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Shahab Yunus <sh...@gmail.com>.
Ravikant,

How is the output that you sent in the email maps to the one you are are
printing in the code (using SOP statements)?

Where do you see reducer being called again for the same key? Maybe, I am
missing something but the output statements in the code look different.

Regards,
Shahab

On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Harshit,
>
> PFA
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> Can you share PALReducer also?
>>
>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Adding source code for more clarity
>>>
>>> Problem statement is simple
>>>
>>> PartitionFileMapper : it takes input file which has tab separated value
>>> V , P
>>> It emits (V, -1#P)
>>>
>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>> It emits (V, E#-1)
>>>
>>> in reducer I want to emit
>>> (V,E#P)
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> By custom key, did you meant some class object ? then no.
>>>>
>>>> I have two map methods each having different file as input. And both
>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>> container file I can see,
>>>>
>>>> key & value separated by ':'
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>> :3278620528725786624:5352454#-1
>>>>
>>>> for key 391 reducer is called twice. , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> In map method I parse the string from input file as Long variable and
>>>> then emit it as LongWritable.
>>>>
>>>> Is there something I am missing when I use multipleInput
>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> As per Map Reduce, it is not possible that two different reducers will
>>>>> get same keys.
>>>>> I think you have created some custom key type? If that is the case
>>>>> then there should be some issue with the comparator.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>
>>>>>> Hi Hadoop user,
>>>>>>
>>>>>> I have two map classes processing two different input files. Both map
>>>>>> functions have same key,value format to emit.
>>>>>>
>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>> while one for value from other map.
>>>>>>
>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>
>>>>>> both maps emit Longwritable key and Text value.
>>>>>>
>>>>>>
>>>>>> Any idea why this is happening?
>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>> emitted by mapper?
>>>>>>
>>>>>> Thanks
>>>>>> Ravikant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harshit Mathur
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Shahab Yunus <sh...@gmail.com>.
Ravikant,

How is the output that you sent in the email maps to the one you are are
printing in the code (using SOP statements)?

Where do you see reducer being called again for the same key? Maybe, I am
missing something but the output statements in the code look different.

Regards,
Shahab

On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <ra...@gmail.com>
wrote:

> Hi Harshit,
>
> PFA
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> Can you share PALReducer also?
>>
>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Adding source code for more clarity
>>>
>>> Problem statement is simple
>>>
>>> PartitionFileMapper : it takes input file which has tab separated value
>>> V , P
>>> It emits (V, -1#P)
>>>
>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>> It emits (V, E#-1)
>>>
>>> in reducer I want to emit
>>> (V,E#P)
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> By custom key, did you meant some class object ? then no.
>>>>
>>>> I have two map methods each having different file as input. And both
>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>> container file I can see,
>>>>
>>>> key & value separated by ':'
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>> :3278620528725786624:5352454#-1
>>>>
>>>> for key 391 reducer is called twice. , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> In map method I parse the string from input file as Long variable and
>>>> then emit it as LongWritable.
>>>>
>>>> Is there something I am missing when I use multipleInput
>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> As per Map Reduce, it is not possible that two different reducers will
>>>>> get same keys.
>>>>> I think you have created some custom key type? If that is the case
>>>>> then there should be some issue with the comparator.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>
>>>>>> Hi Hadoop user,
>>>>>>
>>>>>> I have two map classes processing two different input files. Both map
>>>>>> functions have same key,value format to emit.
>>>>>>
>>>>>> But Reducer called twice for same key , one for value from first map
>>>>>> while one for value from other map.
>>>>>>
>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>
>>>>>> both maps emit Longwritable key and Text value.
>>>>>>
>>>>>>
>>>>>> Any idea why this is happening?
>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>> emitted by mapper?
>>>>>>
>>>>>> Thanks
>>>>>> Ravikant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harshit Mathur
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Harshit,

PFA

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> Can you share PALReducer also?
>
> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Adding source code for more clarity
>>
>> Problem statement is simple
>>
>> PartitionFileMapper : it takes input file which has tab separated value V
>> , P
>> It emits (V, -1#P)
>>
>> ALFileMapper : It takes input file which has tab separated values V, EL
>> It emits (V, E#-1)
>>
>> in reducer I want to emit
>> (V,E#P)
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> By custom key, did you meant some class object ? then no.
>>>
>>> I have two map methods each having different file as input. And both map
>>> methods emit *Longwritable key* type. But As in stdout of container
>>> file I can see,
>>>
>>> key & value separated by ':'
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>> :3278620528725786624:5352454#-1
>>>
>>> for key 391 reducer is called twice. , one for value from first map
>>> while one for value from other map.
>>>
>>> In map method I parse the string from input file as Long variable and
>>> then emit it as LongWritable.
>>>
>>> Is there something I am missing when I use multipleInput
>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>> wrote:
>>>
>>>> As per Map Reduce, it is not possible that two different reducers will
>>>> get same keys.
>>>> I think you have created some custom key type? If that is the case then
>>>> there should be some issue with the comparator.
>>>>
>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> Hi Hadoop user,
>>>>>
>>>>> I have two map classes processing two different input files. Both map
>>>>> functions have same key,value format to emit.
>>>>>
>>>>> But Reducer called twice for same key , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> I am printing (key ,value) pairs in reducer  :
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>
>>>>> both maps emit Longwritable key and Text value.
>>>>>
>>>>>
>>>>> Any idea why this is happening?
>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>> emitted by mapper?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harshit Mathur
>>>>
>>>
>>>
>>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Harshit,

PFA

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> Can you share PALReducer also?
>
> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Adding source code for more clarity
>>
>> Problem statement is simple
>>
>> PartitionFileMapper : it takes input file which has tab separated value V
>> , P
>> It emits (V, -1#P)
>>
>> ALFileMapper : It takes input file which has tab separated values V, EL
>> It emits (V, E#-1)
>>
>> in reducer I want to emit
>> (V,E#P)
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> By custom key, did you meant some class object ? then no.
>>>
>>> I have two map methods each having different file as input. And both map
>>> methods emit *Longwritable key* type. But As in stdout of container
>>> file I can see,
>>>
>>> key & value separated by ':'
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>> :3278620528725786624:5352454#-1
>>>
>>> for key 391 reducer is called twice. , one for value from first map
>>> while one for value from other map.
>>>
>>> In map method I parse the string from input file as Long variable and
>>> then emit it as LongWritable.
>>>
>>> Is there something I am missing when I use multipleInput
>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>> wrote:
>>>
>>>> As per Map Reduce, it is not possible that two different reducers will
>>>> get same keys.
>>>> I think you have created some custom key type? If that is the case then
>>>> there should be some issue with the comparator.
>>>>
>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> Hi Hadoop user,
>>>>>
>>>>> I have two map classes processing two different input files. Both map
>>>>> functions have same key,value format to emit.
>>>>>
>>>>> But Reducer called twice for same key , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> I am printing (key ,value) pairs in reducer  :
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>
>>>>> both maps emit Longwritable key and Text value.
>>>>>
>>>>>
>>>>> Any idea why this is happening?
>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>> emitted by mapper?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harshit Mathur
>>>>
>>>
>>>
>>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Harshit,

PFA

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> Can you share PALReducer also?
>
> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Adding source code for more clarity
>>
>> Problem statement is simple
>>
>> PartitionFileMapper : it takes input file which has tab separated value V
>> , P
>> It emits (V, -1#P)
>>
>> ALFileMapper : It takes input file which has tab separated values V, EL
>> It emits (V, E#-1)
>>
>> in reducer I want to emit
>> (V,E#P)
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> By custom key, did you meant some class object ? then no.
>>>
>>> I have two map methods each having different file as input. And both map
>>> methods emit *Longwritable key* type. But As in stdout of container
>>> file I can see,
>>>
>>> key & value separated by ':'
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>> :3278620528725786624:5352454#-1
>>>
>>> for key 391 reducer is called twice. , one for value from first map
>>> while one for value from other map.
>>>
>>> In map method I parse the string from input file as Long variable and
>>> then emit it as LongWritable.
>>>
>>> Is there something I am missing when I use multipleInput
>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>> wrote:
>>>
>>>> As per Map Reduce, it is not possible that two different reducers will
>>>> get same keys.
>>>> I think you have created some custom key type? If that is the case then
>>>> there should be some issue with the comparator.
>>>>
>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> Hi Hadoop user,
>>>>>
>>>>> I have two map classes processing two different input files. Both map
>>>>> functions have same key,value format to emit.
>>>>>
>>>>> But Reducer called twice for same key , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> I am printing (key ,value) pairs in reducer  :
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>
>>>>> both maps emit Longwritable key and Text value.
>>>>>
>>>>>
>>>>> Any idea why this is happening?
>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>> emitted by mapper?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harshit Mathur
>>>>
>>>
>>>
>>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Hi Harshit,

PFA

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> Can you share PALReducer also?
>
> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Adding source code for more clarity
>>
>> Problem statement is simple
>>
>> PartitionFileMapper : it takes input file which has tab separated value V
>> , P
>> It emits (V, -1#P)
>>
>> ALFileMapper : It takes input file which has tab separated values V, EL
>> It emits (V, E#-1)
>>
>> in reducer I want to emit
>> (V,E#P)
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> By custom key, did you meant some class object ? then no.
>>>
>>> I have two map methods each having different file as input. And both map
>>> methods emit *Longwritable key* type. But As in stdout of container
>>> file I can see,
>>>
>>> key & value separated by ':'
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>> :3278620528725786624:5352454#-1
>>>
>>> for key 391 reducer is called twice. , one for value from first map
>>> while one for value from other map.
>>>
>>> In map method I parse the string from input file as Long variable and
>>> then emit it as LongWritable.
>>>
>>> Is there something I am missing when I use multipleInput
>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>>> wrote:
>>>
>>>> As per Map Reduce, it is not possible that two different reducers will
>>>> get same keys.
>>>> I think you have created some custom key type? If that is the case then
>>>> there should be some issue with the comparator.
>>>>
>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>> ravikant.iisc@gmail.com> wrote:
>>>>
>>>>> Hi Hadoop user,
>>>>>
>>>>> I have two map classes processing two different input files. Both map
>>>>> functions have same key,value format to emit.
>>>>>
>>>>> But Reducer called twice for same key , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> I am printing (key ,value) pairs in reducer  :
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>
>>>>> both maps emit Longwritable key and Text value.
>>>>>
>>>>>
>>>>> Any idea why this is happening?
>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>> emitted by mapper?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harshit Mathur
>>>>
>>>
>>>
>>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
Can you share PALReducer also?

On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Adding source code for more clarity
>
> Problem statement is simple
>
> PartitionFileMapper : it takes input file which has tab separated value V
> , P
> It emits (V, -1#P)
>
> ALFileMapper : It takes input file which has tab separated values V, EL
> It emits (V, E#-1)
>
> in reducer I want to emit
> (V,E#P)
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> By custom key, did you meant some class object ? then no.
>>
>> I have two map methods each having different file as input. And both map
>> methods emit *Longwritable key* type. But As in stdout of container file
>> I can see,
>>
>> key & value separated by ':'
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>> :3278620528725786624:5352454#-1
>>
>> for key 391 reducer is called twice. , one for value from first map while
>> one for value from other map.
>>
>> In map method I parse the string from input file as Long variable and
>> then emit it as LongWritable.
>>
>> Is there something I am missing when I use multipleInput
>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> As per Map Reduce, it is not possible that two different reducers will
>>> get same keys.
>>> I think you have created some custom key type? If that is the case then
>>> there should be some issue with the comparator.
>>>
>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Hi Hadoop user,
>>>>
>>>> I have two map classes processing two different input files. Both map
>>>> functions have same key,value format to emit.
>>>>
>>>> But Reducer called twice for same key , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> I am printing (key ,value) pairs in reducer  :
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>
>>>> both maps emit Longwritable key and Text value.
>>>>
>>>>
>>>> Any idea why this is happening?
>>>> Is there any way to get hash values generated by hadoop for keys
>>>> emitted by mapper?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>


-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
Can you share PALReducer also?

On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Adding source code for more clarity
>
> Problem statement is simple
>
> PartitionFileMapper : it takes input file which has tab separated value V
> , P
> It emits (V, -1#P)
>
> ALFileMapper : It takes input file which has tab separated values V, EL
> It emits (V, E#-1)
>
> in reducer I want to emit
> (V,E#P)
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> By custom key, did you meant some class object ? then no.
>>
>> I have two map methods each having different file as input. And both map
>> methods emit *Longwritable key* type. But As in stdout of container file
>> I can see,
>>
>> key & value separated by ':'
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>> :3278620528725786624:5352454#-1
>>
>> for key 391 reducer is called twice. , one for value from first map while
>> one for value from other map.
>>
>> In map method I parse the string from input file as Long variable and
>> then emit it as LongWritable.
>>
>> Is there something I am missing when I use multipleInput
>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> As per Map Reduce, it is not possible that two different reducers will
>>> get same keys.
>>> I think you have created some custom key type? If that is the case then
>>> there should be some issue with the comparator.
>>>
>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Hi Hadoop user,
>>>>
>>>> I have two map classes processing two different input files. Both map
>>>> functions have same key,value format to emit.
>>>>
>>>> But Reducer called twice for same key , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> I am printing (key ,value) pairs in reducer  :
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>
>>>> both maps emit Longwritable key and Text value.
>>>>
>>>>
>>>> Any idea why this is happening?
>>>> Is there any way to get hash values generated by hadoop for keys
>>>> emitted by mapper?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>


-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
Can you share PALReducer also?

On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Adding source code for more clarity
>
> Problem statement is simple
>
> PartitionFileMapper : it takes input file which has tab separated value V
> , P
> It emits (V, -1#P)
>
> ALFileMapper : It takes input file which has tab separated values V, EL
> It emits (V, E#-1)
>
> in reducer I want to emit
> (V,E#P)
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> By custom key, did you meant some class object ? then no.
>>
>> I have two map methods each having different file as input. And both map
>> methods emit *Longwritable key* type. But As in stdout of container file
>> I can see,
>>
>> key & value separated by ':'
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>> :3278620528725786624:5352454#-1
>>
>> for key 391 reducer is called twice. , one for value from first map while
>> one for value from other map.
>>
>> In map method I parse the string from input file as Long variable and
>> then emit it as LongWritable.
>>
>> Is there something I am missing when I use multipleInput
>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> As per Map Reduce, it is not possible that two different reducers will
>>> get same keys.
>>> I think you have created some custom key type? If that is the case then
>>> there should be some issue with the comparator.
>>>
>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Hi Hadoop user,
>>>>
>>>> I have two map classes processing two different input files. Both map
>>>> functions have same key,value format to emit.
>>>>
>>>> But Reducer called twice for same key , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> I am printing (key ,value) pairs in reducer  :
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>
>>>> both maps emit Longwritable key and Text value.
>>>>
>>>>
>>>> Any idea why this is happening?
>>>> Is there any way to get hash values generated by hadoop for keys
>>>> emitted by mapper?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>


-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
Can you share PALReducer also?

On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Adding source code for more clarity
>
> Problem statement is simple
>
> PartitionFileMapper : it takes input file which has tab separated value V
> , P
> It emits (V, -1#P)
>
> ALFileMapper : It takes input file which has tab separated values V, EL
> It emits (V, E#-1)
>
> in reducer I want to emit
> (V,E#P)
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> By custom key, did you meant some class object ? then no.
>>
>> I have two map methods each having different file as input. And both map
>> methods emit *Longwritable key* type. But As in stdout of container file
>> I can see,
>>
>> key & value separated by ':'
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>> :3278620528725786624:5352454#-1
>>
>> for key 391 reducer is called twice. , one for value from first map while
>> one for value from other map.
>>
>> In map method I parse the string from input file as Long variable and
>> then emit it as LongWritable.
>>
>> Is there something I am missing when I use multipleInput
>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
>> wrote:
>>
>>> As per Map Reduce, it is not possible that two different reducers will
>>> get same keys.
>>> I think you have created some custom key type? If that is the case then
>>> there should be some issue with the comparator.
>>>
>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> Hi Hadoop user,
>>>>
>>>> I have two map classes processing two different input files. Both map
>>>> functions have same key,value format to emit.
>>>>
>>>> But Reducer called twice for same key , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> I am printing (key ,value) pairs in reducer  :
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>
>>>> both maps emit Longwritable key and Text value.
>>>>
>>>>
>>>> Any idea why this is happening?
>>>> Is there any way to get hash values generated by hadoop for keys
>>>> emitted by mapper?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>
>>>
>>>
>>> --
>>> Harshit Mathur
>>>
>>
>>
>


-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Adding source code for more clarity

Problem statement is simple

PartitionFileMapper : it takes input file which has tab separated value V ,
P
It emits (V, -1#P)

ALFileMapper : It takes input file which has tab separated values V, EL
It emits (V, E#-1)

in reducer I want to emit
(V,E#P)

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> By custom key, did you meant some class object ? then no.
>
> I have two map methods each having different file as input. And both map
> methods emit *Longwritable key* type. But As in stdout of container file
> I can see,
>
> key & value separated by ':'
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
> :3278620528725786624:5352454#-1
>
> for key 391 reducer is called twice. , one for value from first map while
> one for value from other map.
>
> In map method I parse the string from input file as Long variable and then
> emit it as LongWritable.
>
> Is there something I am missing when I use multipleInput
> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> As per Map Reduce, it is not possible that two different reducers will
>> get same keys.
>> I think you have created some custom key type? If that is the case then
>> there should be some issue with the comparator.
>>
>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Hi Hadoop user,
>>>
>>> I have two map classes processing two different input files. Both map
>>> functions have same key,value format to emit.
>>>
>>> But Reducer called twice for same key , one for value from first map
>>> while one for value from other map.
>>>
>>> I am printing (key ,value) pairs in reducer  :
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>
>>> both maps emit Longwritable key and Text value.
>>>
>>>
>>> Any idea why this is happening?
>>> Is there any way to get hash values generated by hadoop for keys emitted
>>> by mapper?
>>>
>>> Thanks
>>> Ravikant
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Adding source code for more clarity

Problem statement is simple

PartitionFileMapper : it takes input file which has tab separated value V ,
P
It emits (V, -1#P)

ALFileMapper : It takes input file which has tab separated values V, EL
It emits (V, E#-1)

in reducer I want to emit
(V,E#P)

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> By custom key, did you meant some class object ? then no.
>
> I have two map methods each having different file as input. And both map
> methods emit *Longwritable key* type. But As in stdout of container file
> I can see,
>
> key & value separated by ':'
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
> :3278620528725786624:5352454#-1
>
> for key 391 reducer is called twice. , one for value from first map while
> one for value from other map.
>
> In map method I parse the string from input file as Long variable and then
> emit it as LongWritable.
>
> Is there something I am missing when I use multipleInput
> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> As per Map Reduce, it is not possible that two different reducers will
>> get same keys.
>> I think you have created some custom key type? If that is the case then
>> there should be some issue with the comparator.
>>
>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Hi Hadoop user,
>>>
>>> I have two map classes processing two different input files. Both map
>>> functions have same key,value format to emit.
>>>
>>> But Reducer called twice for same key , one for value from first map
>>> while one for value from other map.
>>>
>>> I am printing (key ,value) pairs in reducer  :
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>
>>> both maps emit Longwritable key and Text value.
>>>
>>>
>>> Any idea why this is happening?
>>> Is there any way to get hash values generated by hadoop for keys emitted
>>> by mapper?
>>>
>>> Thanks
>>> Ravikant
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Adding source code for more clarity

Problem statement is simple

PartitionFileMapper : it takes input file which has tab separated value V ,
P
It emits (V, -1#P)

ALFileMapper : It takes input file which has tab separated values V, EL
It emits (V, E#-1)

in reducer I want to emit
(V,E#P)

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> By custom key, did you meant some class object ? then no.
>
> I have two map methods each having different file as input. And both map
> methods emit *Longwritable key* type. But As in stdout of container file
> I can see,
>
> key & value separated by ':'
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
> :3278620528725786624:5352454#-1
>
> for key 391 reducer is called twice. , one for value from first map while
> one for value from other map.
>
> In map method I parse the string from input file as Long variable and then
> emit it as LongWritable.
>
> Is there something I am missing when I use multipleInput
> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> As per Map Reduce, it is not possible that two different reducers will
>> get same keys.
>> I think you have created some custom key type? If that is the case then
>> there should be some issue with the comparator.
>>
>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Hi Hadoop user,
>>>
>>> I have two map classes processing two different input files. Both map
>>> functions have same key,value format to emit.
>>>
>>> But Reducer called twice for same key , one for value from first map
>>> while one for value from other map.
>>>
>>> I am printing (key ,value) pairs in reducer  :
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>
>>> both maps emit Longwritable key and Text value.
>>>
>>>
>>> Any idea why this is happening?
>>> Is there any way to get hash values generated by hadoop for keys emitted
>>> by mapper?
>>>
>>> Thanks
>>> Ravikant
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
Adding source code for more clarity

Problem statement is simple

PartitionFileMapper : it takes input file which has tab separated value V ,
P
It emits (V, -1#P)

ALFileMapper : It takes input file which has tab separated values V, EL
It emits (V, E#-1)

in reducer I want to emit
(V,E#P)

Thanks
Ravikant

On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> By custom key, did you meant some class object ? then no.
>
> I have two map methods each having different file as input. And both map
> methods emit *Longwritable key* type. But As in stdout of container file
> I can see,
>
> key & value separated by ':'
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
> :3278620528725786624:5352454#-1
>
> for key 391 reducer is called twice. , one for value from first map while
> one for value from other map.
>
> In map method I parse the string from input file as Long variable and then
> emit it as LongWritable.
>
> Is there something I am missing when I use multipleInput
> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
> wrote:
>
>> As per Map Reduce, it is not possible that two different reducers will
>> get same keys.
>> I think you have created some custom key type? If that is the case then
>> there should be some issue with the comparator.
>>
>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Hi Hadoop user,
>>>
>>> I have two map classes processing two different input files. Both map
>>> functions have same key,value format to emit.
>>>
>>> But Reducer called twice for same key , one for value from first map
>>> while one for value from other map.
>>>
>>> I am printing (key ,value) pairs in reducer  :
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>
>>> both maps emit Longwritable key and Text value.
>>>
>>>
>>> Any idea why this is happening?
>>> Is there any way to get hash values generated by hadoop for keys emitted
>>> by mapper?
>>>
>>> Thanks
>>> Ravikant
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
By custom key, did you meant some class object ? then no.

I have two map methods each having different file as input. And both map
methods emit *Longwritable key* type. But As in stdout of container file I
can see,

key & value separated by ':'

./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
:3278620528725786624:5352454#-1

for key 391 reducer is called twice. , one for value from first map while
one for value from other map.

In map method I parse the string from input file as Long variable and then
emit it as LongWritable.

Is there something I am missing when I use multipleInput
(org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?

Thanks
Ravikant

On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> As per Map Reduce, it is not possible that two different reducers will get
> same keys.
> I think you have created some custom key type? If that is the case then
> there should be some issue with the comparator.
>
> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have two map classes processing two different input files. Both map
>> functions have same key,value format to emit.
>>
>> But Reducer called twice for same key , one for value from first map
>> while one for value from other map.
>>
>> I am printing (key ,value) pairs in reducer  :
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>
>> both maps emit Longwritable key and Text value.
>>
>>
>> Any idea why this is happening?
>> Is there any way to get hash values generated by hadoop for keys emitted
>> by mapper?
>>
>> Thanks
>> Ravikant
>>
>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
By custom key, did you meant some class object ? then no.

I have two map methods each having different file as input. And both map
methods emit *Longwritable key* type. But As in stdout of container file I
can see,

key & value separated by ':'

./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
:3278620528725786624:5352454#-1

for key 391 reducer is called twice. , one for value from first map while
one for value from other map.

In map method I parse the string from input file as Long variable and then
emit it as LongWritable.

Is there something I am missing when I use multipleInput
(org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?

Thanks
Ravikant

On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> As per Map Reduce, it is not possible that two different reducers will get
> same keys.
> I think you have created some custom key type? If that is the case then
> there should be some issue with the comparator.
>
> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have two map classes processing two different input files. Both map
>> functions have same key,value format to emit.
>>
>> But Reducer called twice for same key , one for value from first map
>> while one for value from other map.
>>
>> I am printing (key ,value) pairs in reducer  :
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>
>> both maps emit Longwritable key and Text value.
>>
>>
>> Any idea why this is happening?
>> Is there any way to get hash values generated by hadoop for keys emitted
>> by mapper?
>>
>> Thanks
>> Ravikant
>>
>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
By custom key, did you meant some class object ? then no.

I have two map methods each having different file as input. And both map
methods emit *Longwritable key* type. But As in stdout of container file I
can see,

key & value separated by ':'

./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
:3278620528725786624:5352454#-1

for key 391 reducer is called twice. , one for value from first map while
one for value from other map.

In map method I parse the string from input file as Long variable and then
emit it as LongWritable.

Is there something I am missing when I use multipleInput
(org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?

Thanks
Ravikant

On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> As per Map Reduce, it is not possible that two different reducers will get
> same keys.
> I think you have created some custom key type? If that is the case then
> there should be some issue with the comparator.
>
> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have two map classes processing two different input files. Both map
>> functions have same key,value format to emit.
>>
>> But Reducer called twice for same key , one for value from first map
>> while one for value from other map.
>>
>> I am printing (key ,value) pairs in reducer  :
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>
>> both maps emit Longwritable key and Text value.
>>
>>
>> Any idea why this is happening?
>> Is there any way to get hash values generated by hadoop for keys emitted
>> by mapper?
>>
>> Thanks
>> Ravikant
>>
>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Ravikant Dindokar <ra...@gmail.com>.
By custom key, did you meant some class object ? then no.

I have two map methods each having different file as input. And both map
methods emit *Longwritable key* type. But As in stdout of container file I
can see,

key & value separated by ':'

./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
:3278620528725786624:5352454#-1

for key 391 reducer is called twice. , one for value from first map while
one for value from other map.

In map method I parse the string from input file as Long variable and then
emit it as LongWritable.

Is there something I am missing when I use multipleInput
(org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?

Thanks
Ravikant

On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <ma...@gmail.com>
wrote:

> As per Map Reduce, it is not possible that two different reducers will get
> same keys.
> I think you have created some custom key type? If that is the case then
> there should be some issue with the comparator.
>
> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> Hi Hadoop user,
>>
>> I have two map classes processing two different input files. Both map
>> functions have same key,value format to emit.
>>
>> But Reducer called twice for same key , one for value from first map
>> while one for value from other map.
>>
>> I am printing (key ,value) pairs in reducer  :
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>
>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>
>> both maps emit Longwritable key and Text value.
>>
>>
>> Any idea why this is happening?
>> Is there any way to get hash values generated by hadoop for keys emitted
>> by mapper?
>>
>> Thanks
>> Ravikant
>>
>
>
>
> --
> Harshit Mathur
>

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
As per Map Reduce, it is not possible that two different reducers will get
same keys.
I think you have created some custom key type? If that is the case then
there should be some issue with the comparator.

On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Hi Hadoop user,
>
> I have two map classes processing two different input files. Both map
> functions have same key,value format to emit.
>
> But Reducer called twice for same key , one for value from first map while
> one for value from other map.
>
> I am printing (key ,value) pairs in reducer  :
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>
> both maps emit Longwritable key and Text value.
>
>
> Any idea why this is happening?
> Is there any way to get hash values generated by hadoop for keys emitted
> by mapper?
>
> Thanks
> Ravikant
>



-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
As per Map Reduce, it is not possible that two different reducers will get
same keys.
I think you have created some custom key type? If that is the case then
there should be some issue with the comparator.

On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Hi Hadoop user,
>
> I have two map classes processing two different input files. Both map
> functions have same key,value format to emit.
>
> But Reducer called twice for same key , one for value from first map while
> one for value from other map.
>
> I am printing (key ,value) pairs in reducer  :
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>
> both maps emit Longwritable key and Text value.
>
>
> Any idea why this is happening?
> Is there any way to get hash values generated by hadoop for keys emitted
> by mapper?
>
> Thanks
> Ravikant
>



-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
As per Map Reduce, it is not possible that two different reducers will get
same keys.
I think you have created some custom key type? If that is the case then
there should be some issue with the comparator.

On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Hi Hadoop user,
>
> I have two map classes processing two different input files. Both map
> functions have same key,value format to emit.
>
> But Reducer called twice for same key , one for value from first map while
> one for value from other map.
>
> I am printing (key ,value) pairs in reducer  :
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>
> both maps emit Longwritable key and Text value.
>
>
> Any idea why this is happening?
> Is there any way to get hash values generated by hadoop for keys emitted
> by mapper?
>
> Thanks
> Ravikant
>



-- 
Harshit Mathur

Re: Reducer called twice for same key

Posted by Harshit Mathur <ma...@gmail.com>.
As per Map Reduce, it is not possible that two different reducers will get
same keys.
I think you have created some custom key type? If that is the case then
there should be some issue with the comparator.

On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <ravikant.iisc@gmail.com
> wrote:

> Hi Hadoop user,
>
> I have two map classes processing two different input files. Both map
> functions have same key,value format to emit.
>
> But Reducer called twice for same key , one for value from first map while
> one for value from other map.
>
> I am printing (key ,value) pairs in reducer  :
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>
> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>
> both maps emit Longwritable key and Text value.
>
>
> Any idea why this is happening?
> Is there any way to get hash values generated by hadoop for keys emitted
> by mapper?
>
> Thanks
> Ravikant
>



-- 
Harshit Mathur