You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Aseem Anand <as...@gmail.com> on 2012/10/15 18:56:11 UTC

PriorityQueueWritable

Hi,
Is anyone familiar with a PriorityQueueWritable to be used to pass data
from mapper to reducers ?

Regards,
Aseem

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Also, another advantage in trying to make use of the shuffle/sort is that
your sorted list can grow beyond the size of memory.  A risk in trying to
pack this data into a sorted ArrayWritable is that the list would grow too
large to fit in memory.

Thanks,
--Chris

On Mon, Oct 15, 2012 at 11:37 AM, Chris Nauroth <cn...@hortonworks.com>wrote:

> I think it would work, but I'm wondering if it would be easier for your
> application to restructure the keys emitted from the mapper tasks so that
> you can take advantage of the sorting inherently done during the shuffle.
>
> For each reduce task, your reducer code will receive keys emitted from
> mappers in sorted order.  Therefore, if the keys emitted from your mapper
> contain the item's priority, then the shuffle would provide the sort order
> that you need.  This might lead you down the path of writing a custom
> WritableComparable to use as the map output key, but this is usually pretty
> trivial.
>
> Also, keep in mind that if you run multiple reduce tasks, then each
> reducer receives a subset of the keys emitted from the mapper.  Depending
> on your application logic, this may or may not be a problem.
>
> Thanks,
> --Chris
>
>
> On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi Chris,
>> I had a few PriorityQueue's at the mappers which I wished to send to some
>> reducers. After this each reducer(receiving PriorityQueues from each
>> mapper) would perform some operations on these by removing the top and
>> hence accessing the elements in sorted order(which is very essential to my
>> application). Even I thought of pushing them in an ArrayWritable but was
>> wondering if there would be an existing implementation of PriorityQueue.
>> Would it be advisable to insert elements into ArrayWritable in sorted
>> order and reconstruction of merged PriorityQueues at the other end now ?
>>
>> Thanks,
>> Aseem
>>
>>
>> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cnauroth@hortonworks.com
>> > wrote:
>>
>>> Hello Aseem,
>>>
>>> I'm aware of nothing in Hadoop or related projects that provides a
>>> PriorityQueueWritable.  You could achieve this by taking some existing
>>> priority queue class and subclassing it or wrapping it to implement the
>>> Writable.write and Writable.readFields methods.
>>>
>>> If you could give us some additional context around what you want to
>>> solve, then we might be able to offer some other suggestions.  For example,
>>> depending on the problem, maybe you could sort values and wrap them in
>>> ArrayWritable (which already exists), which would save you the trouble of
>>> coding your own custom Writable.
>>>
>>> Thank you,
>>> --Chris
>>>
>>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>>
>>>> Hi,
>>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>>> from mapper to reducers ?
>>>>
>>>> Regards,
>>>> Aseem
>>>>
>>>
>>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Also, another advantage in trying to make use of the shuffle/sort is that
your sorted list can grow beyond the size of memory.  A risk in trying to
pack this data into a sorted ArrayWritable is that the list would grow too
large to fit in memory.

Thanks,
--Chris

On Mon, Oct 15, 2012 at 11:37 AM, Chris Nauroth <cn...@hortonworks.com>wrote:

> I think it would work, but I'm wondering if it would be easier for your
> application to restructure the keys emitted from the mapper tasks so that
> you can take advantage of the sorting inherently done during the shuffle.
>
> For each reduce task, your reducer code will receive keys emitted from
> mappers in sorted order.  Therefore, if the keys emitted from your mapper
> contain the item's priority, then the shuffle would provide the sort order
> that you need.  This might lead you down the path of writing a custom
> WritableComparable to use as the map output key, but this is usually pretty
> trivial.
>
> Also, keep in mind that if you run multiple reduce tasks, then each
> reducer receives a subset of the keys emitted from the mapper.  Depending
> on your application logic, this may or may not be a problem.
>
> Thanks,
> --Chris
>
>
> On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi Chris,
>> I had a few PriorityQueue's at the mappers which I wished to send to some
>> reducers. After this each reducer(receiving PriorityQueues from each
>> mapper) would perform some operations on these by removing the top and
>> hence accessing the elements in sorted order(which is very essential to my
>> application). Even I thought of pushing them in an ArrayWritable but was
>> wondering if there would be an existing implementation of PriorityQueue.
>> Would it be advisable to insert elements into ArrayWritable in sorted
>> order and reconstruction of merged PriorityQueues at the other end now ?
>>
>> Thanks,
>> Aseem
>>
>>
>> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cnauroth@hortonworks.com
>> > wrote:
>>
>>> Hello Aseem,
>>>
>>> I'm aware of nothing in Hadoop or related projects that provides a
>>> PriorityQueueWritable.  You could achieve this by taking some existing
>>> priority queue class and subclassing it or wrapping it to implement the
>>> Writable.write and Writable.readFields methods.
>>>
>>> If you could give us some additional context around what you want to
>>> solve, then we might be able to offer some other suggestions.  For example,
>>> depending on the problem, maybe you could sort values and wrap them in
>>> ArrayWritable (which already exists), which would save you the trouble of
>>> coding your own custom Writable.
>>>
>>> Thank you,
>>> --Chris
>>>
>>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>>
>>>> Hi,
>>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>>> from mapper to reducers ?
>>>>
>>>> Regards,
>>>> Aseem
>>>>
>>>
>>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Also, another advantage in trying to make use of the shuffle/sort is that
your sorted list can grow beyond the size of memory.  A risk in trying to
pack this data into a sorted ArrayWritable is that the list would grow too
large to fit in memory.

Thanks,
--Chris

On Mon, Oct 15, 2012 at 11:37 AM, Chris Nauroth <cn...@hortonworks.com>wrote:

> I think it would work, but I'm wondering if it would be easier for your
> application to restructure the keys emitted from the mapper tasks so that
> you can take advantage of the sorting inherently done during the shuffle.
>
> For each reduce task, your reducer code will receive keys emitted from
> mappers in sorted order.  Therefore, if the keys emitted from your mapper
> contain the item's priority, then the shuffle would provide the sort order
> that you need.  This might lead you down the path of writing a custom
> WritableComparable to use as the map output key, but this is usually pretty
> trivial.
>
> Also, keep in mind that if you run multiple reduce tasks, then each
> reducer receives a subset of the keys emitted from the mapper.  Depending
> on your application logic, this may or may not be a problem.
>
> Thanks,
> --Chris
>
>
> On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi Chris,
>> I had a few PriorityQueue's at the mappers which I wished to send to some
>> reducers. After this each reducer(receiving PriorityQueues from each
>> mapper) would perform some operations on these by removing the top and
>> hence accessing the elements in sorted order(which is very essential to my
>> application). Even I thought of pushing them in an ArrayWritable but was
>> wondering if there would be an existing implementation of PriorityQueue.
>> Would it be advisable to insert elements into ArrayWritable in sorted
>> order and reconstruction of merged PriorityQueues at the other end now ?
>>
>> Thanks,
>> Aseem
>>
>>
>> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cnauroth@hortonworks.com
>> > wrote:
>>
>>> Hello Aseem,
>>>
>>> I'm aware of nothing in Hadoop or related projects that provides a
>>> PriorityQueueWritable.  You could achieve this by taking some existing
>>> priority queue class and subclassing it or wrapping it to implement the
>>> Writable.write and Writable.readFields methods.
>>>
>>> If you could give us some additional context around what you want to
>>> solve, then we might be able to offer some other suggestions.  For example,
>>> depending on the problem, maybe you could sort values and wrap them in
>>> ArrayWritable (which already exists), which would save you the trouble of
>>> coding your own custom Writable.
>>>
>>> Thank you,
>>> --Chris
>>>
>>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>>
>>>> Hi,
>>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>>> from mapper to reducers ?
>>>>
>>>> Regards,
>>>> Aseem
>>>>
>>>
>>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Also, another advantage in trying to make use of the shuffle/sort is that
your sorted list can grow beyond the size of memory.  A risk in trying to
pack this data into a sorted ArrayWritable is that the list would grow too
large to fit in memory.

Thanks,
--Chris

On Mon, Oct 15, 2012 at 11:37 AM, Chris Nauroth <cn...@hortonworks.com>wrote:

> I think it would work, but I'm wondering if it would be easier for your
> application to restructure the keys emitted from the mapper tasks so that
> you can take advantage of the sorting inherently done during the shuffle.
>
> For each reduce task, your reducer code will receive keys emitted from
> mappers in sorted order.  Therefore, if the keys emitted from your mapper
> contain the item's priority, then the shuffle would provide the sort order
> that you need.  This might lead you down the path of writing a custom
> WritableComparable to use as the map output key, but this is usually pretty
> trivial.
>
> Also, keep in mind that if you run multiple reduce tasks, then each
> reducer receives a subset of the keys emitted from the mapper.  Depending
> on your application logic, this may or may not be a problem.
>
> Thanks,
> --Chris
>
>
> On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi Chris,
>> I had a few PriorityQueue's at the mappers which I wished to send to some
>> reducers. After this each reducer(receiving PriorityQueues from each
>> mapper) would perform some operations on these by removing the top and
>> hence accessing the elements in sorted order(which is very essential to my
>> application). Even I thought of pushing them in an ArrayWritable but was
>> wondering if there would be an existing implementation of PriorityQueue.
>> Would it be advisable to insert elements into ArrayWritable in sorted
>> order and reconstruction of merged PriorityQueues at the other end now ?
>>
>> Thanks,
>> Aseem
>>
>>
>> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cnauroth@hortonworks.com
>> > wrote:
>>
>>> Hello Aseem,
>>>
>>> I'm aware of nothing in Hadoop or related projects that provides a
>>> PriorityQueueWritable.  You could achieve this by taking some existing
>>> priority queue class and subclassing it or wrapping it to implement the
>>> Writable.write and Writable.readFields methods.
>>>
>>> If you could give us some additional context around what you want to
>>> solve, then we might be able to offer some other suggestions.  For example,
>>> depending on the problem, maybe you could sort values and wrap them in
>>> ArrayWritable (which already exists), which would save you the trouble of
>>> coding your own custom Writable.
>>>
>>> Thank you,
>>> --Chris
>>>
>>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>>
>>>> Hi,
>>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>>> from mapper to reducers ?
>>>>
>>>> Regards,
>>>> Aseem
>>>>
>>>
>>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
I think it would work, but I'm wondering if it would be easier for your
application to restructure the keys emitted from the mapper tasks so that
you can take advantage of the sorting inherently done during the shuffle.

For each reduce task, your reducer code will receive keys emitted from
mappers in sorted order.  Therefore, if the keys emitted from your mapper
contain the item's priority, then the shuffle would provide the sort order
that you need.  This might lead you down the path of writing a custom
WritableComparable to use as the map output key, but this is usually pretty
trivial.

Also, keep in mind that if you run multiple reduce tasks, then each reducer
receives a subset of the keys emitted from the mapper.  Depending on your
application logic, this may or may not be a problem.

Thanks,
--Chris


On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi Chris,
> I had a few PriorityQueue's at the mappers which I wished to send to some
> reducers. After this each reducer(receiving PriorityQueues from each
> mapper) would perform some operations on these by removing the top and
> hence accessing the elements in sorted order(which is very essential to my
> application). Even I thought of pushing them in an ArrayWritable but was
> wondering if there would be an existing implementation of PriorityQueue.
> Would it be advisable to insert elements into ArrayWritable in sorted
> order and reconstruction of merged PriorityQueues at the other end now ?
>
> Thanks,
> Aseem
>
>
> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Hello Aseem,
>>
>> I'm aware of nothing in Hadoop or related projects that provides a
>> PriorityQueueWritable.  You could achieve this by taking some existing
>> priority queue class and subclassing it or wrapping it to implement the
>> Writable.write and Writable.readFields methods.
>>
>> If you could give us some additional context around what you want to
>> solve, then we might be able to offer some other suggestions.  For example,
>> depending on the problem, maybe you could sort values and wrap them in
>> ArrayWritable (which already exists), which would save you the trouble of
>> coding your own custom Writable.
>>
>> Thank you,
>> --Chris
>>
>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>
>>> Hi,
>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>> from mapper to reducers ?
>>>
>>> Regards,
>>> Aseem
>>>
>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
I think it would work, but I'm wondering if it would be easier for your
application to restructure the keys emitted from the mapper tasks so that
you can take advantage of the sorting inherently done during the shuffle.

For each reduce task, your reducer code will receive keys emitted from
mappers in sorted order.  Therefore, if the keys emitted from your mapper
contain the item's priority, then the shuffle would provide the sort order
that you need.  This might lead you down the path of writing a custom
WritableComparable to use as the map output key, but this is usually pretty
trivial.

Also, keep in mind that if you run multiple reduce tasks, then each reducer
receives a subset of the keys emitted from the mapper.  Depending on your
application logic, this may or may not be a problem.

Thanks,
--Chris


On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi Chris,
> I had a few PriorityQueue's at the mappers which I wished to send to some
> reducers. After this each reducer(receiving PriorityQueues from each
> mapper) would perform some operations on these by removing the top and
> hence accessing the elements in sorted order(which is very essential to my
> application). Even I thought of pushing them in an ArrayWritable but was
> wondering if there would be an existing implementation of PriorityQueue.
> Would it be advisable to insert elements into ArrayWritable in sorted
> order and reconstruction of merged PriorityQueues at the other end now ?
>
> Thanks,
> Aseem
>
>
> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Hello Aseem,
>>
>> I'm aware of nothing in Hadoop or related projects that provides a
>> PriorityQueueWritable.  You could achieve this by taking some existing
>> priority queue class and subclassing it or wrapping it to implement the
>> Writable.write and Writable.readFields methods.
>>
>> If you could give us some additional context around what you want to
>> solve, then we might be able to offer some other suggestions.  For example,
>> depending on the problem, maybe you could sort values and wrap them in
>> ArrayWritable (which already exists), which would save you the trouble of
>> coding your own custom Writable.
>>
>> Thank you,
>> --Chris
>>
>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>
>>> Hi,
>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>> from mapper to reducers ?
>>>
>>> Regards,
>>> Aseem
>>>
>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
I think it would work, but I'm wondering if it would be easier for your
application to restructure the keys emitted from the mapper tasks so that
you can take advantage of the sorting inherently done during the shuffle.

For each reduce task, your reducer code will receive keys emitted from
mappers in sorted order.  Therefore, if the keys emitted from your mapper
contain the item's priority, then the shuffle would provide the sort order
that you need.  This might lead you down the path of writing a custom
WritableComparable to use as the map output key, but this is usually pretty
trivial.

Also, keep in mind that if you run multiple reduce tasks, then each reducer
receives a subset of the keys emitted from the mapper.  Depending on your
application logic, this may or may not be a problem.

Thanks,
--Chris


On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi Chris,
> I had a few PriorityQueue's at the mappers which I wished to send to some
> reducers. After this each reducer(receiving PriorityQueues from each
> mapper) would perform some operations on these by removing the top and
> hence accessing the elements in sorted order(which is very essential to my
> application). Even I thought of pushing them in an ArrayWritable but was
> wondering if there would be an existing implementation of PriorityQueue.
> Would it be advisable to insert elements into ArrayWritable in sorted
> order and reconstruction of merged PriorityQueues at the other end now ?
>
> Thanks,
> Aseem
>
>
> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Hello Aseem,
>>
>> I'm aware of nothing in Hadoop or related projects that provides a
>> PriorityQueueWritable.  You could achieve this by taking some existing
>> priority queue class and subclassing it or wrapping it to implement the
>> Writable.write and Writable.readFields methods.
>>
>> If you could give us some additional context around what you want to
>> solve, then we might be able to offer some other suggestions.  For example,
>> depending on the problem, maybe you could sort values and wrap them in
>> ArrayWritable (which already exists), which would save you the trouble of
>> coding your own custom Writable.
>>
>> Thank you,
>> --Chris
>>
>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>
>>> Hi,
>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>> from mapper to reducers ?
>>>
>>> Regards,
>>> Aseem
>>>
>>
>>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
I think it would work, but I'm wondering if it would be easier for your
application to restructure the keys emitted from the mapper tasks so that
you can take advantage of the sorting inherently done during the shuffle.

For each reduce task, your reducer code will receive keys emitted from
mappers in sorted order.  Therefore, if the keys emitted from your mapper
contain the item's priority, then the shuffle would provide the sort order
that you need.  This might lead you down the path of writing a custom
WritableComparable to use as the map output key, but this is usually pretty
trivial.

Also, keep in mind that if you run multiple reduce tasks, then each reducer
receives a subset of the keys emitted from the mapper.  Depending on your
application logic, this may or may not be a problem.

Thanks,
--Chris


On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi Chris,
> I had a few PriorityQueue's at the mappers which I wished to send to some
> reducers. After this each reducer(receiving PriorityQueues from each
> mapper) would perform some operations on these by removing the top and
> hence accessing the elements in sorted order(which is very essential to my
> application). Even I thought of pushing them in an ArrayWritable but was
> wondering if there would be an existing implementation of PriorityQueue.
> Would it be advisable to insert elements into ArrayWritable in sorted
> order and reconstruction of merged PriorityQueues at the other end now ?
>
> Thanks,
> Aseem
>
>
> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Hello Aseem,
>>
>> I'm aware of nothing in Hadoop or related projects that provides a
>> PriorityQueueWritable.  You could achieve this by taking some existing
>> priority queue class and subclassing it or wrapping it to implement the
>> Writable.write and Writable.readFields methods.
>>
>> If you could give us some additional context around what you want to
>> solve, then we might be able to offer some other suggestions.  For example,
>> depending on the problem, maybe you could sort values and wrap them in
>> ArrayWritable (which already exists), which would save you the trouble of
>> coding your own custom Writable.
>>
>> Thank you,
>> --Chris
>>
>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>>
>>> Hi,
>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>> from mapper to reducers ?
>>>
>>> Regards,
>>> Aseem
>>>
>>
>>
>

Re: PriorityQueueWritable

Posted by Aseem Anand <as...@gmail.com>.
Hi Chris,
I had a few PriorityQueue's at the mappers which I wished to send to some
reducers. After this each reducer(receiving PriorityQueues from each
mapper) would perform some operations on these by removing the top and
hence accessing the elements in sorted order(which is very essential to my
application). Even I thought of pushing them in an ArrayWritable but was
wondering if there would be an existing implementation of PriorityQueue.
Would it be advisable to insert elements into ArrayWritable in sorted order
and reconstruction of merged PriorityQueues at the other end now ?

Thanks,
Aseem

On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Hello Aseem,
>
> I'm aware of nothing in Hadoop or related projects that provides a
> PriorityQueueWritable.  You could achieve this by taking some existing
> priority queue class and subclassing it or wrapping it to implement the
> Writable.write and Writable.readFields methods.
>
> If you could give us some additional context around what you want to
> solve, then we might be able to offer some other suggestions.  For example,
> depending on the problem, maybe you could sort values and wrap them in
> ArrayWritable (which already exists), which would save you the trouble of
> coding your own custom Writable.
>
> Thank you,
> --Chris
>
> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi,
>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>> from mapper to reducers ?
>>
>> Regards,
>> Aseem
>>
>
>

Re: PriorityQueueWritable

Posted by Aseem Anand <as...@gmail.com>.
Hi Chris,
I had a few PriorityQueue's at the mappers which I wished to send to some
reducers. After this each reducer(receiving PriorityQueues from each
mapper) would perform some operations on these by removing the top and
hence accessing the elements in sorted order(which is very essential to my
application). Even I thought of pushing them in an ArrayWritable but was
wondering if there would be an existing implementation of PriorityQueue.
Would it be advisable to insert elements into ArrayWritable in sorted order
and reconstruction of merged PriorityQueues at the other end now ?

Thanks,
Aseem

On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Hello Aseem,
>
> I'm aware of nothing in Hadoop or related projects that provides a
> PriorityQueueWritable.  You could achieve this by taking some existing
> priority queue class and subclassing it or wrapping it to implement the
> Writable.write and Writable.readFields methods.
>
> If you could give us some additional context around what you want to
> solve, then we might be able to offer some other suggestions.  For example,
> depending on the problem, maybe you could sort values and wrap them in
> ArrayWritable (which already exists), which would save you the trouble of
> coding your own custom Writable.
>
> Thank you,
> --Chris
>
> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi,
>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>> from mapper to reducers ?
>>
>> Regards,
>> Aseem
>>
>
>

Re: PriorityQueueWritable

Posted by Aseem Anand <as...@gmail.com>.
Hi Chris,
I had a few PriorityQueue's at the mappers which I wished to send to some
reducers. After this each reducer(receiving PriorityQueues from each
mapper) would perform some operations on these by removing the top and
hence accessing the elements in sorted order(which is very essential to my
application). Even I thought of pushing them in an ArrayWritable but was
wondering if there would be an existing implementation of PriorityQueue.
Would it be advisable to insert elements into ArrayWritable in sorted order
and reconstruction of merged PriorityQueues at the other end now ?

Thanks,
Aseem

On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Hello Aseem,
>
> I'm aware of nothing in Hadoop or related projects that provides a
> PriorityQueueWritable.  You could achieve this by taking some existing
> priority queue class and subclassing it or wrapping it to implement the
> Writable.write and Writable.readFields methods.
>
> If you could give us some additional context around what you want to
> solve, then we might be able to offer some other suggestions.  For example,
> depending on the problem, maybe you could sort values and wrap them in
> ArrayWritable (which already exists), which would save you the trouble of
> coding your own custom Writable.
>
> Thank you,
> --Chris
>
> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi,
>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>> from mapper to reducers ?
>>
>> Regards,
>> Aseem
>>
>
>

Re: PriorityQueueWritable

Posted by Aseem Anand <as...@gmail.com>.
Hi Chris,
I had a few PriorityQueue's at the mappers which I wished to send to some
reducers. After this each reducer(receiving PriorityQueues from each
mapper) would perform some operations on these by removing the top and
hence accessing the elements in sorted order(which is very essential to my
application). Even I thought of pushing them in an ArrayWritable but was
wondering if there would be an existing implementation of PriorityQueue.
Would it be advisable to insert elements into ArrayWritable in sorted order
and reconstruction of merged PriorityQueues at the other end now ?

Thanks,
Aseem

On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Hello Aseem,
>
> I'm aware of nothing in Hadoop or related projects that provides a
> PriorityQueueWritable.  You could achieve this by taking some existing
> priority queue class and subclassing it or wrapping it to implement the
> Writable.write and Writable.readFields methods.
>
> If you could give us some additional context around what you want to
> solve, then we might be able to offer some other suggestions.  For example,
> depending on the problem, maybe you could sort values and wrap them in
> ArrayWritable (which already exists), which would save you the trouble of
> coding your own custom Writable.
>
> Thank you,
> --Chris
>
> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com>wrote:
>
>> Hi,
>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>> from mapper to reducers ?
>>
>> Regards,
>> Aseem
>>
>
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Aseem,

I'm aware of nothing in Hadoop or related projects that provides a
PriorityQueueWritable.  You could achieve this by taking some existing
priority queue class and subclassing it or wrapping it to implement the
Writable.write and Writable.readFields methods.

If you could give us some additional context around what you want to solve,
then we might be able to offer some other suggestions.  For example,
depending on the problem, maybe you could sort values and wrap them in
ArrayWritable (which already exists), which would save you the trouble of
coding your own custom Writable.

Thank you,
--Chris

On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi,
> Is anyone familiar with a PriorityQueueWritable to be used to pass data
> from mapper to reducers ?
>
> Regards,
> Aseem
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Aseem,

I'm aware of nothing in Hadoop or related projects that provides a
PriorityQueueWritable.  You could achieve this by taking some existing
priority queue class and subclassing it or wrapping it to implement the
Writable.write and Writable.readFields methods.

If you could give us some additional context around what you want to solve,
then we might be able to offer some other suggestions.  For example,
depending on the problem, maybe you could sort values and wrap them in
ArrayWritable (which already exists), which would save you the trouble of
coding your own custom Writable.

Thank you,
--Chris

On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi,
> Is anyone familiar with a PriorityQueueWritable to be used to pass data
> from mapper to reducers ?
>
> Regards,
> Aseem
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Aseem,

I'm aware of nothing in Hadoop or related projects that provides a
PriorityQueueWritable.  You could achieve this by taking some existing
priority queue class and subclassing it or wrapping it to implement the
Writable.write and Writable.readFields methods.

If you could give us some additional context around what you want to solve,
then we might be able to offer some other suggestions.  For example,
depending on the problem, maybe you could sort values and wrap them in
ArrayWritable (which already exists), which would save you the trouble of
coding your own custom Writable.

Thank you,
--Chris

On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi,
> Is anyone familiar with a PriorityQueueWritable to be used to pass data
> from mapper to reducers ?
>
> Regards,
> Aseem
>

Re: PriorityQueueWritable

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Aseem,

I'm aware of nothing in Hadoop or related projects that provides a
PriorityQueueWritable.  You could achieve this by taking some existing
priority queue class and subclassing it or wrapping it to implement the
Writable.write and Writable.readFields methods.

If you could give us some additional context around what you want to solve,
then we might be able to offer some other suggestions.  For example,
depending on the problem, maybe you could sort values and wrap them in
ArrayWritable (which already exists), which would save you the trouble of
coding your own custom Writable.

Thank you,
--Chris

On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <as...@gmail.com> wrote:

> Hi,
> Is anyone familiar with a PriorityQueueWritable to be used to pass data
> from mapper to reducers ?
>
> Regards,
> Aseem
>