You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by Diptanshu Kakwani <di...@gmail.com> on 2017/03/10 10:55:39 UTC

Apache Hama scalability

Hey everyone, I was trying to scale my application for larger datasets. But
memory has been becoming a bottleneck for my application. I wanted to know
if there are certain parameters which can be changed to reduce the
framework's memory footprint?

For example, I noticed this comment in AbstractMessageManager.java:

/**
 * Returns a new queue implementation based on what was configured. If nothing
 * has been configured for "hama.messenger.queue.class" then the
 * {@link MemoryQueue} is used. If you have scalability issues, then better
 * use {@link DiskQueue}.
 *
 * @return a <b>new</b> queue implementation.
 */


But I couldn't find DiskQueue in Hama. Why has it been removed from the
latest version of Hama?


-- 
Diptanshu Kakwani
https://keybase.io/dipkakwani

Re: Apache Hama scalability

Posted by "Edward J. Yoon" <ed...@apache.org>.
Unfortunately, we don't support disk-based queue at this moment.
Please consider to optimize your algorithm or message. In graph job
message case, we serialize/deserialize objects to reduce the memory
overhead. You can do like this.

1. https://github.com/apache/hama/blob/master/graph/src/main/java/org/apache/hama/graph/GraphJobMessage.java

On Tue, Mar 14, 2017 at 5:52 PM, Diptanshu Kakwani <di...@gmail.com> wrote:
> No, it is a normal BSP Job.
>
> --
> Diptanshu Kakwani
> https://keybase.io/dipkakwani
>
> On Tue, Mar 14, 2017 at 5:44 AM, Edward J. Yoon <ed...@apache.org>
> wrote:
>
>> Hi, is your job a graph job?
>>
>> On Fri, Mar 10, 2017 at 7:55 PM, Diptanshu Kakwani <di...@gmail.com>
>> wrote:
>> > Hey everyone, I was trying to scale my application for larger datasets.
>> But
>> > memory has been becoming a bottleneck for my application. I wanted to
>> know
>> > if there are certain parameters which can be changed to reduce the
>> > framework's memory footprint?
>> >
>> > For example, I noticed this comment in AbstractMessageManager.java:
>> >
>> > /**
>> >  * Returns a new queue implementation based on what was configured. If
>> nothing
>> >  * has been configured for "hama.messenger.queue.class" then the
>> >  * {@link MemoryQueue} is used. If you have scalability issues, then
>> better
>> >  * use {@link DiskQueue}.
>> >  *
>> >  * @return a <b>new</b> queue implementation.
>> >  */
>> >
>> >
>> > But I couldn't find DiskQueue in Hama. Why has it been removed from the
>> > latest version of Hama?
>> >
>> >
>> > --
>> > Diptanshu Kakwani
>> > https://keybase.io/dipkakwani
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>>



-- 
Best Regards, Edward J. Yoon

Re: Apache Hama scalability

Posted by Diptanshu Kakwani <di...@gmail.com>.
No, it is a normal BSP Job.

-- 
Diptanshu Kakwani
https://keybase.io/dipkakwani

On Tue, Mar 14, 2017 at 5:44 AM, Edward J. Yoon <ed...@apache.org>
wrote:

> Hi, is your job a graph job?
>
> On Fri, Mar 10, 2017 at 7:55 PM, Diptanshu Kakwani <di...@gmail.com>
> wrote:
> > Hey everyone, I was trying to scale my application for larger datasets.
> But
> > memory has been becoming a bottleneck for my application. I wanted to
> know
> > if there are certain parameters which can be changed to reduce the
> > framework's memory footprint?
> >
> > For example, I noticed this comment in AbstractMessageManager.java:
> >
> > /**
> >  * Returns a new queue implementation based on what was configured. If
> nothing
> >  * has been configured for "hama.messenger.queue.class" then the
> >  * {@link MemoryQueue} is used. If you have scalability issues, then
> better
> >  * use {@link DiskQueue}.
> >  *
> >  * @return a <b>new</b> queue implementation.
> >  */
> >
> >
> > But I couldn't find DiskQueue in Hama. Why has it been removed from the
> > latest version of Hama?
> >
> >
> > --
> > Diptanshu Kakwani
> > https://keybase.io/dipkakwani
>
>
>
> --
> Best Regards, Edward J. Yoon
>

Re: Apache Hama scalability

Posted by "Edward J. Yoon" <ed...@apache.org>.
Hi, is your job a graph job?

On Fri, Mar 10, 2017 at 7:55 PM, Diptanshu Kakwani <di...@gmail.com> wrote:
> Hey everyone, I was trying to scale my application for larger datasets. But
> memory has been becoming a bottleneck for my application. I wanted to know
> if there are certain parameters which can be changed to reduce the
> framework's memory footprint?
>
> For example, I noticed this comment in AbstractMessageManager.java:
>
> /**
>  * Returns a new queue implementation based on what was configured. If nothing
>  * has been configured for "hama.messenger.queue.class" then the
>  * {@link MemoryQueue} is used. If you have scalability issues, then better
>  * use {@link DiskQueue}.
>  *
>  * @return a <b>new</b> queue implementation.
>  */
>
>
> But I couldn't find DiskQueue in Hama. Why has it been removed from the
> latest version of Hama?
>
>
> --
> Diptanshu Kakwani
> https://keybase.io/dipkakwani



-- 
Best Regards, Edward J. Yoon