You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by ik...@csd.auth.gr on 2013/09/28 19:32:28 UTC

Selective Aggregator execution

Hello there,

   I wanted to ask you if there is a way to select after which  
supersteps the aggregation phase is executed. My way of cutting some  
of the work is by setting a status id at the vertex value and then if  
it doesn't match with the status number the aggregator wants, it  
doesn't move to the rest code.
   I've solved partially the problem, but is there a way to entirely  
skip the execution so there is no additional delay?

Thanks!


Re: Selective Aggregator execution

Posted by ik...@csd.auth.gr.
Great job Anastasis!

    I'll use it for sure and inform you if anything comes up! Keep going!


Quoting Anastasis Andronidis <an...@hotmail.com>:

> Hello again,
>
> the feature is added and is going to be released on 0.6.3 version.  
> Thanks for joining!
>
> If you find anything that is not working or can be improved, don't  
> hesitate to send an email or a patch.
>
> Cheers,
> Anastasis




Re: Selective Aggregator execution

Posted by Anastasis Andronidis <an...@hotmail.com>.
Hello again,

the feature is added and is going to be released on 0.6.3 version. Thanks for joining!

If you find anything that is not working or can be improved, don't hesitate to send an email or a patch.

Cheers,
Anastasis

Re: Selective Aggregator execution

Posted by Ηλίας Καπουράνης <ik...@csd.auth.gr>.
If there is some sort of distributed cache we can have the list there.


Στις 30/9/2013 1:13 μμ, ο/η Anastasis Andronidis έγραψε:
> I quote from the JIRA issue:
>
>> Ilias Kapouranis added a comment
>> I don't think it would be much of an issue.
>> 	• We have the List where we keep all the aggregators.
>> 	• When executeAggregator(int aggrIndex) is called, we move the aggrIndex to a new List (say tempList) which keeps a pair (aggrIndex,aggrClass).
>> 	• At the end of the superstep, if tempList is empty then all the aggregators will be executed, else only those which are in it.
>> 	• When all aggregators have finished, we move the pairs from tempList to the main List and we put the aggregators to their previous indexes.
>> Hope this helps.
> I totally agree that this is the case in a higher level. The problem is that the implementation is not that simple.
>
> Every node (a machine let's say) that is running in the distributed environment has a BSP peer that runs as a local instance. In every BSP peer, vertices execute their code. This means that when you ask for an aggregator not to run in a specific vertex, this invocation happens only in 1 node. You need to sync all other nodes not to run the same aggregator and in the end also skip the master aggregator. This is a little bit tricky, because it is very depended on the implementation of the software you use (in this case Hama).
>
> Of course, if your code is exactly the same in every vertex, every peer will have a local invoke of skipping their aggregators and no sync is needed. But as it's not always the case we need to plan for the first scenario as well.
>
> If you have any questions, or something is not clear. Please reply.
>
> Cheers,
> Anastasis


Re: Selective Aggregator execution

Posted by Anastasis Andronidis <an...@hotmail.com>.
I quote from the JIRA issue:

> Ilias Kapouranis added a comment
> I don't think it would be much of an issue.
> 	• We have the List where we keep all the aggregators.
> 	• When executeAggregator(int aggrIndex) is called, we move the aggrIndex to a new List (say tempList) which keeps a pair (aggrIndex,aggrClass).
> 	• At the end of the superstep, if tempList is empty then all the aggregators will be executed, else only those which are in it.
> 	• When all aggregators have finished, we move the pairs from tempList to the main List and we put the aggregators to their previous indexes.
> Hope this helps.

I totally agree that this is the case in a higher level. The problem is that the implementation is not that simple.

Every node (a machine let's say) that is running in the distributed environment has a BSP peer that runs as a local instance. In every BSP peer, vertices execute their code. This means that when you ask for an aggregator not to run in a specific vertex, this invocation happens only in 1 node. You need to sync all other nodes not to run the same aggregator and in the end also skip the master aggregator. This is a little bit tricky, because it is very depended on the implementation of the software you use (in this case Hama).

Of course, if your code is exactly the same in every vertex, every peer will have a local invoke of skipping their aggregators and no sync is needed. But as it's not always the case we need to plan for the first scenario as well.

If you have any questions, or something is not clear. Please reply.

Cheers,
Anastasis

Re: Selective Aggregator execution

Posted by Anastasis Andronidis <an...@hotmail.com>.
Hello Ilias,

you can see the progress of the feature here: https://issues.apache.org/jira/browse/HAMA-807

I would really like any review and comments :)

Thanks,
Anastasis

On 29 Σεπ 2013, at 8:41 μ.μ., ikapoura@csd.auth.gr wrote:

> Sure! Then I'll be waiting for news!
> 
> 
> Quoting Anastasis Andronidis <an...@hotmail.com>:
> 
>> Hey again,
>> 
>> I was thinking that we can rush this feature. I will create a new JIRA (probably tomorrow) only for this and you can comment on that so we can work this on. And maybe we can add it on the next release :)
>> 
>> Cheers,
>> Anastasis
>> 
>> On 29 Σεπ 2013, at 1:23 μ.μ., ikapoura@csd.auth.gr wrote:
>> 
>>> Hey Anastasis, thanks for the reply. If there is any way I can help just contact me!
>>> 
>>> 
>>> 
>>> Quoting Anastasis Andronidis <an...@hotmail.com>:
>>> 
>>>> Hello!
>>>> 
>>>> Currently, there is no way to skip aggregation step. We have plans though to develop such a functionality in the next releases as we refactoring aggregators and Graph API in general.
>>>> 
>>>> Your approach seams to be the only solution ATM.
>>>> 
>>>> Cheers,
>>>> Anastasis
>>>> 
>>>> On 28 Σεπ 2013, at 8:32 μ.μ., ikapoura@csd.auth.gr wrote:
>>>> 
>>>>> Hello there,
>>>>> 
>>>>> I wanted to ask you if there is a way to select after which supersteps the aggregation phase is executed. My way of cutting some of the work is by setting a status id at the vertex value and then if it doesn't match with the status number the aggregator wants, it doesn't move to the rest code.
>>>>> I've solved partially the problem, but is there a way to entirely skip the execution so there is no additional delay?
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> 
>>> 
>>> 
>>> 
>>> 
> 
> 
> 
> 


Re: Selective Aggregator execution

Posted by ik...@csd.auth.gr.
Sure! Then I'll be waiting for news!


Quoting Anastasis Andronidis <an...@hotmail.com>:

> Hey again,
>
> I was thinking that we can rush this feature. I will create a new  
> JIRA (probably tomorrow) only for this and you can comment on that  
> so we can work this on. And maybe we can add it on the next release :)
>
> Cheers,
> Anastasis
>
> On 29 Σεπ 2013, at 1:23 μ.μ., ikapoura@csd.auth.gr wrote:
>
>> Hey Anastasis, thanks for the reply. If there is any way I can help  
>> just contact me!
>>
>>
>>
>> Quoting Anastasis Andronidis <an...@hotmail.com>:
>>
>>> Hello!
>>>
>>> Currently, there is no way to skip aggregation step. We have plans  
>>> though to develop such a functionality in the next releases as we  
>>> refactoring aggregators and Graph API in general.
>>>
>>> Your approach seams to be the only solution ATM.
>>>
>>> Cheers,
>>> Anastasis
>>>
>>> On 28 Σεπ 2013, at 8:32 μ.μ., ikapoura@csd.auth.gr wrote:
>>>
>>>> Hello there,
>>>>
>>>> I wanted to ask you if there is a way to select after which  
>>>> supersteps the aggregation phase is executed. My way of cutting  
>>>> some of the work is by setting a status id at the vertex value  
>>>> and then if it doesn't match with the status number the  
>>>> aggregator wants, it doesn't move to the rest code.
>>>> I've solved partially the problem, but is there a way to entirely  
>>>> skip the execution so there is no additional delay?
>>>>
>>>> Thanks!
>>>>
>>>>
>>
>>
>>
>>




Re: Selective Aggregator execution

Posted by Anastasis Andronidis <an...@hotmail.com>.
Hey again,

I was thinking that we can rush this feature. I will create a new JIRA (probably tomorrow) only for this and you can comment on that so we can work this on. And maybe we can add it on the next release :)

Cheers,
Anastasis

On 29 Σεπ 2013, at 1:23 μ.μ., ikapoura@csd.auth.gr wrote:

> Hey Anastasis, thanks for the reply. If there is any way I can help just contact me!
> 
> 
> 
> Quoting Anastasis Andronidis <an...@hotmail.com>:
> 
>> Hello!
>> 
>> Currently, there is no way to skip aggregation step. We have plans though to develop such a functionality in the next releases as we refactoring aggregators and Graph API in general.
>> 
>> Your approach seams to be the only solution ATM.
>> 
>> Cheers,
>> Anastasis
>> 
>> On 28 Σεπ 2013, at 8:32 μ.μ., ikapoura@csd.auth.gr wrote:
>> 
>>> Hello there,
>>> 
>>> I wanted to ask you if there is a way to select after which supersteps the aggregation phase is executed. My way of cutting some of the work is by setting a status id at the vertex value and then if it doesn't match with the status number the aggregator wants, it doesn't move to the rest code.
>>> I've solved partially the problem, but is there a way to entirely skip the execution so there is no additional delay?
>>> 
>>> Thanks!
>>> 
>>> 
> 
> 
> 
> 


Re: Selective Aggregator execution

Posted by ik...@csd.auth.gr.
Hey Anastasis, thanks for the reply. If there is any way I can help  
just contact me!



Quoting Anastasis Andronidis <an...@hotmail.com>:

> Hello!
>
> Currently, there is no way to skip aggregation step. We have plans  
> though to develop such a functionality in the next releases as we  
> refactoring aggregators and Graph API in general.
>
> Your approach seams to be the only solution ATM.
>
> Cheers,
> Anastasis
>
> On 28 Σεπ 2013, at 8:32 μ.μ., ikapoura@csd.auth.gr wrote:
>
>> Hello there,
>>
>>  I wanted to ask you if there is a way to select after which  
>> supersteps the aggregation phase is executed. My way of cutting  
>> some of the work is by setting a status id at the vertex value and  
>> then if it doesn't match with the status number the aggregator  
>> wants, it doesn't move to the rest code.
>>  I've solved partially the problem, but is there a way to entirely  
>> skip the execution so there is no additional delay?
>>
>> Thanks!
>>
>>




Re: Selective Aggregator execution

Posted by Anastasis Andronidis <an...@hotmail.com>.
Hello!

Currently, there is no way to skip aggregation step. We have plans though to develop such a functionality in the next releases as we refactoring aggregators and Graph API in general.

Your approach seams to be the only solution ATM.

Cheers,
Anastasis

On 28 Σεπ 2013, at 8:32 μ.μ., ikapoura@csd.auth.gr wrote:

> Hello there,
> 
>  I wanted to ask you if there is a way to select after which supersteps the aggregation phase is executed. My way of cutting some of the work is by setting a status id at the vertex value and then if it doesn't match with the status number the aggregator wants, it doesn't move to the rest code.
>  I've solved partially the problem, but is there a way to entirely skip the execution so there is no additional delay?
> 
> Thanks!
> 
>