You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by deneche abdelhakim <a_...@yahoo.fr> on 2008/04/11 10:48:34 UTC

About the Mahout.GA Comment

Hi Grant, 

You wrote the following comment on my GSoC proposal:

> Could someone w/ a little more GA knowledge comment on the use of 
> WatchMaker?  What I wonder is if it is possible to distribute some of the 
> watchmaker functionality? 

Do you want to know if there are more other ways to distribute a GA ?

> May not be needed for this proposal, but I am curious as to how much work is 
>done in Watchmaker vs. the actual fitness function.

I dont understand...

Abdel Hakim Deneche

       
---------------------------------
 Envoyé avec Yahoo! Mail.
Une boite mail plus intelligente. 

RE : About the Mahout.GA Comment

Posted by deneche abdelhakim <a_...@yahoo.fr>.
I don't know the exact term, but may be I should have said "computing process", so each processor (or computing node) can run many "computing processes"...

Ted Dunning <td...@veoh.com> a écrit : 
How is "computing node" not a processor?


On 4/12/08 9:26 PM, "deneche abdelhakim"  wrote:

> The number of running algorithms don't depend on the number of processors, in
> fact this kind of algorithms is used even if there is only one single
> processor because of its good search properties. You can imagine it as a
> single big GA with a distributed population and each individual can have its
> own set of operators.
> 
> Abdel Hakim
> 
>> Ted Dunning wrote :
>> 
>> I think it is a very bad idea to tie the algorithm to the number of
>> processors being used in this way.  A program should produce identical
>> results on any machine, subject only to PRNG seeding issues.
> On 4/11/08 8:52 PM, "deneche abdelhakim"  wrote:
> 
>> And there are other reasons to distribute a GA: for example, you may want to
>> run a different version of the algorithm (a different population and perhaps
>> a
>> different set of operators) in each computing node, and from time to time
>> some
>> individuals will migrate from one node to another...this kind of distribution
>> has proven to be more effective cause it searches a larger space.
> 
> 
> 
>        
> ---------------------------------
>  Envoyé avec Yahoo! Mail.
> Une boite mail plus intelligente. 



       
---------------------------------
 Envoyé avec Yahoo! Mail.
Une boite mail plus intelligente. 

Re: RE : About the Mahout.GA Comment

Posted by Ted Dunning <td...@veoh.com>.
How is "computing node" not a processor?


On 4/12/08 9:26 PM, "deneche abdelhakim" <a_...@yahoo.fr> wrote:

> The number of running algorithms don't depend on the number of processors, in
> fact this kind of algorithms is used even if there is only one single
> processor because of its good search properties. You can imagine it as a
> single big GA with a distributed population and each individual can have its
> own set of operators.
> 
> Abdel Hakim
> 
>> Ted Dunning wrote :
>> 
>> I think it is a very bad idea to tie the algorithm to the number of
>> processors being used in this way.  A program should produce identical
>> results on any machine, subject only to PRNG seeding issues.
> On 4/11/08 8:52 PM, "deneche abdelhakim"  wrote:
> 
>> And there are other reasons to distribute a GA: for example, you may want to
>> run a different version of the algorithm (a different population and perhaps
>> a
>> different set of operators) in each computing node, and from time to time
>> some
>> individuals will migrate from one node to another...this kind of distribution
>> has proven to be more effective cause it searches a larger space.
> 
> 
> 
>        
> ---------------------------------
>  Envoyé avec Yahoo! Mail.
> Une boite mail plus intelligente. 


RE : About the Mahout.GA Comment

Posted by deneche abdelhakim <a_...@yahoo.fr>.
The number of running algorithms don't depend on the number of processors, in fact this kind of algorithms is used even if there is only one single processor because of its good search properties. You can imagine it as a single big GA with a distributed population and each individual can have its own set of operators.

Abdel Hakim

>Ted Dunning wrote :
  >
  > I think it is a very bad idea to tie the algorithm to the number of
  > processors being used in this way.  A program should produce identical
  > results on any machine, subject only to PRNG seeding issues.
On 4/11/08 8:52 PM, "deneche abdelhakim"  wrote:

> And there are other reasons to distribute a GA: for example, you may want to
> run a different version of the algorithm (a different population and perhaps a
> different set of operators) in each computing node, and from time to time some
> individuals will migrate from one node to another...this kind of distribution
> has proven to be more effective cause it searches a larger space.



       
---------------------------------
 Envoyé avec Yahoo! Mail.
Une boite mail plus intelligente. 

Re: About the Mahout.GA Comment

Posted by Ted Dunning <td...@veoh.com>.

I think it is a very bad idea to tie the algorithm to the number of
processors being used in this way.  A program should produce identical
results on any machine, subject only to PRNG seeding issues.


On 4/11/08 8:52 PM, "deneche abdelhakim" <a_...@yahoo.fr> wrote:

> And there are other reasons to distribute a GA: for example, you may want to
> run a different version of the algorithm (a different population and perhaps a
> different set of operators) in each computing node, and from time to time some
> individuals will migrate from one node to another...this kind of distribution
> has proven to be more effective cause it searches a larger space.


Re: About the Mahout.GA Comment

Posted by deneche abdelhakim <a_...@yahoo.fr>.
Hi, Grant

> I don't have a sense for how long the fitness function takes versus  
> the other operations that WatchMaker does.   In other words, do the  
> other pieces of running a GA algorithm needed to be distributed.  Just  
> curious and I don't think it is a big deal as far as your proposal goes.

This is problem dependent, generally the fitness function is the big deal but for others you may have very complex mutation operators (or any other operator).

And there are other reasons to distribute a GA: for example, you may want to run a different version of the algorithm (a different population and perhaps a different set of operators) in each computing node, and from time to time some individuals will migrate from one node to another...this kind of distribution has proven to be more effective cause it searches a larger space.

Abdel Hakim

       
---------------------------------
 Envoyé avec Yahoo! Mail.
Une boite mail plus intelligente. 

Re: About the Mahout.GA Comment

Posted by Ted Dunning <td...@veoh.com>.
I can't comment on Watchmaker, but in the EP implementation that I posted,
the control aspects are easily parallelized using multiple reduces.
Moreover, the cost of the framework is just the cost of keeping a priority
queue of the top items.  This can be implemented using a sort on all
candidates (as I did, for simplicity) or as a size limited heap or priority
queue.  In any case, the overhead of the framework is really small.  This is
even more true in real applications of EP's where the evaluation often
involves some truly non-trivial application.


On 4/11/08 4:38 AM, "Grant Ingersoll" <gs...@apache.org> wrote:

> I don't have a sense for how long the fitness function takes versus
> the other operations that WatchMaker does.   In other words, do the
> other pieces of running a GA algorithm needed to be distributed.  Just
> curious and I don't think it is a big deal as far as your proposal goes.
> 
> -Grant
> 
> On Apr 11, 2008, at 10:48 AM, deneche abdelhakim wrote:
> 
>> Hi Grant,
>> 
>> You wrote the following comment on my GSoC proposal:
>> 
>>> Could someone w/ a little more GA knowledge comment on the use of
>>> WatchMaker?  What I wonder is if it is possible to distribute some
>>> of the
>>> watchmaker functionality?
>> 
>> Do you want to know if there are more other ways to distribute a GA ?
>> 
>>> May not be needed for this proposal, but I am curious as to how
>>> much work is
>>> done in Watchmaker vs. the actual fitness function.
>> 
>> I dont understand...
>> 
>> Abdel Hakim Deneche
>> 
>> 
>> ---------------------------------
>> Envoyé avec Yahoo! Mail.
>> Une boite mail plus intelligente.
> 


Re: About the Mahout.GA Comment

Posted by Grant Ingersoll <gs...@apache.org>.
I don't have a sense for how long the fitness function takes versus  
the other operations that WatchMaker does.   In other words, do the  
other pieces of running a GA algorithm needed to be distributed.  Just  
curious and I don't think it is a big deal as far as your proposal goes.

-Grant

On Apr 11, 2008, at 10:48 AM, deneche abdelhakim wrote:

> Hi Grant,
>
> You wrote the following comment on my GSoC proposal:
>
>> Could someone w/ a little more GA knowledge comment on the use of
>> WatchMaker?  What I wonder is if it is possible to distribute some  
>> of the
>> watchmaker functionality?
>
> Do you want to know if there are more other ways to distribute a GA ?
>
>> May not be needed for this proposal, but I am curious as to how  
>> much work is
>> done in Watchmaker vs. the actual fitness function.
>
> I dont understand...
>
> Abdel Hakim Deneche
>
>
> ---------------------------------
> Envoyé avec Yahoo! Mail.
> Une boite mail plus intelligente.