You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Семенов Кирилл <se...@gmail.com> on 2016/03/17 17:11:16 UTC

[math] questions on GA

Hi,

I've been using genetic algorithm for some pet projects. And I'd like to
shed some light on a number of topics.

1. Am I correct to think, that now GA is working in a single thread? In
such case, was there any discussions on the subject (I didn't find within a
quick check of Jira). If not, could you provide some API reference. The
subject is important, because the ability to be distributed is one of the
key features of the GA.

2. Was there talks about implementing Pool for chromosomes? I found
enhancement proposal https://issues.apache.org/jira/browse/MATH-1219 -
which is aimed to solve the same problem - creating an enormous amount of
chromosomes in each generation. Chromosomes after each generation hangs in
a heap waiting for GC. Also, object pool can be implemented, supposing that
chromosome would consist of List<? extends PooledObject>.

3. Examples of using getRepresentation method of AbstractListChromosome
seem misleading. Because getRepresentation  is protected method and writing
classes that implement MutationPolicy/CrossoverPolicy can't use it. For
rapid development one could implement public overriding method, but can't
it be defined public in AbstractListChromosome? If one is to write some
particular policy, he must override getRepresentation method in
CustomChromosome. But if one wants to write some common genetic policy
(e.g., some reordering crossover), he would face an obstacle mentioned.

I'd like to create tasks for those in Jira. Just want to make sure, that
these topics would be useful and gather some information, other devepoler's
opinions on a matter.


-- 
Regards,
Kirill

Re: [math] questions on GA

Posted by Thomas Neidhart <th...@gmail.com>.
On 03/18/2016 02:12 PM, Семенов Кирилл wrote:
>>
>> In effect, some time ago we evoked the possibility to drop GA support
>> altogether since the code seemed little used and a lot of work was
>> anticipated for making it useful beyond demo applications.
> 
> 
> It is rather surprising, that ASF doesn't have any complex and extensive AI
> instruments. I was looking for ANN some time ago and found only Mahout,
> which seemed rather idle.
> But Commons projects, IMO, are a good place for GA, while there is no
> activity on AI merging/separating.

take a loot at http://jenetics.io which contains already everything you
were asking for and is a very mature and well designed library. On top
of that it is also under the Apache license and actually uses CM for
testing.

Thomas

> but lacking human resources
>> it's unlikely to become a reality any time soon.
>>
> 
>  You can always point to lacking features by opening JIRA reports, but
>> unless
>> you intend to work on them yourself, I wouldn't bet on having them fixed
>> rapidly.
>>
> 
> If someone would want to start a large overhaul of the GA code, that is
>> worth considering.
>>
> 
> There surely can be a big scope. Btw, I'm still a student (last year of
> BS).
> So, I can do things listed and some more as a GSoC project (I was planning
> to participate anyway).
> https://docs.google.com/document/d/1RT_zNfBdf8rX2p5Qo0bIQ28SqwWUVPwfKmjStwK8SCw/edit#
> - I wrote my vision and ready to refine it,
> if you care to comment.
> 
> I find it strange, that ASF list for GSoC doesn't include any Commons
> projects or widely known (Cassandram Solr, Kafka, etc.) Is it even possible
> to participate on behalf of Commons Math this year?
> 
> Do you mean using an existing library, or do you suggest implementing the
>> functionality specifically for CM?
>>
>> I'm not sure, which way is preferred. There is Commons Pool, that can
> become a dependency, but is it okay to get any dependencies beside Junit?
> And implementing It would definitely enlarge codebase. Which is less of two
> evils?
> 
> 
> 
> 2016-03-17 21:35 GMT+03:00 Gilles <gi...@harfang.homelinux.org>:
> 
>> Hello.
>>
>> On Thu, 17 Mar 2016 19:11:16 +0300, Семенов Кирилл wrote:
>>
>>> Hi,
>>>
>>> I've been using genetic algorithm for some pet projects. And I'd like to
>>> shed some light on a number of topics.
>>>
>>
>> Thanks for you interest.
>>
>> Given that there exist Java softwares that seem to provide a more complete
>> features set, I'd be interested to know a user's opinion on how the
>> CM implementation compares with those.
>> In effect, some time ago we evoked the possibility to drop GA support
>> altogether since the code seemed little used and a lot of work was
>> anticipated for making it useful beyond demo applications.
>>
>> 1. Am I correct to think, that now GA is working in a single thread?
>>>
>>
>> Certainly.
>>
>> Very few CM codes are multi-thread ready.  It was one of the task to
>> be tackled for future versions of the library, but lacking human resources
>> it's unlikely to become a reality any time soon.
>>
>> In
>>> such case, was there any discussions on the subject (I didn't find within
>>> a
>>> quick check of Jira).
>>>
>>
>> There were discussions (cf. "dev" ML archive).
>>
>> If not, could you provide some API reference. The
>>> subject is important, because the ability to be distributed is one of the
>>> key features of the GA.
>>>
>>> 2. Was there talks about implementing Pool for chromosomes? I found
>>> enhancement proposal https://issues.apache.org/jira/browse/MATH-1219 -
>>> which is aimed to solve the same problem - creating an enormous amount of
>>> chromosomes in each generation. Chromosomes after each generation hangs in
>>> a heap waiting for GC. Also, object pool can be implemented, supposing
>>> that
>>> chromosome would consist of List<? extends PooledObject>.
>>>
>>
>> If someone would want to start a large overhaul of the GA code, that is
>> worth considering.
>> Do you mean using an existing library, or do you suggest implementing the
>> functionality specifically for CM?
>>
>> 3. Examples of using getRepresentation method of AbstractListChromosome
>>> seem misleading. Because getRepresentation  is protected method and
>>> writing
>>> classes that implement MutationPolicy/CrossoverPolicy can't use it. For
>>> rapid development one could implement public overriding method, but can't
>>> it be defined public in AbstractListChromosome? If one is to write some
>>> particular policy, he must override getRepresentation method in
>>> CustomChromosome. But if one wants to write some common genetic policy
>>> (e.g., some reordering crossover), he would face an obstacle mentioned.
>>>
>>> I'd like to create tasks for those in Jira. Just want to make sure, that
>>> these topics would be useful and gather some information, other
>>> devepoler's
>>> opinions on a matter.
>>>
>>
>> You can always point to lacking features by opening JIRA reports, but
>> unless
>> you intend to work on them yourself, I wouldn't bet on having them fixed
>> rapidly.
>>
>>
>> Best regards,
>> Gilles
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] questions on GA

Posted by Gilles <gi...@harfang.homelinux.org>.
Hi.

On Fri, 18 Mar 2016 16:12:11 +0300, Семенов Кирилл wrote:
>>
>> In effect, some time ago we evoked the possibility to drop GA 
>> support
>> altogether since the code seemed little used and a lot of work was
>> anticipated for making it useful beyond demo applications.
>
>
> It is rather surprising, that ASF doesn't have any complex and 
> extensive AI
> instruments. I was looking for ANN some time ago and found only 
> Mahout,
> which seemed rather idle.

Projects depend on people that want to create and expand them.
In Commons Math, there is the "o.a.c.m.ml.neuralnet" package, but it 
only
provides the "Self-Organizing Feature Map" network and learning 
algorithm.

> But Commons projects, IMO, are a good place for GA,

Certainly; but the point of discussion is whether we have sufficient
human resources to "compete" with other, already existing, packages.

Are there features missing in the software referred to by Thomas, which
would compel you to start a major enhancement of the code in CM?

If you still want to improve the CM "o.a.c.m.genetics" package, you are 
of
course free to do so.
But I fear there might be an issue of timely update of the repository 
if
no committer is convinced that the intermediate changes do bring more 
value
than the effort needed to integrate them.
What I mean is that you might need to do the whole refactoring before
being able to show the net improvement.
Perhaps others here will have another opinion...

> while there is no
> activity on AI merging/separating.

I don't understand that sentence.

> but lacking human resources
>> it's unlikely to become a reality any time soon.
>>
>
>  You can always point to lacking features by opening JIRA reports, 
> but
>> unless
>> you intend to work on them yourself, I wouldn't bet on having them 
>> fixed
>> rapidly.
>>
>
> If someone would want to start a large overhaul of the GA code, that 
> is
>> worth considering.
>>
>
> There surely can be a big scope. Btw, I'm still a student (last year 
> of
> BS).
> So, I can do things listed and some more as a GSoC project (I was 
> planning
> to participate anyway).
> 
> https://docs.google.com/document/d/1RT_zNfBdf8rX2p5Qo0bIQ28SqwWUVPwfKmjStwK8SCw/edit#
> - I wrote my vision and ready to refine it,
> if you care to comment.

It's really a nice offer.
The multi-threading feature is quite important, since it good help in 
making
Commons Math more attractive.
But that is a general remark, and a (relatively) recent discussion was 
about
how to make CM codes ready for parallel execution.  E.g. should 
parallelism
be implemented within CM, or should it be handle by a more general 
framework
(that would require *thread-safe* CM classes)?

Anyway, it would nice if you could work on CM in order to clarify the 
issue:
which algorithms need which approach, etc.

> I find it strange, that ASF list for GSoC doesn't include any Commons
> projects or widely known (Cassandram Solr, Kafka, etc.) Is it even 
> possible
> to participate on behalf of Commons Math this year?

Good question. [To which I personally cannot answer.]

Since Commons Math is (still) a part of the "Commons" project, I'd 
suggest
that you send a message to this list with "[ALL]" as a prefix subject, 
in
the hope that it will attract attention from more knowledgeable people.

> Do you mean using an existing library, or do you suggest implementing 
> the
>> functionality specifically for CM?
>>
>> I'm not sure, which way is preferred. There is Commons Pool, that 
>> can
> become a dependency, but is it okay to get any dependencies beside 
> Junit?

The issues of dependencies is sensitive...
[Junit is only a dependency for running the tests, not for using the 
library.]

> And implementing It would definitely enlarge codebase. Which is less 
> of two
> evils?

An interesting point.

A related discussion was about modularizing the CM code.  It's clear 
that GA
could be developed in its own "module".  Thus that module could (and 
should,
probably) reuse external tools rather than reimplement them.
But the decision to add such dependencies would not affect other 
modules
where dependencies are less desirable.


Best regards,
Gilles

>
>
>
> 2016-03-17 21:35 GMT+03:00 Gilles <gi...@harfang.homelinux.org>:
>
>> Hello.
>>
>> On Thu, 17 Mar 2016 19:11:16 +0300, Семенов Кирилл wrote:
>>
>>> Hi,
>>>
>>> I've been using genetic algorithm for some pet projects. And I'd 
>>> like to
>>> shed some light on a number of topics.
>>>
>>
>> Thanks for you interest.
>>
>> Given that there exist Java softwares that seem to provide a more 
>> complete
>> features set, I'd be interested to know a user's opinion on how the
>> CM implementation compares with those.
>> In effect, some time ago we evoked the possibility to drop GA 
>> support
>> altogether since the code seemed little used and a lot of work was
>> anticipated for making it useful beyond demo applications.
>>
>> 1. Am I correct to think, that now GA is working in a single thread?
>>>
>>
>> Certainly.
>>
>> Very few CM codes are multi-thread ready.  It was one of the task to
>> be tackled for future versions of the library, but lacking human 
>> resources
>> it's unlikely to become a reality any time soon.
>>
>> In
>>> such case, was there any discussions on the subject (I didn't find 
>>> within
>>> a
>>> quick check of Jira).
>>>
>>
>> There were discussions (cf. "dev" ML archive).
>>
>> If not, could you provide some API reference. The
>>> subject is important, because the ability to be distributed is one 
>>> of the
>>> key features of the GA.
>>>
>>> 2. Was there talks about implementing Pool for chromosomes? I found
>>> enhancement proposal 
>>> https://issues.apache.org/jira/browse/MATH-1219 -
>>> which is aimed to solve the same problem - creating an enormous 
>>> amount of
>>> chromosomes in each generation. Chromosomes after each generation 
>>> hangs in
>>> a heap waiting for GC. Also, object pool can be implemented, 
>>> supposing
>>> that
>>> chromosome would consist of List<? extends PooledObject>.
>>>
>>
>> If someone would want to start a large overhaul of the GA code, that 
>> is
>> worth considering.
>> Do you mean using an existing library, or do you suggest 
>> implementing the
>> functionality specifically for CM?
>>
>> 3. Examples of using getRepresentation method of 
>> AbstractListChromosome
>>> seem misleading. Because getRepresentation  is protected method and
>>> writing
>>> classes that implement MutationPolicy/CrossoverPolicy can't use it. 
>>> For
>>> rapid development one could implement public overriding method, but 
>>> can't
>>> it be defined public in AbstractListChromosome? If one is to write 
>>> some
>>> particular policy, he must override getRepresentation method in
>>> CustomChromosome. But if one wants to write some common genetic 
>>> policy
>>> (e.g., some reordering crossover), he would face an obstacle 
>>> mentioned.
>>>
>>> I'd like to create tasks for those in Jira. Just want to make sure, 
>>> that
>>> these topics would be useful and gather some information, other
>>> devepoler's
>>> opinions on a matter.
>>>
>>
>> You can always point to lacking features by opening JIRA reports, 
>> but
>> unless
>> you intend to work on them yourself, I wouldn't bet on having them 
>> fixed
>> rapidly.
>>
>>
>> Best regards,
>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [math] questions on GA

Posted by Семенов Кирилл <se...@gmail.com>.
>
> In effect, some time ago we evoked the possibility to drop GA support
> altogether since the code seemed little used and a lot of work was
> anticipated for making it useful beyond demo applications.


It is rather surprising, that ASF doesn't have any complex and extensive AI
instruments. I was looking for ANN some time ago and found only Mahout,
which seemed rather idle.
But Commons projects, IMO, are a good place for GA, while there is no
activity on AI merging/separating.

but lacking human resources
> it's unlikely to become a reality any time soon.
>

 You can always point to lacking features by opening JIRA reports, but
> unless
> you intend to work on them yourself, I wouldn't bet on having them fixed
> rapidly.
>

If someone would want to start a large overhaul of the GA code, that is
> worth considering.
>

There surely can be a big scope. Btw, I'm still a student (last year of
BS).
So, I can do things listed and some more as a GSoC project (I was planning
to participate anyway).
https://docs.google.com/document/d/1RT_zNfBdf8rX2p5Qo0bIQ28SqwWUVPwfKmjStwK8SCw/edit#
- I wrote my vision and ready to refine it,
if you care to comment.

I find it strange, that ASF list for GSoC doesn't include any Commons
projects or widely known (Cassandram Solr, Kafka, etc.) Is it even possible
to participate on behalf of Commons Math this year?

Do you mean using an existing library, or do you suggest implementing the
> functionality specifically for CM?
>
> I'm not sure, which way is preferred. There is Commons Pool, that can
become a dependency, but is it okay to get any dependencies beside Junit?
And implementing It would definitely enlarge codebase. Which is less of two
evils?



2016-03-17 21:35 GMT+03:00 Gilles <gi...@harfang.homelinux.org>:

> Hello.
>
> On Thu, 17 Mar 2016 19:11:16 +0300, Семенов Кирилл wrote:
>
>> Hi,
>>
>> I've been using genetic algorithm for some pet projects. And I'd like to
>> shed some light on a number of topics.
>>
>
> Thanks for you interest.
>
> Given that there exist Java softwares that seem to provide a more complete
> features set, I'd be interested to know a user's opinion on how the
> CM implementation compares with those.
> In effect, some time ago we evoked the possibility to drop GA support
> altogether since the code seemed little used and a lot of work was
> anticipated for making it useful beyond demo applications.
>
> 1. Am I correct to think, that now GA is working in a single thread?
>>
>
> Certainly.
>
> Very few CM codes are multi-thread ready.  It was one of the task to
> be tackled for future versions of the library, but lacking human resources
> it's unlikely to become a reality any time soon.
>
> In
>> such case, was there any discussions on the subject (I didn't find within
>> a
>> quick check of Jira).
>>
>
> There were discussions (cf. "dev" ML archive).
>
> If not, could you provide some API reference. The
>> subject is important, because the ability to be distributed is one of the
>> key features of the GA.
>>
>> 2. Was there talks about implementing Pool for chromosomes? I found
>> enhancement proposal https://issues.apache.org/jira/browse/MATH-1219 -
>> which is aimed to solve the same problem - creating an enormous amount of
>> chromosomes in each generation. Chromosomes after each generation hangs in
>> a heap waiting for GC. Also, object pool can be implemented, supposing
>> that
>> chromosome would consist of List<? extends PooledObject>.
>>
>
> If someone would want to start a large overhaul of the GA code, that is
> worth considering.
> Do you mean using an existing library, or do you suggest implementing the
> functionality specifically for CM?
>
> 3. Examples of using getRepresentation method of AbstractListChromosome
>> seem misleading. Because getRepresentation  is protected method and
>> writing
>> classes that implement MutationPolicy/CrossoverPolicy can't use it. For
>> rapid development one could implement public overriding method, but can't
>> it be defined public in AbstractListChromosome? If one is to write some
>> particular policy, he must override getRepresentation method in
>> CustomChromosome. But if one wants to write some common genetic policy
>> (e.g., some reordering crossover), he would face an obstacle mentioned.
>>
>> I'd like to create tasks for those in Jira. Just want to make sure, that
>> these topics would be useful and gather some information, other
>> devepoler's
>> opinions on a matter.
>>
>
> You can always point to lacking features by opening JIRA reports, but
> unless
> you intend to work on them yourself, I wouldn't bet on having them fixed
> rapidly.
>
>
> Best regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
С уважением,
Семенов Кирилл Павлович
8-905-275-88-72

Re: [math] questions on GA

Posted by Gilles <gi...@harfang.homelinux.org>.
Hello.

On Thu, 17 Mar 2016 19:11:16 +0300, Семенов Кирилл wrote:
> Hi,
>
> I've been using genetic algorithm for some pet projects. And I'd like 
> to
> shed some light on a number of topics.

Thanks for you interest.

Given that there exist Java softwares that seem to provide a more 
complete
features set, I'd be interested to know a user's opinion on how the
CM implementation compares with those.
In effect, some time ago we evoked the possibility to drop GA support
altogether since the code seemed little used and a lot of work was
anticipated for making it useful beyond demo applications.

> 1. Am I correct to think, that now GA is working in a single thread?

Certainly.

Very few CM codes are multi-thread ready.  It was one of the task to
be tackled for future versions of the library, but lacking human 
resources
it's unlikely to become a reality any time soon.

> In
> such case, was there any discussions on the subject (I didn't find 
> within a
> quick check of Jira).

There were discussions (cf. "dev" ML archive).

> If not, could you provide some API reference. The
> subject is important, because the ability to be distributed is one of 
> the
> key features of the GA.
>
> 2. Was there talks about implementing Pool for chromosomes? I found
> enhancement proposal https://issues.apache.org/jira/browse/MATH-1219 
> -
> which is aimed to solve the same problem - creating an enormous 
> amount of
> chromosomes in each generation. Chromosomes after each generation 
> hangs in
> a heap waiting for GC. Also, object pool can be implemented, 
> supposing that
> chromosome would consist of List<? extends PooledObject>.

If someone would want to start a large overhaul of the GA code, that is
worth considering.
Do you mean using an existing library, or do you suggest implementing 
the
functionality specifically for CM?

> 3. Examples of using getRepresentation method of 
> AbstractListChromosome
> seem misleading. Because getRepresentation  is protected method and 
> writing
> classes that implement MutationPolicy/CrossoverPolicy can't use it. 
> For
> rapid development one could implement public overriding method, but 
> can't
> it be defined public in AbstractListChromosome? If one is to write 
> some
> particular policy, he must override getRepresentation method in
> CustomChromosome. But if one wants to write some common genetic 
> policy
> (e.g., some reordering crossover), he would face an obstacle 
> mentioned.
>
> I'd like to create tasks for those in Jira. Just want to make sure, 
> that
> these topics would be useful and gather some information, other 
> devepoler's
> opinions on a matter.

You can always point to lacking features by opening JIRA reports, but 
unless
you intend to work on them yourself, I wouldn't bet on having them 
fixed
rapidly.


Best regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org