You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@harfang.homelinux.org> on 2012/07/21 15:17:52 UTC

[Math] Little thought about multi-threading

Hi.

My previous post (with subject "Synchronisation") made me think (again) that
it might be useful to start considering how to take advantage of
multi-threading in Commons Math.
Indeed, it seems that some parts of the library might end up not being used
anymore because their performance simply cannot match competing
implementations that do benefit form parallelization. [The recent example
that comes to mind is the FFT.]


Best regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Luc Maisonobe <Lu...@free.fr>.
On 22/07/2012 02:29, Gilles Sadowski wrote:
> On Sat, Jul 21, 2012 at 11:41:45AM -0700, Ted Dunning wrote:
>> The easy way to get much of this benefit is to simply use multi-threaded
>> versions of Atlas via jblas.  Probably not viable given the no dependency
>> posture of commons math.
>>
> 
> When referring to multi-threading, I was not specifically and not only
> referring to linear algebra.

+1.

> Moreover, I don't see the interest of CM being yet another layer above those
> Fortran codes. [If so, why would we limit ourselves to matrices? There are
> other libraries which could be wrapped...]

+1

> Personally, I consider that the no-dependency should be confined to the core
> business of CM, i.e. it's a distinct feature of CM to provide pure Java,
> from scratch, implementations of numerical tools.[1]
> [As I've already stated, CM could still benefit from other (pure Java)
> projects (i.e. depend on them), e.g. for things like logging.]

-1, but we could speak again about this for 4.0. The development team
has evolved a lot, the use of [math] has increased a lot, so it would be
worth making sure we are still in line with everybody expectations or if
we should change this principle some time.

best regards,
Luc

> 
> 
> Regards,
> Gilles
> 
> [1] For some project, pure Java is a requirement.
> 
> 
>>> Hi.
>>>
>>> My previous post (with subject "Synchronisation") made me think (again)
>>> that
>>> it might be useful to start considering how to take advantage of
>>> multi-threading in Commons Math.
>>> Indeed, it seems that some parts of the library might end up not being used
>>> anymore because their performance simply cannot match competing
>>> implementations that do benefit form parallelization. [The recent example
>>> that comes to mind is the FFT.]
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Sat, Jul 21, 2012 at 11:41:45AM -0700, Ted Dunning wrote:
> The easy way to get much of this benefit is to simply use multi-threaded
> versions of Atlas via jblas.  Probably not viable given the no dependency
> posture of commons math.
> 

When referring to multi-threading, I was not specifically and not only
referring to linear algebra.
Moreover, I don't see the interest of CM being yet another layer above those
Fortran codes. [If so, why would we limit ourselves to matrices? There are
other libraries which could be wrapped...]
Personally, I consider that the no-dependency should be confined to the core
business of CM, i.e. it's a distinct feature of CM to provide pure Java,
from scratch, implementations of numerical tools.[1]
[As I've already stated, CM could still benefit from other (pure Java)
projects (i.e. depend on them), e.g. for things like logging.]


Regards,
Gilles

[1] For some project, pure Java is a requirement.


> > Hi.
> >
> > My previous post (with subject "Synchronisation") made me think (again)
> > that
> > it might be useful to start considering how to take advantage of
> > multi-threading in Commons Math.
> > Indeed, it seems that some parts of the library might end up not being used
> > anymore because their performance simply cannot match competing
> > implementations that do benefit form parallelization. [The recent example
> > that comes to mind is the FFT.]
> >
> >
> > Best regards,
> > Gilles
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Ted Dunning <te...@gmail.com>.
The easy way to get much of this benefit is to simply use multi-threaded
versions of Atlas via jblas.  Probably not viable given the no dependency
posture of commons math.

On Sat, Jul 21, 2012 at 6:17 AM, Gilles Sadowski <
gilles@harfang.homelinux.org> wrote:

> Hi.
>
> My previous post (with subject "Synchronisation") made me think (again)
> that
> it might be useful to start considering how to take advantage of
> multi-threading in Commons Math.
> Indeed, it seems that some parts of the library might end up not being used
> anymore because their performance simply cannot match competing
> implementations that do benefit form parallelization. [The recent example
> that comes to mind is the FFT.]
>
>
> Best regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [Math] Little thought about multi-threading

Posted by Sébastien Brisard <se...@m4x.org>.
Hi,

2012/7/30 Gilles Sadowski <gi...@harfang.homelinux.org>:
>> > [...]
>> >>> Well, you probably don't want to switch to Java 7 now, [...]
>> >>
>> >> Oh, yes, please! 8-P
>> >
>> > I would be in favor of this too, but we could also target it for the 4.0
>> > release together with the parallelization stuff.
>> >
>> > Thomas
>> >
>> I would also be in favor of Java 7, but I understand that the switch
>> might be difficult in some professional environments. The good news is
>> that you actually do not need Java 7 to run the F/J framework. Package
>> jsr166y.jar does the trick [1]. There is even a maven repos
>>     <dependency>
>>       <groupId>org.codehaus.jsr166-mirror</groupId>
>>       <artifactId>jsr166y</artifactId>
>>       <version>1.7.0</version>
>>     </dependency>
>>
>> Best regards,
>> Sébastien
>>
>> [1] http://g.oswego.edu/dl/concurrency-interest/
>>
>
> This page indicates that Java6 is required.
>
> But even if Java5 was fine, that JAR would be a dependency...
>
Yes, that does not really help, does it?

Best regards,
Sébastien
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
> > [...]
> >>> Well, you probably don't want to switch to Java 7 now, [...]
> >>
> >> Oh, yes, please! 8-P
> >
> > I would be in favor of this too, but we could also target it for the 4.0
> > release together with the parallelization stuff.
> >
> > Thomas
> >
> I would also be in favor of Java 7, but I understand that the switch
> might be difficult in some professional environments. The good news is
> that you actually do not need Java 7 to run the F/J framework. Package
> jsr166y.jar does the trick [1]. There is even a maven repos
>     <dependency>
>       <groupId>org.codehaus.jsr166-mirror</groupId>
>       <artifactId>jsr166y</artifactId>
>       <version>1.7.0</version>
>     </dependency>
> 
> Best regards,
> Sébastien
> 
> [1] http://g.oswego.edu/dl/concurrency-interest/
> 

This page indicates that Java6 is required.

But even if Java5 was fine, that JAR would be a dependency...


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Sébastien Brisard <se...@m4x.org>.
Hello,

2012/7/23 Thomas Neidhart <th...@gmail.com>:
> On 07/22/2012 11:25 PM, Gilles Sadowski wrote:
>>
>>>> [...]
>>>> Threaded execution, on the other hand, can be very, very helpful for a
>>>> number of math algorithms and thread management inside commons math is a
>>>> very reasonable option in those cases.  This would provide a performance
>>>> boost with very little complexity for the user of math.  Managing these
>>>> threads is really pretty simple as well.
>>>>
>>>>
>>> How about the Fork-Join framework of Java 7 as an alternative?
>>
>> +1
>
> I would also be interested in performance improvements of certain
> algorithms by taking advantage of the multi-core processors we have
> nowadays. The fork-join framework of java 7 looks clean and simple to
> use, so I guess it is somehow a logical target for a library as CM.
>
> +1
>
I have been playing around with the F/J framework for image analysis
applications. Implementation is indeed simple (I bran new to
multithreading). I was thinking of implementing a
ForkJoinArrayRealVector, to see if there is some potential here.

>>> Well, you probably don't want to switch to Java 7 now, [...]
>>
>> Oh, yes, please! 8-P
>
> I would be in favor of this too, but we could also target it for the 4.0
> release together with the parallelization stuff.
>
> Thomas
>
I would also be in favor of Java 7, but I understand that the switch
might be difficult in some professional environments. The good news is
that you actually do not need Java 7 to run the F/J framework. Package
jsr166y.jar does the trick [1]. There is even a maven repos
    <dependency>
      <groupId>org.codehaus.jsr166-mirror</groupId>
      <artifactId>jsr166y</artifactId>
      <version>1.7.0</version>
    </dependency>

Best regards,
Sébastien

[1] http://g.oswego.edu/dl/concurrency-interest/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Thomas Neidhart <th...@gmail.com>.
On 07/22/2012 11:25 PM, Gilles Sadowski wrote:
> 
>>> [...]
>>> Threaded execution, on the other hand, can be very, very helpful for a
>>> number of math algorithms and thread management inside commons math is a
>>> very reasonable option in those cases.  This would provide a performance
>>> boost with very little complexity for the user of math.  Managing these
>>> threads is really pretty simple as well.
>>>
>>>
>> How about the Fork-Join framework of Java 7 as an alternative?
> 
> +1

I would also be interested in performance improvements of certain
algorithms by taking advantage of the multi-core processors we have
nowadays. The fork-join framework of java 7 looks clean and simple to
use, so I guess it is somehow a logical target for a library as CM.

+1

>> Well, you probably don't want to switch to Java 7 now, [...]
> 
> Oh, yes, please! 8-P

I would be in favor of this too, but we could also target it for the 4.0
release together with the parallelization stuff.

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
> > [...]
> >Threaded execution, on the other hand, can be very, very helpful for a
> >number of math algorithms and thread management inside commons math is a
> >very reasonable option in those cases.  This would provide a performance
> >boost with very little complexity for the user of math.  Managing these
> >threads is really pretty simple as well.
> >
> >
> How about the Fork-Join framework of Java 7 as an alternative?

+1

> 
> Well, you probably don't want to switch to Java 7 now, [...]

Oh, yes, please! 8-P


Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Oliver Heger <ol...@oliver-heger.de>.
Am 22.07.2012 21:01, schrieb Ted Dunning:
> I don't believe that there are any commons math algorithms that would
> benefit from execution in a Hadoop map-reduce style.  The issue is that
> iterative algorithms are essentially incompatible with the very large
> startup costs of map-reduce programs under Hadoop.
>
> Some algorithms can be recast to make use of an all-reduce operator which
> can be implemented in a map-only job.  EM algorithms often have this
> structure.
>
> Otherwise, massive algorithmic change is usually necessary.  For instance,
> partial SVD can be done using a fixed and small number of map-reduce
> operations by using stochastic projection.
>
> Threaded execution, on the other hand, can be very, very helpful for a
> number of math algorithms and thread management inside commons math is a
> very reasonable option in those cases.  This would provide a performance
> boost with very little complexity for the user of math.  Managing these
> threads is really pretty simple as well.
>
>
How about the Fork-Join framework of Java 7 as an alternative?

Well, you probably don't want to switch to Java 7 now, but maybe in a 
later version? And I think, there are back-ports for earlier Java versions.

Oliver

>
> On Sun, Jul 22, 2012 at 9:27 AM, Phil Steitz <ph...@gmail.com> wrote:
>
>> On 7/21/12 6:17 AM, Gilles Sadowski wrote:
>>> Hi.
>>>
>>> My previous post (with subject "Synchronisation") made me think (again)
>> that
>>> it might be useful to start considering how to take advantage of
>>> multi-threading in Commons Math.
>>> Indeed, it seems that some parts of the library might end up not being
>> used
>>> anymore because their performance simply cannot match competing
>>> implementations that do benefit form parallelization. [The recent example
>>> that comes to mind is the FFT.]
>>
>> This is an interesting question.  I am also -1 on adding
>> dependencies, but it would be a good idea to look at how others have
>> solved the problem of how to support parallel execution by multiple
>> threads without managing threads directly.  Lots of [math]
>> algorithms could be parallelized.  The question is how to
>> effectively coordinate the work without owning or creating the
>> workers.  I would be -0 to any suggestion that involved [math]
>> itself spawning threads, since that 0) creates management headeaches
>> 1) may violate some container contracts and 2) forces execution
>> threads to be in the same process.  I think it is worth thinking
>> about how we might support parallel execution by externally managed
>> workers.  An obvious thing to look at is how to break our
>> parallelizable algorithms into pieces that could be executed in
>> Hadoop Map/Reduce jobs.  Step 0) is the breaking up part.  Then step
>> 1) might be either some examples added to the user guide or custom
>> Pig functions (or examples of how to code them).
>>
>> Phil
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Am 22.07.2012 23:19, schrieb Gilles Sadowski:
> I agree. I.e. let's make a list of the algorithms that would certainly
> benefit from parallelization, and for which the parallelization would be
> pretty simple (the devilish details notwithstanding...).

Integration, root solving or minimizing a function whose evaluation
is itself time consuming. On a multi core machine you can get several
function values in the same time a single core machine produces one.
Adapting numerical integration is probably quite easy, while root
solving and optimizers will probably require new algorithms which get
higher order convergence from the multiple function values computed
in parallel in order to minimize iteration count.

J.Pietschmann

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Sun, Jul 22, 2012 at 12:01:01PM -0700, Ted Dunning wrote:
> I don't believe that there are any commons math algorithms that would
> benefit from execution in a Hadoop map-reduce style.  The issue is that
> iterative algorithms are essentially incompatible with the very large
> startup costs of map-reduce programs under Hadoop.
> 
> Some algorithms can be recast to make use of an all-reduce operator which
> can be implemented in a map-only job.  EM algorithms often have this
> structure.
> 
> Otherwise, massive algorithmic change is usually necessary.  For instance,
> partial SVD can be done using a fixed and small number of map-reduce
> operations by using stochastic projection.
> 
> Threaded execution, on the other hand, can be very, very helpful for a
> number of math algorithms and thread management inside commons math is a
> very reasonable option in those cases.  This would provide a performance
> boost with very little complexity for the user of math.  Managing these
> threads is really pretty simple as well.

I agree. I.e. let's make a list of the algorithms that would certainly
benefit from parallelization, and for which the parallelization would be
pretty simple (the devilish details notwithstanding...).

Suggestions, in order of simplicity, welcome.


Gilles

> 
> 
> 
> On Sun, Jul 22, 2012 at 9:27 AM, Phil Steitz <ph...@gmail.com> wrote:
> 
> > On 7/21/12 6:17 AM, Gilles Sadowski wrote:
> > > Hi.
> > >
> > > My previous post (with subject "Synchronisation") made me think (again)
> > that
> > > it might be useful to start considering how to take advantage of
> > > multi-threading in Commons Math.
> > > Indeed, it seems that some parts of the library might end up not being
> > used
> > > anymore because their performance simply cannot match competing
> > > implementations that do benefit form parallelization. [The recent example
> > > that comes to mind is the FFT.]
> >
> > This is an interesting question.  I am also -1 on adding
> > dependencies, but it would be a good idea to look at how others have
> > solved the problem of how to support parallel execution by multiple
> > threads without managing threads directly.  Lots of [math]
> > algorithms could be parallelized.  The question is how to
> > effectively coordinate the work without owning or creating the
> > workers.  I would be -0 to any suggestion that involved [math]
> > itself spawning threads, since that 0) creates management headeaches
> > 1) may violate some container contracts and 2) forces execution
> > threads to be in the same process.  I think it is worth thinking
> > about how we might support parallel execution by externally managed
> > workers.  An obvious thing to look at is how to break our
> > parallelizable algorithms into pieces that could be executed in
> > Hadoop Map/Reduce jobs.  Step 0) is the breaking up part.  Then step
> > 1) might be either some examples added to the user guide or custom
> > Pig functions (or examples of how to code them).
> >
> > Phil
> > >
> > >
> > > Best regards,
> > > Gilles
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Ted Dunning <te...@gmail.com>.
I don't believe that there are any commons math algorithms that would
benefit from execution in a Hadoop map-reduce style.  The issue is that
iterative algorithms are essentially incompatible with the very large
startup costs of map-reduce programs under Hadoop.

Some algorithms can be recast to make use of an all-reduce operator which
can be implemented in a map-only job.  EM algorithms often have this
structure.

Otherwise, massive algorithmic change is usually necessary.  For instance,
partial SVD can be done using a fixed and small number of map-reduce
operations by using stochastic projection.

Threaded execution, on the other hand, can be very, very helpful for a
number of math algorithms and thread management inside commons math is a
very reasonable option in those cases.  This would provide a performance
boost with very little complexity for the user of math.  Managing these
threads is really pretty simple as well.



On Sun, Jul 22, 2012 at 9:27 AM, Phil Steitz <ph...@gmail.com> wrote:

> On 7/21/12 6:17 AM, Gilles Sadowski wrote:
> > Hi.
> >
> > My previous post (with subject "Synchronisation") made me think (again)
> that
> > it might be useful to start considering how to take advantage of
> > multi-threading in Commons Math.
> > Indeed, it seems that some parts of the library might end up not being
> used
> > anymore because their performance simply cannot match competing
> > implementations that do benefit form parallelization. [The recent example
> > that comes to mind is the FFT.]
>
> This is an interesting question.  I am also -1 on adding
> dependencies, but it would be a good idea to look at how others have
> solved the problem of how to support parallel execution by multiple
> threads without managing threads directly.  Lots of [math]
> algorithms could be parallelized.  The question is how to
> effectively coordinate the work without owning or creating the
> workers.  I would be -0 to any suggestion that involved [math]
> itself spawning threads, since that 0) creates management headeaches
> 1) may violate some container contracts and 2) forces execution
> threads to be in the same process.  I think it is worth thinking
> about how we might support parallel execution by externally managed
> workers.  An obvious thing to look at is how to break our
> parallelizable algorithms into pieces that could be executed in
> Hadoop Map/Reduce jobs.  Step 0) is the breaking up part.  Then step
> 1) might be either some examples added to the user guide or custom
> Pig functions (or examples of how to code them).
>
> Phil
> >
> >
> > Best regards,
> > Gilles
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [Math] Little thought about multi-threading

Posted by Phil Steitz <ph...@gmail.com>.
On 7/22/12 2:14 PM, Gilles Sadowski wrote:
> On Sun, Jul 22, 2012 at 09:27:17AM -0700, Phil Steitz wrote:
>> On 7/21/12 6:17 AM, Gilles Sadowski wrote:
>>> Hi.
>>>
>>> My previous post (with subject "Synchronisation") made me think (again) that
>>> it might be useful to start considering how to take advantage of
>>> multi-threading in Commons Math.
>>> Indeed, it seems that some parts of the library might end up not being used
>>> anymore because their performance simply cannot match competing
>>> implementations that do benefit form parallelization. [The recent example
>>> that comes to mind is the FFT.]
>> This is an interesting question.  I am also -1 on adding
>> dependencies, but it would be a good idea to look at how others have
>> solved the problem of how to support parallel execution by multiple
>> threads without managing threads directly.  Lots of [math]
>> algorithms could be parallelized.  The question is how to
>> effectively coordinate the work without owning or creating the
>> workers.  I would be -0 to any suggestion that involved [math]
>> itself spawning threads,
> I certainly do mean that, although threads are to be managed by the
> utilities in package "java.util.concurrent".
>
>> since that 0) creates management headeaches
> If it does, then it's too complex for CM. But it shouldn't in readily
> paralellizable tasks (i.e. a processing that can be cut into independeant
> sub-tasks).

It is "easy" to spawn a lot of threads in applications.  It is not
as easy to make sure they are all cleaned up on all execution paths.
>
>> 1) may violate some container contracts 
> The usage of multiple cores would be a user setting (i.e. how many tasks can
> run in parallel).

Some container contracts forbid spawning threads, so this would have
to be able to be disabled.

>
>> and 2) forces execution
>> threads to be in the same process.
> I don't understand that.

If [math] is managing the threads, they have to all be in the same
jvm process.  If, on the other hand, we allow [math] algorithm
subtasks to be executed in parallel by other programs / frameworks
(such as, for example, Hadoop), the computation could be spread
across multiple processes or even physical hosts.

>
>>  I think it is worth thinking
>> about how we might support parallel execution by externally managed
>> workers.  An obvious thing to look at is how to break our
>> parallelizable algorithms into pieces that could be executed in
>> Hadoop Map/Reduce jobs.
> I don't know what that is.

Have a look at the Hadoop docs, or Pig.  Both are Apache projects. 
There are also other parallel execution frameworks out there.
>
>> Step 0) is the breaking up part. Then step
>> 1) might be either some examples added to the user guide or custom
>> Pig functions (or examples of how to code them).
> I don't know about that either.
>
> I was rather thinking of using the utilities readily available in the
> Java language standard e.g.:
>   http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

That is the internally-managed threads approach, which could be
done, but has the limitations mentioned above.

For either approach - managing threads internally, or letting an
external execution framework do it - the first step, as you have
mentioned above, is to identify which algorithms can be
parallelized, how to go about dividing up the work, what data needs
to be shared and how to aggregate the results.

Phil
>
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Gilles Sadowski <gi...@harfang.homelinux.org>.
On Sun, Jul 22, 2012 at 09:27:17AM -0700, Phil Steitz wrote:
> On 7/21/12 6:17 AM, Gilles Sadowski wrote:
> > Hi.
> >
> > My previous post (with subject "Synchronisation") made me think (again) that
> > it might be useful to start considering how to take advantage of
> > multi-threading in Commons Math.
> > Indeed, it seems that some parts of the library might end up not being used
> > anymore because their performance simply cannot match competing
> > implementations that do benefit form parallelization. [The recent example
> > that comes to mind is the FFT.]
> 
> This is an interesting question.  I am also -1 on adding
> dependencies, but it would be a good idea to look at how others have
> solved the problem of how to support parallel execution by multiple
> threads without managing threads directly.  Lots of [math]
> algorithms could be parallelized.  The question is how to
> effectively coordinate the work without owning or creating the
> workers.  I would be -0 to any suggestion that involved [math]
> itself spawning threads,

I certainly do mean that, although threads are to be managed by the
utilities in package "java.util.concurrent".

> since that 0) creates management headeaches

If it does, then it's too complex for CM. But it shouldn't in readily
paralellizable tasks (i.e. a processing that can be cut into independeant
sub-tasks).

> 1) may violate some container contracts 

The usage of multiple cores would be a user setting (i.e. how many tasks can
run in parallel).

> and 2) forces execution
> threads to be in the same process.

I don't understand that.

>  I think it is worth thinking
> about how we might support parallel execution by externally managed
> workers.  An obvious thing to look at is how to break our
> parallelizable algorithms into pieces that could be executed in
> Hadoop Map/Reduce jobs.

I don't know what that is.

> Step 0) is the breaking up part. Then step
> 1) might be either some examples added to the user guide or custom
> Pig functions (or examples of how to code them).

I don't know about that either.

I was rather thinking of using the utilities readily available in the
Java language standard e.g.:
  http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html


Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [Math] Little thought about multi-threading

Posted by Phil Steitz <ph...@gmail.com>.
On 7/21/12 6:17 AM, Gilles Sadowski wrote:
> Hi.
>
> My previous post (with subject "Synchronisation") made me think (again) that
> it might be useful to start considering how to take advantage of
> multi-threading in Commons Math.
> Indeed, it seems that some parts of the library might end up not being used
> anymore because their performance simply cannot match competing
> implementations that do benefit form parallelization. [The recent example
> that comes to mind is the FFT.]

This is an interesting question.  I am also -1 on adding
dependencies, but it would be a good idea to look at how others have
solved the problem of how to support parallel execution by multiple
threads without managing threads directly.  Lots of [math]
algorithms could be parallelized.  The question is how to
effectively coordinate the work without owning or creating the
workers.  I would be -0 to any suggestion that involved [math]
itself spawning threads, since that 0) creates management headeaches
1) may violate some container contracts and 2) forces execution
threads to be in the same process.  I think it is worth thinking
about how we might support parallel execution by externally managed
workers.  An obvious thing to look at is how to break our
parallelizable algorithms into pieces that could be executed in
Hadoop Map/Reduce jobs.  Step 0) is the breaking up part.  Then step
1) might be either some examples added to the user guide or custom
Pig functions (or examples of how to code them).

Phil
>
>
> Best regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org