You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Arun Mohan <st...@gmail.com> on 2017/06/12 21:26:10 UTC

Commons sub project for parallel method execution

Hi All,

Good afternoon.

I have been working on a java generic parallel execution library which will
allow clients to execute methods in parallel irrespective of the number of
method arguments, type of method arguments, return type of the method etc.

Here is the link to the source code:
https://github.com/striderarun/parallel-execution-engine

The project is in a nascent state and I am the only contributor so far. I
am new to the Apache community and I would like to bring this project into
Apache and improve, expand and build a developer community around it.

I think this project can be a sub project of Apache Commons since it
provides generic components for parallelizing any kind of methods.

Can somebody please guide me or suggest what other options I can explore ?

Thanks,
Arun

Re: Commons sub project for parallel method execution

Posted by Bill Igoe <bi...@gmail.com>.
A java CUDA jni library would be excellent.

On Jun 12, 2017 7:08 PM, "Gary Gregory" <ga...@gmail.com> wrote:

> On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:
>
> > So wouldn't something like ASM or Javassist or one of the zillion other
> > bytecode libraries be a better alternative to using reflection for
> > performance? Also, using the Java 7 reflections API improvements helps
> > speed things up quite a bit.
> >
>
> IMO, unless you are doing scripting, reflection should be a used as a
> workaround, but that's just me. For example, like we do in Commons IO's
> Java7Support class.
>
> But I digress ;-)
>
> This is clearly an interesting topic. My concern is that there is a LOT of
> code out there that does stuff like this at the low and high level from the
> JRE's fork/join to Apache Spark and so on as I've stated.
>
> IMO something new would have to be both unique and since this is Commons,
> potentially pluggable into other frameworks.
>
> Gary
>
>
> > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
> >
> > > My goto library for such tasks would be GPars. It has both Java and
> > > Groovy support for most things (actors/dataflow) but less so for
> > > asynchronous task execution. It's one of the things that would be good
> > > to explore in light of Java 8. Groovy is now Apache, GPars not at this
> > > stage.
> > >
> > > So with adding two jars (GPars + Groovy), you can use Groovy like this:
> > >
> > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > > import com.arun.student.StudentService
> > > import groovyx.gpars.GParsExecutorsPool
> > >
> > > long startTime = System.nanoTime()
> > > def service = new StudentService()
> > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > > "Harry Potter": 7]
> > >
> > > def tasks = [
> > >         { println service.findStudent("john@gmail.com", 11, false) },
> > >         { println service.getStudentMarks(1L) },
> > >         { println service.getStudentsByFirstNames(["John","Alice"]) },
> > >         { println service.getRandomLastName() },
> > >         { println service.findStudentIdByName("Kate", "Williams") },
> > >         { service.printMapValues(bookSeries) }
> > > ]
> > >
> > > GParsExecutorsPool.withPool {
> > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > > }
> > >
> > > long executionTime = (System.nanoTime() - startTime) / 1000000
> > > println "\nTotal elapsed time is $executionTime\n\n"
> > >
> > >
> > > Cheers, Paul.
> > >
> > >
> > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > > > I'd be interested to see where this leads to. It could end up as a
> sort
> > > of
> > > > Commons Parallel library. Besides providing an execution API, there
> > could
> > > > be plenty of support utilities that tend to be found in all the
> > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > > earlier
> > > > (basically all sorts of Hadoop-related projects and other distributed
> > > > systems here).
> > > >
> > > > Really, there's so many ways that such a project could head, I'd like
> > to
> > > > hear more ideas on what to focus on.
> > > >
> > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
> wrote:
> > > >
> > > >> The upshot is that there has to be a way to do this with some custom
> > > code
> > > >> to at least have the ability to 'fast path' the code without
> > reflection.
> > > >> Using lambdas should make this fairly syntactically unobtrusive.
> > > >>
> > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> strider90arun@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Yes, reflection is not very performant but I don't think I have
> any
> > > other
> > > >> > choice since the library has to inspect the object supplied by the
> > > client
> > > >> > at runtime to pick out the methods to be invoked using
> > > CompletableFuture.
> > > >> > But the performance penalty paid for using reflection will be more
> > > than
> > > >> > offset by the savings of parallel method execution, more so as the
> > no
> > > of
> > > >> > methods executed in parallel increases.
> > > >> >
> > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > garydgregory@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > On a lower-level, if you want to use this for lower-level
> services
> > > >> (where
> > > >> > > there is no network latency for example), you will need to avoid
> > > using
> > > >> > > reflection to get the best performance.
> > > >> > >
> > > >> > > Gary
> > > >> > >
> > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > > strider90arun@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi Gary,
> > > >> > > >
> > > >> > > > Thanks for your response. You have some valid and interesting
> > > points
> > > >> > :-)
> > > >> > > > Of course you are right that Spark is much more mature. Thanks
> > for
> > > >> your
> > > >> > > > insight.
> > > >> > > > It will be interesting indeed to find out if the core
> > > parallelization
> > > >> > > > engine of Spark can be isolated like you suggest.
> > > >> > > >
> > > >> > > > I started working on this project because I felt that there
> was
> > no
> > > >> good
> > > >> > > > library for parallelizing method calls which can be plugged in
> > > easily
> > > >> > > into
> > > >> > > > an existing java project. Ultimately, if such a solution can
> be
> > > >> > > > incorporated in the Apache Commons, it would be a useful
> > addition
> > > to
> > > >> > the
> > > >> > > > Commons repository.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Arun
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > >> garydgregory@gmail.com>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > Hi Arun,
> > > >> > > > >
> > > >> > > > > Sure, and that is to be expected, Spark is more mature than
> a
> > > four
> > > >> > > class
> > > >> > > > > prototype. What I am trying to get to is that in order for
> the
> > > >> > library
> > > >> > > to
> > > >> > > > > be useful, you will end up with more in a first release, and
> > > after
> > > >> a
> > > >> > > > couple
> > > >> > > > > more releases, there will be more and more. Would Spark not
> > > have in
> > > >> > its
> > > >> > > > > guts the same kind of code your are proposing here? By
> > > extension,
> > > >> > will
> > > >> > > > you
> > > >> > > > > not end up with more framework-like (Spark-like) code and
> > > solutions
> > > >> > as
> > > >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > What would be interesting would be to find out if there is a
> > > core
> > > >> > part
> > > >> > > of
> > > >> > > > > Spark that is separable and ex tractable into a Commons
> > > component.
> > > >> > > Since
> > > >> > > > > Spark has a proven track record, it is more likely, that
> such
> > a
> > > >> > library
> > > >> > > > > would be generally useful than one created from scratch that
> > > does
> > > >> not
> > > >> > > > > integrate with anything else. Again, please do not take any
> of
> > > this
> > > >> > > > > personally, I am just playing here :-)
> > > >> > > > >
> > > >> > > > > Gary
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> > boards@gmail.com>
> > > >> > wrote:
> > > >> > > > >
> > > >> > > > > > I already see a huge difference here: Spark requires a
> bunch
> > > of
> > > >> > > > > > infrastructure to be set up, while this library is just a
> > > >> library.
> > > >> > > > > Similar
> > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm
> or
> > > >> Samza
> > > >> > or
> > > >> > > > the
> > > >> > > > > > others.
> > > >> > > > > >
> > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > > garydgregory@gmail.com>
> > > >> > > wrote:
> > > >> > > > > >
> > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > >> > > strider90arun@gmail.com
> > > >> > > > >
> > > >> > > > > > > wrote:
> > > >> > > > > > >
> > > >> > > > > > > > Hi All,
> > > >> > > > > > > >
> > > >> > > > > > > > Good afternoon.
> > > >> > > > > > > >
> > > >> > > > > > > > I have been working on a java generic parallel
> execution
> > > >> > library
> > > >> > > > > which
> > > >> > > > > > > will
> > > >> > > > > > > > allow clients to execute methods in parallel
> > irrespective
> > > of
> > > >> > the
> > > >> > > > > number
> > > >> > > > > > > of
> > > >> > > > > > > > method arguments, type of method arguments, return
> type
> > of
> > > >> the
> > > >> > > > method
> > > >> > > > > > > etc.
> > > >> > > > > > > >
> > > >> > > > > > > > Here is the link to the source code:
> > > >> > > > > > > > https://github.com/striderarun/parallel-
> > execution-engine
> > > >> > > > > > > >
> > > >> > > > > > > > The project is in a nascent state and I am the only
> > > >> contributor
> > > >> > > so
> > > >> > > > > > far. I
> > > >> > > > > > > > am new to the Apache community and I would like to
> bring
> > > this
> > > >> > > > project
> > > >> > > > > > > into
> > > >> > > > > > > > Apache and improve, expand and build a developer
> > community
> > > >> > around
> > > >> > > > it.
> > > >> > > > > > > >
> > > >> > > > > > > > I think this project can be a sub project of Apache
> > > Commons
> > > >> > since
> > > >> > > > it
> > > >> > > > > > > > provides generic components for parallelizing any kind
> > of
> > > >> > > methods.
> > > >> > > > > > > >
> > > >> > > > > > > > Can somebody please guide me or suggest what other
> > > options I
> > > >> > can
> > > >> > > > > > explore
> > > >> > > > > > > ?
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > Hi Arun,
> > > >> > > > > > >
> > > >> > > > > > > Thank you for your proposal.
> > > >> > > > > > >
> > > >> > > > > > > How would this be different from Apache Spark?
> > > >> > > > > > >
> > > >> > > > > > > Thank you,
> > > >> > > > > > > Gary
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > > > Thanks,
> > > >> > > > > > > > Arun
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > --
> > > >> > > > > > Matt Sicker <bo...@gmail.com>
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Matt Sicker <bo...@gmail.com>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Hi All,

Further building on the code generation approach, I have committed some
changes which will now allow clients to specify methods as if calling them
directly. The client API is highly simplified now and is almost equivalent
to direct invocation. Argument types and return types are all automatically
inferred now thanks to code generation.

Also, now the compiler will enforce that clients cannot supply wrong
argument values when building the signature.

Eg: For parallelizing methods, the client api will now look like:

Signature.build(StudentService_.getStudentMarks(1L));
Signature.build(StudentService_.findStudentIdByName("Kate", "Williams"));
Signature.build(StudentService_.findStudent("bob@gmail.com", 14, false));
Signature.build(StudentService_.printMapValues(bookSeries));

Thanks for the ideas everyone! Would be interesting to see what else could
be achieved here.


On Wed, Jun 28, 2017 at 6:00 PM, Arun Mohan <st...@gmail.com> wrote:

> Hi All,
>
> I found some time recently to work on the suggestions and ideas that came
> up while discussing this.
>
> Specifically, I reworked two major points that were called out -
>
> 1. Removed usage of Reflection API.
> Replaced reflection with MethodHandles introduced in Java 7 which provide
>  typed, directly executable reference to an underlying method on an object.
>
> 2. Annotation based code generation for providing the clients type safety
> while building the method signatures to be parallelized. No more hardcoded
> method strings.
>
> https://github.com/striderarun/parallel-execution-engine
>
>
>
> On Wed, Jun 14, 2017 at 12:41 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
>> Thanks for the tip Gary. Will give it a try.
>>
>> On Wed, Jun 14, 2017 at 12:13 PM, Gary Gregory <ga...@gmail.com>
>> wrote:
>>
>>> Briefly: If you are considering code generation, then you can do away
>>> with
>>> using reflection.
>>>
>>> G
>>>
>>> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
>>> wrote:
>>>
>>> > I was exploring ways on how to substitute the typing of method names
>>> in the
>>> > api with something thats more clean and maintainable.
>>> > Using annotations, how can I provide clients the ability to specify
>>> which
>>> > method needs to be specified? Any ideas? Sort of stuck on this now.
>>> >
>>> > Right now I am thinking of something similar to HibernateJpa Metamodel
>>> > generator, where a new class will be generated via byte code
>>> manipulation
>>> >  which will contain static string variables corresponding to all
>>> annotated
>>> > method names. Then the client can refer to the String variables in the
>>> > generated class instead of typing the method names.
>>> >
>>> > Also, I don't have much experience playing with ASM or java assist. As
>>> it
>>> > currently stands, is this project a good fit for further exploration
>>> in the
>>> > Sandbox? I would like to see if there are interested folks with
>>> experience
>>> > in byte code manipulation who can contribute to this.
>>> >
>>> > On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
>>> > wrote:
>>> >
>>> > > I was checking out how the library would plug into Spring and other
>>> > > frameworks. I created a sample Spring project with a couple of auto
>>> wired
>>> > > service classes. To fetch and combine data from multiple service
>>> classes
>>> > in
>>> > > parallel, the Spring injected service dependencies are passed to the
>>> > > library.
>>> > >
>>> > > Since the library is framework agnostic, it deals with the spring
>>> > injected
>>> > > dependency as a normal object.
>>> > >
>>> > > You can see it here : https://github.com/striderarun/spring-app-
>>> > > parallel-execution/blob/master/src/main/java/com/dashboard/s
>>> ervice/impl/
>>> > > DashboardServiceImpl.java
>>> > >
>>> > > I think the idea here is that clients can parallelize method calls
>>> > > irrespective of whether they are part of Spring beans or implemented
>>> as
>>> > > part of any other framework. Clients don't have to modify or wrap
>>> their
>>> > > methods into an ExecutorService, Runnable or any other low level
>>> apis to
>>> > do
>>> > > so. Methods can be submitted as-is to the library.
>>> > >
>>> > > The library can serve as a higher level abstraction that completely
>>> hides
>>> > > concurrency apis from the client.
>>> > >
>>> > >
>>> > > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com>
>>> wrote:
>>> > >
>>> > >> There's also some interesting execution APIs available in the Scala
>>> > >> standard library. Those are built on top of ForkJoinPool and such
>>> > >> nowadays,
>>> > >> but the idea is there for a nicer API on top of ExecutorService and
>>> > other
>>> > >> low level details.
>>> > >>
>>> > >> In the interests of concurrency, there are other thread-like models
>>> that
>>> > >> can be explored. For example: http://docs.paralleluniverse.c
>>> o/quasar/
>>> > >>
>>> > >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
>>> > >> brunodepaulak@yahoo.com.br.invalid> wrote:
>>> > >>
>>> > >> > Interesting idea. And great discussion. Can't really say I'd have
>>> a
>>> > use
>>> > >> > case for that right now, so abstaining from the discussion around
>>> the
>>> > >> > implementation.
>>> > >> >
>>> > >> > I believe if we decide to explore this idea in Commons, we will
>>> > probably
>>> > >> > move it to sandbox? Even if we do not move that to Commons or to
>>> > >> sandbox, I
>>> > >> > intend to find some time in the next days to try Apache Commons
>>> > Javaflow
>>> > >> > with this library.
>>> > >> >
>>> > >> > Jenkins implemented pipelines + continuations with code that when
>>> > >> started
>>> > >> > it looked a lot like Javaflow. The execution in parallel is taken
>>> care
>>> > >> in
>>> > >> > some internal modules in Jenkins, but I would like to see how if
>>> > simpler
>>> > >> > implementation like this one would work.
>>> > >> >
>>> > >> > Ideally, this utility would execute in parallel, say, 20 tasks
>>> each
>>> > >> taking
>>> > >> > 5 minutes (haven't looked if it supports fork/join). Then I would
>>> be
>>> > >> able
>>> > >> > to have checkpoints during the execution and if the whole workflow
>>> > >> fails, I
>>> > >> > would be able to restart it from the last checkpoint.
>>> > >> >
>>> > >> >
>>> > >> > I use Java7+ concurrent classes when I need to execute tasks in
>>> > parallel
>>> > >> > (though I'm adding a flag to Paul King's message in this thread to
>>> > give
>>> > >> > GPars a try too!), but I am unaware of any way to have
>>> persistentable
>>> > >> (?)
>>> > >> > continuation workflows as in Jenkins, but with simple Java code.
>>> > >> >
>>> > >> > Cheers
>>> > >> > Bruno
>>> > >> >
>>> > >> > ________________________________
>>> > >> > From: Gary Gregory <ga...@gmail.com>
>>> > >> > To: Commons Developers List <de...@commons.apache.org>
>>> > >> > Sent: Tuesday, 13 June 2017 2:08 PM
>>> > >> > Subject: Re: Commons sub project for parallel method execution
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com>
>>> > wrote:
>>> > >> >
>>> > >> > > So wouldn't something like ASM or Javassist or one of the
>>> zillion
>>> > >> other
>>> > >> > > bytecode libraries be a better alternative to using reflection
>>> for
>>> > >> > > performance? Also, using the Java 7 reflections API improvements
>>> > helps
>>> > >> > > speed things up quite a bit.
>>> > >> > >
>>> > >> >
>>> > >> > IMO, unless you are doing scripting, reflection should be a used
>>> as a
>>> > >> > workaround, but that's just me. For example, like we do in Commons
>>> > IO's
>>> > >> > Java7Support class.
>>> > >> >
>>> > >> > But I digress ;-)
>>> > >> >
>>> > >> > This is clearly an interesting topic. My concern is that there is
>>> a
>>> > LOT
>>> > >> of
>>> > >> > code out there that does stuff like this at the low and high level
>>> > from
>>> > >> the
>>> > >> > JRE's fork/join to Apache Spark and so on as I've stated.
>>> > >> >
>>> > >> > IMO something new would have to be both unique and since this is
>>> > >> Commons,
>>> > >> > potentially pluggable into other frameworks.
>>> > >> >
>>> > >> > Gary
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com>
>>> > >> wrote:
>>> > >> > >
>>> > >> > > > My goto library for such tasks would be GPars. It has both
>>> Java
>>> > and
>>> > >> > > > Groovy support for most things (actors/dataflow) but less so
>>> for
>>> > >> > > > asynchronous task execution. It's one of the things that
>>> would be
>>> > >> good
>>> > >> > > > to explore in light of Java 8. Groovy is now Apache, GPars
>>> not at
>>> > >> this
>>> > >> > > > stage.
>>> > >> > > >
>>> > >> > > > So with adding two jars (GPars + Groovy), you can use Groovy
>>> like
>>> > >> this:
>>> > >> > > >
>>> > >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
>>> > >> > > > import com.arun.student.StudentService
>>> > >> > > > import groovyx.gpars.GParsExecutorsPool
>>> > >> > > >
>>> > >> > > > long startTime = System.nanoTime()
>>> > >> > > > def service = new StudentService()
>>> > >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of
>>> Time":
>>> > 14,
>>> > >> > > > "Harry Potter": 7]
>>> > >> > > >
>>> > >> > > > def tasks = [
>>> > >> > > >         { println service.findStudent("john@gmail.com", 11,
>>> > false)
>>> > >> },
>>> > >> > > >         { println service.getStudentMarks(1L) },
>>> > >> > > >         { println service.getStudentsByFirstNames(["
>>> > John","Alice"])
>>> > >> },
>>> > >> > > >         { println service.getRandomLastName() },
>>> > >> > > >         { println service.findStudentIdByName("Kate",
>>> "Williams")
>>> > >> },
>>> > >> > > >         { service.printMapValues(bookSeries) }
>>> > >> > > > ]
>>> > >> > > >
>>> > >> > > > GParsExecutorsPool.withPool {
>>> > >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
>>> > >> > > > //    tasks.eachParallel{ it() } // one of numerous
>>> alternatives
>>> > >> > > > }
>>> > >> > > >
>>> > >> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
>>> > >> > > > println "\nTotal elapsed time is $executionTime\n\n"
>>> > >> > > >
>>> > >> > > >
>>> > >> > > > Cheers, Paul.
>>> > >> > > >
>>> > >> > > >
>>> > >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <
>>> boards@gmail.com>
>>> > >> wrote:
>>> > >> > > > > I'd be interested to see where this leads to. It could end
>>> up
>>> > as a
>>> > >> > sort
>>> > >> > > > of
>>> > >> > > > > Commons Parallel library. Besides providing an execution
>>> API,
>>> > >> there
>>> > >> > > could
>>> > >> > > > > be plenty of support utilities that tend to be found in all
>>> the
>>> > >> > > > > *Util(s)/*Helper classes in projects like all the ones I
>>> > mentioned
>>> > >> > > > earlier
>>> > >> > > > > (basically all sorts of Hadoop-related projects and other
>>> > >> distributed
>>> > >> > > > > systems here).
>>> > >> > > > >
>>> > >> > > > > Really, there's so many ways that such a project could
>>> head, I'd
>>> > >> like
>>> > >> > > to
>>> > >> > > > > hear more ideas on what to focus on.
>>> > >> > > > >
>>> > >> > > > > On 12 June 2017 at 18:19, Gary Gregory <
>>> garydgregory@gmail.com>
>>> > >> > wrote:
>>> > >> > > > >
>>> > >> > > > >> The upshot is that there has to be a way to do this with
>>> some
>>> > >> custom
>>> > >> > > > code
>>> > >> > > > >> to at least have the ability to 'fast path' the code
>>> without
>>> > >> > > reflection.
>>> > >> > > > >> Using lambdas should make this fairly syntactically
>>> > unobtrusive.
>>> > >> > > > >>
>>> > >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
>>> > >> > strider90arun@gmail.com>
>>> > >> > > > >> wrote:
>>> > >> > > > >>
>>> > >> > > > >> > Yes, reflection is not very performant but I don't think
>>> I
>>> > have
>>> > >> > any
>>> > >> > > > other
>>> > >> > > > >> > choice since the library has to inspect the object
>>> supplied
>>> > by
>>> > >> the
>>> > >> > > > client
>>> > >> > > > >> > at runtime to pick out the methods to be invoked using
>>> > >> > > > CompletableFuture.
>>> > >> > > > >> > But the performance penalty paid for using reflection
>>> will be
>>> > >> more
>>> > >> > > > than
>>> > >> > > > >> > offset by the savings of parallel method execution, more
>>> so
>>> > as
>>> > >> the
>>> > >> > > no
>>> > >> > > > of
>>> > >> > > > >> > methods executed in parallel increases.
>>> > >> > > > >> >
>>> > >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
>>> > >> > > garydgregory@gmail.com
>>> > >> > > > >
>>> > >> > > > >> > wrote:
>>> > >> > > > >> >
>>> > >> > > > >> > > On a lower-level, if you want to use this for
>>> lower-level
>>> > >> > services
>>> > >> > > > >> (where
>>> > >> > > > >> > > there is no network latency for example), you will
>>> need to
>>> > >> avoid
>>> > >> > > > using
>>> > >> > > > >> > > reflection to get the best performance.
>>> > >> > > > >> > >
>>> > >> > > > >> > > Gary
>>> > >> > > > >> > >
>>> > >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
>>> > >> > > > strider90arun@gmail.com>
>>> > >> > > > >> > > wrote:
>>> > >> > > > >> > >
>>> > >> > > > >> > > > Hi Gary,
>>> > >> > > > >> > > >
>>> > >> > > > >> > > > Thanks for your response. You have some valid and
>>> > >> interesting
>>> > >> > > > points
>>> > >> > > > >> > :-)
>>> > >> > > > >> > > > Of course you are right that Spark is much more
>>> mature.
>>> > >> Thanks
>>> > >> > > for
>>> > >> > > > >> your
>>> > >> > > > >> > > > insight.
>>> > >> > > > >> > > > It will be interesting indeed to find out if the core
>>> > >> > > > parallelization
>>> > >> > > > >> > > > engine of Spark can be isolated like you suggest.
>>> > >> > > > >> > > >
>>> > >> > > > >> > > > I started working on this project because I felt that
>>> > there
>>> > >> > was
>>> > >> > > no
>>> > >> > > > >> good
>>> > >> > > > >> > > > library for parallelizing method calls which can be
>>> > >> plugged in
>>> > >> > > > easily
>>> > >> > > > >> > > into
>>> > >> > > > >> > > > an existing java project. Ultimately, if such a
>>> solution
>>> > >> can
>>> > >> > be
>>> > >> > > > >> > > > incorporated in the Apache Commons, it would be a
>>> useful
>>> > >> > > addition
>>> > >> > > > to
>>> > >> > > > >> > the
>>> > >> > > > >> > > > Commons repository.
>>> > >> > > > >> > > >
>>> > >> > > > >> > > > Thanks,
>>> > >> > > > >> > > > Arun
>>> > >> > > > >> > > >
>>> > >> > > > >> > > >
>>> > >> > > > >> > > >
>>> > >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
>>> > >> > > > >> garydgregory@gmail.com>
>>> > >> > > > >> > > > wrote:
>>> > >> > > > >> > > >
>>> > >> > > > >> > > > > Hi Arun,
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > Sure, and that is to be expected, Spark is more
>>> mature
>>> > >> than
>>> > >> > a
>>> > >> > > > four
>>> > >> > > > >> > > class
>>> > >> > > > >> > > > > prototype. What I am trying to get to is that in
>>> order
>>> > >> for
>>> > >> > the
>>> > >> > > > >> > library
>>> > >> > > > >> > > to
>>> > >> > > > >> > > > > be useful, you will end up with more in a first
>>> > release,
>>> > >> and
>>> > >> > > > after
>>> > >> > > > >> a
>>> > >> > > > >> > > > couple
>>> > >> > > > >> > > > > more releases, there will be more and more. Would
>>> Spark
>>> > >> not
>>> > >> > > > have in
>>> > >> > > > >> > its
>>> > >> > > > >> > > > > guts the same kind of code your are proposing
>>> here? By
>>> > >> > > > extension,
>>> > >> > > > >> > will
>>> > >> > > > >> > > > you
>>> > >> > > > >> > > > > not end up with more framework-like (Spark-like)
>>> code
>>> > and
>>> > >> > > > solutions
>>> > >> > > > >> > as
>>> > >> > > > >> > > > > found in Spark? I am just playing devil's advocate
>>> here
>>> > >> ;-)
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > What would be interesting would be to find out if
>>> there
>>> > >> is a
>>> > >> > > > core
>>> > >> > > > >> > part
>>> > >> > > > >> > > of
>>> > >> > > > >> > > > > Spark that is separable and ex tractable into a
>>> Commons
>>> > >> > > > component.
>>> > >> > > > >> > > Since
>>> > >> > > > >> > > > > Spark has a proven track record, it is more likely,
>>> > that
>>> > >> > such
>>> > >> > > a
>>> > >> > > > >> > library
>>> > >> > > > >> > > > > would be generally useful than one created from
>>> scratch
>>> > >> that
>>> > >> > > > does
>>> > >> > > > >> not
>>> > >> > > > >> > > > > integrate with anything else. Again, please do not
>>> take
>>> > >> any
>>> > >> > of
>>> > >> > > > this
>>> > >> > > > >> > > > > personally, I am just playing here :-)
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > Gary
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
>>> > >> > > boards@gmail.com>
>>> > >> > > > >> > wrote:
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > > I already see a huge difference here: Spark
>>> requires
>>> > a
>>> > >> > bunch
>>> > >> > > > of
>>> > >> > > > >> > > > > > infrastructure to be set up, while this library
>>> is
>>> > >> just a
>>> > >> > > > >> library.
>>> > >> > > > >> > > > > Similar
>>> > >> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink
>>> or
>>> > >> Storm
>>> > >> > or
>>> > >> > > > >> Samza
>>> > >> > > > >> > or
>>> > >> > > > >> > > > the
>>> > >> > > > >> > > > > > others.
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
>>> > >> > > > garydgregory@gmail.com>
>>> > >> > > > >> > > wrote:
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
>>> > >> > > > >> > > strider90arun@gmail.com
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > > > > > wrote:
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > > Hi All,
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > Good afternoon.
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > I have been working on a java generic
>>> parallel
>>> > >> > execution
>>> > >> > > > >> > library
>>> > >> > > > >> > > > > which
>>> > >> > > > >> > > > > > > will
>>> > >> > > > >> > > > > > > > allow clients to execute methods in parallel
>>> > >> > > irrespective
>>> > >> > > > of
>>> > >> > > > >> > the
>>> > >> > > > >> > > > > number
>>> > >> > > > >> > > > > > > of
>>> > >> > > > >> > > > > > > > method arguments, type of method arguments,
>>> > return
>>> > >> > type
>>> > >> > > of
>>> > >> > > > >> the
>>> > >> > > > >> > > > method
>>> > >> > > > >> > > > > > > etc.
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > Here is the link to the source code:
>>> > >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
>>> > >> > > execution-engine
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > The project is in a nascent state and I am
>>> the
>>> > only
>>> > >> > > > >> contributor
>>> > >> > > > >> > > so
>>> > >> > > > >> > > > > > far. I
>>> > >> > > > >> > > > > > > > am new to the Apache community and I would
>>> like
>>> > to
>>> > >> > bring
>>> > >> > > > this
>>> > >> > > > >> > > > project
>>> > >> > > > >> > > > > > > into
>>> > >> > > > >> > > > > > > > Apache and improve, expand and build a
>>> developer
>>> > >> > > community
>>> > >> > > > >> > around
>>> > >> > > > >> > > > it.
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > I think this project can be a sub project of
>>> > Apache
>>> > >> > > > Commons
>>> > >> > > > >> > since
>>> > >> > > > >> > > > it
>>> > >> > > > >> > > > > > > > provides generic components for
>>> parallelizing any
>>> > >> kind
>>> > >> > > of
>>> > >> > > > >> > > methods.
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > Can somebody please guide me or suggest what
>>> > other
>>> > >> > > > options I
>>> > >> > > > >> > can
>>> > >> > > > >> > > > > > explore
>>> > >> > > > >> > > > > > > ?
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > Hi Arun,
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > Thank you for your proposal.
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > How would this be different from Apache Spark?
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > Thank you,
>>> > >> > > > >> > > > > > > Gary
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > > > Thanks,
>>> > >> > > > >> > > > > > > > Arun
>>> > >> > > > >> > > > > > > >
>>> > >> > > > >> > > > > > >
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > > > --
>>> > >> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
>>> > >> > > > >> > > > > >
>>> > >> > > > >> > > > >
>>> > >> > > > >> > > >
>>> > >> > > > >> > >
>>> > >> > > > >> >
>>> > >> > > > >>
>>> > >> > > > >
>>> > >> > > > >
>>> > >> > > > >
>>> > >> > > > > --
>>> > >> > > > > Matt Sicker <bo...@gmail.com>
>>> > >> > > >
>>> > >> > > > ------------------------------------------------------------
>>> > >> ---------
>>> > >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> > >> > > > For additional commands, e-mail: dev-help@commons.apache.org
>>> > >> > > >
>>> > >> > > >
>>> > >> > >
>>> > >> > >
>>> > >> > > --
>>> > >> > > Matt Sicker <bo...@gmail.com>
>>> > >> > >
>>> > >> >
>>> > >> > ------------------------------------------------------------
>>> ---------
>>> > >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> > >> > For additional commands, e-mail: dev-help@commons.apache.org
>>> > >> >
>>> > >> >
>>> > >>
>>> > >>
>>> > >> --
>>> > >> Matt Sicker <bo...@gmail.com>
>>> > >>
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Hi All,

I found some time recently to work on the suggestions and ideas that came
up while discussing this.

Specifically, I reworked two major points that were called out -

1. Removed usage of Reflection API.
Replaced reflection with MethodHandles introduced in Java 7 which provide
 typed, directly executable reference to an underlying method on an object.

2. Annotation based code generation for providing the clients type safety
while building the method signatures to be parallelized. No more hardcoded
method strings.

https://github.com/striderarun/parallel-execution-engine



On Wed, Jun 14, 2017 at 12:41 PM, Arun Mohan <st...@gmail.com>
wrote:

> Thanks for the tip Gary. Will give it a try.
>
> On Wed, Jun 14, 2017 at 12:13 PM, Gary Gregory <ga...@gmail.com>
> wrote:
>
>> Briefly: If you are considering code generation, then you can do away with
>> using reflection.
>>
>> G
>>
>> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
>> wrote:
>>
>> > I was exploring ways on how to substitute the typing of method names in
>> the
>> > api with something thats more clean and maintainable.
>> > Using annotations, how can I provide clients the ability to specify
>> which
>> > method needs to be specified? Any ideas? Sort of stuck on this now.
>> >
>> > Right now I am thinking of something similar to HibernateJpa Metamodel
>> > generator, where a new class will be generated via byte code
>> manipulation
>> >  which will contain static string variables corresponding to all
>> annotated
>> > method names. Then the client can refer to the String variables in the
>> > generated class instead of typing the method names.
>> >
>> > Also, I don't have much experience playing with ASM or java assist. As
>> it
>> > currently stands, is this project a good fit for further exploration in
>> the
>> > Sandbox? I would like to see if there are interested folks with
>> experience
>> > in byte code manipulation who can contribute to this.
>> >
>> > On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
>> > wrote:
>> >
>> > > I was checking out how the library would plug into Spring and other
>> > > frameworks. I created a sample Spring project with a couple of auto
>> wired
>> > > service classes. To fetch and combine data from multiple service
>> classes
>> > in
>> > > parallel, the Spring injected service dependencies are passed to the
>> > > library.
>> > >
>> > > Since the library is framework agnostic, it deals with the spring
>> > injected
>> > > dependency as a normal object.
>> > >
>> > > You can see it here : https://github.com/striderarun/spring-app-
>> > > parallel-execution/blob/master/src/main/java/com/dashboard/
>> service/impl/
>> > > DashboardServiceImpl.java
>> > >
>> > > I think the idea here is that clients can parallelize method calls
>> > > irrespective of whether they are part of Spring beans or implemented
>> as
>> > > part of any other framework. Clients don't have to modify or wrap
>> their
>> > > methods into an ExecutorService, Runnable or any other low level apis
>> to
>> > do
>> > > so. Methods can be submitted as-is to the library.
>> > >
>> > > The library can serve as a higher level abstraction that completely
>> hides
>> > > concurrency apis from the client.
>> > >
>> > >
>> > > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com>
>> wrote:
>> > >
>> > >> There's also some interesting execution APIs available in the Scala
>> > >> standard library. Those are built on top of ForkJoinPool and such
>> > >> nowadays,
>> > >> but the idea is there for a nicer API on top of ExecutorService and
>> > other
>> > >> low level details.
>> > >>
>> > >> In the interests of concurrency, there are other thread-like models
>> that
>> > >> can be explored. For example: http://docs.paralleluniverse.c
>> o/quasar/
>> > >>
>> > >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
>> > >> brunodepaulak@yahoo.com.br.invalid> wrote:
>> > >>
>> > >> > Interesting idea. And great discussion. Can't really say I'd have a
>> > use
>> > >> > case for that right now, so abstaining from the discussion around
>> the
>> > >> > implementation.
>> > >> >
>> > >> > I believe if we decide to explore this idea in Commons, we will
>> > probably
>> > >> > move it to sandbox? Even if we do not move that to Commons or to
>> > >> sandbox, I
>> > >> > intend to find some time in the next days to try Apache Commons
>> > Javaflow
>> > >> > with this library.
>> > >> >
>> > >> > Jenkins implemented pipelines + continuations with code that when
>> > >> started
>> > >> > it looked a lot like Javaflow. The execution in parallel is taken
>> care
>> > >> in
>> > >> > some internal modules in Jenkins, but I would like to see how if
>> > simpler
>> > >> > implementation like this one would work.
>> > >> >
>> > >> > Ideally, this utility would execute in parallel, say, 20 tasks each
>> > >> taking
>> > >> > 5 minutes (haven't looked if it supports fork/join). Then I would
>> be
>> > >> able
>> > >> > to have checkpoints during the execution and if the whole workflow
>> > >> fails, I
>> > >> > would be able to restart it from the last checkpoint.
>> > >> >
>> > >> >
>> > >> > I use Java7+ concurrent classes when I need to execute tasks in
>> > parallel
>> > >> > (though I'm adding a flag to Paul King's message in this thread to
>> > give
>> > >> > GPars a try too!), but I am unaware of any way to have
>> persistentable
>> > >> (?)
>> > >> > continuation workflows as in Jenkins, but with simple Java code.
>> > >> >
>> > >> > Cheers
>> > >> > Bruno
>> > >> >
>> > >> > ________________________________
>> > >> > From: Gary Gregory <ga...@gmail.com>
>> > >> > To: Commons Developers List <de...@commons.apache.org>
>> > >> > Sent: Tuesday, 13 June 2017 2:08 PM
>> > >> > Subject: Re: Commons sub project for parallel method execution
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com>
>> > wrote:
>> > >> >
>> > >> > > So wouldn't something like ASM or Javassist or one of the zillion
>> > >> other
>> > >> > > bytecode libraries be a better alternative to using reflection
>> for
>> > >> > > performance? Also, using the Java 7 reflections API improvements
>> > helps
>> > >> > > speed things up quite a bit.
>> > >> > >
>> > >> >
>> > >> > IMO, unless you are doing scripting, reflection should be a used
>> as a
>> > >> > workaround, but that's just me. For example, like we do in Commons
>> > IO's
>> > >> > Java7Support class.
>> > >> >
>> > >> > But I digress ;-)
>> > >> >
>> > >> > This is clearly an interesting topic. My concern is that there is a
>> > LOT
>> > >> of
>> > >> > code out there that does stuff like this at the low and high level
>> > from
>> > >> the
>> > >> > JRE's fork/join to Apache Spark and so on as I've stated.
>> > >> >
>> > >> > IMO something new would have to be both unique and since this is
>> > >> Commons,
>> > >> > potentially pluggable into other frameworks.
>> > >> >
>> > >> > Gary
>> > >> >
>> > >> >
>> > >> >
>> > >> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com>
>> > >> wrote:
>> > >> > >
>> > >> > > > My goto library for such tasks would be GPars. It has both Java
>> > and
>> > >> > > > Groovy support for most things (actors/dataflow) but less so
>> for
>> > >> > > > asynchronous task execution. It's one of the things that would
>> be
>> > >> good
>> > >> > > > to explore in light of Java 8. Groovy is now Apache, GPars not
>> at
>> > >> this
>> > >> > > > stage.
>> > >> > > >
>> > >> > > > So with adding two jars (GPars + Groovy), you can use Groovy
>> like
>> > >> this:
>> > >> > > >
>> > >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
>> > >> > > > import com.arun.student.StudentService
>> > >> > > > import groovyx.gpars.GParsExecutorsPool
>> > >> > > >
>> > >> > > > long startTime = System.nanoTime()
>> > >> > > > def service = new StudentService()
>> > >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time":
>> > 14,
>> > >> > > > "Harry Potter": 7]
>> > >> > > >
>> > >> > > > def tasks = [
>> > >> > > >         { println service.findStudent("john@gmail.com", 11,
>> > false)
>> > >> },
>> > >> > > >         { println service.getStudentMarks(1L) },
>> > >> > > >         { println service.getStudentsByFirstNames(["
>> > John","Alice"])
>> > >> },
>> > >> > > >         { println service.getRandomLastName() },
>> > >> > > >         { println service.findStudentIdByName("Kate",
>> "Williams")
>> > >> },
>> > >> > > >         { service.printMapValues(bookSeries) }
>> > >> > > > ]
>> > >> > > >
>> > >> > > > GParsExecutorsPool.withPool {
>> > >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
>> > >> > > > //    tasks.eachParallel{ it() } // one of numerous
>> alternatives
>> > >> > > > }
>> > >> > > >
>> > >> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
>> > >> > > > println "\nTotal elapsed time is $executionTime\n\n"
>> > >> > > >
>> > >> > > >
>> > >> > > > Cheers, Paul.
>> > >> > > >
>> > >> > > >
>> > >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <boards@gmail.com
>> >
>> > >> wrote:
>> > >> > > > > I'd be interested to see where this leads to. It could end up
>> > as a
>> > >> > sort
>> > >> > > > of
>> > >> > > > > Commons Parallel library. Besides providing an execution API,
>> > >> there
>> > >> > > could
>> > >> > > > > be plenty of support utilities that tend to be found in all
>> the
>> > >> > > > > *Util(s)/*Helper classes in projects like all the ones I
>> > mentioned
>> > >> > > > earlier
>> > >> > > > > (basically all sorts of Hadoop-related projects and other
>> > >> distributed
>> > >> > > > > systems here).
>> > >> > > > >
>> > >> > > > > Really, there's so many ways that such a project could head,
>> I'd
>> > >> like
>> > >> > > to
>> > >> > > > > hear more ideas on what to focus on.
>> > >> > > > >
>> > >> > > > > On 12 June 2017 at 18:19, Gary Gregory <
>> garydgregory@gmail.com>
>> > >> > wrote:
>> > >> > > > >
>> > >> > > > >> The upshot is that there has to be a way to do this with
>> some
>> > >> custom
>> > >> > > > code
>> > >> > > > >> to at least have the ability to 'fast path' the code without
>> > >> > > reflection.
>> > >> > > > >> Using lambdas should make this fairly syntactically
>> > unobtrusive.
>> > >> > > > >>
>> > >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
>> > >> > strider90arun@gmail.com>
>> > >> > > > >> wrote:
>> > >> > > > >>
>> > >> > > > >> > Yes, reflection is not very performant but I don't think I
>> > have
>> > >> > any
>> > >> > > > other
>> > >> > > > >> > choice since the library has to inspect the object
>> supplied
>> > by
>> > >> the
>> > >> > > > client
>> > >> > > > >> > at runtime to pick out the methods to be invoked using
>> > >> > > > CompletableFuture.
>> > >> > > > >> > But the performance penalty paid for using reflection
>> will be
>> > >> more
>> > >> > > > than
>> > >> > > > >> > offset by the savings of parallel method execution, more
>> so
>> > as
>> > >> the
>> > >> > > no
>> > >> > > > of
>> > >> > > > >> > methods executed in parallel increases.
>> > >> > > > >> >
>> > >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
>> > >> > > garydgregory@gmail.com
>> > >> > > > >
>> > >> > > > >> > wrote:
>> > >> > > > >> >
>> > >> > > > >> > > On a lower-level, if you want to use this for
>> lower-level
>> > >> > services
>> > >> > > > >> (where
>> > >> > > > >> > > there is no network latency for example), you will need
>> to
>> > >> avoid
>> > >> > > > using
>> > >> > > > >> > > reflection to get the best performance.
>> > >> > > > >> > >
>> > >> > > > >> > > Gary
>> > >> > > > >> > >
>> > >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
>> > >> > > > strider90arun@gmail.com>
>> > >> > > > >> > > wrote:
>> > >> > > > >> > >
>> > >> > > > >> > > > Hi Gary,
>> > >> > > > >> > > >
>> > >> > > > >> > > > Thanks for your response. You have some valid and
>> > >> interesting
>> > >> > > > points
>> > >> > > > >> > :-)
>> > >> > > > >> > > > Of course you are right that Spark is much more
>> mature.
>> > >> Thanks
>> > >> > > for
>> > >> > > > >> your
>> > >> > > > >> > > > insight.
>> > >> > > > >> > > > It will be interesting indeed to find out if the core
>> > >> > > > parallelization
>> > >> > > > >> > > > engine of Spark can be isolated like you suggest.
>> > >> > > > >> > > >
>> > >> > > > >> > > > I started working on this project because I felt that
>> > there
>> > >> > was
>> > >> > > no
>> > >> > > > >> good
>> > >> > > > >> > > > library for parallelizing method calls which can be
>> > >> plugged in
>> > >> > > > easily
>> > >> > > > >> > > into
>> > >> > > > >> > > > an existing java project. Ultimately, if such a
>> solution
>> > >> can
>> > >> > be
>> > >> > > > >> > > > incorporated in the Apache Commons, it would be a
>> useful
>> > >> > > addition
>> > >> > > > to
>> > >> > > > >> > the
>> > >> > > > >> > > > Commons repository.
>> > >> > > > >> > > >
>> > >> > > > >> > > > Thanks,
>> > >> > > > >> > > > Arun
>> > >> > > > >> > > >
>> > >> > > > >> > > >
>> > >> > > > >> > > >
>> > >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
>> > >> > > > >> garydgregory@gmail.com>
>> > >> > > > >> > > > wrote:
>> > >> > > > >> > > >
>> > >> > > > >> > > > > Hi Arun,
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > Sure, and that is to be expected, Spark is more
>> mature
>> > >> than
>> > >> > a
>> > >> > > > four
>> > >> > > > >> > > class
>> > >> > > > >> > > > > prototype. What I am trying to get to is that in
>> order
>> > >> for
>> > >> > the
>> > >> > > > >> > library
>> > >> > > > >> > > to
>> > >> > > > >> > > > > be useful, you will end up with more in a first
>> > release,
>> > >> and
>> > >> > > > after
>> > >> > > > >> a
>> > >> > > > >> > > > couple
>> > >> > > > >> > > > > more releases, there will be more and more. Would
>> Spark
>> > >> not
>> > >> > > > have in
>> > >> > > > >> > its
>> > >> > > > >> > > > > guts the same kind of code your are proposing here?
>> By
>> > >> > > > extension,
>> > >> > > > >> > will
>> > >> > > > >> > > > you
>> > >> > > > >> > > > > not end up with more framework-like (Spark-like)
>> code
>> > and
>> > >> > > > solutions
>> > >> > > > >> > as
>> > >> > > > >> > > > > found in Spark? I am just playing devil's advocate
>> here
>> > >> ;-)
>> > >> > > > >> > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > What would be interesting would be to find out if
>> there
>> > >> is a
>> > >> > > > core
>> > >> > > > >> > part
>> > >> > > > >> > > of
>> > >> > > > >> > > > > Spark that is separable and ex tractable into a
>> Commons
>> > >> > > > component.
>> > >> > > > >> > > Since
>> > >> > > > >> > > > > Spark has a proven track record, it is more likely,
>> > that
>> > >> > such
>> > >> > > a
>> > >> > > > >> > library
>> > >> > > > >> > > > > would be generally useful than one created from
>> scratch
>> > >> that
>> > >> > > > does
>> > >> > > > >> not
>> > >> > > > >> > > > > integrate with anything else. Again, please do not
>> take
>> > >> any
>> > >> > of
>> > >> > > > this
>> > >> > > > >> > > > > personally, I am just playing here :-)
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > Gary
>> > >> > > > >> > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
>> > >> > > boards@gmail.com>
>> > >> > > > >> > wrote:
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > > I already see a huge difference here: Spark
>> requires
>> > a
>> > >> > bunch
>> > >> > > > of
>> > >> > > > >> > > > > > infrastructure to be set up, while this library is
>> > >> just a
>> > >> > > > >> library.
>> > >> > > > >> > > > > Similar
>> > >> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink
>> or
>> > >> Storm
>> > >> > or
>> > >> > > > >> Samza
>> > >> > > > >> > or
>> > >> > > > >> > > > the
>> > >> > > > >> > > > > > others.
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
>> > >> > > > garydgregory@gmail.com>
>> > >> > > > >> > > wrote:
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
>> > >> > > > >> > > strider90arun@gmail.com
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > > > wrote:
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > > Hi All,
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Good afternoon.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > I have been working on a java generic parallel
>> > >> > execution
>> > >> > > > >> > library
>> > >> > > > >> > > > > which
>> > >> > > > >> > > > > > > will
>> > >> > > > >> > > > > > > > allow clients to execute methods in parallel
>> > >> > > irrespective
>> > >> > > > of
>> > >> > > > >> > the
>> > >> > > > >> > > > > number
>> > >> > > > >> > > > > > > of
>> > >> > > > >> > > > > > > > method arguments, type of method arguments,
>> > return
>> > >> > type
>> > >> > > of
>> > >> > > > >> the
>> > >> > > > >> > > > method
>> > >> > > > >> > > > > > > etc.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Here is the link to the source code:
>> > >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
>> > >> > > execution-engine
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > The project is in a nascent state and I am the
>> > only
>> > >> > > > >> contributor
>> > >> > > > >> > > so
>> > >> > > > >> > > > > > far. I
>> > >> > > > >> > > > > > > > am new to the Apache community and I would
>> like
>> > to
>> > >> > bring
>> > >> > > > this
>> > >> > > > >> > > > project
>> > >> > > > >> > > > > > > into
>> > >> > > > >> > > > > > > > Apache and improve, expand and build a
>> developer
>> > >> > > community
>> > >> > > > >> > around
>> > >> > > > >> > > > it.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > I think this project can be a sub project of
>> > Apache
>> > >> > > > Commons
>> > >> > > > >> > since
>> > >> > > > >> > > > it
>> > >> > > > >> > > > > > > > provides generic components for parallelizing
>> any
>> > >> kind
>> > >> > > of
>> > >> > > > >> > > methods.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Can somebody please guide me or suggest what
>> > other
>> > >> > > > options I
>> > >> > > > >> > can
>> > >> > > > >> > > > > > explore
>> > >> > > > >> > > > > > > ?
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Hi Arun,
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Thank you for your proposal.
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > How would this be different from Apache Spark?
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Thank you,
>> > >> > > > >> > > > > > > Gary
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Thanks,
>> > >> > > > >> > > > > > > > Arun
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > --
>> > >> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > >
>> > >> > > > >> > >
>> > >> > > > >> >
>> > >> > > > >>
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > --
>> > >> > > > > Matt Sicker <bo...@gmail.com>
>> > >> > > >
>> > >> > > > ------------------------------------------------------------
>> > >> ---------
>> > >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > >> > > > For additional commands, e-mail: dev-help@commons.apache.org
>> > >> > > >
>> > >> > > >
>> > >> > >
>> > >> > >
>> > >> > > --
>> > >> > > Matt Sicker <bo...@gmail.com>
>> > >> > >
>> > >> >
>> > >> > ------------------------------------------------------------
>> ---------
>> > >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > >> > For additional commands, e-mail: dev-help@commons.apache.org
>> > >> >
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Matt Sicker <bo...@gmail.com>
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Thanks for the tip Gary. Will give it a try.

On Wed, Jun 14, 2017 at 12:13 PM, Gary Gregory <ga...@gmail.com>
wrote:

> Briefly: If you are considering code generation, then you can do away with
> using reflection.
>
> G
>
> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
> > I was exploring ways on how to substitute the typing of method names in
> the
> > api with something thats more clean and maintainable.
> > Using annotations, how can I provide clients the ability to specify which
> > method needs to be specified? Any ideas? Sort of stuck on this now.
> >
> > Right now I am thinking of something similar to HibernateJpa Metamodel
> > generator, where a new class will be generated via byte code manipulation
> >  which will contain static string variables corresponding to all
> annotated
> > method names. Then the client can refer to the String variables in the
> > generated class instead of typing the method names.
> >
> > Also, I don't have much experience playing with ASM or java assist. As it
> > currently stands, is this project a good fit for further exploration in
> the
> > Sandbox? I would like to see if there are interested folks with
> experience
> > in byte code manipulation who can contribute to this.
> >
> > On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
> > wrote:
> >
> > > I was checking out how the library would plug into Spring and other
> > > frameworks. I created a sample Spring project with a couple of auto
> wired
> > > service classes. To fetch and combine data from multiple service
> classes
> > in
> > > parallel, the Spring injected service dependencies are passed to the
> > > library.
> > >
> > > Since the library is framework agnostic, it deals with the spring
> > injected
> > > dependency as a normal object.
> > >
> > > You can see it here : https://github.com/striderarun/spring-app-
> > > parallel-execution/blob/master/src/main/java/com/
> dashboard/service/impl/
> > > DashboardServiceImpl.java
> > >
> > > I think the idea here is that clients can parallelize method calls
> > > irrespective of whether they are part of Spring beans or implemented as
> > > part of any other framework. Clients don't have to modify or wrap their
> > > methods into an ExecutorService, Runnable or any other low level apis
> to
> > do
> > > so. Methods can be submitted as-is to the library.
> > >
> > > The library can serve as a higher level abstraction that completely
> hides
> > > concurrency apis from the client.
> > >
> > >
> > > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com> wrote:
> > >
> > >> There's also some interesting execution APIs available in the Scala
> > >> standard library. Those are built on top of ForkJoinPool and such
> > >> nowadays,
> > >> but the idea is there for a nicer API on top of ExecutorService and
> > other
> > >> low level details.
> > >>
> > >> In the interests of concurrency, there are other thread-like models
> that
> > >> can be explored. For example: http://docs.paralleluniverse.co/quasar/
> > >>
> > >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
> > >> brunodepaulak@yahoo.com.br.invalid> wrote:
> > >>
> > >> > Interesting idea. And great discussion. Can't really say I'd have a
> > use
> > >> > case for that right now, so abstaining from the discussion around
> the
> > >> > implementation.
> > >> >
> > >> > I believe if we decide to explore this idea in Commons, we will
> > probably
> > >> > move it to sandbox? Even if we do not move that to Commons or to
> > >> sandbox, I
> > >> > intend to find some time in the next days to try Apache Commons
> > Javaflow
> > >> > with this library.
> > >> >
> > >> > Jenkins implemented pipelines + continuations with code that when
> > >> started
> > >> > it looked a lot like Javaflow. The execution in parallel is taken
> care
> > >> in
> > >> > some internal modules in Jenkins, but I would like to see how if
> > simpler
> > >> > implementation like this one would work.
> > >> >
> > >> > Ideally, this utility would execute in parallel, say, 20 tasks each
> > >> taking
> > >> > 5 minutes (haven't looked if it supports fork/join). Then I would be
> > >> able
> > >> > to have checkpoints during the execution and if the whole workflow
> > >> fails, I
> > >> > would be able to restart it from the last checkpoint.
> > >> >
> > >> >
> > >> > I use Java7+ concurrent classes when I need to execute tasks in
> > parallel
> > >> > (though I'm adding a flag to Paul King's message in this thread to
> > give
> > >> > GPars a try too!), but I am unaware of any way to have
> persistentable
> > >> (?)
> > >> > continuation workflows as in Jenkins, but with simple Java code.
> > >> >
> > >> > Cheers
> > >> > Bruno
> > >> >
> > >> > ________________________________
> > >> > From: Gary Gregory <ga...@gmail.com>
> > >> > To: Commons Developers List <de...@commons.apache.org>
> > >> > Sent: Tuesday, 13 June 2017 2:08 PM
> > >> > Subject: Re: Commons sub project for parallel method execution
> > >> >
> > >> >
> > >> >
> > >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com>
> > wrote:
> > >> >
> > >> > > So wouldn't something like ASM or Javassist or one of the zillion
> > >> other
> > >> > > bytecode libraries be a better alternative to using reflection for
> > >> > > performance? Also, using the Java 7 reflections API improvements
> > helps
> > >> > > speed things up quite a bit.
> > >> > >
> > >> >
> > >> > IMO, unless you are doing scripting, reflection should be a used as
> a
> > >> > workaround, but that's just me. For example, like we do in Commons
> > IO's
> > >> > Java7Support class.
> > >> >
> > >> > But I digress ;-)
> > >> >
> > >> > This is clearly an interesting topic. My concern is that there is a
> > LOT
> > >> of
> > >> > code out there that does stuff like this at the low and high level
> > from
> > >> the
> > >> > JRE's fork/join to Apache Spark and so on as I've stated.
> > >> >
> > >> > IMO something new would have to be both unique and since this is
> > >> Commons,
> > >> > potentially pluggable into other frameworks.
> > >> >
> > >> > Gary
> > >> >
> > >> >
> > >> >
> > >> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com>
> > >> wrote:
> > >> > >
> > >> > > > My goto library for such tasks would be GPars. It has both Java
> > and
> > >> > > > Groovy support for most things (actors/dataflow) but less so for
> > >> > > > asynchronous task execution. It's one of the things that would
> be
> > >> good
> > >> > > > to explore in light of Java 8. Groovy is now Apache, GPars not
> at
> > >> this
> > >> > > > stage.
> > >> > > >
> > >> > > > So with adding two jars (GPars + Groovy), you can use Groovy
> like
> > >> this:
> > >> > > >
> > >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > >> > > > import com.arun.student.StudentService
> > >> > > > import groovyx.gpars.GParsExecutorsPool
> > >> > > >
> > >> > > > long startTime = System.nanoTime()
> > >> > > > def service = new StudentService()
> > >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time":
> > 14,
> > >> > > > "Harry Potter": 7]
> > >> > > >
> > >> > > > def tasks = [
> > >> > > >         { println service.findStudent("john@gmail.com", 11,
> > false)
> > >> },
> > >> > > >         { println service.getStudentMarks(1L) },
> > >> > > >         { println service.getStudentsByFirstNames(["
> > John","Alice"])
> > >> },
> > >> > > >         { println service.getRandomLastName() },
> > >> > > >         { println service.findStudentIdByName("Kate",
> "Williams")
> > >> },
> > >> > > >         { service.printMapValues(bookSeries) }
> > >> > > > ]
> > >> > > >
> > >> > > > GParsExecutorsPool.withPool {
> > >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > >> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > >> > > > }
> > >> > > >
> > >> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
> > >> > > > println "\nTotal elapsed time is $executionTime\n\n"
> > >> > > >
> > >> > > >
> > >> > > > Cheers, Paul.
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com>
> > >> wrote:
> > >> > > > > I'd be interested to see where this leads to. It could end up
> > as a
> > >> > sort
> > >> > > > of
> > >> > > > > Commons Parallel library. Besides providing an execution API,
> > >> there
> > >> > > could
> > >> > > > > be plenty of support utilities that tend to be found in all
> the
> > >> > > > > *Util(s)/*Helper classes in projects like all the ones I
> > mentioned
> > >> > > > earlier
> > >> > > > > (basically all sorts of Hadoop-related projects and other
> > >> distributed
> > >> > > > > systems here).
> > >> > > > >
> > >> > > > > Really, there's so many ways that such a project could head,
> I'd
> > >> like
> > >> > > to
> > >> > > > > hear more ideas on what to focus on.
> > >> > > > >
> > >> > > > > On 12 June 2017 at 18:19, Gary Gregory <
> garydgregory@gmail.com>
> > >> > wrote:
> > >> > > > >
> > >> > > > >> The upshot is that there has to be a way to do this with some
> > >> custom
> > >> > > > code
> > >> > > > >> to at least have the ability to 'fast path' the code without
> > >> > > reflection.
> > >> > > > >> Using lambdas should make this fairly syntactically
> > unobtrusive.
> > >> > > > >>
> > >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> > >> > strider90arun@gmail.com>
> > >> > > > >> wrote:
> > >> > > > >>
> > >> > > > >> > Yes, reflection is not very performant but I don't think I
> > have
> > >> > any
> > >> > > > other
> > >> > > > >> > choice since the library has to inspect the object supplied
> > by
> > >> the
> > >> > > > client
> > >> > > > >> > at runtime to pick out the methods to be invoked using
> > >> > > > CompletableFuture.
> > >> > > > >> > But the performance penalty paid for using reflection will
> be
> > >> more
> > >> > > > than
> > >> > > > >> > offset by the savings of parallel method execution, more so
> > as
> > >> the
> > >> > > no
> > >> > > > of
> > >> > > > >> > methods executed in parallel increases.
> > >> > > > >> >
> > >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > >> > > garydgregory@gmail.com
> > >> > > > >
> > >> > > > >> > wrote:
> > >> > > > >> >
> > >> > > > >> > > On a lower-level, if you want to use this for lower-level
> > >> > services
> > >> > > > >> (where
> > >> > > > >> > > there is no network latency for example), you will need
> to
> > >> avoid
> > >> > > > using
> > >> > > > >> > > reflection to get the best performance.
> > >> > > > >> > >
> > >> > > > >> > > Gary
> > >> > > > >> > >
> > >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > >> > > > strider90arun@gmail.com>
> > >> > > > >> > > wrote:
> > >> > > > >> > >
> > >> > > > >> > > > Hi Gary,
> > >> > > > >> > > >
> > >> > > > >> > > > Thanks for your response. You have some valid and
> > >> interesting
> > >> > > > points
> > >> > > > >> > :-)
> > >> > > > >> > > > Of course you are right that Spark is much more mature.
> > >> Thanks
> > >> > > for
> > >> > > > >> your
> > >> > > > >> > > > insight.
> > >> > > > >> > > > It will be interesting indeed to find out if the core
> > >> > > > parallelization
> > >> > > > >> > > > engine of Spark can be isolated like you suggest.
> > >> > > > >> > > >
> > >> > > > >> > > > I started working on this project because I felt that
> > there
> > >> > was
> > >> > > no
> > >> > > > >> good
> > >> > > > >> > > > library for parallelizing method calls which can be
> > >> plugged in
> > >> > > > easily
> > >> > > > >> > > into
> > >> > > > >> > > > an existing java project. Ultimately, if such a
> solution
> > >> can
> > >> > be
> > >> > > > >> > > > incorporated in the Apache Commons, it would be a
> useful
> > >> > > addition
> > >> > > > to
> > >> > > > >> > the
> > >> > > > >> > > > Commons repository.
> > >> > > > >> > > >
> > >> > > > >> > > > Thanks,
> > >> > > > >> > > > Arun
> > >> > > > >> > > >
> > >> > > > >> > > >
> > >> > > > >> > > >
> > >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > >> > > > >> garydgregory@gmail.com>
> > >> > > > >> > > > wrote:
> > >> > > > >> > > >
> > >> > > > >> > > > > Hi Arun,
> > >> > > > >> > > > >
> > >> > > > >> > > > > Sure, and that is to be expected, Spark is more
> mature
> > >> than
> > >> > a
> > >> > > > four
> > >> > > > >> > > class
> > >> > > > >> > > > > prototype. What I am trying to get to is that in
> order
> > >> for
> > >> > the
> > >> > > > >> > library
> > >> > > > >> > > to
> > >> > > > >> > > > > be useful, you will end up with more in a first
> > release,
> > >> and
> > >> > > > after
> > >> > > > >> a
> > >> > > > >> > > > couple
> > >> > > > >> > > > > more releases, there will be more and more. Would
> Spark
> > >> not
> > >> > > > have in
> > >> > > > >> > its
> > >> > > > >> > > > > guts the same kind of code your are proposing here?
> By
> > >> > > > extension,
> > >> > > > >> > will
> > >> > > > >> > > > you
> > >> > > > >> > > > > not end up with more framework-like (Spark-like) code
> > and
> > >> > > > solutions
> > >> > > > >> > as
> > >> > > > >> > > > > found in Spark? I am just playing devil's advocate
> here
> > >> ;-)
> > >> > > > >> > > > >
> > >> > > > >> > > > >
> > >> > > > >> > > > > What would be interesting would be to find out if
> there
> > >> is a
> > >> > > > core
> > >> > > > >> > part
> > >> > > > >> > > of
> > >> > > > >> > > > > Spark that is separable and ex tractable into a
> Commons
> > >> > > > component.
> > >> > > > >> > > Since
> > >> > > > >> > > > > Spark has a proven track record, it is more likely,
> > that
> > >> > such
> > >> > > a
> > >> > > > >> > library
> > >> > > > >> > > > > would be generally useful than one created from
> scratch
> > >> that
> > >> > > > does
> > >> > > > >> not
> > >> > > > >> > > > > integrate with anything else. Again, please do not
> take
> > >> any
> > >> > of
> > >> > > > this
> > >> > > > >> > > > > personally, I am just playing here :-)
> > >> > > > >> > > > >
> > >> > > > >> > > > > Gary
> > >> > > > >> > > > >
> > >> > > > >> > > > >
> > >> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> > >> > > boards@gmail.com>
> > >> > > > >> > wrote:
> > >> > > > >> > > > >
> > >> > > > >> > > > > > I already see a huge difference here: Spark
> requires
> > a
> > >> > bunch
> > >> > > > of
> > >> > > > >> > > > > > infrastructure to be set up, while this library is
> > >> just a
> > >> > > > >> library.
> > >> > > > >> > > > > Similar
> > >> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or
> > >> Storm
> > >> > or
> > >> > > > >> Samza
> > >> > > > >> > or
> > >> > > > >> > > > the
> > >> > > > >> > > > > > others.
> > >> > > > >> > > > > >
> > >> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > >> > > > garydgregory@gmail.com>
> > >> > > > >> > > wrote:
> > >> > > > >> > > > > >
> > >> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > >> > > > >> > > strider90arun@gmail.com
> > >> > > > >> > > > >
> > >> > > > >> > > > > > > wrote:
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > > Hi All,
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > Good afternoon.
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > I have been working on a java generic parallel
> > >> > execution
> > >> > > > >> > library
> > >> > > > >> > > > > which
> > >> > > > >> > > > > > > will
> > >> > > > >> > > > > > > > allow clients to execute methods in parallel
> > >> > > irrespective
> > >> > > > of
> > >> > > > >> > the
> > >> > > > >> > > > > number
> > >> > > > >> > > > > > > of
> > >> > > > >> > > > > > > > method arguments, type of method arguments,
> > return
> > >> > type
> > >> > > of
> > >> > > > >> the
> > >> > > > >> > > > method
> > >> > > > >> > > > > > > etc.
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > Here is the link to the source code:
> > >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
> > >> > > execution-engine
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > The project is in a nascent state and I am the
> > only
> > >> > > > >> contributor
> > >> > > > >> > > so
> > >> > > > >> > > > > > far. I
> > >> > > > >> > > > > > > > am new to the Apache community and I would like
> > to
> > >> > bring
> > >> > > > this
> > >> > > > >> > > > project
> > >> > > > >> > > > > > > into
> > >> > > > >> > > > > > > > Apache and improve, expand and build a
> developer
> > >> > > community
> > >> > > > >> > around
> > >> > > > >> > > > it.
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > I think this project can be a sub project of
> > Apache
> > >> > > > Commons
> > >> > > > >> > since
> > >> > > > >> > > > it
> > >> > > > >> > > > > > > > provides generic components for parallelizing
> any
> > >> kind
> > >> > > of
> > >> > > > >> > > methods.
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > Can somebody please guide me or suggest what
> > other
> > >> > > > options I
> > >> > > > >> > can
> > >> > > > >> > > > > > explore
> > >> > > > >> > > > > > > ?
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > Hi Arun,
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > Thank you for your proposal.
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > How would this be different from Apache Spark?
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > Thank you,
> > >> > > > >> > > > > > > Gary
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > > > Thanks,
> > >> > > > >> > > > > > > > Arun
> > >> > > > >> > > > > > > >
> > >> > > > >> > > > > > >
> > >> > > > >> > > > > >
> > >> > > > >> > > > > >
> > >> > > > >> > > > > >
> > >> > > > >> > > > > > --
> > >> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
> > >> > > > >> > > > > >
> > >> > > > >> > > > >
> > >> > > > >> > > >
> > >> > > > >> > >
> > >> > > > >> >
> > >> > > > >>
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > --
> > >> > > > > Matt Sicker <bo...@gmail.com>
> > >> > > >
> > >> > > > ------------------------------------------------------------
> > >> ---------
> > >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > >> > > > For additional commands, e-mail: dev-help@commons.apache.org
> > >> > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Matt Sicker <bo...@gmail.com>
> > >> > >
> > >> >
> > >> > ------------------------------------------------------------
> ---------
> > >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > >> > For additional commands, e-mail: dev-help@commons.apache.org
> > >> >
> > >> >
> > >>
> > >>
> > >> --
> > >> Matt Sicker <bo...@gmail.com>
> > >>
> > >
> > >
> >
>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
Briefly: If you are considering code generation, then you can do away with
using reflection.

G

On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
wrote:

> I was exploring ways on how to substitute the typing of method names in the
> api with something thats more clean and maintainable.
> Using annotations, how can I provide clients the ability to specify which
> method needs to be specified? Any ideas? Sort of stuck on this now.
>
> Right now I am thinking of something similar to HibernateJpa Metamodel
> generator, where a new class will be generated via byte code manipulation
>  which will contain static string variables corresponding to all annotated
> method names. Then the client can refer to the String variables in the
> generated class instead of typing the method names.
>
> Also, I don't have much experience playing with ASM or java assist. As it
> currently stands, is this project a good fit for further exploration in the
> Sandbox? I would like to see if there are interested folks with experience
> in byte code manipulation who can contribute to this.
>
> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
> > I was checking out how the library would plug into Spring and other
> > frameworks. I created a sample Spring project with a couple of auto wired
> > service classes. To fetch and combine data from multiple service classes
> in
> > parallel, the Spring injected service dependencies are passed to the
> > library.
> >
> > Since the library is framework agnostic, it deals with the spring
> injected
> > dependency as a normal object.
> >
> > You can see it here : https://github.com/striderarun/spring-app-
> > parallel-execution/blob/master/src/main/java/com/dashboard/service/impl/
> > DashboardServiceImpl.java
> >
> > I think the idea here is that clients can parallelize method calls
> > irrespective of whether they are part of Spring beans or implemented as
> > part of any other framework. Clients don't have to modify or wrap their
> > methods into an ExecutorService, Runnable or any other low level apis to
> do
> > so. Methods can be submitted as-is to the library.
> >
> > The library can serve as a higher level abstraction that completely hides
> > concurrency apis from the client.
> >
> >
> > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com> wrote:
> >
> >> There's also some interesting execution APIs available in the Scala
> >> standard library. Those are built on top of ForkJoinPool and such
> >> nowadays,
> >> but the idea is there for a nicer API on top of ExecutorService and
> other
> >> low level details.
> >>
> >> In the interests of concurrency, there are other thread-like models that
> >> can be explored. For example: http://docs.paralleluniverse.co/quasar/
> >>
> >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
> >> brunodepaulak@yahoo.com.br.invalid> wrote:
> >>
> >> > Interesting idea. And great discussion. Can't really say I'd have a
> use
> >> > case for that right now, so abstaining from the discussion around the
> >> > implementation.
> >> >
> >> > I believe if we decide to explore this idea in Commons, we will
> probably
> >> > move it to sandbox? Even if we do not move that to Commons or to
> >> sandbox, I
> >> > intend to find some time in the next days to try Apache Commons
> Javaflow
> >> > with this library.
> >> >
> >> > Jenkins implemented pipelines + continuations with code that when
> >> started
> >> > it looked a lot like Javaflow. The execution in parallel is taken care
> >> in
> >> > some internal modules in Jenkins, but I would like to see how if
> simpler
> >> > implementation like this one would work.
> >> >
> >> > Ideally, this utility would execute in parallel, say, 20 tasks each
> >> taking
> >> > 5 minutes (haven't looked if it supports fork/join). Then I would be
> >> able
> >> > to have checkpoints during the execution and if the whole workflow
> >> fails, I
> >> > would be able to restart it from the last checkpoint.
> >> >
> >> >
> >> > I use Java7+ concurrent classes when I need to execute tasks in
> parallel
> >> > (though I'm adding a flag to Paul King's message in this thread to
> give
> >> > GPars a try too!), but I am unaware of any way to have persistentable
> >> (?)
> >> > continuation workflows as in Jenkins, but with simple Java code.
> >> >
> >> > Cheers
> >> > Bruno
> >> >
> >> > ________________________________
> >> > From: Gary Gregory <ga...@gmail.com>
> >> > To: Commons Developers List <de...@commons.apache.org>
> >> > Sent: Tuesday, 13 June 2017 2:08 PM
> >> > Subject: Re: Commons sub project for parallel method execution
> >> >
> >> >
> >> >
> >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com>
> wrote:
> >> >
> >> > > So wouldn't something like ASM or Javassist or one of the zillion
> >> other
> >> > > bytecode libraries be a better alternative to using reflection for
> >> > > performance? Also, using the Java 7 reflections API improvements
> helps
> >> > > speed things up quite a bit.
> >> > >
> >> >
> >> > IMO, unless you are doing scripting, reflection should be a used as a
> >> > workaround, but that's just me. For example, like we do in Commons
> IO's
> >> > Java7Support class.
> >> >
> >> > But I digress ;-)
> >> >
> >> > This is clearly an interesting topic. My concern is that there is a
> LOT
> >> of
> >> > code out there that does stuff like this at the low and high level
> from
> >> the
> >> > JRE's fork/join to Apache Spark and so on as I've stated.
> >> >
> >> > IMO something new would have to be both unique and since this is
> >> Commons,
> >> > potentially pluggable into other frameworks.
> >> >
> >> > Gary
> >> >
> >> >
> >> >
> >> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com>
> >> wrote:
> >> > >
> >> > > > My goto library for such tasks would be GPars. It has both Java
> and
> >> > > > Groovy support for most things (actors/dataflow) but less so for
> >> > > > asynchronous task execution. It's one of the things that would be
> >> good
> >> > > > to explore in light of Java 8. Groovy is now Apache, GPars not at
> >> this
> >> > > > stage.
> >> > > >
> >> > > > So with adding two jars (GPars + Groovy), you can use Groovy like
> >> this:
> >> > > >
> >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> >> > > > import com.arun.student.StudentService
> >> > > > import groovyx.gpars.GParsExecutorsPool
> >> > > >
> >> > > > long startTime = System.nanoTime()
> >> > > > def service = new StudentService()
> >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time":
> 14,
> >> > > > "Harry Potter": 7]
> >> > > >
> >> > > > def tasks = [
> >> > > >         { println service.findStudent("john@gmail.com", 11,
> false)
> >> },
> >> > > >         { println service.getStudentMarks(1L) },
> >> > > >         { println service.getStudentsByFirstNames(["
> John","Alice"])
> >> },
> >> > > >         { println service.getRandomLastName() },
> >> > > >         { println service.findStudentIdByName("Kate", "Williams")
> >> },
> >> > > >         { service.printMapValues(bookSeries) }
> >> > > > ]
> >> > > >
> >> > > > GParsExecutorsPool.withPool {
> >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> >> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> >> > > > }
> >> > > >
> >> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
> >> > > > println "\nTotal elapsed time is $executionTime\n\n"
> >> > > >
> >> > > >
> >> > > > Cheers, Paul.
> >> > > >
> >> > > >
> >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com>
> >> wrote:
> >> > > > > I'd be interested to see where this leads to. It could end up
> as a
> >> > sort
> >> > > > of
> >> > > > > Commons Parallel library. Besides providing an execution API,
> >> there
> >> > > could
> >> > > > > be plenty of support utilities that tend to be found in all the
> >> > > > > *Util(s)/*Helper classes in projects like all the ones I
> mentioned
> >> > > > earlier
> >> > > > > (basically all sorts of Hadoop-related projects and other
> >> distributed
> >> > > > > systems here).
> >> > > > >
> >> > > > > Really, there's so many ways that such a project could head, I'd
> >> like
> >> > > to
> >> > > > > hear more ideas on what to focus on.
> >> > > > >
> >> > > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > >> The upshot is that there has to be a way to do this with some
> >> custom
> >> > > > code
> >> > > > >> to at least have the ability to 'fast path' the code without
> >> > > reflection.
> >> > > > >> Using lambdas should make this fairly syntactically
> unobtrusive.
> >> > > > >>
> >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> >> > strider90arun@gmail.com>
> >> > > > >> wrote:
> >> > > > >>
> >> > > > >> > Yes, reflection is not very performant but I don't think I
> have
> >> > any
> >> > > > other
> >> > > > >> > choice since the library has to inspect the object supplied
> by
> >> the
> >> > > > client
> >> > > > >> > at runtime to pick out the methods to be invoked using
> >> > > > CompletableFuture.
> >> > > > >> > But the performance penalty paid for using reflection will be
> >> more
> >> > > > than
> >> > > > >> > offset by the savings of parallel method execution, more so
> as
> >> the
> >> > > no
> >> > > > of
> >> > > > >> > methods executed in parallel increases.
> >> > > > >> >
> >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> >> > > garydgregory@gmail.com
> >> > > > >
> >> > > > >> > wrote:
> >> > > > >> >
> >> > > > >> > > On a lower-level, if you want to use this for lower-level
> >> > services
> >> > > > >> (where
> >> > > > >> > > there is no network latency for example), you will need to
> >> avoid
> >> > > > using
> >> > > > >> > > reflection to get the best performance.
> >> > > > >> > >
> >> > > > >> > > Gary
> >> > > > >> > >
> >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> >> > > > strider90arun@gmail.com>
> >> > > > >> > > wrote:
> >> > > > >> > >
> >> > > > >> > > > Hi Gary,
> >> > > > >> > > >
> >> > > > >> > > > Thanks for your response. You have some valid and
> >> interesting
> >> > > > points
> >> > > > >> > :-)
> >> > > > >> > > > Of course you are right that Spark is much more mature.
> >> Thanks
> >> > > for
> >> > > > >> your
> >> > > > >> > > > insight.
> >> > > > >> > > > It will be interesting indeed to find out if the core
> >> > > > parallelization
> >> > > > >> > > > engine of Spark can be isolated like you suggest.
> >> > > > >> > > >
> >> > > > >> > > > I started working on this project because I felt that
> there
> >> > was
> >> > > no
> >> > > > >> good
> >> > > > >> > > > library for parallelizing method calls which can be
> >> plugged in
> >> > > > easily
> >> > > > >> > > into
> >> > > > >> > > > an existing java project. Ultimately, if such a solution
> >> can
> >> > be
> >> > > > >> > > > incorporated in the Apache Commons, it would be a useful
> >> > > addition
> >> > > > to
> >> > > > >> > the
> >> > > > >> > > > Commons repository.
> >> > > > >> > > >
> >> > > > >> > > > Thanks,
> >> > > > >> > > > Arun
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> >> > > > >> garydgregory@gmail.com>
> >> > > > >> > > > wrote:
> >> > > > >> > > >
> >> > > > >> > > > > Hi Arun,
> >> > > > >> > > > >
> >> > > > >> > > > > Sure, and that is to be expected, Spark is more mature
> >> than
> >> > a
> >> > > > four
> >> > > > >> > > class
> >> > > > >> > > > > prototype. What I am trying to get to is that in order
> >> for
> >> > the
> >> > > > >> > library
> >> > > > >> > > to
> >> > > > >> > > > > be useful, you will end up with more in a first
> release,
> >> and
> >> > > > after
> >> > > > >> a
> >> > > > >> > > > couple
> >> > > > >> > > > > more releases, there will be more and more. Would Spark
> >> not
> >> > > > have in
> >> > > > >> > its
> >> > > > >> > > > > guts the same kind of code your are proposing here? By
> >> > > > extension,
> >> > > > >> > will
> >> > > > >> > > > you
> >> > > > >> > > > > not end up with more framework-like (Spark-like) code
> and
> >> > > > solutions
> >> > > > >> > as
> >> > > > >> > > > > found in Spark? I am just playing devil's advocate here
> >> ;-)
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > > > What would be interesting would be to find out if there
> >> is a
> >> > > > core
> >> > > > >> > part
> >> > > > >> > > of
> >> > > > >> > > > > Spark that is separable and ex tractable into a Commons
> >> > > > component.
> >> > > > >> > > Since
> >> > > > >> > > > > Spark has a proven track record, it is more likely,
> that
> >> > such
> >> > > a
> >> > > > >> > library
> >> > > > >> > > > > would be generally useful than one created from scratch
> >> that
> >> > > > does
> >> > > > >> not
> >> > > > >> > > > > integrate with anything else. Again, please do not take
> >> any
> >> > of
> >> > > > this
> >> > > > >> > > > > personally, I am just playing here :-)
> >> > > > >> > > > >
> >> > > > >> > > > > Gary
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> >> > > boards@gmail.com>
> >> > > > >> > wrote:
> >> > > > >> > > > >
> >> > > > >> > > > > > I already see a huge difference here: Spark requires
> a
> >> > bunch
> >> > > > of
> >> > > > >> > > > > > infrastructure to be set up, while this library is
> >> just a
> >> > > > >> library.
> >> > > > >> > > > > Similar
> >> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or
> >> Storm
> >> > or
> >> > > > >> Samza
> >> > > > >> > or
> >> > > > >> > > > the
> >> > > > >> > > > > > others.
> >> > > > >> > > > > >
> >> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> >> > > > garydgregory@gmail.com>
> >> > > > >> > > wrote:
> >> > > > >> > > > > >
> >> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> >> > > > >> > > strider90arun@gmail.com
> >> > > > >> > > > >
> >> > > > >> > > > > > > wrote:
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > > Hi All,
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Good afternoon.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > I have been working on a java generic parallel
> >> > execution
> >> > > > >> > library
> >> > > > >> > > > > which
> >> > > > >> > > > > > > will
> >> > > > >> > > > > > > > allow clients to execute methods in parallel
> >> > > irrespective
> >> > > > of
> >> > > > >> > the
> >> > > > >> > > > > number
> >> > > > >> > > > > > > of
> >> > > > >> > > > > > > > method arguments, type of method arguments,
> return
> >> > type
> >> > > of
> >> > > > >> the
> >> > > > >> > > > method
> >> > > > >> > > > > > > etc.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Here is the link to the source code:
> >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
> >> > > execution-engine
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > The project is in a nascent state and I am the
> only
> >> > > > >> contributor
> >> > > > >> > > so
> >> > > > >> > > > > > far. I
> >> > > > >> > > > > > > > am new to the Apache community and I would like
> to
> >> > bring
> >> > > > this
> >> > > > >> > > > project
> >> > > > >> > > > > > > into
> >> > > > >> > > > > > > > Apache and improve, expand and build a developer
> >> > > community
> >> > > > >> > around
> >> > > > >> > > > it.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > I think this project can be a sub project of
> Apache
> >> > > > Commons
> >> > > > >> > since
> >> > > > >> > > > it
> >> > > > >> > > > > > > > provides generic components for parallelizing any
> >> kind
> >> > > of
> >> > > > >> > > methods.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Can somebody please guide me or suggest what
> other
> >> > > > options I
> >> > > > >> > can
> >> > > > >> > > > > > explore
> >> > > > >> > > > > > > ?
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Hi Arun,
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Thank you for your proposal.
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > How would this be different from Apache Spark?
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Thank you,
> >> > > > >> > > > > > > Gary
> >> > > > >> > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Thanks,
> >> > > > >> > > > > > > > Arun
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > > --
> >> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
> >> > > > >> > > > > >
> >> > > > >> > > > >
> >> > > > >> > > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Matt Sicker <bo...@gmail.com>
> >> > > >
> >> > > > ------------------------------------------------------------
> >> ---------
> >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> > > > For additional commands, e-mail: dev-help@commons.apache.org
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Matt Sicker <bo...@gmail.com>
> >> > >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> > For additional commands, e-mail: dev-help@commons.apache.org
> >> >
> >> >
> >>
> >>
> >> --
> >> Matt Sicker <bo...@gmail.com>
> >>
> >
> >
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
I was exploring ways on how to substitute the typing of method names in the
api with something thats more clean and maintainable.
Using annotations, how can I provide clients the ability to specify which
method needs to be specified? Any ideas? Sort of stuck on this now.

Right now I am thinking of something similar to HibernateJpa Metamodel
generator, where a new class will be generated via byte code manipulation
 which will contain static string variables corresponding to all annotated
method names. Then the client can refer to the String variables in the
generated class instead of typing the method names.

Also, I don't have much experience playing with ASM or java assist. As it
currently stands, is this project a good fit for further exploration in the
Sandbox? I would like to see if there are interested folks with experience
in byte code manipulation who can contribute to this.

On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <st...@gmail.com>
wrote:

> I was checking out how the library would plug into Spring and other
> frameworks. I created a sample Spring project with a couple of auto wired
> service classes. To fetch and combine data from multiple service classes in
> parallel, the Spring injected service dependencies are passed to the
> library.
>
> Since the library is framework agnostic, it deals with the spring injected
> dependency as a normal object.
>
> You can see it here : https://github.com/striderarun/spring-app-
> parallel-execution/blob/master/src/main/java/com/dashboard/service/impl/
> DashboardServiceImpl.java
>
> I think the idea here is that clients can parallelize method calls
> irrespective of whether they are part of Spring beans or implemented as
> part of any other framework. Clients don't have to modify or wrap their
> methods into an ExecutorService, Runnable or any other low level apis to do
> so. Methods can be submitted as-is to the library.
>
> The library can serve as a higher level abstraction that completely hides
> concurrency apis from the client.
>
>
> On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com> wrote:
>
>> There's also some interesting execution APIs available in the Scala
>> standard library. Those are built on top of ForkJoinPool and such
>> nowadays,
>> but the idea is there for a nicer API on top of ExecutorService and other
>> low level details.
>>
>> In the interests of concurrency, there are other thread-like models that
>> can be explored. For example: http://docs.paralleluniverse.co/quasar/
>>
>> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
>> brunodepaulak@yahoo.com.br.invalid> wrote:
>>
>> > Interesting idea. And great discussion. Can't really say I'd have a use
>> > case for that right now, so abstaining from the discussion around the
>> > implementation.
>> >
>> > I believe if we decide to explore this idea in Commons, we will probably
>> > move it to sandbox? Even if we do not move that to Commons or to
>> sandbox, I
>> > intend to find some time in the next days to try Apache Commons Javaflow
>> > with this library.
>> >
>> > Jenkins implemented pipelines + continuations with code that when
>> started
>> > it looked a lot like Javaflow. The execution in parallel is taken care
>> in
>> > some internal modules in Jenkins, but I would like to see how if simpler
>> > implementation like this one would work.
>> >
>> > Ideally, this utility would execute in parallel, say, 20 tasks each
>> taking
>> > 5 minutes (haven't looked if it supports fork/join). Then I would be
>> able
>> > to have checkpoints during the execution and if the whole workflow
>> fails, I
>> > would be able to restart it from the last checkpoint.
>> >
>> >
>> > I use Java7+ concurrent classes when I need to execute tasks in parallel
>> > (though I'm adding a flag to Paul King's message in this thread to give
>> > GPars a try too!), but I am unaware of any way to have persistentable
>> (?)
>> > continuation workflows as in Jenkins, but with simple Java code.
>> >
>> > Cheers
>> > Bruno
>> >
>> > ________________________________
>> > From: Gary Gregory <ga...@gmail.com>
>> > To: Commons Developers List <de...@commons.apache.org>
>> > Sent: Tuesday, 13 June 2017 2:08 PM
>> > Subject: Re: Commons sub project for parallel method execution
>> >
>> >
>> >
>> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:
>> >
>> > > So wouldn't something like ASM or Javassist or one of the zillion
>> other
>> > > bytecode libraries be a better alternative to using reflection for
>> > > performance? Also, using the Java 7 reflections API improvements helps
>> > > speed things up quite a bit.
>> > >
>> >
>> > IMO, unless you are doing scripting, reflection should be a used as a
>> > workaround, but that's just me. For example, like we do in Commons IO's
>> > Java7Support class.
>> >
>> > But I digress ;-)
>> >
>> > This is clearly an interesting topic. My concern is that there is a LOT
>> of
>> > code out there that does stuff like this at the low and high level from
>> the
>> > JRE's fork/join to Apache Spark and so on as I've stated.
>> >
>> > IMO something new would have to be both unique and since this is
>> Commons,
>> > potentially pluggable into other frameworks.
>> >
>> > Gary
>> >
>> >
>> >
>> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com>
>> wrote:
>> > >
>> > > > My goto library for such tasks would be GPars. It has both Java and
>> > > > Groovy support for most things (actors/dataflow) but less so for
>> > > > asynchronous task execution. It's one of the things that would be
>> good
>> > > > to explore in light of Java 8. Groovy is now Apache, GPars not at
>> this
>> > > > stage.
>> > > >
>> > > > So with adding two jars (GPars + Groovy), you can use Groovy like
>> this:
>> > > >
>> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
>> > > > import com.arun.student.StudentService
>> > > > import groovyx.gpars.GParsExecutorsPool
>> > > >
>> > > > long startTime = System.nanoTime()
>> > > > def service = new StudentService()
>> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
>> > > > "Harry Potter": 7]
>> > > >
>> > > > def tasks = [
>> > > >         { println service.findStudent("john@gmail.com", 11, false)
>> },
>> > > >         { println service.getStudentMarks(1L) },
>> > > >         { println service.getStudentsByFirstNames(["John","Alice"])
>> },
>> > > >         { println service.getRandomLastName() },
>> > > >         { println service.findStudentIdByName("Kate", "Williams")
>> },
>> > > >         { service.printMapValues(bookSeries) }
>> > > > ]
>> > > >
>> > > > GParsExecutorsPool.withPool {
>> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
>> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
>> > > > }
>> > > >
>> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
>> > > > println "\nTotal elapsed time is $executionTime\n\n"
>> > > >
>> > > >
>> > > > Cheers, Paul.
>> > > >
>> > > >
>> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com>
>> wrote:
>> > > > > I'd be interested to see where this leads to. It could end up as a
>> > sort
>> > > > of
>> > > > > Commons Parallel library. Besides providing an execution API,
>> there
>> > > could
>> > > > > be plenty of support utilities that tend to be found in all the
>> > > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
>> > > > earlier
>> > > > > (basically all sorts of Hadoop-related projects and other
>> distributed
>> > > > > systems here).
>> > > > >
>> > > > > Really, there's so many ways that such a project could head, I'd
>> like
>> > > to
>> > > > > hear more ideas on what to focus on.
>> > > > >
>> > > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
>> > wrote:
>> > > > >
>> > > > >> The upshot is that there has to be a way to do this with some
>> custom
>> > > > code
>> > > > >> to at least have the ability to 'fast path' the code without
>> > > reflection.
>> > > > >> Using lambdas should make this fairly syntactically unobtrusive.
>> > > > >>
>> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
>> > strider90arun@gmail.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Yes, reflection is not very performant but I don't think I have
>> > any
>> > > > other
>> > > > >> > choice since the library has to inspect the object supplied by
>> the
>> > > > client
>> > > > >> > at runtime to pick out the methods to be invoked using
>> > > > CompletableFuture.
>> > > > >> > But the performance penalty paid for using reflection will be
>> more
>> > > > than
>> > > > >> > offset by the savings of parallel method execution, more so as
>> the
>> > > no
>> > > > of
>> > > > >> > methods executed in parallel increases.
>> > > > >> >
>> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
>> > > garydgregory@gmail.com
>> > > > >
>> > > > >> > wrote:
>> > > > >> >
>> > > > >> > > On a lower-level, if you want to use this for lower-level
>> > services
>> > > > >> (where
>> > > > >> > > there is no network latency for example), you will need to
>> avoid
>> > > > using
>> > > > >> > > reflection to get the best performance.
>> > > > >> > >
>> > > > >> > > Gary
>> > > > >> > >
>> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
>> > > > strider90arun@gmail.com>
>> > > > >> > > wrote:
>> > > > >> > >
>> > > > >> > > > Hi Gary,
>> > > > >> > > >
>> > > > >> > > > Thanks for your response. You have some valid and
>> interesting
>> > > > points
>> > > > >> > :-)
>> > > > >> > > > Of course you are right that Spark is much more mature.
>> Thanks
>> > > for
>> > > > >> your
>> > > > >> > > > insight.
>> > > > >> > > > It will be interesting indeed to find out if the core
>> > > > parallelization
>> > > > >> > > > engine of Spark can be isolated like you suggest.
>> > > > >> > > >
>> > > > >> > > > I started working on this project because I felt that there
>> > was
>> > > no
>> > > > >> good
>> > > > >> > > > library for parallelizing method calls which can be
>> plugged in
>> > > > easily
>> > > > >> > > into
>> > > > >> > > > an existing java project. Ultimately, if such a solution
>> can
>> > be
>> > > > >> > > > incorporated in the Apache Commons, it would be a useful
>> > > addition
>> > > > to
>> > > > >> > the
>> > > > >> > > > Commons repository.
>> > > > >> > > >
>> > > > >> > > > Thanks,
>> > > > >> > > > Arun
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
>> > > > >> garydgregory@gmail.com>
>> > > > >> > > > wrote:
>> > > > >> > > >
>> > > > >> > > > > Hi Arun,
>> > > > >> > > > >
>> > > > >> > > > > Sure, and that is to be expected, Spark is more mature
>> than
>> > a
>> > > > four
>> > > > >> > > class
>> > > > >> > > > > prototype. What I am trying to get to is that in order
>> for
>> > the
>> > > > >> > library
>> > > > >> > > to
>> > > > >> > > > > be useful, you will end up with more in a first release,
>> and
>> > > > after
>> > > > >> a
>> > > > >> > > > couple
>> > > > >> > > > > more releases, there will be more and more. Would Spark
>> not
>> > > > have in
>> > > > >> > its
>> > > > >> > > > > guts the same kind of code your are proposing here? By
>> > > > extension,
>> > > > >> > will
>> > > > >> > > > you
>> > > > >> > > > > not end up with more framework-like (Spark-like) code and
>> > > > solutions
>> > > > >> > as
>> > > > >> > > > > found in Spark? I am just playing devil's advocate here
>> ;-)
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > > > > What would be interesting would be to find out if there
>> is a
>> > > > core
>> > > > >> > part
>> > > > >> > > of
>> > > > >> > > > > Spark that is separable and ex tractable into a Commons
>> > > > component.
>> > > > >> > > Since
>> > > > >> > > > > Spark has a proven track record, it is more likely, that
>> > such
>> > > a
>> > > > >> > library
>> > > > >> > > > > would be generally useful than one created from scratch
>> that
>> > > > does
>> > > > >> not
>> > > > >> > > > > integrate with anything else. Again, please do not take
>> any
>> > of
>> > > > this
>> > > > >> > > > > personally, I am just playing here :-)
>> > > > >> > > > >
>> > > > >> > > > > Gary
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
>> > > boards@gmail.com>
>> > > > >> > wrote:
>> > > > >> > > > >
>> > > > >> > > > > > I already see a huge difference here: Spark requires a
>> > bunch
>> > > > of
>> > > > >> > > > > > infrastructure to be set up, while this library is
>> just a
>> > > > >> library.
>> > > > >> > > > > Similar
>> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or
>> Storm
>> > or
>> > > > >> Samza
>> > > > >> > or
>> > > > >> > > > the
>> > > > >> > > > > > others.
>> > > > >> > > > > >
>> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
>> > > > garydgregory@gmail.com>
>> > > > >> > > wrote:
>> > > > >> > > > > >
>> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
>> > > > >> > > strider90arun@gmail.com
>> > > > >> > > > >
>> > > > >> > > > > > > wrote:
>> > > > >> > > > > > >
>> > > > >> > > > > > > > Hi All,
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Good afternoon.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > I have been working on a java generic parallel
>> > execution
>> > > > >> > library
>> > > > >> > > > > which
>> > > > >> > > > > > > will
>> > > > >> > > > > > > > allow clients to execute methods in parallel
>> > > irrespective
>> > > > of
>> > > > >> > the
>> > > > >> > > > > number
>> > > > >> > > > > > > of
>> > > > >> > > > > > > > method arguments, type of method arguments, return
>> > type
>> > > of
>> > > > >> the
>> > > > >> > > > method
>> > > > >> > > > > > > etc.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Here is the link to the source code:
>> > > > >> > > > > > > > https://github.com/striderarun/parallel-
>> > > execution-engine
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > The project is in a nascent state and I am the only
>> > > > >> contributor
>> > > > >> > > so
>> > > > >> > > > > > far. I
>> > > > >> > > > > > > > am new to the Apache community and I would like to
>> > bring
>> > > > this
>> > > > >> > > > project
>> > > > >> > > > > > > into
>> > > > >> > > > > > > > Apache and improve, expand and build a developer
>> > > community
>> > > > >> > around
>> > > > >> > > > it.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > I think this project can be a sub project of Apache
>> > > > Commons
>> > > > >> > since
>> > > > >> > > > it
>> > > > >> > > > > > > > provides generic components for parallelizing any
>> kind
>> > > of
>> > > > >> > > methods.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Can somebody please guide me or suggest what other
>> > > > options I
>> > > > >> > can
>> > > > >> > > > > > explore
>> > > > >> > > > > > > ?
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > Hi Arun,
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thank you for your proposal.
>> > > > >> > > > > > >
>> > > > >> > > > > > > How would this be different from Apache Spark?
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thank you,
>> > > > >> > > > > > > Gary
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Thanks,
>> > > > >> > > > > > > > Arun
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > > --
>> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Matt Sicker <bo...@gmail.com>
>> > > >
>> > > > ------------------------------------------------------------
>> ---------
>> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > > > For additional commands, e-mail: dev-help@commons.apache.org
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Matt Sicker <bo...@gmail.com>
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > For additional commands, e-mail: dev-help@commons.apache.org
>> >
>> >
>>
>>
>> --
>> Matt Sicker <bo...@gmail.com>
>>
>
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
I was checking out how the library would plug into Spring and other
frameworks. I created a sample Spring project with a couple of auto wired
service classes. To fetch and combine data from multiple service classes in
parallel, the Spring injected service dependencies are passed to the
library.

Since the library is framework agnostic, it deals with the spring injected
dependency as a normal object.

You can see it here :
https://github.com/striderarun/spring-app-parallel-execution/blob/master/src/main/java/com/dashboard/service/impl/DashboardServiceImpl.java

I think the idea here is that clients can parallelize method calls
irrespective of whether they are part of Spring beans or implemented as
part of any other framework. Clients don't have to modify or wrap their
methods into an ExecutorService, Runnable or any other low level apis to do
so. Methods can be submitted as-is to the library.

The library can serve as a higher level abstraction that completely hides
concurrency apis from the client.


On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <bo...@gmail.com> wrote:

> There's also some interesting execution APIs available in the Scala
> standard library. Those are built on top of ForkJoinPool and such nowadays,
> but the idea is there for a nicer API on top of ExecutorService and other
> low level details.
>
> In the interests of concurrency, there are other thread-like models that
> can be explored. For example: http://docs.paralleluniverse.co/quasar/
>
> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
> brunodepaulak@yahoo.com.br.invalid> wrote:
>
> > Interesting idea. And great discussion. Can't really say I'd have a use
> > case for that right now, so abstaining from the discussion around the
> > implementation.
> >
> > I believe if we decide to explore this idea in Commons, we will probably
> > move it to sandbox? Even if we do not move that to Commons or to
> sandbox, I
> > intend to find some time in the next days to try Apache Commons Javaflow
> > with this library.
> >
> > Jenkins implemented pipelines + continuations with code that when started
> > it looked a lot like Javaflow. The execution in parallel is taken care in
> > some internal modules in Jenkins, but I would like to see how if simpler
> > implementation like this one would work.
> >
> > Ideally, this utility would execute in parallel, say, 20 tasks each
> taking
> > 5 minutes (haven't looked if it supports fork/join). Then I would be able
> > to have checkpoints during the execution and if the whole workflow
> fails, I
> > would be able to restart it from the last checkpoint.
> >
> >
> > I use Java7+ concurrent classes when I need to execute tasks in parallel
> > (though I'm adding a flag to Paul King's message in this thread to give
> > GPars a try too!), but I am unaware of any way to have persistentable (?)
> > continuation workflows as in Jenkins, but with simple Java code.
> >
> > Cheers
> > Bruno
> >
> > ________________________________
> > From: Gary Gregory <ga...@gmail.com>
> > To: Commons Developers List <de...@commons.apache.org>
> > Sent: Tuesday, 13 June 2017 2:08 PM
> > Subject: Re: Commons sub project for parallel method execution
> >
> >
> >
> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:
> >
> > > So wouldn't something like ASM or Javassist or one of the zillion other
> > > bytecode libraries be a better alternative to using reflection for
> > > performance? Also, using the Java 7 reflections API improvements helps
> > > speed things up quite a bit.
> > >
> >
> > IMO, unless you are doing scripting, reflection should be a used as a
> > workaround, but that's just me. For example, like we do in Commons IO's
> > Java7Support class.
> >
> > But I digress ;-)
> >
> > This is clearly an interesting topic. My concern is that there is a LOT
> of
> > code out there that does stuff like this at the low and high level from
> the
> > JRE's fork/join to Apache Spark and so on as I've stated.
> >
> > IMO something new would have to be both unique and since this is Commons,
> > potentially pluggable into other frameworks.
> >
> > Gary
> >
> >
> >
> > > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
> > >
> > > > My goto library for such tasks would be GPars. It has both Java and
> > > > Groovy support for most things (actors/dataflow) but less so for
> > > > asynchronous task execution. It's one of the things that would be
> good
> > > > to explore in light of Java 8. Groovy is now Apache, GPars not at
> this
> > > > stage.
> > > >
> > > > So with adding two jars (GPars + Groovy), you can use Groovy like
> this:
> > > >
> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > > > import com.arun.student.StudentService
> > > > import groovyx.gpars.GParsExecutorsPool
> > > >
> > > > long startTime = System.nanoTime()
> > > > def service = new StudentService()
> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > > > "Harry Potter": 7]
> > > >
> > > > def tasks = [
> > > >         { println service.findStudent("john@gmail.com", 11, false)
> },
> > > >         { println service.getStudentMarks(1L) },
> > > >         { println service.getStudentsByFirstNames(["John","Alice"])
> },
> > > >         { println service.getRandomLastName() },
> > > >         { println service.findStudentIdByName("Kate", "Williams") },
> > > >         { service.printMapValues(bookSeries) }
> > > > ]
> > > >
> > > > GParsExecutorsPool.withPool {
> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > > > }
> > > >
> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
> > > > println "\nTotal elapsed time is $executionTime\n\n"
> > > >
> > > >
> > > > Cheers, Paul.
> > > >
> > > >
> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com>
> wrote:
> > > > > I'd be interested to see where this leads to. It could end up as a
> > sort
> > > > of
> > > > > Commons Parallel library. Besides providing an execution API, there
> > > could
> > > > > be plenty of support utilities that tend to be found in all the
> > > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > > > earlier
> > > > > (basically all sorts of Hadoop-related projects and other
> distributed
> > > > > systems here).
> > > > >
> > > > > Really, there's so many ways that such a project could head, I'd
> like
> > > to
> > > > > hear more ideas on what to focus on.
> > > > >
> > > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
> > wrote:
> > > > >
> > > > >> The upshot is that there has to be a way to do this with some
> custom
> > > > code
> > > > >> to at least have the ability to 'fast path' the code without
> > > reflection.
> > > > >> Using lambdas should make this fairly syntactically unobtrusive.
> > > > >>
> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> > strider90arun@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Yes, reflection is not very performant but I don't think I have
> > any
> > > > other
> > > > >> > choice since the library has to inspect the object supplied by
> the
> > > > client
> > > > >> > at runtime to pick out the methods to be invoked using
> > > > CompletableFuture.
> > > > >> > But the performance penalty paid for using reflection will be
> more
> > > > than
> > > > >> > offset by the savings of parallel method execution, more so as
> the
> > > no
> > > > of
> > > > >> > methods executed in parallel increases.
> > > > >> >
> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > > garydgregory@gmail.com
> > > > >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > On a lower-level, if you want to use this for lower-level
> > services
> > > > >> (where
> > > > >> > > there is no network latency for example), you will need to
> avoid
> > > > using
> > > > >> > > reflection to get the best performance.
> > > > >> > >
> > > > >> > > Gary
> > > > >> > >
> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > > > strider90arun@gmail.com>
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Hi Gary,
> > > > >> > > >
> > > > >> > > > Thanks for your response. You have some valid and
> interesting
> > > > points
> > > > >> > :-)
> > > > >> > > > Of course you are right that Spark is much more mature.
> Thanks
> > > for
> > > > >> your
> > > > >> > > > insight.
> > > > >> > > > It will be interesting indeed to find out if the core
> > > > parallelization
> > > > >> > > > engine of Spark can be isolated like you suggest.
> > > > >> > > >
> > > > >> > > > I started working on this project because I felt that there
> > was
> > > no
> > > > >> good
> > > > >> > > > library for parallelizing method calls which can be plugged
> in
> > > > easily
> > > > >> > > into
> > > > >> > > > an existing java project. Ultimately, if such a solution can
> > be
> > > > >> > > > incorporated in the Apache Commons, it would be a useful
> > > addition
> > > > to
> > > > >> > the
> > > > >> > > > Commons repository.
> > > > >> > > >
> > > > >> > > > Thanks,
> > > > >> > > > Arun
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > > >> garydgregory@gmail.com>
> > > > >> > > > wrote:
> > > > >> > > >
> > > > >> > > > > Hi Arun,
> > > > >> > > > >
> > > > >> > > > > Sure, and that is to be expected, Spark is more mature
> than
> > a
> > > > four
> > > > >> > > class
> > > > >> > > > > prototype. What I am trying to get to is that in order for
> > the
> > > > >> > library
> > > > >> > > to
> > > > >> > > > > be useful, you will end up with more in a first release,
> and
> > > > after
> > > > >> a
> > > > >> > > > couple
> > > > >> > > > > more releases, there will be more and more. Would Spark
> not
> > > > have in
> > > > >> > its
> > > > >> > > > > guts the same kind of code your are proposing here? By
> > > > extension,
> > > > >> > will
> > > > >> > > > you
> > > > >> > > > > not end up with more framework-like (Spark-like) code and
> > > > solutions
> > > > >> > as
> > > > >> > > > > found in Spark? I am just playing devil's advocate here
> ;-)
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > What would be interesting would be to find out if there
> is a
> > > > core
> > > > >> > part
> > > > >> > > of
> > > > >> > > > > Spark that is separable and ex tractable into a Commons
> > > > component.
> > > > >> > > Since
> > > > >> > > > > Spark has a proven track record, it is more likely, that
> > such
> > > a
> > > > >> > library
> > > > >> > > > > would be generally useful than one created from scratch
> that
> > > > does
> > > > >> not
> > > > >> > > > > integrate with anything else. Again, please do not take
> any
> > of
> > > > this
> > > > >> > > > > personally, I am just playing here :-)
> > > > >> > > > >
> > > > >> > > > > Gary
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> > > boards@gmail.com>
> > > > >> > wrote:
> > > > >> > > > >
> > > > >> > > > > > I already see a huge difference here: Spark requires a
> > bunch
> > > > of
> > > > >> > > > > > infrastructure to be set up, while this library is just
> a
> > > > >> library.
> > > > >> > > > > Similar
> > > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or
> Storm
> > or
> > > > >> Samza
> > > > >> > or
> > > > >> > > > the
> > > > >> > > > > > others.
> > > > >> > > > > >
> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > > > garydgregory@gmail.com>
> > > > >> > > wrote:
> > > > >> > > > > >
> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > > >> > > strider90arun@gmail.com
> > > > >> > > > >
> > > > >> > > > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > > Hi All,
> > > > >> > > > > > > >
> > > > >> > > > > > > > Good afternoon.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I have been working on a java generic parallel
> > execution
> > > > >> > library
> > > > >> > > > > which
> > > > >> > > > > > > will
> > > > >> > > > > > > > allow clients to execute methods in parallel
> > > irrespective
> > > > of
> > > > >> > the
> > > > >> > > > > number
> > > > >> > > > > > > of
> > > > >> > > > > > > > method arguments, type of method arguments, return
> > type
> > > of
> > > > >> the
> > > > >> > > > method
> > > > >> > > > > > > etc.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Here is the link to the source code:
> > > > >> > > > > > > > https://github.com/striderarun/parallel-
> > > execution-engine
> > > > >> > > > > > > >
> > > > >> > > > > > > > The project is in a nascent state and I am the only
> > > > >> contributor
> > > > >> > > so
> > > > >> > > > > > far. I
> > > > >> > > > > > > > am new to the Apache community and I would like to
> > bring
> > > > this
> > > > >> > > > project
> > > > >> > > > > > > into
> > > > >> > > > > > > > Apache and improve, expand and build a developer
> > > community
> > > > >> > around
> > > > >> > > > it.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I think this project can be a sub project of Apache
> > > > Commons
> > > > >> > since
> > > > >> > > > it
> > > > >> > > > > > > > provides generic components for parallelizing any
> kind
> > > of
> > > > >> > > methods.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Can somebody please guide me or suggest what other
> > > > options I
> > > > >> > can
> > > > >> > > > > > explore
> > > > >> > > > > > > ?
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > Hi Arun,
> > > > >> > > > > > >
> > > > >> > > > > > > Thank you for your proposal.
> > > > >> > > > > > >
> > > > >> > > > > > > How would this be different from Apache Spark?
> > > > >> > > > > > >
> > > > >> > > > > > > Thank you,
> > > > >> > > > > > > Gary
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > >
> > > > >> > > > > > > > Thanks,
> > > > >> > > > > > > > Arun
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > --
> > > > >> > > > > > Matt Sicker <bo...@gmail.com>
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Matt Sicker <bo...@gmail.com>
> > > >
> > > > ------------------------------------------------------------
> ---------
> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > > For additional commands, e-mail: dev-help@commons.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Matt Sicker <bo...@gmail.com>.
There's also some interesting execution APIs available in the Scala
standard library. Those are built on top of ForkJoinPool and such nowadays,
but the idea is there for a nicer API on top of ExecutorService and other
low level details.

In the interests of concurrency, there are other thread-like models that
can be explored. For example: http://docs.paralleluniverse.co/quasar/

On 12 June 2017 at 21:22, Bruno P. Kinoshita <
brunodepaulak@yahoo.com.br.invalid> wrote:

> Interesting idea. And great discussion. Can't really say I'd have a use
> case for that right now, so abstaining from the discussion around the
> implementation.
>
> I believe if we decide to explore this idea in Commons, we will probably
> move it to sandbox? Even if we do not move that to Commons or to sandbox, I
> intend to find some time in the next days to try Apache Commons Javaflow
> with this library.
>
> Jenkins implemented pipelines + continuations with code that when started
> it looked a lot like Javaflow. The execution in parallel is taken care in
> some internal modules in Jenkins, but I would like to see how if simpler
> implementation like this one would work.
>
> Ideally, this utility would execute in parallel, say, 20 tasks each taking
> 5 minutes (haven't looked if it supports fork/join). Then I would be able
> to have checkpoints during the execution and if the whole workflow fails, I
> would be able to restart it from the last checkpoint.
>
>
> I use Java7+ concurrent classes when I need to execute tasks in parallel
> (though I'm adding a flag to Paul King's message in this thread to give
> GPars a try too!), but I am unaware of any way to have persistentable (?)
> continuation workflows as in Jenkins, but with simple Java code.
>
> Cheers
> Bruno
>
> ________________________________
> From: Gary Gregory <ga...@gmail.com>
> To: Commons Developers List <de...@commons.apache.org>
> Sent: Tuesday, 13 June 2017 2:08 PM
> Subject: Re: Commons sub project for parallel method execution
>
>
>
> On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:
>
> > So wouldn't something like ASM or Javassist or one of the zillion other
> > bytecode libraries be a better alternative to using reflection for
> > performance? Also, using the Java 7 reflections API improvements helps
> > speed things up quite a bit.
> >
>
> IMO, unless you are doing scripting, reflection should be a used as a
> workaround, but that's just me. For example, like we do in Commons IO's
> Java7Support class.
>
> But I digress ;-)
>
> This is clearly an interesting topic. My concern is that there is a LOT of
> code out there that does stuff like this at the low and high level from the
> JRE's fork/join to Apache Spark and so on as I've stated.
>
> IMO something new would have to be both unique and since this is Commons,
> potentially pluggable into other frameworks.
>
> Gary
>
>
>
> > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
> >
> > > My goto library for such tasks would be GPars. It has both Java and
> > > Groovy support for most things (actors/dataflow) but less so for
> > > asynchronous task execution. It's one of the things that would be good
> > > to explore in light of Java 8. Groovy is now Apache, GPars not at this
> > > stage.
> > >
> > > So with adding two jars (GPars + Groovy), you can use Groovy like this:
> > >
> > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > > import com.arun.student.StudentService
> > > import groovyx.gpars.GParsExecutorsPool
> > >
> > > long startTime = System.nanoTime()
> > > def service = new StudentService()
> > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > > "Harry Potter": 7]
> > >
> > > def tasks = [
> > >         { println service.findStudent("john@gmail.com", 11, false) },
> > >         { println service.getStudentMarks(1L) },
> > >         { println service.getStudentsByFirstNames(["John","Alice"]) },
> > >         { println service.getRandomLastName() },
> > >         { println service.findStudentIdByName("Kate", "Williams") },
> > >         { service.printMapValues(bookSeries) }
> > > ]
> > >
> > > GParsExecutorsPool.withPool {
> > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > > }
> > >
> > > long executionTime = (System.nanoTime() - startTime) / 1000000
> > > println "\nTotal elapsed time is $executionTime\n\n"
> > >
> > >
> > > Cheers, Paul.
> > >
> > >
> > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > > > I'd be interested to see where this leads to. It could end up as a
> sort
> > > of
> > > > Commons Parallel library. Besides providing an execution API, there
> > could
> > > > be plenty of support utilities that tend to be found in all the
> > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > > earlier
> > > > (basically all sorts of Hadoop-related projects and other distributed
> > > > systems here).
> > > >
> > > > Really, there's so many ways that such a project could head, I'd like
> > to
> > > > hear more ideas on what to focus on.
> > > >
> > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
> wrote:
> > > >
> > > >> The upshot is that there has to be a way to do this with some custom
> > > code
> > > >> to at least have the ability to 'fast path' the code without
> > reflection.
> > > >> Using lambdas should make this fairly syntactically unobtrusive.
> > > >>
> > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> strider90arun@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Yes, reflection is not very performant but I don't think I have
> any
> > > other
> > > >> > choice since the library has to inspect the object supplied by the
> > > client
> > > >> > at runtime to pick out the methods to be invoked using
> > > CompletableFuture.
> > > >> > But the performance penalty paid for using reflection will be more
> > > than
> > > >> > offset by the savings of parallel method execution, more so as the
> > no
> > > of
> > > >> > methods executed in parallel increases.
> > > >> >
> > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > garydgregory@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > On a lower-level, if you want to use this for lower-level
> services
> > > >> (where
> > > >> > > there is no network latency for example), you will need to avoid
> > > using
> > > >> > > reflection to get the best performance.
> > > >> > >
> > > >> > > Gary
> > > >> > >
> > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > > strider90arun@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi Gary,
> > > >> > > >
> > > >> > > > Thanks for your response. You have some valid and interesting
> > > points
> > > >> > :-)
> > > >> > > > Of course you are right that Spark is much more mature. Thanks
> > for
> > > >> your
> > > >> > > > insight.
> > > >> > > > It will be interesting indeed to find out if the core
> > > parallelization
> > > >> > > > engine of Spark can be isolated like you suggest.
> > > >> > > >
> > > >> > > > I started working on this project because I felt that there
> was
> > no
> > > >> good
> > > >> > > > library for parallelizing method calls which can be plugged in
> > > easily
> > > >> > > into
> > > >> > > > an existing java project. Ultimately, if such a solution can
> be
> > > >> > > > incorporated in the Apache Commons, it would be a useful
> > addition
> > > to
> > > >> > the
> > > >> > > > Commons repository.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Arun
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > >> garydgregory@gmail.com>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > Hi Arun,
> > > >> > > > >
> > > >> > > > > Sure, and that is to be expected, Spark is more mature than
> a
> > > four
> > > >> > > class
> > > >> > > > > prototype. What I am trying to get to is that in order for
> the
> > > >> > library
> > > >> > > to
> > > >> > > > > be useful, you will end up with more in a first release, and
> > > after
> > > >> a
> > > >> > > > couple
> > > >> > > > > more releases, there will be more and more. Would Spark not
> > > have in
> > > >> > its
> > > >> > > > > guts the same kind of code your are proposing here? By
> > > extension,
> > > >> > will
> > > >> > > > you
> > > >> > > > > not end up with more framework-like (Spark-like) code and
> > > solutions
> > > >> > as
> > > >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > What would be interesting would be to find out if there is a
> > > core
> > > >> > part
> > > >> > > of
> > > >> > > > > Spark that is separable and ex tractable into a Commons
> > > component.
> > > >> > > Since
> > > >> > > > > Spark has a proven track record, it is more likely, that
> such
> > a
> > > >> > library
> > > >> > > > > would be generally useful than one created from scratch that
> > > does
> > > >> not
> > > >> > > > > integrate with anything else. Again, please do not take any
> of
> > > this
> > > >> > > > > personally, I am just playing here :-)
> > > >> > > > >
> > > >> > > > > Gary
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> > boards@gmail.com>
> > > >> > wrote:
> > > >> > > > >
> > > >> > > > > > I already see a huge difference here: Spark requires a
> bunch
> > > of
> > > >> > > > > > infrastructure to be set up, while this library is just a
> > > >> library.
> > > >> > > > > Similar
> > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm
> or
> > > >> Samza
> > > >> > or
> > > >> > > > the
> > > >> > > > > > others.
> > > >> > > > > >
> > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > > garydgregory@gmail.com>
> > > >> > > wrote:
> > > >> > > > > >
> > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > >> > > strider90arun@gmail.com
> > > >> > > > >
> > > >> > > > > > > wrote:
> > > >> > > > > > >
> > > >> > > > > > > > Hi All,
> > > >> > > > > > > >
> > > >> > > > > > > > Good afternoon.
> > > >> > > > > > > >
> > > >> > > > > > > > I have been working on a java generic parallel
> execution
> > > >> > library
> > > >> > > > > which
> > > >> > > > > > > will
> > > >> > > > > > > > allow clients to execute methods in parallel
> > irrespective
> > > of
> > > >> > the
> > > >> > > > > number
> > > >> > > > > > > of
> > > >> > > > > > > > method arguments, type of method arguments, return
> type
> > of
> > > >> the
> > > >> > > > method
> > > >> > > > > > > etc.
> > > >> > > > > > > >
> > > >> > > > > > > > Here is the link to the source code:
> > > >> > > > > > > > https://github.com/striderarun/parallel-
> > execution-engine
> > > >> > > > > > > >
> > > >> > > > > > > > The project is in a nascent state and I am the only
> > > >> contributor
> > > >> > > so
> > > >> > > > > > far. I
> > > >> > > > > > > > am new to the Apache community and I would like to
> bring
> > > this
> > > >> > > > project
> > > >> > > > > > > into
> > > >> > > > > > > > Apache and improve, expand and build a developer
> > community
> > > >> > around
> > > >> > > > it.
> > > >> > > > > > > >
> > > >> > > > > > > > I think this project can be a sub project of Apache
> > > Commons
> > > >> > since
> > > >> > > > it
> > > >> > > > > > > > provides generic components for parallelizing any kind
> > of
> > > >> > > methods.
> > > >> > > > > > > >
> > > >> > > > > > > > Can somebody please guide me or suggest what other
> > > options I
> > > >> > can
> > > >> > > > > > explore
> > > >> > > > > > > ?
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > Hi Arun,
> > > >> > > > > > >
> > > >> > > > > > > Thank you for your proposal.
> > > >> > > > > > >
> > > >> > > > > > > How would this be different from Apache Spark?
> > > >> > > > > > >
> > > >> > > > > > > Thank you,
> > > >> > > > > > > Gary
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > > > Thanks,
> > > >> > > > > > > > Arun
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > --
> > > >> > > > > > Matt Sicker <bo...@gmail.com>
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Matt Sicker <bo...@gmail.com>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
Matt Sicker <bo...@gmail.com>

Re: Commons sub project for parallel method execution

Posted by "Bruno P. Kinoshita" <br...@yahoo.com.br.INVALID>.
Interesting idea. And great discussion. Can't really say I'd have a use case for that right now, so abstaining from the discussion around the implementation.

I believe if we decide to explore this idea in Commons, we will probably move it to sandbox? Even if we do not move that to Commons or to sandbox, I intend to find some time in the next days to try Apache Commons Javaflow with this library.

Jenkins implemented pipelines + continuations with code that when started it looked a lot like Javaflow. The execution in parallel is taken care in some internal modules in Jenkins, but I would like to see how if simpler implementation like this one would work.

Ideally, this utility would execute in parallel, say, 20 tasks each taking 5 minutes (haven't looked if it supports fork/join). Then I would be able to have checkpoints during the execution and if the whole workflow fails, I would be able to restart it from the last checkpoint.


I use Java7+ concurrent classes when I need to execute tasks in parallel (though I'm adding a flag to Paul King's message in this thread to give GPars a try too!), but I am unaware of any way to have persistentable (?) continuation workflows as in Jenkins, but with simple Java code.

Cheers
Bruno

________________________________
From: Gary Gregory <ga...@gmail.com>
To: Commons Developers List <de...@commons.apache.org> 
Sent: Tuesday, 13 June 2017 2:08 PM
Subject: Re: Commons sub project for parallel method execution



On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:

> So wouldn't something like ASM or Javassist or one of the zillion other
> bytecode libraries be a better alternative to using reflection for
> performance? Also, using the Java 7 reflections API improvements helps
> speed things up quite a bit.
>

IMO, unless you are doing scripting, reflection should be a used as a
workaround, but that's just me. For example, like we do in Commons IO's
Java7Support class.

But I digress ;-)

This is clearly an interesting topic. My concern is that there is a LOT of
code out there that does stuff like this at the low and high level from the
JRE's fork/join to Apache Spark and so on as I've stated.

IMO something new would have to be both unique and since this is Commons,
potentially pluggable into other frameworks.

Gary



> On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
>
> > My goto library for such tasks would be GPars. It has both Java and
> > Groovy support for most things (actors/dataflow) but less so for
> > asynchronous task execution. It's one of the things that would be good
> > to explore in light of Java 8. Groovy is now Apache, GPars not at this
> > stage.
> >
> > So with adding two jars (GPars + Groovy), you can use Groovy like this:
> >
> > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > import com.arun.student.StudentService
> > import groovyx.gpars.GParsExecutorsPool
> >
> > long startTime = System.nanoTime()
> > def service = new StudentService()
> > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > "Harry Potter": 7]
> >
> > def tasks = [
> >         { println service.findStudent("john@gmail.com", 11, false) },
> >         { println service.getStudentMarks(1L) },
> >         { println service.getStudentsByFirstNames(["John","Alice"]) },
> >         { println service.getRandomLastName() },
> >         { println service.findStudentIdByName("Kate", "Williams") },
> >         { service.printMapValues(bookSeries) }
> > ]
> >
> > GParsExecutorsPool.withPool {
> >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > }
> >
> > long executionTime = (System.nanoTime() - startTime) / 1000000
> > println "\nTotal elapsed time is $executionTime\n\n"
> >
> >
> > Cheers, Paul.
> >
> >
> > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > > I'd be interested to see where this leads to. It could end up as a sort
> > of
> > > Commons Parallel library. Besides providing an execution API, there
> could
> > > be plenty of support utilities that tend to be found in all the
> > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > earlier
> > > (basically all sorts of Hadoop-related projects and other distributed
> > > systems here).
> > >
> > > Really, there's so many ways that such a project could head, I'd like
> to
> > > hear more ideas on what to focus on.
> > >
> > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
> > >
> > >> The upshot is that there has to be a way to do this with some custom
> > code
> > >> to at least have the ability to 'fast path' the code without
> reflection.
> > >> Using lambdas should make this fairly syntactically unobtrusive.
> > >>
> > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> > >> wrote:
> > >>
> > >> > Yes, reflection is not very performant but I don't think I have any
> > other
> > >> > choice since the library has to inspect the object supplied by the
> > client
> > >> > at runtime to pick out the methods to be invoked using
> > CompletableFuture.
> > >> > But the performance penalty paid for using reflection will be more
> > than
> > >> > offset by the savings of parallel method execution, more so as the
> no
> > of
> > >> > methods executed in parallel increases.
> > >> >
> > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> garydgregory@gmail.com
> > >
> > >> > wrote:
> > >> >
> > >> > > On a lower-level, if you want to use this for lower-level services
> > >> (where
> > >> > > there is no network latency for example), you will need to avoid
> > using
> > >> > > reflection to get the best performance.
> > >> > >
> > >> > > Gary
> > >> > >
> > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > strider90arun@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Gary,
> > >> > > >
> > >> > > > Thanks for your response. You have some valid and interesting
> > points
> > >> > :-)
> > >> > > > Of course you are right that Spark is much more mature. Thanks
> for
> > >> your
> > >> > > > insight.
> > >> > > > It will be interesting indeed to find out if the core
> > parallelization
> > >> > > > engine of Spark can be isolated like you suggest.
> > >> > > >
> > >> > > > I started working on this project because I felt that there was
> no
> > >> good
> > >> > > > library for parallelizing method calls which can be plugged in
> > easily
> > >> > > into
> > >> > > > an existing java project. Ultimately, if such a solution can be
> > >> > > > incorporated in the Apache Commons, it would be a useful
> addition
> > to
> > >> > the
> > >> > > > Commons repository.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Arun
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > >> garydgregory@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi Arun,
> > >> > > > >
> > >> > > > > Sure, and that is to be expected, Spark is more mature than a
> > four
> > >> > > class
> > >> > > > > prototype. What I am trying to get to is that in order for the
> > >> > library
> > >> > > to
> > >> > > > > be useful, you will end up with more in a first release, and
> > after
> > >> a
> > >> > > > couple
> > >> > > > > more releases, there will be more and more. Would Spark not
> > have in
> > >> > its
> > >> > > > > guts the same kind of code your are proposing here? By
> > extension,
> > >> > will
> > >> > > > you
> > >> > > > > not end up with more framework-like (Spark-like) code and
> > solutions
> > >> > as
> > >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > >> > > > >
> > >> > > > >
> > >> > > > > What would be interesting would be to find out if there is a
> > core
> > >> > part
> > >> > > of
> > >> > > > > Spark that is separable and ex tractable into a Commons
> > component.
> > >> > > Since
> > >> > > > > Spark has a proven track record, it is more likely, that such
> a
> > >> > library
> > >> > > > > would be generally useful than one created from scratch that
> > does
> > >> not
> > >> > > > > integrate with anything else. Again, please do not take any of
> > this
> > >> > > > > personally, I am just playing here :-)
> > >> > > > >
> > >> > > > > Gary
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> boards@gmail.com>
> > >> > wrote:
> > >> > > > >
> > >> > > > > > I already see a huge difference here: Spark requires a bunch
> > of
> > >> > > > > > infrastructure to be set up, while this library is just a
> > >> library.
> > >> > > > > Similar
> > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> > >> Samza
> > >> > or
> > >> > > > the
> > >> > > > > > others.
> > >> > > > > >
> > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > garydgregory@gmail.com>
> > >> > > wrote:
> > >> > > > > >
> > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > >> > > strider90arun@gmail.com
> > >> > > > >
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi All,
> > >> > > > > > > >
> > >> > > > > > > > Good afternoon.
> > >> > > > > > > >
> > >> > > > > > > > I have been working on a java generic parallel execution
> > >> > library
> > >> > > > > which
> > >> > > > > > > will
> > >> > > > > > > > allow clients to execute methods in parallel
> irrespective
> > of
> > >> > the
> > >> > > > > number
> > >> > > > > > > of
> > >> > > > > > > > method arguments, type of method arguments, return type
> of
> > >> the
> > >> > > > method
> > >> > > > > > > etc.
> > >> > > > > > > >
> > >> > > > > > > > Here is the link to the source code:
> > >> > > > > > > > https://github.com/striderarun/parallel-
> execution-engine
> > >> > > > > > > >
> > >> > > > > > > > The project is in a nascent state and I am the only
> > >> contributor
> > >> > > so
> > >> > > > > > far. I
> > >> > > > > > > > am new to the Apache community and I would like to bring
> > this
> > >> > > > project
> > >> > > > > > > into
> > >> > > > > > > > Apache and improve, expand and build a developer
> community
> > >> > around
> > >> > > > it.
> > >> > > > > > > >
> > >> > > > > > > > I think this project can be a sub project of Apache
> > Commons
> > >> > since
> > >> > > > it
> > >> > > > > > > > provides generic components for parallelizing any kind
> of
> > >> > > methods.
> > >> > > > > > > >
> > >> > > > > > > > Can somebody please guide me or suggest what other
> > options I
> > >> > can
> > >> > > > > > explore
> > >> > > > > > > ?
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > > > Hi Arun,
> > >> > > > > > >
> > >> > > > > > > Thank you for your proposal.
> > >> > > > > > >
> > >> > > > > > > How would this be different from Apache Spark?
> > >> > > > > > >
> > >> > > > > > > Thank you,
> > >> > > > > > > Gary
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > > Arun
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Matt Sicker <bo...@gmail.com>
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
A lot of useful and interesting suggestions here.
Usage of annotations instead of hardcoding method names is definitely a
very good idea. Decorating the domain classes with annotations will be much
more clean and maintainable.
And I think we can check out the usage of MethodHandle's interoperability
with Reflection API in java 8 for better performance improvements.
And being pluggable into other frameworks is definitely something to be
kept in mind.

A lot of things to think about.
I will check out the GPars library, haven't used it before :-)


On Mon, Jun 12, 2017 at 7:08 PM, Gary Gregory <ga...@gmail.com>
wrote:

> On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:
>
> > So wouldn't something like ASM or Javassist or one of the zillion other
> > bytecode libraries be a better alternative to using reflection for
> > performance? Also, using the Java 7 reflections API improvements helps
> > speed things up quite a bit.
> >
>
> IMO, unless you are doing scripting, reflection should be a used as a
> workaround, but that's just me. For example, like we do in Commons IO's
> Java7Support class.
>
> But I digress ;-)
>
> This is clearly an interesting topic. My concern is that there is a LOT of
> code out there that does stuff like this at the low and high level from the
> JRE's fork/join to Apache Spark and so on as I've stated.
>
> IMO something new would have to be both unique and since this is Commons,
> potentially pluggable into other frameworks.
>
> Gary
>
>
> > On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
> >
> > > My goto library for such tasks would be GPars. It has both Java and
> > > Groovy support for most things (actors/dataflow) but less so for
> > > asynchronous task execution. It's one of the things that would be good
> > > to explore in light of Java 8. Groovy is now Apache, GPars not at this
> > > stage.
> > >
> > > So with adding two jars (GPars + Groovy), you can use Groovy like this:
> > >
> > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > > import com.arun.student.StudentService
> > > import groovyx.gpars.GParsExecutorsPool
> > >
> > > long startTime = System.nanoTime()
> > > def service = new StudentService()
> > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > > "Harry Potter": 7]
> > >
> > > def tasks = [
> > >         { println service.findStudent("john@gmail.com", 11, false) },
> > >         { println service.getStudentMarks(1L) },
> > >         { println service.getStudentsByFirstNames(["John","Alice"]) },
> > >         { println service.getRandomLastName() },
> > >         { println service.findStudentIdByName("Kate", "Williams") },
> > >         { service.printMapValues(bookSeries) }
> > > ]
> > >
> > > GParsExecutorsPool.withPool {
> > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > > }
> > >
> > > long executionTime = (System.nanoTime() - startTime) / 1000000
> > > println "\nTotal elapsed time is $executionTime\n\n"
> > >
> > >
> > > Cheers, Paul.
> > >
> > >
> > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > > > I'd be interested to see where this leads to. It could end up as a
> sort
> > > of
> > > > Commons Parallel library. Besides providing an execution API, there
> > could
> > > > be plenty of support utilities that tend to be found in all the
> > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > > earlier
> > > > (basically all sorts of Hadoop-related projects and other distributed
> > > > systems here).
> > > >
> > > > Really, there's so many ways that such a project could head, I'd like
> > to
> > > > hear more ideas on what to focus on.
> > > >
> > > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com>
> wrote:
> > > >
> > > >> The upshot is that there has to be a way to do this with some custom
> > > code
> > > >> to at least have the ability to 'fast path' the code without
> > reflection.
> > > >> Using lambdas should make this fairly syntactically unobtrusive.
> > > >>
> > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> strider90arun@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Yes, reflection is not very performant but I don't think I have
> any
> > > other
> > > >> > choice since the library has to inspect the object supplied by the
> > > client
> > > >> > at runtime to pick out the methods to be invoked using
> > > CompletableFuture.
> > > >> > But the performance penalty paid for using reflection will be more
> > > than
> > > >> > offset by the savings of parallel method execution, more so as the
> > no
> > > of
> > > >> > methods executed in parallel increases.
> > > >> >
> > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > garydgregory@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> > > On a lower-level, if you want to use this for lower-level
> services
> > > >> (where
> > > >> > > there is no network latency for example), you will need to avoid
> > > using
> > > >> > > reflection to get the best performance.
> > > >> > >
> > > >> > > Gary
> > > >> > >
> > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > > strider90arun@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi Gary,
> > > >> > > >
> > > >> > > > Thanks for your response. You have some valid and interesting
> > > points
> > > >> > :-)
> > > >> > > > Of course you are right that Spark is much more mature. Thanks
> > for
> > > >> your
> > > >> > > > insight.
> > > >> > > > It will be interesting indeed to find out if the core
> > > parallelization
> > > >> > > > engine of Spark can be isolated like you suggest.
> > > >> > > >
> > > >> > > > I started working on this project because I felt that there
> was
> > no
> > > >> good
> > > >> > > > library for parallelizing method calls which can be plugged in
> > > easily
> > > >> > > into
> > > >> > > > an existing java project. Ultimately, if such a solution can
> be
> > > >> > > > incorporated in the Apache Commons, it would be a useful
> > addition
> > > to
> > > >> > the
> > > >> > > > Commons repository.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Arun
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > >> garydgregory@gmail.com>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > Hi Arun,
> > > >> > > > >
> > > >> > > > > Sure, and that is to be expected, Spark is more mature than
> a
> > > four
> > > >> > > class
> > > >> > > > > prototype. What I am trying to get to is that in order for
> the
> > > >> > library
> > > >> > > to
> > > >> > > > > be useful, you will end up with more in a first release, and
> > > after
> > > >> a
> > > >> > > > couple
> > > >> > > > > more releases, there will be more and more. Would Spark not
> > > have in
> > > >> > its
> > > >> > > > > guts the same kind of code your are proposing here? By
> > > extension,
> > > >> > will
> > > >> > > > you
> > > >> > > > > not end up with more framework-like (Spark-like) code and
> > > solutions
> > > >> > as
> > > >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > What would be interesting would be to find out if there is a
> > > core
> > > >> > part
> > > >> > > of
> > > >> > > > > Spark that is separable and ex tractable into a Commons
> > > component.
> > > >> > > Since
> > > >> > > > > Spark has a proven track record, it is more likely, that
> such
> > a
> > > >> > library
> > > >> > > > > would be generally useful than one created from scratch that
> > > does
> > > >> not
> > > >> > > > > integrate with anything else. Again, please do not take any
> of
> > > this
> > > >> > > > > personally, I am just playing here :-)
> > > >> > > > >
> > > >> > > > > Gary
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> > boards@gmail.com>
> > > >> > wrote:
> > > >> > > > >
> > > >> > > > > > I already see a huge difference here: Spark requires a
> bunch
> > > of
> > > >> > > > > > infrastructure to be set up, while this library is just a
> > > >> library.
> > > >> > > > > Similar
> > > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm
> or
> > > >> Samza
> > > >> > or
> > > >> > > > the
> > > >> > > > > > others.
> > > >> > > > > >
> > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > > garydgregory@gmail.com>
> > > >> > > wrote:
> > > >> > > > > >
> > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > >> > > strider90arun@gmail.com
> > > >> > > > >
> > > >> > > > > > > wrote:
> > > >> > > > > > >
> > > >> > > > > > > > Hi All,
> > > >> > > > > > > >
> > > >> > > > > > > > Good afternoon.
> > > >> > > > > > > >
> > > >> > > > > > > > I have been working on a java generic parallel
> execution
> > > >> > library
> > > >> > > > > which
> > > >> > > > > > > will
> > > >> > > > > > > > allow clients to execute methods in parallel
> > irrespective
> > > of
> > > >> > the
> > > >> > > > > number
> > > >> > > > > > > of
> > > >> > > > > > > > method arguments, type of method arguments, return
> type
> > of
> > > >> the
> > > >> > > > method
> > > >> > > > > > > etc.
> > > >> > > > > > > >
> > > >> > > > > > > > Here is the link to the source code:
> > > >> > > > > > > > https://github.com/striderarun/parallel-
> > execution-engine
> > > >> > > > > > > >
> > > >> > > > > > > > The project is in a nascent state and I am the only
> > > >> contributor
> > > >> > > so
> > > >> > > > > > far. I
> > > >> > > > > > > > am new to the Apache community and I would like to
> bring
> > > this
> > > >> > > > project
> > > >> > > > > > > into
> > > >> > > > > > > > Apache and improve, expand and build a developer
> > community
> > > >> > around
> > > >> > > > it.
> > > >> > > > > > > >
> > > >> > > > > > > > I think this project can be a sub project of Apache
> > > Commons
> > > >> > since
> > > >> > > > it
> > > >> > > > > > > > provides generic components for parallelizing any kind
> > of
> > > >> > > methods.
> > > >> > > > > > > >
> > > >> > > > > > > > Can somebody please guide me or suggest what other
> > > options I
> > > >> > can
> > > >> > > > > > explore
> > > >> > > > > > > ?
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > Hi Arun,
> > > >> > > > > > >
> > > >> > > > > > > Thank you for your proposal.
> > > >> > > > > > >
> > > >> > > > > > > How would this be different from Apache Spark?
> > > >> > > > > > >
> > > >> > > > > > > Thank you,
> > > >> > > > > > > Gary
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > >
> > > >> > > > > > > > Thanks,
> > > >> > > > > > > > Arun
> > > >> > > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > --
> > > >> > > > > > Matt Sicker <bo...@gmail.com>
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Matt Sicker <bo...@gmail.com>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >
>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <bo...@gmail.com> wrote:

> So wouldn't something like ASM or Javassist or one of the zillion other
> bytecode libraries be a better alternative to using reflection for
> performance? Also, using the Java 7 reflections API improvements helps
> speed things up quite a bit.
>

IMO, unless you are doing scripting, reflection should be a used as a
workaround, but that's just me. For example, like we do in Commons IO's
Java7Support class.

But I digress ;-)

This is clearly an interesting topic. My concern is that there is a LOT of
code out there that does stuff like this at the low and high level from the
JRE's fork/join to Apache Spark and so on as I've stated.

IMO something new would have to be both unique and since this is Commons,
potentially pluggable into other frameworks.

Gary


> On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:
>
> > My goto library for such tasks would be GPars. It has both Java and
> > Groovy support for most things (actors/dataflow) but less so for
> > asynchronous task execution. It's one of the things that would be good
> > to explore in light of Java 8. Groovy is now Apache, GPars not at this
> > stage.
> >
> > So with adding two jars (GPars + Groovy), you can use Groovy like this:
> >
> > @Grab('org.codehaus.gpars:gpars:1.2.1')
> > import com.arun.student.StudentService
> > import groovyx.gpars.GParsExecutorsPool
> >
> > long startTime = System.nanoTime()
> > def service = new StudentService()
> > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> > "Harry Potter": 7]
> >
> > def tasks = [
> >         { println service.findStudent("john@gmail.com", 11, false) },
> >         { println service.getStudentMarks(1L) },
> >         { println service.getStudentsByFirstNames(["John","Alice"]) },
> >         { println service.getRandomLastName() },
> >         { println service.findStudentIdByName("Kate", "Williams") },
> >         { service.printMapValues(bookSeries) }
> > ]
> >
> > GParsExecutorsPool.withPool {
> >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> > //    tasks.eachParallel{ it() } // one of numerous alternatives
> > }
> >
> > long executionTime = (System.nanoTime() - startTime) / 1000000
> > println "\nTotal elapsed time is $executionTime\n\n"
> >
> >
> > Cheers, Paul.
> >
> >
> > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > > I'd be interested to see where this leads to. It could end up as a sort
> > of
> > > Commons Parallel library. Besides providing an execution API, there
> could
> > > be plenty of support utilities that tend to be found in all the
> > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > earlier
> > > (basically all sorts of Hadoop-related projects and other distributed
> > > systems here).
> > >
> > > Really, there's so many ways that such a project could head, I'd like
> to
> > > hear more ideas on what to focus on.
> > >
> > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
> > >
> > >> The upshot is that there has to be a way to do this with some custom
> > code
> > >> to at least have the ability to 'fast path' the code without
> reflection.
> > >> Using lambdas should make this fairly syntactically unobtrusive.
> > >>
> > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> > >> wrote:
> > >>
> > >> > Yes, reflection is not very performant but I don't think I have any
> > other
> > >> > choice since the library has to inspect the object supplied by the
> > client
> > >> > at runtime to pick out the methods to be invoked using
> > CompletableFuture.
> > >> > But the performance penalty paid for using reflection will be more
> > than
> > >> > offset by the savings of parallel method execution, more so as the
> no
> > of
> > >> > methods executed in parallel increases.
> > >> >
> > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> garydgregory@gmail.com
> > >
> > >> > wrote:
> > >> >
> > >> > > On a lower-level, if you want to use this for lower-level services
> > >> (where
> > >> > > there is no network latency for example), you will need to avoid
> > using
> > >> > > reflection to get the best performance.
> > >> > >
> > >> > > Gary
> > >> > >
> > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > strider90arun@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Gary,
> > >> > > >
> > >> > > > Thanks for your response. You have some valid and interesting
> > points
> > >> > :-)
> > >> > > > Of course you are right that Spark is much more mature. Thanks
> for
> > >> your
> > >> > > > insight.
> > >> > > > It will be interesting indeed to find out if the core
> > parallelization
> > >> > > > engine of Spark can be isolated like you suggest.
> > >> > > >
> > >> > > > I started working on this project because I felt that there was
> no
> > >> good
> > >> > > > library for parallelizing method calls which can be plugged in
> > easily
> > >> > > into
> > >> > > > an existing java project. Ultimately, if such a solution can be
> > >> > > > incorporated in the Apache Commons, it would be a useful
> addition
> > to
> > >> > the
> > >> > > > Commons repository.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Arun
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > >> garydgregory@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi Arun,
> > >> > > > >
> > >> > > > > Sure, and that is to be expected, Spark is more mature than a
> > four
> > >> > > class
> > >> > > > > prototype. What I am trying to get to is that in order for the
> > >> > library
> > >> > > to
> > >> > > > > be useful, you will end up with more in a first release, and
> > after
> > >> a
> > >> > > > couple
> > >> > > > > more releases, there will be more and more. Would Spark not
> > have in
> > >> > its
> > >> > > > > guts the same kind of code your are proposing here? By
> > extension,
> > >> > will
> > >> > > > you
> > >> > > > > not end up with more framework-like (Spark-like) code and
> > solutions
> > >> > as
> > >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > >> > > > >
> > >> > > > >
> > >> > > > > What would be interesting would be to find out if there is a
> > core
> > >> > part
> > >> > > of
> > >> > > > > Spark that is separable and ex tractable into a Commons
> > component.
> > >> > > Since
> > >> > > > > Spark has a proven track record, it is more likely, that such
> a
> > >> > library
> > >> > > > > would be generally useful than one created from scratch that
> > does
> > >> not
> > >> > > > > integrate with anything else. Again, please do not take any of
> > this
> > >> > > > > personally, I am just playing here :-)
> > >> > > > >
> > >> > > > > Gary
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> boards@gmail.com>
> > >> > wrote:
> > >> > > > >
> > >> > > > > > I already see a huge difference here: Spark requires a bunch
> > of
> > >> > > > > > infrastructure to be set up, while this library is just a
> > >> library.
> > >> > > > > Similar
> > >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> > >> Samza
> > >> > or
> > >> > > > the
> > >> > > > > > others.
> > >> > > > > >
> > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > garydgregory@gmail.com>
> > >> > > wrote:
> > >> > > > > >
> > >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > >> > > strider90arun@gmail.com
> > >> > > > >
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi All,
> > >> > > > > > > >
> > >> > > > > > > > Good afternoon.
> > >> > > > > > > >
> > >> > > > > > > > I have been working on a java generic parallel execution
> > >> > library
> > >> > > > > which
> > >> > > > > > > will
> > >> > > > > > > > allow clients to execute methods in parallel
> irrespective
> > of
> > >> > the
> > >> > > > > number
> > >> > > > > > > of
> > >> > > > > > > > method arguments, type of method arguments, return type
> of
> > >> the
> > >> > > > method
> > >> > > > > > > etc.
> > >> > > > > > > >
> > >> > > > > > > > Here is the link to the source code:
> > >> > > > > > > > https://github.com/striderarun/parallel-
> execution-engine
> > >> > > > > > > >
> > >> > > > > > > > The project is in a nascent state and I am the only
> > >> contributor
> > >> > > so
> > >> > > > > > far. I
> > >> > > > > > > > am new to the Apache community and I would like to bring
> > this
> > >> > > > project
> > >> > > > > > > into
> > >> > > > > > > > Apache and improve, expand and build a developer
> community
> > >> > around
> > >> > > > it.
> > >> > > > > > > >
> > >> > > > > > > > I think this project can be a sub project of Apache
> > Commons
> > >> > since
> > >> > > > it
> > >> > > > > > > > provides generic components for parallelizing any kind
> of
> > >> > > methods.
> > >> > > > > > > >
> > >> > > > > > > > Can somebody please guide me or suggest what other
> > options I
> > >> > can
> > >> > > > > > explore
> > >> > > > > > > ?
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > > > Hi Arun,
> > >> > > > > > >
> > >> > > > > > > Thank you for your proposal.
> > >> > > > > > >
> > >> > > > > > > How would this be different from Apache Spark?
> > >> > > > > > >
> > >> > > > > > > Thank you,
> > >> > > > > > > Gary
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > > Arun
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Matt Sicker <bo...@gmail.com>
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Matt Sicker <bo...@gmail.com>.
So wouldn't something like ASM or Javassist or one of the zillion other
bytecode libraries be a better alternative to using reflection for
performance? Also, using the Java 7 reflections API improvements helps
speed things up quite a bit.

On 12 June 2017 at 20:37, Paul King <pa...@gmail.com> wrote:

> My goto library for such tasks would be GPars. It has both Java and
> Groovy support for most things (actors/dataflow) but less so for
> asynchronous task execution. It's one of the things that would be good
> to explore in light of Java 8. Groovy is now Apache, GPars not at this
> stage.
>
> So with adding two jars (GPars + Groovy), you can use Groovy like this:
>
> @Grab('org.codehaus.gpars:gpars:1.2.1')
> import com.arun.student.StudentService
> import groovyx.gpars.GParsExecutorsPool
>
> long startTime = System.nanoTime()
> def service = new StudentService()
> def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
> "Harry Potter": 7]
>
> def tasks = [
>         { println service.findStudent("john@gmail.com", 11, false) },
>         { println service.getStudentMarks(1L) },
>         { println service.getStudentsByFirstNames(["John","Alice"]) },
>         { println service.getRandomLastName() },
>         { println service.findStudentIdByName("Kate", "Williams") },
>         { service.printMapValues(bookSeries) }
> ]
>
> GParsExecutorsPool.withPool {
>     tasks.collect{ it.callAsync() }.collect{ it.get() }
> //    tasks.eachParallel{ it() } // one of numerous alternatives
> }
>
> long executionTime = (System.nanoTime() - startTime) / 1000000
> println "\nTotal elapsed time is $executionTime\n\n"
>
>
> Cheers, Paul.
>
>
> On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> > I'd be interested to see where this leads to. It could end up as a sort
> of
> > Commons Parallel library. Besides providing an execution API, there could
> > be plenty of support utilities that tend to be found in all the
> > *Util(s)/*Helper classes in projects like all the ones I mentioned
> earlier
> > (basically all sorts of Hadoop-related projects and other distributed
> > systems here).
> >
> > Really, there's so many ways that such a project could head, I'd like to
> > hear more ideas on what to focus on.
> >
> > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
> >
> >> The upshot is that there has to be a way to do this with some custom
> code
> >> to at least have the ability to 'fast path' the code without reflection.
> >> Using lambdas should make this fairly syntactically unobtrusive.
> >>
> >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> >> wrote:
> >>
> >> > Yes, reflection is not very performant but I don't think I have any
> other
> >> > choice since the library has to inspect the object supplied by the
> client
> >> > at runtime to pick out the methods to be invoked using
> CompletableFuture.
> >> > But the performance penalty paid for using reflection will be more
> than
> >> > offset by the savings of parallel method execution, more so as the no
> of
> >> > methods executed in parallel increases.
> >> >
> >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <garydgregory@gmail.com
> >
> >> > wrote:
> >> >
> >> > > On a lower-level, if you want to use this for lower-level services
> >> (where
> >> > > there is no network latency for example), you will need to avoid
> using
> >> > > reflection to get the best performance.
> >> > >
> >> > > Gary
> >> > >
> >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> strider90arun@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Gary,
> >> > > >
> >> > > > Thanks for your response. You have some valid and interesting
> points
> >> > :-)
> >> > > > Of course you are right that Spark is much more mature. Thanks for
> >> your
> >> > > > insight.
> >> > > > It will be interesting indeed to find out if the core
> parallelization
> >> > > > engine of Spark can be isolated like you suggest.
> >> > > >
> >> > > > I started working on this project because I felt that there was no
> >> good
> >> > > > library for parallelizing method calls which can be plugged in
> easily
> >> > > into
> >> > > > an existing java project. Ultimately, if such a solution can be
> >> > > > incorporated in the Apache Commons, it would be a useful addition
> to
> >> > the
> >> > > > Commons repository.
> >> > > >
> >> > > > Thanks,
> >> > > > Arun
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> >> garydgregory@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi Arun,
> >> > > > >
> >> > > > > Sure, and that is to be expected, Spark is more mature than a
> four
> >> > > class
> >> > > > > prototype. What I am trying to get to is that in order for the
> >> > library
> >> > > to
> >> > > > > be useful, you will end up with more in a first release, and
> after
> >> a
> >> > > > couple
> >> > > > > more releases, there will be more and more. Would Spark not
> have in
> >> > its
> >> > > > > guts the same kind of code your are proposing here? By
> extension,
> >> > will
> >> > > > you
> >> > > > > not end up with more framework-like (Spark-like) code and
> solutions
> >> > as
> >> > > > > found in Spark? I am just playing devil's advocate here ;-)
> >> > > > >
> >> > > > >
> >> > > > > What would be interesting would be to find out if there is a
> core
> >> > part
> >> > > of
> >> > > > > Spark that is separable and ex tractable into a Commons
> component.
> >> > > Since
> >> > > > > Spark has a proven track record, it is more likely, that such a
> >> > library
> >> > > > > would be generally useful than one created from scratch that
> does
> >> not
> >> > > > > integrate with anything else. Again, please do not take any of
> this
> >> > > > > personally, I am just playing here :-)
> >> > > > >
> >> > > > > Gary
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > > > I already see a huge difference here: Spark requires a bunch
> of
> >> > > > > > infrastructure to be set up, while this library is just a
> >> library.
> >> > > > > Similar
> >> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> >> Samza
> >> > or
> >> > > > the
> >> > > > > > others.
> >> > > > > >
> >> > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> garydgregory@gmail.com>
> >> > > wrote:
> >> > > > > >
> >> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> >> > > strider90arun@gmail.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi All,
> >> > > > > > > >
> >> > > > > > > > Good afternoon.
> >> > > > > > > >
> >> > > > > > > > I have been working on a java generic parallel execution
> >> > library
> >> > > > > which
> >> > > > > > > will
> >> > > > > > > > allow clients to execute methods in parallel irrespective
> of
> >> > the
> >> > > > > number
> >> > > > > > > of
> >> > > > > > > > method arguments, type of method arguments, return type of
> >> the
> >> > > > method
> >> > > > > > > etc.
> >> > > > > > > >
> >> > > > > > > > Here is the link to the source code:
> >> > > > > > > > https://github.com/striderarun/parallel-execution-engine
> >> > > > > > > >
> >> > > > > > > > The project is in a nascent state and I am the only
> >> contributor
> >> > > so
> >> > > > > > far. I
> >> > > > > > > > am new to the Apache community and I would like to bring
> this
> >> > > > project
> >> > > > > > > into
> >> > > > > > > > Apache and improve, expand and build a developer community
> >> > around
> >> > > > it.
> >> > > > > > > >
> >> > > > > > > > I think this project can be a sub project of Apache
> Commons
> >> > since
> >> > > > it
> >> > > > > > > > provides generic components for parallelizing any kind of
> >> > > methods.
> >> > > > > > > >
> >> > > > > > > > Can somebody please guide me or suggest what other
> options I
> >> > can
> >> > > > > > explore
> >> > > > > > > ?
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > Hi Arun,
> >> > > > > > >
> >> > > > > > > Thank you for your proposal.
> >> > > > > > >
> >> > > > > > > How would this be different from Apache Spark?
> >> > > > > > >
> >> > > > > > > Thank you,
> >> > > > > > > Gary
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > Thanks,
> >> > > > > > > > Arun
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Matt Sicker <bo...@gmail.com>
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
Matt Sicker <bo...@gmail.com>

Re: Commons sub project for parallel method execution

Posted by Paul King <pa...@gmail.com>.
My goto library for such tasks would be GPars. It has both Java and
Groovy support for most things (actors/dataflow) but less so for
asynchronous task execution. It's one of the things that would be good
to explore in light of Java 8. Groovy is now Apache, GPars not at this
stage.

So with adding two jars (GPars + Groovy), you can use Groovy like this:

@Grab('org.codehaus.gpars:gpars:1.2.1')
import com.arun.student.StudentService
import groovyx.gpars.GParsExecutorsPool

long startTime = System.nanoTime()
def service = new StudentService()
def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
"Harry Potter": 7]

def tasks = [
        { println service.findStudent("john@gmail.com", 11, false) },
        { println service.getStudentMarks(1L) },
        { println service.getStudentsByFirstNames(["John","Alice"]) },
        { println service.getRandomLastName() },
        { println service.findStudentIdByName("Kate", "Williams") },
        { service.printMapValues(bookSeries) }
]

GParsExecutorsPool.withPool {
    tasks.collect{ it.callAsync() }.collect{ it.get() }
//    tasks.eachParallel{ it() } // one of numerous alternatives
}

long executionTime = (System.nanoTime() - startTime) / 1000000
println "\nTotal elapsed time is $executionTime\n\n"


Cheers, Paul.


On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <bo...@gmail.com> wrote:
> I'd be interested to see where this leads to. It could end up as a sort of
> Commons Parallel library. Besides providing an execution API, there could
> be plenty of support utilities that tend to be found in all the
> *Util(s)/*Helper classes in projects like all the ones I mentioned earlier
> (basically all sorts of Hadoop-related projects and other distributed
> systems here).
>
> Really, there's so many ways that such a project could head, I'd like to
> hear more ideas on what to focus on.
>
> On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
>
>> The upshot is that there has to be a way to do this with some custom code
>> to at least have the ability to 'fast path' the code without reflection.
>> Using lambdas should make this fairly syntactically unobtrusive.
>>
>> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
>> wrote:
>>
>> > Yes, reflection is not very performant but I don't think I have any other
>> > choice since the library has to inspect the object supplied by the client
>> > at runtime to pick out the methods to be invoked using CompletableFuture.
>> > But the performance penalty paid for using reflection will be more than
>> > offset by the savings of parallel method execution, more so as the no of
>> > methods executed in parallel increases.
>> >
>> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <ga...@gmail.com>
>> > wrote:
>> >
>> > > On a lower-level, if you want to use this for lower-level services
>> (where
>> > > there is no network latency for example), you will need to avoid using
>> > > reflection to get the best performance.
>> > >
>> > > Gary
>> > >
>> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <st...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Gary,
>> > > >
>> > > > Thanks for your response. You have some valid and interesting points
>> > :-)
>> > > > Of course you are right that Spark is much more mature. Thanks for
>> your
>> > > > insight.
>> > > > It will be interesting indeed to find out if the core parallelization
>> > > > engine of Spark can be isolated like you suggest.
>> > > >
>> > > > I started working on this project because I felt that there was no
>> good
>> > > > library for parallelizing method calls which can be plugged in easily
>> > > into
>> > > > an existing java project. Ultimately, if such a solution can be
>> > > > incorporated in the Apache Commons, it would be a useful addition to
>> > the
>> > > > Commons repository.
>> > > >
>> > > > Thanks,
>> > > > Arun
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
>> garydgregory@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Arun,
>> > > > >
>> > > > > Sure, and that is to be expected, Spark is more mature than a four
>> > > class
>> > > > > prototype. What I am trying to get to is that in order for the
>> > library
>> > > to
>> > > > > be useful, you will end up with more in a first release, and after
>> a
>> > > > couple
>> > > > > more releases, there will be more and more. Would Spark not have in
>> > its
>> > > > > guts the same kind of code your are proposing here? By extension,
>> > will
>> > > > you
>> > > > > not end up with more framework-like (Spark-like) code and solutions
>> > as
>> > > > > found in Spark? I am just playing devil's advocate here ;-)
>> > > > >
>> > > > >
>> > > > > What would be interesting would be to find out if there is a core
>> > part
>> > > of
>> > > > > Spark that is separable and ex tractable into a Commons component.
>> > > Since
>> > > > > Spark has a proven track record, it is more likely, that such a
>> > library
>> > > > > would be generally useful than one created from scratch that does
>> not
>> > > > > integrate with anything else. Again, please do not take any of this
>> > > > > personally, I am just playing here :-)
>> > > > >
>> > > > > Gary
>> > > > >
>> > > > >
>> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > I already see a huge difference here: Spark requires a bunch of
>> > > > > > infrastructure to be set up, while this library is just a
>> library.
>> > > > > Similar
>> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
>> Samza
>> > or
>> > > > the
>> > > > > > others.
>> > > > > >
>> > > > > > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
>> > > strider90arun@gmail.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi All,
>> > > > > > > >
>> > > > > > > > Good afternoon.
>> > > > > > > >
>> > > > > > > > I have been working on a java generic parallel execution
>> > library
>> > > > > which
>> > > > > > > will
>> > > > > > > > allow clients to execute methods in parallel irrespective of
>> > the
>> > > > > number
>> > > > > > > of
>> > > > > > > > method arguments, type of method arguments, return type of
>> the
>> > > > method
>> > > > > > > etc.
>> > > > > > > >
>> > > > > > > > Here is the link to the source code:
>> > > > > > > > https://github.com/striderarun/parallel-execution-engine
>> > > > > > > >
>> > > > > > > > The project is in a nascent state and I am the only
>> contributor
>> > > so
>> > > > > > far. I
>> > > > > > > > am new to the Apache community and I would like to bring this
>> > > > project
>> > > > > > > into
>> > > > > > > > Apache and improve, expand and build a developer community
>> > around
>> > > > it.
>> > > > > > > >
>> > > > > > > > I think this project can be a sub project of Apache Commons
>> > since
>> > > > it
>> > > > > > > > provides generic components for parallelizing any kind of
>> > > methods.
>> > > > > > > >
>> > > > > > > > Can somebody please guide me or suggest what other options I
>> > can
>> > > > > > explore
>> > > > > > > ?
>> > > > > > > >
>> > > > > > >
>> > > > > > > Hi Arun,
>> > > > > > >
>> > > > > > > Thank you for your proposal.
>> > > > > > >
>> > > > > > > How would this be different from Apache Spark?
>> > > > > > >
>> > > > > > > Thank you,
>> > > > > > > Gary
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > Arun
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Matt Sicker <bo...@gmail.com>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
>
> --
> Matt Sicker <bo...@gmail.com>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
On Mon, Jun 12, 2017 at 6:21 PM, Matt Sicker <bo...@gmail.com> wrote:

> Last time I used Spring, they had an @Async annotation you could use which
> would automatically execute in an executor service (all handled via bean
> proxies as usual).
>

Yeah, for higher-level stuff like Spring Batch, it's a little more awkward
but, hey, it works (split and flow elements).

Gary


> On 12 June 2017 at 19:22, Gary Gregory <ga...@gmail.com> wrote:
>
> > Hi All,
> >
> > I think it would be most helpful to note the distinction between the
> > parallelism aspect and the bridge to domain classes aspect (currently
> done
> > with reflection in the proposed github repo.)
> >
> > It seems (to me) that in between the ForkJoin framework already in Java
> (a
> > low-level library) and up to Apache Spark (an lowel-level set of classes
> > and high-level application-server-like code base), there are a ton of
> > options already out there. I am not sure what yet another framework would
> > do that is not already there.
> >
> > Maybe the distinguishing factor here is the use of reflection? What about
> > annotations? That seems to be more modern approach (Java 5! :-) than the
> > typing of method names in code (as currently done in the repo) which is a
> > nightmare to maintain especially when you are in an evolving code base
> and
> > refactoring all the time.
> >
> > Maybe an interesting angle would be decorating domain classes with
> > annotations and submitting those to fork/join. Just thinkin' aloud...
> >
> > Gary
> >
> > On Mon, Jun 12, 2017 at 4:29 PM, Matt Sicker <bo...@gmail.com> wrote:
> >
> > > I'd be interested to see where this leads to. It could end up as a sort
> > of
> > > Commons Parallel library. Besides providing an execution API, there
> could
> > > be plenty of support utilities that tend to be found in all the
> > > *Util(s)/*Helper classes in projects like all the ones I mentioned
> > earlier
> > > (basically all sorts of Hadoop-related projects and other distributed
> > > systems here).
> > >
> > > Really, there's so many ways that such a project could head, I'd like
> to
> > > hear more ideas on what to focus on.
> > >
> > > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
> > >
> > > > The upshot is that there has to be a way to do this with some custom
> > code
> > > > to at least have the ability to 'fast path' the code without
> > reflection.
> > > > Using lambdas should make this fairly syntactically unobtrusive.
> > > >
> > > > On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <strider90arun@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Yes, reflection is not very performant but I don't think I have any
> > > other
> > > > > choice since the library has to inspect the object supplied by the
> > > client
> > > > > at runtime to pick out the methods to be invoked using
> > > CompletableFuture.
> > > > > But the performance penalty paid for using reflection will be more
> > than
> > > > > offset by the savings of parallel method execution, more so as the
> no
> > > of
> > > > > methods executed in parallel increases.
> > > > >
> > > > > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> > garydgregory@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > On a lower-level, if you want to use this for lower-level
> services
> > > > (where
> > > > > > there is no network latency for example), you will need to avoid
> > > using
> > > > > > reflection to get the best performance.
> > > > > >
> > > > > > Gary
> > > > > >
> > > > > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> > strider90arun@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Gary,
> > > > > > >
> > > > > > > Thanks for your response. You have some valid and interesting
> > > points
> > > > > :-)
> > > > > > > Of course you are right that Spark is much more mature. Thanks
> > for
> > > > your
> > > > > > > insight.
> > > > > > > It will be interesting indeed to find out if the core
> > > parallelization
> > > > > > > engine of Spark can be isolated like you suggest.
> > > > > > >
> > > > > > > I started working on this project because I felt that there was
> > no
> > > > good
> > > > > > > library for parallelizing method calls which can be plugged in
> > > easily
> > > > > > into
> > > > > > > an existing java project. Ultimately, if such a solution can be
> > > > > > > incorporated in the Apache Commons, it would be a useful
> addition
> > > to
> > > > > the
> > > > > > > Commons repository.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Arun
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > > garydgregory@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Arun,
> > > > > > > >
> > > > > > > > Sure, and that is to be expected, Spark is more mature than a
> > > four
> > > > > > class
> > > > > > > > prototype. What I am trying to get to is that in order for
> the
> > > > > library
> > > > > > to
> > > > > > > > be useful, you will end up with more in a first release, and
> > > after
> > > > a
> > > > > > > couple
> > > > > > > > more releases, there will be more and more. Would Spark not
> > have
> > > in
> > > > > its
> > > > > > > > guts the same kind of code your are proposing here? By
> > extension,
> > > > > will
> > > > > > > you
> > > > > > > > not end up with more framework-like (Spark-like) code and
> > > solutions
> > > > > as
> > > > > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > > > > > >
> > > > > > > >
> > > > > > > > What would be interesting would be to find out if there is a
> > core
> > > > > part
> > > > > > of
> > > > > > > > Spark that is separable and ex tractable into a Commons
> > > component.
> > > > > > Since
> > > > > > > > Spark has a proven track record, it is more likely, that
> such a
> > > > > library
> > > > > > > > would be generally useful than one created from scratch that
> > does
> > > > not
> > > > > > > > integrate with anything else. Again, please do not take any
> of
> > > this
> > > > > > > > personally, I am just playing here :-)
> > > > > > > >
> > > > > > > > Gary
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <
> boards@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > I already see a huge difference here: Spark requires a
> bunch
> > of
> > > > > > > > > infrastructure to be set up, while this library is just a
> > > > library.
> > > > > > > > Similar
> > > > > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm
> or
> > > > Samza
> > > > > or
> > > > > > > the
> > > > > > > > > others.
> > > > > > > > >
> > > > > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> > garydgregory@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > > > > strider90arun@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi All,
> > > > > > > > > > >
> > > > > > > > > > > Good afternoon.
> > > > > > > > > > >
> > > > > > > > > > > I have been working on a java generic parallel
> execution
> > > > > library
> > > > > > > > which
> > > > > > > > > > will
> > > > > > > > > > > allow clients to execute methods in parallel
> irrespective
> > > of
> > > > > the
> > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > method arguments, type of method arguments, return type
> > of
> > > > the
> > > > > > > method
> > > > > > > > > > etc.
> > > > > > > > > > >
> > > > > > > > > > > Here is the link to the source code:
> > > > > > > > > > > https://github.com/striderarun/parallel-
> execution-engine
> > > > > > > > > > >
> > > > > > > > > > > The project is in a nascent state and I am the only
> > > > contributor
> > > > > > so
> > > > > > > > > far. I
> > > > > > > > > > > am new to the Apache community and I would like to
> bring
> > > this
> > > > > > > project
> > > > > > > > > > into
> > > > > > > > > > > Apache and improve, expand and build a developer
> > community
> > > > > around
> > > > > > > it.
> > > > > > > > > > >
> > > > > > > > > > > I think this project can be a sub project of Apache
> > Commons
> > > > > since
> > > > > > > it
> > > > > > > > > > > provides generic components for parallelizing any kind
> of
> > > > > > methods.
> > > > > > > > > > >
> > > > > > > > > > > Can somebody please guide me or suggest what other
> > options
> > > I
> > > > > can
> > > > > > > > > explore
> > > > > > > > > > ?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi Arun,
> > > > > > > > > >
> > > > > > > > > > Thank you for your proposal.
> > > > > > > > > >
> > > > > > > > > > How would this be different from Apache Spark?
> > > > > > > > > >
> > > > > > > > > > Thank you,
> > > > > > > > > > Gary
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Arun
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Matt Sicker <bo...@gmail.com>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> > >
> >
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Matt Sicker <bo...@gmail.com>.
Last time I used Spring, they had an @Async annotation you could use which
would automatically execute in an executor service (all handled via bean
proxies as usual).

On 12 June 2017 at 19:22, Gary Gregory <ga...@gmail.com> wrote:

> Hi All,
>
> I think it would be most helpful to note the distinction between the
> parallelism aspect and the bridge to domain classes aspect (currently done
> with reflection in the proposed github repo.)
>
> It seems (to me) that in between the ForkJoin framework already in Java (a
> low-level library) and up to Apache Spark (an lowel-level set of classes
> and high-level application-server-like code base), there are a ton of
> options already out there. I am not sure what yet another framework would
> do that is not already there.
>
> Maybe the distinguishing factor here is the use of reflection? What about
> annotations? That seems to be more modern approach (Java 5! :-) than the
> typing of method names in code (as currently done in the repo) which is a
> nightmare to maintain especially when you are in an evolving code base and
> refactoring all the time.
>
> Maybe an interesting angle would be decorating domain classes with
> annotations and submitting those to fork/join. Just thinkin' aloud...
>
> Gary
>
> On Mon, Jun 12, 2017 at 4:29 PM, Matt Sicker <bo...@gmail.com> wrote:
>
> > I'd be interested to see where this leads to. It could end up as a sort
> of
> > Commons Parallel library. Besides providing an execution API, there could
> > be plenty of support utilities that tend to be found in all the
> > *Util(s)/*Helper classes in projects like all the ones I mentioned
> earlier
> > (basically all sorts of Hadoop-related projects and other distributed
> > systems here).
> >
> > Really, there's so many ways that such a project could head, I'd like to
> > hear more ideas on what to focus on.
> >
> > On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
> >
> > > The upshot is that there has to be a way to do this with some custom
> code
> > > to at least have the ability to 'fast path' the code without
> reflection.
> > > Using lambdas should make this fairly syntactically unobtrusive.
> > >
> > > On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> > > wrote:
> > >
> > > > Yes, reflection is not very performant but I don't think I have any
> > other
> > > > choice since the library has to inspect the object supplied by the
> > client
> > > > at runtime to pick out the methods to be invoked using
> > CompletableFuture.
> > > > But the performance penalty paid for using reflection will be more
> than
> > > > offset by the savings of parallel method execution, more so as the no
> > of
> > > > methods executed in parallel increases.
> > > >
> > > > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> garydgregory@gmail.com>
> > > > wrote:
> > > >
> > > > > On a lower-level, if you want to use this for lower-level services
> > > (where
> > > > > there is no network latency for example), you will need to avoid
> > using
> > > > > reflection to get the best performance.
> > > > >
> > > > > Gary
> > > > >
> > > > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
> strider90arun@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Gary,
> > > > > >
> > > > > > Thanks for your response. You have some valid and interesting
> > points
> > > > :-)
> > > > > > Of course you are right that Spark is much more mature. Thanks
> for
> > > your
> > > > > > insight.
> > > > > > It will be interesting indeed to find out if the core
> > parallelization
> > > > > > engine of Spark can be isolated like you suggest.
> > > > > >
> > > > > > I started working on this project because I felt that there was
> no
> > > good
> > > > > > library for parallelizing method calls which can be plugged in
> > easily
> > > > > into
> > > > > > an existing java project. Ultimately, if such a solution can be
> > > > > > incorporated in the Apache Commons, it would be a useful addition
> > to
> > > > the
> > > > > > Commons repository.
> > > > > >
> > > > > > Thanks,
> > > > > > Arun
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > > garydgregory@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Arun,
> > > > > > >
> > > > > > > Sure, and that is to be expected, Spark is more mature than a
> > four
> > > > > class
> > > > > > > prototype. What I am trying to get to is that in order for the
> > > > library
> > > > > to
> > > > > > > be useful, you will end up with more in a first release, and
> > after
> > > a
> > > > > > couple
> > > > > > > more releases, there will be more and more. Would Spark not
> have
> > in
> > > > its
> > > > > > > guts the same kind of code your are proposing here? By
> extension,
> > > > will
> > > > > > you
> > > > > > > not end up with more framework-like (Spark-like) code and
> > solutions
> > > > as
> > > > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > > > > >
> > > > > > >
> > > > > > > What would be interesting would be to find out if there is a
> core
> > > > part
> > > > > of
> > > > > > > Spark that is separable and ex tractable into a Commons
> > component.
> > > > > Since
> > > > > > > Spark has a proven track record, it is more likely, that such a
> > > > library
> > > > > > > would be generally useful than one created from scratch that
> does
> > > not
> > > > > > > integrate with anything else. Again, please do not take any of
> > this
> > > > > > > personally, I am just playing here :-)
> > > > > > >
> > > > > > > Gary
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <boards@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > I already see a huge difference here: Spark requires a bunch
> of
> > > > > > > > infrastructure to be set up, while this library is just a
> > > library.
> > > > > > > Similar
> > > > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> > > Samza
> > > > or
> > > > > > the
> > > > > > > > others.
> > > > > > > >
> > > > > > > > On 12 June 2017 at 16:28, Gary Gregory <
> garydgregory@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > > > strider90arun@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > Good afternoon.
> > > > > > > > > >
> > > > > > > > > > I have been working on a java generic parallel execution
> > > > library
> > > > > > > which
> > > > > > > > > will
> > > > > > > > > > allow clients to execute methods in parallel irrespective
> > of
> > > > the
> > > > > > > number
> > > > > > > > > of
> > > > > > > > > > method arguments, type of method arguments, return type
> of
> > > the
> > > > > > method
> > > > > > > > > etc.
> > > > > > > > > >
> > > > > > > > > > Here is the link to the source code:
> > > > > > > > > > https://github.com/striderarun/parallel-execution-engine
> > > > > > > > > >
> > > > > > > > > > The project is in a nascent state and I am the only
> > > contributor
> > > > > so
> > > > > > > > far. I
> > > > > > > > > > am new to the Apache community and I would like to bring
> > this
> > > > > > project
> > > > > > > > > into
> > > > > > > > > > Apache and improve, expand and build a developer
> community
> > > > around
> > > > > > it.
> > > > > > > > > >
> > > > > > > > > > I think this project can be a sub project of Apache
> Commons
> > > > since
> > > > > > it
> > > > > > > > > > provides generic components for parallelizing any kind of
> > > > > methods.
> > > > > > > > > >
> > > > > > > > > > Can somebody please guide me or suggest what other
> options
> > I
> > > > can
> > > > > > > > explore
> > > > > > > > > ?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi Arun,
> > > > > > > > >
> > > > > > > > > Thank you for your proposal.
> > > > > > > > >
> > > > > > > > > How would this be different from Apache Spark?
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > Gary
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Arun
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Matt Sicker <bo...@gmail.com>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >
>



-- 
Matt Sicker <bo...@gmail.com>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
Hi All,

I think it would be most helpful to note the distinction between the
parallelism aspect and the bridge to domain classes aspect (currently done
with reflection in the proposed github repo.)

It seems (to me) that in between the ForkJoin framework already in Java (a
low-level library) and up to Apache Spark (an lowel-level set of classes
and high-level application-server-like code base), there are a ton of
options already out there. I am not sure what yet another framework would
do that is not already there.

Maybe the distinguishing factor here is the use of reflection? What about
annotations? That seems to be more modern approach (Java 5! :-) than the
typing of method names in code (as currently done in the repo) which is a
nightmare to maintain especially when you are in an evolving code base and
refactoring all the time.

Maybe an interesting angle would be decorating domain classes with
annotations and submitting those to fork/join. Just thinkin' aloud...

Gary

On Mon, Jun 12, 2017 at 4:29 PM, Matt Sicker <bo...@gmail.com> wrote:

> I'd be interested to see where this leads to. It could end up as a sort of
> Commons Parallel library. Besides providing an execution API, there could
> be plenty of support utilities that tend to be found in all the
> *Util(s)/*Helper classes in projects like all the ones I mentioned earlier
> (basically all sorts of Hadoop-related projects and other distributed
> systems here).
>
> Really, there's so many ways that such a project could head, I'd like to
> hear more ideas on what to focus on.
>
> On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:
>
> > The upshot is that there has to be a way to do this with some custom code
> > to at least have the ability to 'fast path' the code without reflection.
> > Using lambdas should make this fairly syntactically unobtrusive.
> >
> > On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> > wrote:
> >
> > > Yes, reflection is not very performant but I don't think I have any
> other
> > > choice since the library has to inspect the object supplied by the
> client
> > > at runtime to pick out the methods to be invoked using
> CompletableFuture.
> > > But the performance penalty paid for using reflection will be more than
> > > offset by the savings of parallel method execution, more so as the no
> of
> > > methods executed in parallel increases.
> > >
> > > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <ga...@gmail.com>
> > > wrote:
> > >
> > > > On a lower-level, if you want to use this for lower-level services
> > (where
> > > > there is no network latency for example), you will need to avoid
> using
> > > > reflection to get the best performance.
> > > >
> > > > Gary
> > > >
> > > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <strider90arun@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Gary,
> > > > >
> > > > > Thanks for your response. You have some valid and interesting
> points
> > > :-)
> > > > > Of course you are right that Spark is much more mature. Thanks for
> > your
> > > > > insight.
> > > > > It will be interesting indeed to find out if the core
> parallelization
> > > > > engine of Spark can be isolated like you suggest.
> > > > >
> > > > > I started working on this project because I felt that there was no
> > good
> > > > > library for parallelizing method calls which can be plugged in
> easily
> > > > into
> > > > > an existing java project. Ultimately, if such a solution can be
> > > > > incorporated in the Apache Commons, it would be a useful addition
> to
> > > the
> > > > > Commons repository.
> > > > >
> > > > > Thanks,
> > > > > Arun
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> > garydgregory@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Arun,
> > > > > >
> > > > > > Sure, and that is to be expected, Spark is more mature than a
> four
> > > > class
> > > > > > prototype. What I am trying to get to is that in order for the
> > > library
> > > > to
> > > > > > be useful, you will end up with more in a first release, and
> after
> > a
> > > > > couple
> > > > > > more releases, there will be more and more. Would Spark not have
> in
> > > its
> > > > > > guts the same kind of code your are proposing here? By extension,
> > > will
> > > > > you
> > > > > > not end up with more framework-like (Spark-like) code and
> solutions
> > > as
> > > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > > > >
> > > > > >
> > > > > > What would be interesting would be to find out if there is a core
> > > part
> > > > of
> > > > > > Spark that is separable and ex tractable into a Commons
> component.
> > > > Since
> > > > > > Spark has a proven track record, it is more likely, that such a
> > > library
> > > > > > would be generally useful than one created from scratch that does
> > not
> > > > > > integrate with anything else. Again, please do not take any of
> this
> > > > > > personally, I am just playing here :-)
> > > > > >
> > > > > > Gary
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > I already see a huge difference here: Spark requires a bunch of
> > > > > > > infrastructure to be set up, while this library is just a
> > library.
> > > > > > Similar
> > > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> > Samza
> > > or
> > > > > the
> > > > > > > others.
> > > > > > >
> > > > > > > On 12 June 2017 at 16:28, Gary Gregory <garydgregory@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > > strider90arun@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > Good afternoon.
> > > > > > > > >
> > > > > > > > > I have been working on a java generic parallel execution
> > > library
> > > > > > which
> > > > > > > > will
> > > > > > > > > allow clients to execute methods in parallel irrespective
> of
> > > the
> > > > > > number
> > > > > > > > of
> > > > > > > > > method arguments, type of method arguments, return type of
> > the
> > > > > method
> > > > > > > > etc.
> > > > > > > > >
> > > > > > > > > Here is the link to the source code:
> > > > > > > > > https://github.com/striderarun/parallel-execution-engine
> > > > > > > > >
> > > > > > > > > The project is in a nascent state and I am the only
> > contributor
> > > > so
> > > > > > > far. I
> > > > > > > > > am new to the Apache community and I would like to bring
> this
> > > > > project
> > > > > > > > into
> > > > > > > > > Apache and improve, expand and build a developer community
> > > around
> > > > > it.
> > > > > > > > >
> > > > > > > > > I think this project can be a sub project of Apache Commons
> > > since
> > > > > it
> > > > > > > > > provides generic components for parallelizing any kind of
> > > > methods.
> > > > > > > > >
> > > > > > > > > Can somebody please guide me or suggest what other options
> I
> > > can
> > > > > > > explore
> > > > > > > > ?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hi Arun,
> > > > > > > >
> > > > > > > > Thank you for your proposal.
> > > > > > > >
> > > > > > > > How would this be different from Apache Spark?
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Gary
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Arun
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Matt Sicker <bo...@gmail.com>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Matt Sicker <bo...@gmail.com>.
I'd be interested to see where this leads to. It could end up as a sort of
Commons Parallel library. Besides providing an execution API, there could
be plenty of support utilities that tend to be found in all the
*Util(s)/*Helper classes in projects like all the ones I mentioned earlier
(basically all sorts of Hadoop-related projects and other distributed
systems here).

Really, there's so many ways that such a project could head, I'd like to
hear more ideas on what to focus on.

On 12 June 2017 at 18:19, Gary Gregory <ga...@gmail.com> wrote:

> The upshot is that there has to be a way to do this with some custom code
> to at least have the ability to 'fast path' the code without reflection.
> Using lambdas should make this fairly syntactically unobtrusive.
>
> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
> > Yes, reflection is not very performant but I don't think I have any other
> > choice since the library has to inspect the object supplied by the client
> > at runtime to pick out the methods to be invoked using CompletableFuture.
> > But the performance penalty paid for using reflection will be more than
> > offset by the savings of parallel method execution, more so as the no of
> > methods executed in parallel increases.
> >
> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <ga...@gmail.com>
> > wrote:
> >
> > > On a lower-level, if you want to use this for lower-level services
> (where
> > > there is no network latency for example), you will need to avoid using
> > > reflection to get the best performance.
> > >
> > > Gary
> > >
> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <st...@gmail.com>
> > > wrote:
> > >
> > > > Hi Gary,
> > > >
> > > > Thanks for your response. You have some valid and interesting points
> > :-)
> > > > Of course you are right that Spark is much more mature. Thanks for
> your
> > > > insight.
> > > > It will be interesting indeed to find out if the core parallelization
> > > > engine of Spark can be isolated like you suggest.
> > > >
> > > > I started working on this project because I felt that there was no
> good
> > > > library for parallelizing method calls which can be plugged in easily
> > > into
> > > > an existing java project. Ultimately, if such a solution can be
> > > > incorporated in the Apache Commons, it would be a useful addition to
> > the
> > > > Commons repository.
> > > >
> > > > Thanks,
> > > > Arun
> > > >
> > > >
> > > >
> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <
> garydgregory@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Arun,
> > > > >
> > > > > Sure, and that is to be expected, Spark is more mature than a four
> > > class
> > > > > prototype. What I am trying to get to is that in order for the
> > library
> > > to
> > > > > be useful, you will end up with more in a first release, and after
> a
> > > > couple
> > > > > more releases, there will be more and more. Would Spark not have in
> > its
> > > > > guts the same kind of code your are proposing here? By extension,
> > will
> > > > you
> > > > > not end up with more framework-like (Spark-like) code and solutions
> > as
> > > > > found in Spark? I am just playing devil's advocate here ;-)
> > > > >
> > > > >
> > > > > What would be interesting would be to find out if there is a core
> > part
> > > of
> > > > > Spark that is separable and ex tractable into a Commons component.
> > > Since
> > > > > Spark has a proven track record, it is more likely, that such a
> > library
> > > > > would be generally useful than one created from scratch that does
> not
> > > > > integrate with anything else. Again, please do not take any of this
> > > > > personally, I am just playing here :-)
> > > > >
> > > > > Gary
> > > > >
> > > > >
> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com>
> > wrote:
> > > > >
> > > > > > I already see a huge difference here: Spark requires a bunch of
> > > > > > infrastructure to be set up, while this library is just a
> library.
> > > > > Similar
> > > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or
> Samza
> > or
> > > > the
> > > > > > others.
> > > > > >
> > > > > > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > > strider90arun@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Good afternoon.
> > > > > > > >
> > > > > > > > I have been working on a java generic parallel execution
> > library
> > > > > which
> > > > > > > will
> > > > > > > > allow clients to execute methods in parallel irrespective of
> > the
> > > > > number
> > > > > > > of
> > > > > > > > method arguments, type of method arguments, return type of
> the
> > > > method
> > > > > > > etc.
> > > > > > > >
> > > > > > > > Here is the link to the source code:
> > > > > > > > https://github.com/striderarun/parallel-execution-engine
> > > > > > > >
> > > > > > > > The project is in a nascent state and I am the only
> contributor
> > > so
> > > > > > far. I
> > > > > > > > am new to the Apache community and I would like to bring this
> > > > project
> > > > > > > into
> > > > > > > > Apache and improve, expand and build a developer community
> > around
> > > > it.
> > > > > > > >
> > > > > > > > I think this project can be a sub project of Apache Commons
> > since
> > > > it
> > > > > > > > provides generic components for parallelizing any kind of
> > > methods.
> > > > > > > >
> > > > > > > > Can somebody please guide me or suggest what other options I
> > can
> > > > > > explore
> > > > > > > ?
> > > > > > > >
> > > > > > >
> > > > > > > Hi Arun,
> > > > > > >
> > > > > > > Thank you for your proposal.
> > > > > > >
> > > > > > > How would this be different from Apache Spark?
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Gary
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Arun
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Matt Sicker <bo...@gmail.com>
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Matt Sicker <bo...@gmail.com>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
The upshot is that there has to be a way to do this with some custom code
to at least have the ability to 'fast path' the code without reflection.
Using lambdas should make this fairly syntactically unobtrusive.

On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <st...@gmail.com> wrote:

> Yes, reflection is not very performant but I don't think I have any other
> choice since the library has to inspect the object supplied by the client
> at runtime to pick out the methods to be invoked using CompletableFuture.
> But the performance penalty paid for using reflection will be more than
> offset by the savings of parallel method execution, more so as the no of
> methods executed in parallel increases.
>
> On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <ga...@gmail.com>
> wrote:
>
> > On a lower-level, if you want to use this for lower-level services (where
> > there is no network latency for example), you will need to avoid using
> > reflection to get the best performance.
> >
> > Gary
> >
> > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <st...@gmail.com>
> > wrote:
> >
> > > Hi Gary,
> > >
> > > Thanks for your response. You have some valid and interesting points
> :-)
> > > Of course you are right that Spark is much more mature. Thanks for your
> > > insight.
> > > It will be interesting indeed to find out if the core parallelization
> > > engine of Spark can be isolated like you suggest.
> > >
> > > I started working on this project because I felt that there was no good
> > > library for parallelizing method calls which can be plugged in easily
> > into
> > > an existing java project. Ultimately, if such a solution can be
> > > incorporated in the Apache Commons, it would be a useful addition to
> the
> > > Commons repository.
> > >
> > > Thanks,
> > > Arun
> > >
> > >
> > >
> > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <ga...@gmail.com>
> > > wrote:
> > >
> > > > Hi Arun,
> > > >
> > > > Sure, and that is to be expected, Spark is more mature than a four
> > class
> > > > prototype. What I am trying to get to is that in order for the
> library
> > to
> > > > be useful, you will end up with more in a first release, and after a
> > > couple
> > > > more releases, there will be more and more. Would Spark not have in
> its
> > > > guts the same kind of code your are proposing here? By extension,
> will
> > > you
> > > > not end up with more framework-like (Spark-like) code and solutions
> as
> > > > found in Spark? I am just playing devil's advocate here ;-)
> > > >
> > > >
> > > > What would be interesting would be to find out if there is a core
> part
> > of
> > > > Spark that is separable and ex tractable into a Commons component.
> > Since
> > > > Spark has a proven track record, it is more likely, that such a
> library
> > > > would be generally useful than one created from scratch that does not
> > > > integrate with anything else. Again, please do not take any of this
> > > > personally, I am just playing here :-)
> > > >
> > > > Gary
> > > >
> > > >
> > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com>
> wrote:
> > > >
> > > > > I already see a huge difference here: Spark requires a bunch of
> > > > > infrastructure to be set up, while this library is just a library.
> > > > Similar
> > > > > to Kafka Streams versus Spark Streaming or Flink or Storm or Samza
> or
> > > the
> > > > > others.
> > > > >
> > > > > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com>
> > wrote:
> > > > >
> > > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> > strider90arun@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Good afternoon.
> > > > > > >
> > > > > > > I have been working on a java generic parallel execution
> library
> > > > which
> > > > > > will
> > > > > > > allow clients to execute methods in parallel irrespective of
> the
> > > > number
> > > > > > of
> > > > > > > method arguments, type of method arguments, return type of the
> > > method
> > > > > > etc.
> > > > > > >
> > > > > > > Here is the link to the source code:
> > > > > > > https://github.com/striderarun/parallel-execution-engine
> > > > > > >
> > > > > > > The project is in a nascent state and I am the only contributor
> > so
> > > > > far. I
> > > > > > > am new to the Apache community and I would like to bring this
> > > project
> > > > > > into
> > > > > > > Apache and improve, expand and build a developer community
> around
> > > it.
> > > > > > >
> > > > > > > I think this project can be a sub project of Apache Commons
> since
> > > it
> > > > > > > provides generic components for parallelizing any kind of
> > methods.
> > > > > > >
> > > > > > > Can somebody please guide me or suggest what other options I
> can
> > > > > explore
> > > > > > ?
> > > > > > >
> > > > > >
> > > > > > Hi Arun,
> > > > > >
> > > > > > Thank you for your proposal.
> > > > > >
> > > > > > How would this be different from Apache Spark?
> > > > > >
> > > > > > Thank you,
> > > > > > Gary
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Arun
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Matt Sicker <bo...@gmail.com>
> > > > >
> > > >
> > >
> >
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Yes, reflection is not very performant but I don't think I have any other
choice since the library has to inspect the object supplied by the client
at runtime to pick out the methods to be invoked using CompletableFuture.
But the performance penalty paid for using reflection will be more than
offset by the savings of parallel method execution, more so as the no of
methods executed in parallel increases.

On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <ga...@gmail.com>
wrote:

> On a lower-level, if you want to use this for lower-level services (where
> there is no network latency for example), you will need to avoid using
> reflection to get the best performance.
>
> Gary
>
> On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
> > Hi Gary,
> >
> > Thanks for your response. You have some valid and interesting points :-)
> > Of course you are right that Spark is much more mature. Thanks for your
> > insight.
> > It will be interesting indeed to find out if the core parallelization
> > engine of Spark can be isolated like you suggest.
> >
> > I started working on this project because I felt that there was no good
> > library for parallelizing method calls which can be plugged in easily
> into
> > an existing java project. Ultimately, if such a solution can be
> > incorporated in the Apache Commons, it would be a useful addition to the
> > Commons repository.
> >
> > Thanks,
> > Arun
> >
> >
> >
> > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <ga...@gmail.com>
> > wrote:
> >
> > > Hi Arun,
> > >
> > > Sure, and that is to be expected, Spark is more mature than a four
> class
> > > prototype. What I am trying to get to is that in order for the library
> to
> > > be useful, you will end up with more in a first release, and after a
> > couple
> > > more releases, there will be more and more. Would Spark not have in its
> > > guts the same kind of code your are proposing here? By extension, will
> > you
> > > not end up with more framework-like (Spark-like) code and solutions as
> > > found in Spark? I am just playing devil's advocate here ;-)
> > >
> > >
> > > What would be interesting would be to find out if there is a core part
> of
> > > Spark that is separable and ex tractable into a Commons component.
> Since
> > > Spark has a proven track record, it is more likely, that such a library
> > > would be generally useful than one created from scratch that does not
> > > integrate with anything else. Again, please do not take any of this
> > > personally, I am just playing here :-)
> > >
> > > Gary
> > >
> > >
> > > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com> wrote:
> > >
> > > > I already see a huge difference here: Spark requires a bunch of
> > > > infrastructure to be set up, while this library is just a library.
> > > Similar
> > > > to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or
> > the
> > > > others.
> > > >
> > > > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com>
> wrote:
> > > >
> > > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <
> strider90arun@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Good afternoon.
> > > > > >
> > > > > > I have been working on a java generic parallel execution library
> > > which
> > > > > will
> > > > > > allow clients to execute methods in parallel irrespective of the
> > > number
> > > > > of
> > > > > > method arguments, type of method arguments, return type of the
> > method
> > > > > etc.
> > > > > >
> > > > > > Here is the link to the source code:
> > > > > > https://github.com/striderarun/parallel-execution-engine
> > > > > >
> > > > > > The project is in a nascent state and I am the only contributor
> so
> > > > far. I
> > > > > > am new to the Apache community and I would like to bring this
> > project
> > > > > into
> > > > > > Apache and improve, expand and build a developer community around
> > it.
> > > > > >
> > > > > > I think this project can be a sub project of Apache Commons since
> > it
> > > > > > provides generic components for parallelizing any kind of
> methods.
> > > > > >
> > > > > > Can somebody please guide me or suggest what other options I can
> > > > explore
> > > > > ?
> > > > > >
> > > > >
> > > > > Hi Arun,
> > > > >
> > > > > Thank you for your proposal.
> > > > >
> > > > > How would this be different from Apache Spark?
> > > > >
> > > > > Thank you,
> > > > > Gary
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Arun
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Matt Sicker <bo...@gmail.com>
> > > >
> > >
> >
>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
On a lower-level, if you want to use this for lower-level services (where
there is no network latency for example), you will need to avoid using
reflection to get the best performance.

Gary

On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <st...@gmail.com> wrote:

> Hi Gary,
>
> Thanks for your response. You have some valid and interesting points :-)
> Of course you are right that Spark is much more mature. Thanks for your
> insight.
> It will be interesting indeed to find out if the core parallelization
> engine of Spark can be isolated like you suggest.
>
> I started working on this project because I felt that there was no good
> library for parallelizing method calls which can be plugged in easily into
> an existing java project. Ultimately, if such a solution can be
> incorporated in the Apache Commons, it would be a useful addition to the
> Commons repository.
>
> Thanks,
> Arun
>
>
>
> On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <ga...@gmail.com>
> wrote:
>
> > Hi Arun,
> >
> > Sure, and that is to be expected, Spark is more mature than a four class
> > prototype. What I am trying to get to is that in order for the library to
> > be useful, you will end up with more in a first release, and after a
> couple
> > more releases, there will be more and more. Would Spark not have in its
> > guts the same kind of code your are proposing here? By extension, will
> you
> > not end up with more framework-like (Spark-like) code and solutions as
> > found in Spark? I am just playing devil's advocate here ;-)
> >
> >
> > What would be interesting would be to find out if there is a core part of
> > Spark that is separable and ex tractable into a Commons component. Since
> > Spark has a proven track record, it is more likely, that such a library
> > would be generally useful than one created from scratch that does not
> > integrate with anything else. Again, please do not take any of this
> > personally, I am just playing here :-)
> >
> > Gary
> >
> >
> > On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com> wrote:
> >
> > > I already see a huge difference here: Spark requires a bunch of
> > > infrastructure to be set up, while this library is just a library.
> > Similar
> > > to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or
> the
> > > others.
> > >
> > > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com> wrote:
> > >
> > > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <strider90arun@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Good afternoon.
> > > > >
> > > > > I have been working on a java generic parallel execution library
> > which
> > > > will
> > > > > allow clients to execute methods in parallel irrespective of the
> > number
> > > > of
> > > > > method arguments, type of method arguments, return type of the
> method
> > > > etc.
> > > > >
> > > > > Here is the link to the source code:
> > > > > https://github.com/striderarun/parallel-execution-engine
> > > > >
> > > > > The project is in a nascent state and I am the only contributor so
> > > far. I
> > > > > am new to the Apache community and I would like to bring this
> project
> > > > into
> > > > > Apache and improve, expand and build a developer community around
> it.
> > > > >
> > > > > I think this project can be a sub project of Apache Commons since
> it
> > > > > provides generic components for parallelizing any kind of methods.
> > > > >
> > > > > Can somebody please guide me or suggest what other options I can
> > > explore
> > > > ?
> > > > >
> > > >
> > > > Hi Arun,
> > > >
> > > > Thank you for your proposal.
> > > >
> > > > How would this be different from Apache Spark?
> > > >
> > > > Thank you,
> > > > Gary
> > > >
> > > >
> > > > >
> > > > > Thanks,
> > > > > Arun
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> > >
> >
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Hi Gary,

Thanks for your response. You have some valid and interesting points :-)
Of course you are right that Spark is much more mature. Thanks for your
insight.
It will be interesting indeed to find out if the core parallelization
engine of Spark can be isolated like you suggest.

I started working on this project because I felt that there was no good
library for parallelizing method calls which can be plugged in easily into
an existing java project. Ultimately, if such a solution can be
incorporated in the Apache Commons, it would be a useful addition to the
Commons repository.

Thanks,
Arun



On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory <ga...@gmail.com>
wrote:

> Hi Arun,
>
> Sure, and that is to be expected, Spark is more mature than a four class
> prototype. What I am trying to get to is that in order for the library to
> be useful, you will end up with more in a first release, and after a couple
> more releases, there will be more and more. Would Spark not have in its
> guts the same kind of code your are proposing here? By extension, will you
> not end up with more framework-like (Spark-like) code and solutions as
> found in Spark? I am just playing devil's advocate here ;-)
>
>
> What would be interesting would be to find out if there is a core part of
> Spark that is separable and ex tractable into a Commons component. Since
> Spark has a proven track record, it is more likely, that such a library
> would be generally useful than one created from scratch that does not
> integrate with anything else. Again, please do not take any of this
> personally, I am just playing here :-)
>
> Gary
>
>
> On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com> wrote:
>
> > I already see a huge difference here: Spark requires a bunch of
> > infrastructure to be set up, while this library is just a library.
> Similar
> > to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or the
> > others.
> >
> > On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com> wrote:
> >
> > > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <st...@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Good afternoon.
> > > >
> > > > I have been working on a java generic parallel execution library
> which
> > > will
> > > > allow clients to execute methods in parallel irrespective of the
> number
> > > of
> > > > method arguments, type of method arguments, return type of the method
> > > etc.
> > > >
> > > > Here is the link to the source code:
> > > > https://github.com/striderarun/parallel-execution-engine
> > > >
> > > > The project is in a nascent state and I am the only contributor so
> > far. I
> > > > am new to the Apache community and I would like to bring this project
> > > into
> > > > Apache and improve, expand and build a developer community around it.
> > > >
> > > > I think this project can be a sub project of Apache Commons since it
> > > > provides generic components for parallelizing any kind of methods.
> > > >
> > > > Can somebody please guide me or suggest what other options I can
> > explore
> > > ?
> > > >
> > >
> > > Hi Arun,
> > >
> > > Thank you for your proposal.
> > >
> > > How would this be different from Apache Spark?
> > >
> > > Thank you,
> > > Gary
> > >
> > >
> > > >
> > > > Thanks,
> > > > Arun
> > > >
> > >
> >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >
>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
Hi Arun,

Sure, and that is to be expected, Spark is more mature than a four class
prototype. What I am trying to get to is that in order for the library to
be useful, you will end up with more in a first release, and after a couple
more releases, there will be more and more. Would Spark not have in its
guts the same kind of code your are proposing here? By extension, will you
not end up with more framework-like (Spark-like) code and solutions as
found in Spark? I am just playing devil's advocate here ;-)


What would be interesting would be to find out if there is a core part of
Spark that is separable and ex tractable into a Commons component. Since
Spark has a proven track record, it is more likely, that such a library
would be generally useful than one created from scratch that does not
integrate with anything else. Again, please do not take any of this
personally, I am just playing here :-)

Gary


On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com> wrote:

> I already see a huge difference here: Spark requires a bunch of
> infrastructure to be set up, while this library is just a library. Similar
> to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or the
> others.
>
> On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com> wrote:
>
> > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <st...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Good afternoon.
> > >
> > > I have been working on a java generic parallel execution library which
> > will
> > > allow clients to execute methods in parallel irrespective of the number
> > of
> > > method arguments, type of method arguments, return type of the method
> > etc.
> > >
> > > Here is the link to the source code:
> > > https://github.com/striderarun/parallel-execution-engine
> > >
> > > The project is in a nascent state and I am the only contributor so
> far. I
> > > am new to the Apache community and I would like to bring this project
> > into
> > > Apache and improve, expand and build a developer community around it.
> > >
> > > I think this project can be a sub project of Apache Commons since it
> > > provides generic components for parallelizing any kind of methods.
> > >
> > > Can somebody please guide me or suggest what other options I can
> explore
> > ?
> > >
> >
> > Hi Arun,
> >
> > Thank you for your proposal.
> >
> > How would this be different from Apache Spark?
> >
> > Thank you,
> > Gary
> >
> >
> > >
> > > Thanks,
> > > Arun
> > >
> >
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Arun Mohan <st...@gmail.com>.
Hi,

This project is a pure java project with no dependencies on any library
outside the JDK. It can be used with any java based project or java
frameworks like Spring as just a jar file.
Given some objects and a couple of methods on these objects, the library
can execute these methods in parallel.
The client just needs to specify the method signatures they they want to be
executed in parallel.
As Matt said, it doesnt require any other setup and can be used in almost
any existing java 8 codebase.

Thanks,
Arun


On Mon, Jun 12, 2017 at 2:29 PM, Matt Sicker <bo...@gmail.com> wrote:

> I already see a huge difference here: Spark requires a bunch of
> infrastructure to be set up, while this library is just a library. Similar
> to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or the
> others.
>
> On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com> wrote:
>
> > On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <st...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Good afternoon.
> > >
> > > I have been working on a java generic parallel execution library which
> > will
> > > allow clients to execute methods in parallel irrespective of the number
> > of
> > > method arguments, type of method arguments, return type of the method
> > etc.
> > >
> > > Here is the link to the source code:
> > > https://github.com/striderarun/parallel-execution-engine
> > >
> > > The project is in a nascent state and I am the only contributor so
> far. I
> > > am new to the Apache community and I would like to bring this project
> > into
> > > Apache and improve, expand and build a developer community around it.
> > >
> > > I think this project can be a sub project of Apache Commons since it
> > > provides generic components for parallelizing any kind of methods.
> > >
> > > Can somebody please guide me or suggest what other options I can
> explore
> > ?
> > >
> >
> > Hi Arun,
> >
> > Thank you for your proposal.
> >
> > How would this be different from Apache Spark?
> >
> > Thank you,
> > Gary
> >
> >
> > >
> > > Thanks,
> > > Arun
> > >
> >
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: Commons sub project for parallel method execution

Posted by Matt Sicker <bo...@gmail.com>.
I already see a huge difference here: Spark requires a bunch of
infrastructure to be set up, while this library is just a library. Similar
to Kafka Streams versus Spark Streaming or Flink or Storm or Samza or the
others.

On 12 June 2017 at 16:28, Gary Gregory <ga...@gmail.com> wrote:

> On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <st...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > Good afternoon.
> >
> > I have been working on a java generic parallel execution library which
> will
> > allow clients to execute methods in parallel irrespective of the number
> of
> > method arguments, type of method arguments, return type of the method
> etc.
> >
> > Here is the link to the source code:
> > https://github.com/striderarun/parallel-execution-engine
> >
> > The project is in a nascent state and I am the only contributor so far. I
> > am new to the Apache community and I would like to bring this project
> into
> > Apache and improve, expand and build a developer community around it.
> >
> > I think this project can be a sub project of Apache Commons since it
> > provides generic components for parallelizing any kind of methods.
> >
> > Can somebody please guide me or suggest what other options I can explore
> ?
> >
>
> Hi Arun,
>
> Thank you for your proposal.
>
> How would this be different from Apache Spark?
>
> Thank you,
> Gary
>
>
> >
> > Thanks,
> > Arun
> >
>



-- 
Matt Sicker <bo...@gmail.com>

Re: Commons sub project for parallel method execution

Posted by Gary Gregory <ga...@gmail.com>.
On Mon, Jun 12, 2017 at 2:26 PM, Arun Mohan <st...@gmail.com> wrote:

> Hi All,
>
> Good afternoon.
>
> I have been working on a java generic parallel execution library which will
> allow clients to execute methods in parallel irrespective of the number of
> method arguments, type of method arguments, return type of the method etc.
>
> Here is the link to the source code:
> https://github.com/striderarun/parallel-execution-engine
>
> The project is in a nascent state and I am the only contributor so far. I
> am new to the Apache community and I would like to bring this project into
> Apache and improve, expand and build a developer community around it.
>
> I think this project can be a sub project of Apache Commons since it
> provides generic components for parallelizing any kind of methods.
>
> Can somebody please guide me or suggest what other options I can explore ?
>

Hi Arun,

Thank you for your proposal.

How would this be different from Apache Spark?

Thank you,
Gary


>
> Thanks,
> Arun
>