You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Davor Bonaci <da...@apache.org> on 2018/02/19 04:40:28 UTC
Re: Euphoria Java 8 DSL - proposal
I may have missed things, but any update on the progress of this donation?
On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:
> Great !
>
> Thanks !
> Regards
> JB
>
> On 01/03/2018 07:29 AM, David Morávek wrote:
>
>> Hello JB,
>>
>> Perfect! I'm already on the Beam Slack workspace, I'll contact you once I
>> get to the office.
>>
>> Thanks!
>> D.
>>
>> On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <jb@nanthrax.net
>> <ma...@nanthrax.net>> wrote:
>>
>> Hi David,
>>
>> absolutely !! Let's move forward on the preparation steps.
>>
>> Are you on Slack and/or hangout to plan this ?
>>
>> Thanks,
>> Regards
>> JB
>>
>> On 01/02/2018 05:35 PM, David Morávek wrote:
>>
>> Hello JB,
>>
>> can we help in any way to move things forward?
>>
>> Thanks,
>> D.
>>
>> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré <
>> jb@nanthrax.net
>> <ma...@nanthrax.net> <mailto:jb@nanthrax.net
>> <ma...@nanthrax.net>>> wrote:
>>
>> Thanks Jan,
>>
>> It makes sense.
>>
>> Let me take a look on the code to understand the
>> "interaction".
>>
>> Regards
>> JB
>>
>>
>> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>
>> Hi JB,
>>
>> basically you are not wrong. The project started about
>> three or
>> four
>> years ago with a goal to unify batch and streaming
>> processing into
>> single portable, executor independent API. Because of
>> that, it is
>> currently "close" to Beam in this sense. But we don't
>> see much
>> added
>> value keeping this as a separate project, with one of
>> the key
>> differences to be the API (not the model itself), so we
>> would
>> like to
>> focus on translation from Euphoria API to Beam's SDK.
>> That's why we
>> would like to see it as a DSL, so that it would be
>> possible to use
>> Euphoria API with Beam's runners as much natively as
>> possible.
>>
>> I hope I didn't make the subject even more unclear, if
>> so, I'll
>> be happy
>> to explain anything in more detail. :-)
>>
>> Jan
>>
>>
>> On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>
>> Hi Jan,
>>
>> Thanks for your answers.
>>
>> However, they confused me ;)
>>
>> Regarding what you replied, Euphoria seems like a
>> programming
>> model/SDK "close" to Beam more than a DSL on top of
>> an
>> existing Beam
>> SDK.
>>
>> Am I wrong ?
>>
>> Regards
>> JB
>>
>> On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>>
>> Hi Ismael,
>>
>> basically we adopted the Beam's design regarding
>> partitioning
>> (https://github.com/seznam/euphoria/issues/160
>> <https://github.com/seznam/euphoria/issues/160>
>> <https://github.com/seznam/euphoria/issues/160
>> <https://github.com/seznam/euphoria/issues/160>>) and implemented
>> the sorting manually
>> (https://github.com/seznam/euphoria/issues/158
>> <https://github.com/seznam/euphoria/issues/158>
>> <https://github.com/seznam/euphoria/issues/158
>> <https://github.com/seznam/euphoria/issues/158>>). I'm not aware
>> of the time model differences (Euphoria supports
>> ingestion and
>> event time, we don't support processing time by
>> decision).
>> Regarding other differences (looking into Beam
>> capability
>> matrix, I'd say that):
>>
>> - we don't support stateful FlatMap (i.e.
>> ParDo) for now
>> (https://github.com/seznam/euphoria/issues/192
>> <https://github.com/seznam/euphoria/issues/192>
>> <https://github.com/seznam/euphoria/issues/192
>> <https://github.com/seznam/euphoria/issues/192>>)
>>
>> - we don't support side inputs (by decision
>> now, but
>> might be
>> reconsidered) and outputs
>> (https://github.com/seznam/euphoria/issues/124
>> <https://github.com/seznam/euphoria/issues/124>
>> <https://github.com/seznam/euphoria/issues/124
>> <https://github.com/seznam/euphoria/issues/124>>)
>>
>>
>> - we support complete event-time windows
>> (non-merging,
>> merging, aligned, unaligned) and time control
>>
>> - we don't support processing time by
>> decision (might be
>> reconsidered if a valid use-case is found)
>>
>> - we support window triggering based on both
>> time
>> and data,
>> including discarding and accumulating (without
>> accumulating &
>> retracting)
>>
>> All our executors (runners) - Flink, Spark and
>> Local -
>> implement
>> the complete model, which we enforce using
>> "operator
>> test kit"
>> that all executors must pass. Spark executor
>> supports
>> bounded
>> sources only (for now). As David said, we
>> currently
>> don't have
>> serialization abstraction, so there is some work
>> to be
>> done in
>> that regard.
>>
>> Our intention is to completely supersede
>> Euphoria, we
>> would like
>> to consider possibility to use executors that
>> would not
>> rely on
>> Beam, but that is optional now and should be
>> straightforward.
>>
>> We'd be happy to answer any more questions you
>> might
>> have and
>> thanks a lot!
>>
>> Best,
>>
>> Jan
>>
>>
>> On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>>
>> Hi,
>>
>> It is great to see that you guys have
>> achieved a
>> maturity
>> point to
>> propose this. Congratulations for your work
>> and the
>> idea to
>> contribute
>> it into Beam.
>>
>> I remember from a previous discussion with
>> Jan
>> about the model
>> mismatch between Euphoria and Beam, because
>> of some
>> design
>> decisions
>> of both projects. I remember you guys had
>> some
>> issues with
>> the way
>> Beam's sources do partitioning, as well as
>> Beam's
>> lack of
>> sorted data
>> (on shuffle a la hadoop). Also if I remember
>> well
>> the 'time'
>> model of
>> Euphoria was simpler than Beam's. I talk
>> about all
>> of this
>> because I
>> am curious about what parts of the Euphoria
>> model
>> you guys
>> had to
>> sacrifice to support Beam, and what parts of
>> Beam's
>> model
>> should still
>> be integrated into Euphoria (and if there is
>> a
>> straightforward path to
>> do it).
>>
>> If I understand well if this gets merged into
>> Apache this
>> means that
>> Euphoria's current implementation would be
>> superseded by
>> this DSL? I
>> am curious because I would like to
>> understand your
>> level of
>> investment
>> on supporting the future of this DSL.
>>
>> Thanks and congrats again !
>> Ismaël
>>
>> On Mon, Dec 18, 2017 at 10:12 AM,
>> Jean-Baptiste Onofré
>> <jb@nanthrax.net <ma...@nanthrax.net>
>> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>> wrote:
>>
>> Depending of the donation, you would
>> need ICLA
>> for each
>> contributor, and
>> CCLA in addition of SGA.
>>
>> We can sync with Davor and I for the
>> legal stuff.
>> However, I would wait a little bit just
>> to have
>> feedback
>> from the whole team
>> and start a formal vote.
>>
>> I would be happy to start the formal
>> vote.
>>
>> Regards
>> JB
>>
>> On 12/18/2017 10:03 AM, David Morávek
>> wrote:
>>
>> Hello,
>>
>> Thanks for the awesome feedback!
>>
>> Romain:
>>
>> We already use Java Stream API in
>> all operators
>> where it makes sense (eg.:
>> ReduceByKey). Still not sure if it
>> was a good
>> choice, but i can be easily
>> converted to iterator anyway.
>>
>> Side outputs support is coming soon,
>> we
>> already made
>> an initial work on
>> this.
>>
>> Side inputs are not supported in a
>> way you
>> are used
>> to from beam, because
>> it can be replaced by Join operator
>> on the
>> same key
>> (if annotated with
>> broadcastHashJoin, it will be turned
>> into
>> map side
>> join).
>>
>> Only significant difference from
>> Beam is,
>> that we
>> decided not to abstract
>> serialization, so we need to add
>> support
>> for Type
>> Hints, because of type
>> erasure.
>>
>> Fluent API:
>>
>> API is fluent within one operator.
>> It is
>> designed to
>> "lead the
>> programmer", which means, that he
>> we'll be only
>> offered methods that makes
>> sense after the last method he used
>> (eg.: in
>> ReduceByKey, we know that after
>> keyBy either reduceBy method should
>> come).
>> It is
>> implemented as a series of
>> builders.
>>
>> Davor:
>>
>> Thanks, I'll contact you, and will
>> start
>> the process
>> of having all the
>> necessary paperwork signed on our
>> side, so
>> we can
>> get things moving.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Dec 18, 2017 at 7:46 AM,
>> Romain
>> Manni-Bucau
>> <rmannibucau@gmail.com
>> <ma...@gmail.com> <mailto:rmannibucau@gmail.com
>> <ma...@gmail.com>>
>> <mailto:rmannibucau@gmail.com
>> <ma...@gmail.com>
>> <mailto:rmannibucau@gmail.com
>> <ma...@gmail.com>>>> wrote:
>>
>> Hi guys
>>
>> A DSL would be very welcomed,
>> in
>> particular if
>> fluent.
>>
>> Open question: did you study
>> to implement
>> Stream API (surely extending
>> it to
>> have a BeamStream and a few
>> more
>> features like
>> sides etc)? Would be
>> very
>> natural and integrable easily
>> anywhere and
>> avoid a new API discovery.
>>
>> Hazelcast jet did it so I dont
>> see
>> why Beam
>> couldnt.
>>
>> Le 18 déc. 2017 07:26, "Davor
>> Bonaci"
>> <davor@apache.org <mailto:
>> davor@apache.org>
>> <mailto:davor@apache.org <ma...@apache.org>>
>> <mailto:davor@apache.org
>> <ma...@apache.org>
>>
>> <mailto:davor@apache.org
>> <ma...@apache.org>>>> a écrit :
>>
>> Hi David,
>> As JB noted, merging of
>> these two
>> projects
>> is a great idea. If
>> fact,
>> some of us have had those
>> discussions in
>> the past.
>>
>> Legally, nothing
>> particular is
>> strictly
>> necessary as the code seem
>> to
>> already be Apache 2.0
>> licensed.
>> We don't,
>> however, want to be
>> perceived
>> as making hostile forks,
>> so it
>> would be
>> great to file a Software
>> Grant
>> Agreement with the ASF
>> Secretary.
>> I can
>> help with the process, as
>> necessary.
>>
>> Project alignment-wise,
>> there
>> aren't any
>> particular blockers that
>> I am
>> aware of. We welcome DSLs.
>>
>> Technically, the code
>> would start
>> in a
>> feature branch. During this
>> stage, we'd need to
>> validate a
>> few things,
>> including confirmation
>> the
>> code and dependencies
>> match the ASF
>> policy, automate testing in
>> Beam's
>> tooling, etc. At that
>> point, we'd
>> take a
>> community vote to accept
>> the
>> component into master, and
>> consider
>> author(s) for committership in
>> the
>> overall project.
>>
>> Welcome to the ASF and
>> Beam -- we are
>> thrilled to have you! Hope
>> this
>> helps, and please reach
>> out if
>> anybody on
>> our end can help,
>> including JB
>> or myself.
>>
>> Davor
>>
>>
>> On Sun, Dec 17, 2017 at
>> 10:13 AM,
>> Jean-Baptiste Onofré
>> <jb@nanthrax.net <mailto:
>> jb@nanthrax.net>
>> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>> <mailto:jb@nanthrax.net
>> <ma...@nanthrax.net>
>>
>> <mailto:jb@nanthrax.net
>> <ma...@nanthrax.net>>>> wrote:
>>
>> Hi David,
>>
>> Generally speaking,
>> having
>> different
>> fluent DSL on top of the
>> Beam
>> SDK is great.
>>
>> I would like to take a
>> look
>> on your
>> wordcount examples to give
>> you a
>> complete feedback. I
>> like the
>> idea and
>> a fluent Java DSL is
>> valuable.
>>
>> Let's wait feedback
>> from
>> others. If we
>> have a consensus, then
>> I
>> would be more than
>> happy to
>> help you
>> for the donation (I
>> worked on
>> the Camel Java DSL
>> while ago,
>> so I
>> have some experience here).
>>
>> Thanks !
>> Regards
>> JB
>>
>> On 12/17/2017 07:00
>> PM, David
>> Morávek
>> wrote:
>>
>> Hello,
>>
>>
>> First of all,
>> thanks for the
>> amazing work the Apache Beam
>> community is doing!
>>
>>
>> In 2014, we've
>> started
>> development
>> of the runtime
>> independent
>> Java 8 API, that
>> helps us to
>> create unified big-data
>> processing
>> flows. It has been
>> used
>> as a core
>> building block of
>> Seznam.cz
>> web crawler data
>> infrastructure
>> every since. Its design
>> principles and
>> execution
>> model are
>> very similar to Apache
>> Beam.
>>
>>
>> This API was open
>> sourced
>> in 2016,
>> under the name Euphoria
>> API:
>>
>> https://github.com/seznam/euphoria <https://github.com/seznam/eup
>> horia>
>> <https://github.com/seznam/euphoria
>> <https://github.com/seznam/euphoria>>
>> <https://github.com/seznam/euphoria
>> <https://github.com/seznam/euphoria>
>> <https://github.com/seznam/euphoria
>> <https://github.com/seznam/euphoria>>>
>>
>>
>> As it is very
>> similar to
>> Apache
>> Beam, we feel, that it is
>> not
>> worth of
>> duplicating
>> effort in
>> terms of development of new
>> runtimes and
>> fine-tuning of
>> current ones.
>>
>>
>> The main blocker
>> for us
>> to switch
>> to Apache Beam is lack
>> of the
>> Java 8 API. *W*e
>> propose the
>> integration of Euphoria API
>> into
>> Apache Beam as a
>> Java 8
>> DSL, in
>> order to share our effort
>> with
>> the community.
>>
>>
>> Simple example of
>> the
>> Euphoria API
>> usage, can be found
>> here:
>>
>>
>> https://github.com/seznam/euphoria/tree/master/euphoria-exam
>> ples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>>
>>
>> <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>
>>
>>
>>
>> If you feel, that
>> Beam
>> community
>> could leverage from our
>> work,
>> we would love to
>> start
>> working on
>> Euphoria integration
>> into
>> Apache Beam (we
>> already
>> have a
>> working POC, with few basic
>> operators
>> implemented).
>>
>>
>> I look forward to
>> hearing
>> from you,
>>
>> David
>>
>>
>> --
>> Jean-Baptiste
>> Onofré
>> jbonofre@apache.org <ma...@apache.org>
>> <mailto:jbonofre@apache.org <ma...@apache.org>>
>> <mailto:jbonofre@apache.org
>> <ma...@apache.org>
>> <mailto:jbonofre@apache.org
>> <ma...@apache.org>>>
>> http://blog.nanthrax.net
>> Talend -
>> http://www.talend.com
>>
>>
>>
>>
>>
>> -- s
>> pozdravem
>>
>> David Morávek
>>
>>
>> -- Jean-Baptiste
>> Onofré
>> jbonofre@apache.org <ma...@apache.org>
>> <mailto:jbonofre@apache.org <ma...@apache.org>>
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>>
>>
>>
>> -- Jean-Baptiste Onofré
>> jbonofre@apache.org <ma...@apache.org>
>> <mailto:jbonofre@apache.org <ma...@apache.org>>
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>>
>>
>> -- s pozdravem
>>
>> David Morávek
>>
>>
>> -- Jean-Baptiste Onofré
>> jbonofre@apache.org <ma...@apache.org>
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
Re: Euphoria Java 8 DSL - proposal
Posted by Davor Bonaci <da...@apache.org>.
(Sounds good, thanks! We'll follow-up there.)
On Tue, Feb 27, 2018 at 10:49 AM, David Morávek <da...@gmail.com>
wrote:
> Hi Davor,
>
> sorry for the delay, we were blocked by our legal department. I've send
> both SGA and CCLA to private@apache.beam.org, please let me know if you
> need anything else.
>
> Regards,
> David
>
> On Mon, Feb 19, 2018 at 6:13 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
>> Hi Davor,
>>
>> We still have some discussion/paperwork on Euphoria side (SGA, ...).
>>
>> So, it's on track but it takes a little more time than expected.
>>
>> Regards
>> JB
>>
>> On 02/19/2018 05:40 AM, Davor Bonaci wrote:
>> > I may have missed things, but any update on the progress of this
>> donation?
>> >
>> > On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <jb@nanthrax.net
>> > <ma...@nanthrax.net>> wrote:
>> >
>> > Great !
>> >
>> > Thanks !
>> > Regards
>> > JB
>> >
>> > On 01/03/2018 07:29 AM, David Morávek wrote:
>> >
>> > Hello JB,
>> >
>> > Perfect! I'm already on the Beam Slack workspace, I'll contact
>> you once
>> > I get to the office.
>> >
>> > Thanks!
>> > D.
>> >
>> > On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <
>> jb@nanthrax.net
>> > <ma...@nanthrax.net> <mailto:jb@nanthrax.net
>> > <ma...@nanthrax.net>>> wrote:
>> >
>> > Hi David,
>> >
>> > absolutely !! Let's move forward on the preparation steps.
>> >
>> > Are you on Slack and/or hangout to plan this ?
>> >
>> > Thanks,
>> > Regards
>> > JB
>> >
>> > On 01/02/2018 05:35 PM, David Morávek wrote:
>> >
>> > Hello JB,
>> >
>> > can we help in any way to move things forward?
>> >
>> > Thanks,
>> > D.
>> >
>> > On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré
>> > <jb@nanthrax.net <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>
>> wrote:
>> >
>> > Thanks Jan,
>> >
>> > It makes sense.
>> >
>> > Let me take a look on the code to understand the
>> "interaction".
>> >
>> > Regards
>> > JB
>> >
>> >
>> > On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>> >
>> > Hi JB,
>> >
>> > basically you are not wrong. The project
>> started about
>> > three or
>> > four
>> > years ago with a goal to unify batch and
>> streaming
>> > processing into
>> > single portable, executor independent API.
>> Because of
>> > that, it is
>> > currently "close" to Beam in this sense. But
>> we don't
>> > see much
>> > added
>> > value keeping this as a separate project, with
>> one of
>> > the key
>> > differences to be the API (not the model
>> itself), so we
>> > would
>> > like to
>> > focus on translation from Euphoria API to
>> Beam's SDK.
>> > That's why we
>> > would like to see it as a DSL, so that it
>> would be
>> > possible to use
>> > Euphoria API with Beam's runners as much
>> natively as
>> > possible.
>> >
>> > I hope I didn't make the subject even more
>> unclear, if
>> > so, I'll
>> > be happy
>> > to explain anything in more detail. :-)
>> >
>> > Jan
>> >
>> >
>> > On 12/18/2017 04:08 PM, Jean-Baptiste Onofré
>> wrote:
>> >
>> > Hi Jan,
>> >
>> > Thanks for your answers.
>> >
>> > However, they confused me ;)
>> >
>> > Regarding what you replied, Euphoria seems
>> like a
>> > programming
>> > model/SDK "close" to Beam more than a DSL
>> on top of an
>> > existing Beam
>> > SDK.
>> >
>> > Am I wrong ?
>> >
>> > Regards
>> > JB
>> >
>> > On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>> >
>> > Hi Ismael,
>> >
>> > basically we adopted the Beam's design
>> regarding
>> > partitioning
>> > (https://github.com/seznam/eup
>> horia/issues/160
>> > <https://github.com/seznam/euphoria/issues/160>
>> > <https://github.com/seznam/euphoria/issues/160
>> > <https://github.com/seznam/euphoria/issues/160>>
>> > <https://github.com/seznam/eup
>> horia/issues/160
>> > <https://github.com/seznam/euphoria/issues/160>
>> > <https://github.com/seznam/euphoria/issues/160
>> > <https://github.com/seznam/euphoria/issues/160>>>) and
>> implemented
>> > the sorting manually
>> > (https://github.com/seznam/eup
>> horia/issues/158
>> > <https://github.com/seznam/euphoria/issues/158>
>> > <https://github.com/seznam/euphoria/issues/158
>> > <https://github.com/seznam/euphoria/issues/158>>
>> > <https://github.com/seznam/eup
>> horia/issues/158
>> > <https://github.com/seznam/euphoria/issues/158>
>> > <https://github.com/seznam/euphoria/issues/158
>> > <https://github.com/seznam/euphoria/issues/158>>>). I'm not
>> aware
>> > of the time model differences
>> (Euphoria supports
>> > ingestion and
>> > event time, we don't support
>> processing time by
>> > decision).
>> > Regarding other differences (looking
>> into Beam
>> > capability
>> > matrix, I'd say that):
>> >
>> > - we don't support stateful FlatMap
>> (i.e.
>> > ParDo) for now
>> > (https://github.com/seznam/eup
>> horia/issues/192
>> > <https://github.com/seznam/euphoria/issues/192>
>> > <https://github.com/seznam/euphoria/issues/192
>> > <https://github.com/seznam/euphoria/issues/192>>
>> > <https://github.com/seznam/eup
>> horia/issues/192
>> > <https://github.com/seznam/euphoria/issues/192>
>> > <https://github.com/seznam/euphoria/issues/192
>> > <https://github.com/seznam/euphoria/issues/192>>>)
>> >
>> > - we don't support side inputs (by
>> decision
>> > now, but
>> > might be
>> > reconsidered) and outputs
>> > (https://github.com/seznam/eup
>> horia/issues/124
>> > <https://github.com/seznam/euphoria/issues/124>
>> > <https://github.com/seznam/euphoria/issues/124
>> > <https://github.com/seznam/euphoria/issues/124>>
>> > <https://github.com/seznam/eup
>> horia/issues/124
>> > <https://github.com/seznam/euphoria/issues/124>
>> > <https://github.com/seznam/euphoria/issues/124
>> > <https://github.com/seznam/euphoria/issues/124>>>)
>> >
>> >
>> > - we support complete event-time
>> windows
>> > (non-merging,
>> > merging, aligned, unaligned) and time
>> control
>> >
>> > - we don't support processing time
>> by
>> > decision (might be
>> > reconsidered if a valid use-case is
>> found)
>> >
>> > - we support window triggering
>> based on both
>> > time
>> > and data,
>> > including discarding and accumulating
>> (without
>> > accumulating &
>> > retracting)
>> >
>> > All our executors (runners) - Flink,
>> Spark and
>> > Local -
>> > implement
>> > the complete model, which we enforce
>> using
>> > "operator
>> > test kit"
>> > that all executors must pass. Spark
>> executor
>> > supports
>> > bounded
>> > sources only (for now). As David said,
>> we currently
>> > don't have
>> > serialization abstraction, so there is
>> some
>> > work to be
>> > done in
>> > that regard.
>> >
>> > Our intention is to completely
>> supersede
>> > Euphoria, we
>> > would like
>> > to consider possibility to use
>> executors that
>> > would not
>> > rely on
>> > Beam, but that is optional now and
>> should be
>> > straightforward.
>> >
>> > We'd be happy to answer any more
>> questions you
>> > might
>> > have and
>> > thanks a lot!
>> >
>> > Best,
>> >
>> > Jan
>> >
>> >
>> > On 12/18/2017 03:19 PM, Ismaël Mejía
>> wrote:
>> >
>> > Hi,
>> >
>> > It is great to see that you guys
>> have
>> > achieved a
>> > maturity
>> > point to
>> > propose this. Congratulations for
>> your work
>> > and the
>> > idea to
>> > contribute
>> > it into Beam.
>> >
>> > I remember from a previous
>> discussion with Jan
>> > about the model
>> > mismatch between Euphoria and
>> Beam, because
>> > of some
>> > design
>> > decisions
>> > of both projects. I remember you
>> guys had some
>> > issues with
>> > the way
>> > Beam's sources do partitioning, as
>> well as
>> > Beam's
>> > lack of
>> > sorted data
>> > (on shuffle a la hadoop). Also if I
>> > remember well
>> > the 'time'
>> > model of
>> > Euphoria was simpler than Beam's.
>> I talk
>> > about all
>> > of this
>> > because I
>> > am curious about what parts of the
>> Euphoria
>> > model
>> > you guys
>> > had to
>> > sacrifice to support Beam, and
>> what parts
>> > of Beam's
>> > model
>> > should still
>> > be integrated into Euphoria (and
>> if there is a
>> > straightforward path to
>> > do it).
>> >
>> > If I understand well if this gets
>> merged into
>> > Apache this
>> > means that
>> > Euphoria's current implementation
>> would be
>> > superseded by
>> > this DSL? I
>> > am curious because I would like to
>> > understand your
>> > level of
>> > investment
>> > on supporting the future of this
>> DSL.
>> >
>> > Thanks and congrats again !
>> > Ismaël
>> >
>> > On Mon, Dec 18, 2017 at 10:12 AM,
>> > Jean-Baptiste Onofré
>> > <jb@nanthrax.net <mailto:
>> jb@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>> wrote:
>> >
>> > Depending of the donation, you
>> would
>> > need ICLA
>> > for each
>> > contributor, and
>> > CCLA in addition of SGA.
>> >
>> > We can sync with Davor and I
>> for the
>> > legal stuff.
>> > However, I would wait a little
>> bit just
>> > to have
>> > feedback
>> > from the whole team
>> > and start a formal vote.
>> >
>> > I would be happy to start the
>> formal vote.
>> >
>> > Regards
>> > JB
>> >
>> > On 12/18/2017 10:03 AM, David
>> Morávek
>> > wrote:
>> >
>> > Hello,
>> >
>> > Thanks for the awesome
>> feedback!
>> >
>> > Romain:
>> >
>> > We already use Java Stream
>> API in
>> > all operators
>> > where it makes sense (eg.:
>> > ReduceByKey). Still not
>> sure if it
>> > was a good
>> > choice, but i can be easily
>> > converted to iterator
>> anyway.
>> >
>> > Side outputs support is
>> coming soon, we
>> > already made
>> > an initial work on
>> > this.
>> >
>> > Side inputs are not
>> supported in a
>> > way you
>> > are used
>> > to from beam, because
>> > it can be replaced by Join
>> operator
>> > on the
>> > same key
>> > (if annotated with
>> > broadcastHashJoin, it will
>> be
>> > turned into
>> > map side
>> > join).
>> >
>> > Only significant
>> difference from
>> > Beam is,
>> > that we
>> > decided not to abstract
>> > serialization, so we need
>> to add
>> > support
>> > for Type
>> > Hints, because of type
>> > erasure.
>> >
>> > Fluent API:
>> >
>> > API is fluent within one
>> operator.
>> > It is
>> > designed to
>> > "lead the
>> > programmer", which means,
>> that he
>> > we'll be only
>> > offered methods that makes
>> > sense after the last
>> method he used
>> > (eg.: in
>> > ReduceByKey, we know that
>> after
>> > keyBy either reduceBy
>> method should
>> > come).
>> > It is
>> > implemented as a series of
>> > builders.
>> >
>> > Davor:
>> >
>> > Thanks, I'll contact you,
>> and will
>> > start
>> > the process
>> > of having all the
>> > necessary paperwork signed
>> on our
>> > side, so
>> > we can
>> > get things moving.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Mon, Dec 18, 2017 at
>> 7:46 AM, Romain
>> > Manni-Bucau
>> > <rmannibucau@gmail.com
>> > <ma...@gmail.com>
>> > <mailto:rmannibucau@gmail.com <mailto:
>> rmannibucau@gmail.com>>
>> > <mailto:rmannibucau@gmail.com <ma...@gmail.com>
>> > <mailto:rmannibucau@gmail.com <mailto:
>> rmannibucau@gmail.com>>>
>> > <mailto:
>> rmannibucau@gmail.com
>> > <ma...@gmail.com>
>> > <mailto:rmannibucau@gmail.com <mailto:
>> rmannibucau@gmail.com>>
>> > <mailto:
>> rmannibucau@gmail.com
>> > <ma...@gmail.com>
>> > <mailto:rmannibucau@gmail.com <mailto:
>> rmannibucau@gmail.com>>>>>
>> > wrote:
>> >
>> > Hi guys
>> >
>> > A DSL would be very
>> welcomed, in
>> > particular if
>> > fluent.
>> >
>> > Open question: did
>> you study
>> > to implement
>> > Stream API (surely
>> extending
>> > it to
>> > have a BeamStream
>> and a few more
>> > features like
>> > sides etc)? Would be
>> > very
>> > natural and
>> integrable easily
>> > anywhere and
>> > avoid a new API discovery.
>> >
>> > Hazelcast jet did it
>> so I
>> > dont see
>> > why Beam
>> > couldnt.
>> >
>> > Le 18 déc. 2017
>> 07:26, "Davor
>> > Bonaci"
>> > <davor@apache.org
>> > <ma...@apache.org> <mailto:davor@apache.org
>> > <ma...@apache.org>>
>> > <mailto:davor@apache.org <ma...@apache.org>
>> > <mailto:davor@apache.org <ma...@apache.org>>>
>> > <mailto:
>> davor@apache.org
>> > <ma...@apache.org>
>> > <mailto:davor@apache.org <ma...@apache.org>>
>> >
>> > <mailto:davor@apache.org
>> > <ma...@apache.org>
>> > <mailto:davor@apache.org <ma...@apache.org>>>>>
>> a écrit :
>> >
>> > Hi David,
>> > As JB noted,
>> merging of
>> > these two
>> > projects
>> > is a great idea. If
>> > fact,
>> > some of us have
>> had those
>> > discussions in
>> > the past.
>> >
>> > Legally, nothing
>> > particular is
>> > strictly
>> > necessary as the code seem
>> > to
>> > already be
>> Apache 2.0
>> > licensed.
>> > We don't,
>> > however, want to be
>> > perceived
>> > as making
>> hostile forks,
>> > so it
>> > would be
>> > great to file a Software
>> > Grant
>> > Agreement with
>> the ASF
>> > Secretary.
>> > I can
>> > help with the process, as
>> > necessary.
>> >
>> > Project
>> alignment-wise, there
>> > aren't any
>> > particular blockers that
>> > I am
>> > aware of. We
>> welcome DSLs.
>> >
>> > Technically, the
>> code
>> > would start
>> > in a
>> > feature branch. During this
>> > stage, we'd need
>> to
>> > validate a
>> > few things,
>> > including confirmation
>> > the
>> > code and
>> dependencies
>> > match the ASF
>> > policy, automate testing in
>> > Beam's
>> > tooling, etc. At
>> that
>> > point, we'd
>> > take a
>> > community vote to accept
>> > the
>> > component into
>> master,
>> > and consider
>> > author(s) for
>> committership in
>> > the
>> > overall project.
>> >
>> > Welcome to the
>> ASF and
>> > Beam -- we are
>> > thrilled to have you! Hope
>> > this
>> > helps, and
>> please reach
>> > out if
>> > anybody on
>> > our end can help,
>> > including JB
>> > or myself.
>> >
>> > Davor
>> >
>> >
>> > On Sun, Dec 17,
>> 2017 at
>> > 10:13 AM,
>> > Jean-Baptiste Onofré
>> > <jb@nanthrax.net
>> > <ma...@nanthrax.net> <mailto:jb@nanthrax.net <mailto:
>> jb@nanthrax.net>>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>
>> > <mailto:
>> jb@nanthrax.net
>> > <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>> >
>> > <mailto:jb@nanthrax.net
>> > <ma...@nanthrax.net>
>> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>>
>> wrote:
>> >
>> > Hi David,
>> >
>> > Generally
>> speaking,
>> > having
>> > different
>> > fluent DSL on top of the
>> > Beam
>> > SDK is great.
>> >
>> > I would like
>> to take
>> > a look
>> > on your
>> > wordcount examples to give
>> > you a
>> > complete
>> feedback. I
>> > like the
>> > idea and
>> > a fluent Java DSL is
>> > valuable.
>> >
>> > Let's wait
>> feedback from
>> > others. If we
>> > have a consensus, then
>> > I
>> > would be
>> more than
>> > happy to
>> > help you
>> > for the donation (I
>> > worked on
>> > the Camel
>> Java DSL
>> > while ago,
>> > so I
>> > have some experience here).
>> >
>> > Thanks !
>> > Regards
>> > JB
>> >
>> > On
>> 12/17/2017 07:00
>> > PM, David
>> > Morávek
>> > wrote:
>> >
>> > Hello,
>> >
>> >
>> > First of
>> all,
>> > thanks for the
>> > amazing work the Apache
>> Beam
>> >
>> community is doing!
>> >
>> >
>> > In 2014,
>> we've
>> > started
>> > development
>> > of the runtime
>> > independent
>> > Java 8
>> API, that
>> > helps us to
>> > create unified big-data
>> > processing
>> > flows.
>> It has
>> > been used
>> > as a core
>> > building block of
>> > Seznam.cz
>> > web
>> crawler data
>> > infrastructure
>> > every since. Its design
>> >
>> principles and
>> > execution
>> > model are
>> > very similar to Apache
>> > Beam.
>> >
>> >
>> > This API
>> was open
>> > sourced
>> > in 2016,
>> > under the name Euphoria
>> > API:
>> >
>> > https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria> <
>> https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>>
>> > <
>> https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>
>> > <https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>>>
>> > <
>> https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>
>> > <https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>>
>> > <
>> https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>
>> > <https://github.com/seznam/euphoria
>> > <https://github.com/seznam/euphoria>>>>
>> >
>> >
>> > As it is
>> very
>> > similar to
>> > Apache
>> > Beam, we feel, that it is
>> > not
>> > worth of
>> duplicating
>> > effort in
>> > terms of development of new
>> > runtimes
>> and
>> > fine-tuning of
>> > current ones.
>> >
>> >
>> > The main
>> blocker
>> > for us
>> > to switch
>> > to Apache Beam is lack
>> > of the
>> > Java 8
>> API. *W*e
>> > propose the
>> > integration of Euphoria API
>> > into
>> > Apache
>> Beam as a
>> > Java 8
>> > DSL, in
>> > order to share our effort
>> > with
>> > the
>> community.
>> >
>> >
>> > Simple
>> example of the
>> > Euphoria API
>> > usage, can be found
>> > here:
>> >
>> >
>> >
>> > https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>
>> >
>> >
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>> >
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount
>> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex
>> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>>
>> >
>> >
>> >
>> > If you
>> feel, that
>> > Beam
>> > community
>> > could leverage from our
>> > work,
>> > we would
>> love to
>> > start
>> > working on
>> > Euphoria integration
>> > into
>> > Apache
>> Beam (we
>> > already
>> > have a
>> > working POC, with few basic
>> > operators
>> > implemented).
>> >
>> >
>> > I look
>> forward to
>> > hearing
>> > from you,
>> >
>> > David
>> >
>> >
>> >
>> --
>> > Jean-Baptiste
>> > Onofré
>> > jbonofre@apache.org <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>
>> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
>> >
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
>> > <mailto:
>> jbonofre@apache.org
>> > <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
>> >>
>> > <mailto:
>> jbonofre@apache.org
>> > <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
>> >>>>
>> > http://blog.nanthrax.net
>> > Talend -
>> > http://www.talend.com
>> >
>> >
>> >
>> >
>> >
>> > --
>> s
>> > pozdravem
>> >
>> > David Morávek
>> >
>> >
>> > --
>> > Jean-Baptiste Onofré
>> > jbonofre@apache.org <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>
>> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
>> >
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>> >
>> >
>> >
>> >
>> >
>> > -- Jean-Baptiste Onofré
>> > jbonofre@apache.org <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>
>> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
>> >
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>> >
>> >
>> >
>> >
>> > -- s pozdravem
>> >
>> > David Morávek
>> >
>> >
>> > -- Jean-Baptiste Onofré
>> > jbonofre@apache.org <ma...@apache.org>
>> > <mailto:jbonofre@apache.org <ma...@apache.org>>
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>> >
>> >
>> >
>> > --
>> > Jean-Baptiste Onofré
>> > jbonofre@apache.org <ma...@apache.org>
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbonofre@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>
Re: Euphoria Java 8 DSL - proposal
Posted by David Morávek <da...@gmail.com>.
Hi Davor,
sorry for the delay, we were blocked by our legal department. I've send
both SGA and CCLA to private@apache.beam.org, please let me know if you
need anything else.
Regards,
David
On Mon, Feb 19, 2018 at 6:13 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:
> Hi Davor,
>
> We still have some discussion/paperwork on Euphoria side (SGA, ...).
>
> So, it's on track but it takes a little more time than expected.
>
> Regards
> JB
>
> On 02/19/2018 05:40 AM, Davor Bonaci wrote:
> > I may have missed things, but any update on the progress of this
> donation?
> >
> > On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <jb@nanthrax.net
> > <ma...@nanthrax.net>> wrote:
> >
> > Great !
> >
> > Thanks !
> > Regards
> > JB
> >
> > On 01/03/2018 07:29 AM, David Morávek wrote:
> >
> > Hello JB,
> >
> > Perfect! I'm already on the Beam Slack workspace, I'll contact
> you once
> > I get to the office.
> >
> > Thanks!
> > D.
> >
> > On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <
> jb@nanthrax.net
> > <ma...@nanthrax.net> <mailto:jb@nanthrax.net
> > <ma...@nanthrax.net>>> wrote:
> >
> > Hi David,
> >
> > absolutely !! Let's move forward on the preparation steps.
> >
> > Are you on Slack and/or hangout to plan this ?
> >
> > Thanks,
> > Regards
> > JB
> >
> > On 01/02/2018 05:35 PM, David Morávek wrote:
> >
> > Hello JB,
> >
> > can we help in any way to move things forward?
> >
> > Thanks,
> > D.
> >
> > On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré
> > <jb@nanthrax.net <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>
> wrote:
> >
> > Thanks Jan,
> >
> > It makes sense.
> >
> > Let me take a look on the code to understand the
> "interaction".
> >
> > Regards
> > JB
> >
> >
> > On 12/18/2017 04:26 PM, Jan Lukavský wrote:
> >
> > Hi JB,
> >
> > basically you are not wrong. The project
> started about
> > three or
> > four
> > years ago with a goal to unify batch and
> streaming
> > processing into
> > single portable, executor independent API.
> Because of
> > that, it is
> > currently "close" to Beam in this sense. But we
> don't
> > see much
> > added
> > value keeping this as a separate project, with
> one of
> > the key
> > differences to be the API (not the model
> itself), so we
> > would
> > like to
> > focus on translation from Euphoria API to
> Beam's SDK.
> > That's why we
> > would like to see it as a DSL, so that it would
> be
> > possible to use
> > Euphoria API with Beam's runners as much
> natively as
> > possible.
> >
> > I hope I didn't make the subject even more
> unclear, if
> > so, I'll
> > be happy
> > to explain anything in more detail. :-)
> >
> > Jan
> >
> >
> > On 12/18/2017 04:08 PM, Jean-Baptiste Onofré
> wrote:
> >
> > Hi Jan,
> >
> > Thanks for your answers.
> >
> > However, they confused me ;)
> >
> > Regarding what you replied, Euphoria seems
> like a
> > programming
> > model/SDK "close" to Beam more than a DSL
> on top of an
> > existing Beam
> > SDK.
> >
> > Am I wrong ?
> >
> > Regards
> > JB
> >
> > On 12/18/2017 03:44 PM, Jan Lukavský wrote:
> >
> > Hi Ismael,
> >
> > basically we adopted the Beam's design
> regarding
> > partitioning
> > (https://github.com/seznam/
> euphoria/issues/160
> > <https://github.com/seznam/euphoria/issues/160>
> > <https://github.com/seznam/euphoria/issues/160
> > <https://github.com/seznam/euphoria/issues/160>>
> > <https://github.com/seznam/
> euphoria/issues/160
> > <https://github.com/seznam/euphoria/issues/160>
> > <https://github.com/seznam/euphoria/issues/160
> > <https://github.com/seznam/euphoria/issues/160>>>) and
> implemented
> > the sorting manually
> > (https://github.com/seznam/
> euphoria/issues/158
> > <https://github.com/seznam/euphoria/issues/158>
> > <https://github.com/seznam/euphoria/issues/158
> > <https://github.com/seznam/euphoria/issues/158>>
> > <https://github.com/seznam/
> euphoria/issues/158
> > <https://github.com/seznam/euphoria/issues/158>
> > <https://github.com/seznam/euphoria/issues/158
> > <https://github.com/seznam/euphoria/issues/158>>>). I'm not
> aware
> > of the time model differences (Euphoria
> supports
> > ingestion and
> > event time, we don't support processing
> time by
> > decision).
> > Regarding other differences (looking
> into Beam
> > capability
> > matrix, I'd say that):
> >
> > - we don't support stateful FlatMap
> (i.e.
> > ParDo) for now
> > (https://github.com/seznam/
> euphoria/issues/192
> > <https://github.com/seznam/euphoria/issues/192>
> > <https://github.com/seznam/euphoria/issues/192
> > <https://github.com/seznam/euphoria/issues/192>>
> > <https://github.com/seznam/
> euphoria/issues/192
> > <https://github.com/seznam/euphoria/issues/192>
> > <https://github.com/seznam/euphoria/issues/192
> > <https://github.com/seznam/euphoria/issues/192>>>)
> >
> > - we don't support side inputs (by
> decision
> > now, but
> > might be
> > reconsidered) and outputs
> > (https://github.com/seznam/
> euphoria/issues/124
> > <https://github.com/seznam/euphoria/issues/124>
> > <https://github.com/seznam/euphoria/issues/124
> > <https://github.com/seznam/euphoria/issues/124>>
> > <https://github.com/seznam/
> euphoria/issues/124
> > <https://github.com/seznam/euphoria/issues/124>
> > <https://github.com/seznam/euphoria/issues/124
> > <https://github.com/seznam/euphoria/issues/124>>>)
> >
> >
> > - we support complete event-time
> windows
> > (non-merging,
> > merging, aligned, unaligned) and time
> control
> >
> > - we don't support processing time by
> > decision (might be
> > reconsidered if a valid use-case is
> found)
> >
> > - we support window triggering based
> on both
> > time
> > and data,
> > including discarding and accumulating
> (without
> > accumulating &
> > retracting)
> >
> > All our executors (runners) - Flink,
> Spark and
> > Local -
> > implement
> > the complete model, which we enforce
> using
> > "operator
> > test kit"
> > that all executors must pass. Spark
> executor
> > supports
> > bounded
> > sources only (for now). As David said,
> we currently
> > don't have
> > serialization abstraction, so there is
> some
> > work to be
> > done in
> > that regard.
> >
> > Our intention is to completely supersede
> > Euphoria, we
> > would like
> > to consider possibility to use
> executors that
> > would not
> > rely on
> > Beam, but that is optional now and
> should be
> > straightforward.
> >
> > We'd be happy to answer any more
> questions you
> > might
> > have and
> > thanks a lot!
> >
> > Best,
> >
> > Jan
> >
> >
> > On 12/18/2017 03:19 PM, Ismaël Mejía
> wrote:
> >
> > Hi,
> >
> > It is great to see that you guys
> have
> > achieved a
> > maturity
> > point to
> > propose this. Congratulations for
> your work
> > and the
> > idea to
> > contribute
> > it into Beam.
> >
> > I remember from a previous
> discussion with Jan
> > about the model
> > mismatch between Euphoria and Beam,
> because
> > of some
> > design
> > decisions
> > of both projects. I remember you
> guys had some
> > issues with
> > the way
> > Beam's sources do partitioning, as
> well as
> > Beam's
> > lack of
> > sorted data
> > (on shuffle a la hadoop). Also if I
> > remember well
> > the 'time'
> > model of
> > Euphoria was simpler than Beam's. I
> talk
> > about all
> > of this
> > because I
> > am curious about what parts of the
> Euphoria
> > model
> > you guys
> > had to
> > sacrifice to support Beam, and what
> parts
> > of Beam's
> > model
> > should still
> > be integrated into Euphoria (and if
> there is a
> > straightforward path to
> > do it).
> >
> > If I understand well if this gets
> merged into
> > Apache this
> > means that
> > Euphoria's current implementation
> would be
> > superseded by
> > this DSL? I
> > am curious because I would like to
> > understand your
> > level of
> > investment
> > on supporting the future of this
> DSL.
> >
> > Thanks and congrats again !
> > Ismaël
> >
> > On Mon, Dec 18, 2017 at 10:12 AM,
> > Jean-Baptiste Onofré
> > <jb@nanthrax.net <mailto:
> jb@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>> wrote:
> >
> > Depending of the donation, you
> would
> > need ICLA
> > for each
> > contributor, and
> > CCLA in addition of SGA.
> >
> > We can sync with Davor and I
> for the
> > legal stuff.
> > However, I would wait a little
> bit just
> > to have
> > feedback
> > from the whole team
> > and start a formal vote.
> >
> > I would be happy to start the
> formal vote.
> >
> > Regards
> > JB
> >
> > On 12/18/2017 10:03 AM, David
> Morávek
> > wrote:
> >
> > Hello,
> >
> > Thanks for the awesome
> feedback!
> >
> > Romain:
> >
> > We already use Java Stream
> API in
> > all operators
> > where it makes sense (eg.:
> > ReduceByKey). Still not
> sure if it
> > was a good
> > choice, but i can be easily
> > converted to iterator
> anyway.
> >
> > Side outputs support is
> coming soon, we
> > already made
> > an initial work on
> > this.
> >
> > Side inputs are not
> supported in a
> > way you
> > are used
> > to from beam, because
> > it can be replaced by Join
> operator
> > on the
> > same key
> > (if annotated with
> > broadcastHashJoin, it will
> be
> > turned into
> > map side
> > join).
> >
> > Only significant difference
> from
> > Beam is,
> > that we
> > decided not to abstract
> > serialization, so we need
> to add
> > support
> > for Type
> > Hints, because of type
> > erasure.
> >
> > Fluent API:
> >
> > API is fluent within one
> operator.
> > It is
> > designed to
> > "lead the
> > programmer", which means,
> that he
> > we'll be only
> > offered methods that makes
> > sense after the last method
> he used
> > (eg.: in
> > ReduceByKey, we know that
> after
> > keyBy either reduceBy
> method should
> > come).
> > It is
> > implemented as a series of
> > builders.
> >
> > Davor:
> >
> > Thanks, I'll contact you,
> and will
> > start
> > the process
> > of having all the
> > necessary paperwork signed
> on our
> > side, so
> > we can
> > get things moving.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Dec 18, 2017 at
> 7:46 AM, Romain
> > Manni-Bucau
> > <rmannibucau@gmail.com
> > <ma...@gmail.com>
> > <mailto:rmannibucau@gmail.com <mailto:
> rmannibucau@gmail.com>>
> > <mailto:rmannibucau@gmail.com <ma...@gmail.com>
> > <mailto:rmannibucau@gmail.com <mailto:
> rmannibucau@gmail.com>>>
> > <mailto:
> rmannibucau@gmail.com
> > <ma...@gmail.com>
> > <mailto:rmannibucau@gmail.com <mailto:
> rmannibucau@gmail.com>>
> > <mailto:
> rmannibucau@gmail.com
> > <ma...@gmail.com>
> > <mailto:rmannibucau@gmail.com <mailto:
> rmannibucau@gmail.com>>>>>
> > wrote:
> >
> > Hi guys
> >
> > A DSL would be very
> welcomed, in
> > particular if
> > fluent.
> >
> > Open question: did
> you study
> > to implement
> > Stream API (surely extending
> > it to
> > have a BeamStream and
> a few more
> > features like
> > sides etc)? Would be
> > very
> > natural and
> integrable easily
> > anywhere and
> > avoid a new API discovery.
> >
> > Hazelcast jet did it
> so I
> > dont see
> > why Beam
> > couldnt.
> >
> > Le 18 déc. 2017
> 07:26, "Davor
> > Bonaci"
> > <davor@apache.org
> > <ma...@apache.org> <mailto:davor@apache.org
> > <ma...@apache.org>>
> > <mailto:davor@apache.org <ma...@apache.org>
> > <mailto:davor@apache.org <ma...@apache.org>>>
> > <mailto:
> davor@apache.org
> > <ma...@apache.org>
> > <mailto:davor@apache.org <ma...@apache.org>>
> >
> > <mailto:davor@apache.org
> > <ma...@apache.org>
> > <mailto:davor@apache.org <ma...@apache.org>>>>>
> a écrit :
> >
> > Hi David,
> > As JB noted,
> merging of
> > these two
> > projects
> > is a great idea. If
> > fact,
> > some of us have
> had those
> > discussions in
> > the past.
> >
> > Legally, nothing
> > particular is
> > strictly
> > necessary as the code seem
> > to
> > already be Apache
> 2.0
> > licensed.
> > We don't,
> > however, want to be
> > perceived
> > as making hostile
> forks,
> > so it
> > would be
> > great to file a Software
> > Grant
> > Agreement with
> the ASF
> > Secretary.
> > I can
> > help with the process, as
> > necessary.
> >
> > Project
> alignment-wise, there
> > aren't any
> > particular blockers that
> > I am
> > aware of. We
> welcome DSLs.
> >
> > Technically, the
> code
> > would start
> > in a
> > feature branch. During this
> > stage, we'd need
> to
> > validate a
> > few things,
> > including confirmation
> > the
> > code and
> dependencies
> > match the ASF
> > policy, automate testing in
> > Beam's
> > tooling, etc. At
> that
> > point, we'd
> > take a
> > community vote to accept
> > the
> > component into
> master,
> > and consider
> > author(s) for committership
> in
> > the
> > overall project.
> >
> > Welcome to the
> ASF and
> > Beam -- we are
> > thrilled to have you! Hope
> > this
> > helps, and please
> reach
> > out if
> > anybody on
> > our end can help,
> > including JB
> > or myself.
> >
> > Davor
> >
> >
> > On Sun, Dec 17,
> 2017 at
> > 10:13 AM,
> > Jean-Baptiste Onofré
> > <jb@nanthrax.net
> > <ma...@nanthrax.net> <mailto:jb@nanthrax.net <mailto:
> jb@nanthrax.net>>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>
> > <mailto:
> jb@nanthrax.net
> > <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> >
> > <mailto:jb@nanthrax.net
> > <ma...@nanthrax.net>
> > <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>>
> wrote:
> >
> > Hi David,
> >
> > Generally
> speaking,
> > having
> > different
> > fluent DSL on top of the
> > Beam
> > SDK is great.
> >
> > I would like
> to take
> > a look
> > on your
> > wordcount examples to give
> > you a
> > complete
> feedback. I
> > like the
> > idea and
> > a fluent Java DSL is
> > valuable.
> >
> > Let's wait
> feedback from
> > others. If we
> > have a consensus, then
> > I
> > would be more
> than
> > happy to
> > help you
> > for the donation (I
> > worked on
> > the Camel
> Java DSL
> > while ago,
> > so I
> > have some experience here).
> >
> > Thanks !
> > Regards
> > JB
> >
> > On 12/17/2017
> 07:00
> > PM, David
> > Morávek
> > wrote:
> >
> > Hello,
> >
> >
> > First of
> all,
> > thanks for the
> > amazing work the Apache Beam
> > community
> is doing!
> >
> >
> > In 2014,
> we've
> > started
> > development
> > of the runtime
> > independent
> > Java 8
> API, that
> > helps us to
> > create unified big-data
> > processing
> > flows. It
> has
> > been used
> > as a core
> > building block of
> > Seznam.cz
> > web
> crawler data
> > infrastructure
> > every since. Its design
> >
> principles and
> > execution
> > model are
> > very similar to Apache
> > Beam.
> >
> >
> > This API
> was open
> > sourced
> > in 2016,
> > under the name Euphoria
> > API:
> >
> > https://github.com/seznam/euphoria
> > <https://github.com/seznam/euphoria> <https://github.com/seznam/
> euphoria
> > <https://github.com/seznam/euphoria>>
> > <https://github.com/seznam/
> euphoria
> > <https://github.com/seznam/euphoria>
> > <https://github.com/seznam/euphoria
> > <https://github.com/seznam/euphoria>>>
> > <https://github.com/seznam/
> euphoria
> > <https://github.com/seznam/euphoria>
> > <https://github.com/seznam/euphoria
> > <https://github.com/seznam/euphoria>>
> > <https://github.com/seznam/
> euphoria
> > <https://github.com/seznam/euphoria>
> > <https://github.com/seznam/euphoria
> > <https://github.com/seznam/euphoria>>>>
> >
> >
> > As it is
> very
> > similar to
> > Apache
> > Beam, we feel, that it is
> > not
> > worth of
> duplicating
> > effort in
> > terms of development of new
> > runtimes
> and
> > fine-tuning of
> > current ones.
> >
> >
> > The main
> blocker
> > for us
> > to switch
> > to Apache Beam is lack
> > of the
> > Java 8
> API. *W*e
> > propose the
> > integration of Euphoria API
> > into
> > Apache
> Beam as a
> > Java 8
> > DSL, in
> > order to share our effort
> > with
> > the
> community.
> >
> >
> > Simple
> example of the
> > Euphoria API
> > usage, can be found
> > here:
> >
> >
> >
> > https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>
> >
> >
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
> >
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> > <https://github.com/seznam/euphoria/tree/master/euphoria-
> examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>>
> >
> >
> >
> > If you
> feel, that
> > Beam
> > community
> > could leverage from our
> > work,
> > we would
> love to
> > start
> > working on
> > Euphoria integration
> > into
> > Apache
> Beam (we
> > already
> > have a
> > working POC, with few basic
> > operators
> > implemented).
> >
> >
> > I look
> forward to
> > hearing
> > from you,
> >
> > David
> >
> >
> > --
> > Jean-Baptiste
> > Onofré
> > jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>
> > <mailto:jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
> > <mailto:jbonofre@apache.org
> > <ma...@apache.org>
> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
> >>
> > <mailto:jbonofre@apache.org
> > <ma...@apache.org>
> > <mailto:jbonofre@apache.org <mailto:jbonofre@apache.org
> >>>>
> > http://blog.nanthrax.net
> > Talend -
> > http://www.talend.com
> >
> >
> >
> >
> >
> > --
> s
> > pozdravem
> >
> > David Morávek
> >
> >
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>
> > <mailto:jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
> >
> >
> >
> > -- Jean-Baptiste Onofré
> > jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>
> > <mailto:jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>>
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
> >
> >
> > -- s pozdravem
> >
> > David Morávek
> >
> >
> > -- Jean-Baptiste Onofré
> > jbonofre@apache.org <ma...@apache.org>
> > <mailto:jbonofre@apache.org <ma...@apache.org>>
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
> >
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org <ma...@apache.org>
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
Re: Euphoria Java 8 DSL - proposal
Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Davor,
We still have some discussion/paperwork on Euphoria side (SGA, ...).
So, it's on track but it takes a little more time than expected.
Regards
JB
On 02/19/2018 05:40 AM, Davor Bonaci wrote:
> I may have missed things, but any update on the progress of this donation?
>
> On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <jb@nanthrax.net
> <ma...@nanthrax.net>> wrote:
>
> Great !
>
> Thanks !
> Regards
> JB
>
> On 01/03/2018 07:29 AM, David Morávek wrote:
>
> Hello JB,
>
> Perfect! I'm already on the Beam Slack workspace, I'll contact you once
> I get to the office.
>
> Thanks!
> D.
>
> On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <jb@nanthrax.net
> <ma...@nanthrax.net> <mailto:jb@nanthrax.net
> <ma...@nanthrax.net>>> wrote:
>
> Hi David,
>
> absolutely !! Let's move forward on the preparation steps.
>
> Are you on Slack and/or hangout to plan this ?
>
> Thanks,
> Regards
> JB
>
> On 01/02/2018 05:35 PM, David Morávek wrote:
>
> Hello JB,
>
> can we help in any way to move things forward?
>
> Thanks,
> D.
>
> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré
> <jb@nanthrax.net <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>> wrote:
>
> Thanks Jan,
>
> It makes sense.
>
> Let me take a look on the code to understand the "interaction".
>
> Regards
> JB
>
>
> On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>
> Hi JB,
>
> basically you are not wrong. The project started about
> three or
> four
> years ago with a goal to unify batch and streaming
> processing into
> single portable, executor independent API. Because of
> that, it is
> currently "close" to Beam in this sense. But we don't
> see much
> added
> value keeping this as a separate project, with one of
> the key
> differences to be the API (not the model itself), so we
> would
> like to
> focus on translation from Euphoria API to Beam's SDK.
> That's why we
> would like to see it as a DSL, so that it would be
> possible to use
> Euphoria API with Beam's runners as much natively as
> possible.
>
> I hope I didn't make the subject even more unclear, if
> so, I'll
> be happy
> to explain anything in more detail. :-)
>
> Jan
>
>
> On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>
> Hi Jan,
>
> Thanks for your answers.
>
> However, they confused me ;)
>
> Regarding what you replied, Euphoria seems like a
> programming
> model/SDK "close" to Beam more than a DSL on top of an
> existing Beam
> SDK.
>
> Am I wrong ?
>
> Regards
> JB
>
> On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>
> Hi Ismael,
>
> basically we adopted the Beam's design regarding
> partitioning
> (https://github.com/seznam/euphoria/issues/160
> <https://github.com/seznam/euphoria/issues/160>
> <https://github.com/seznam/euphoria/issues/160
> <https://github.com/seznam/euphoria/issues/160>>
> <https://github.com/seznam/euphoria/issues/160
> <https://github.com/seznam/euphoria/issues/160>
> <https://github.com/seznam/euphoria/issues/160
> <https://github.com/seznam/euphoria/issues/160>>>) and implemented
> the sorting manually
> (https://github.com/seznam/euphoria/issues/158
> <https://github.com/seznam/euphoria/issues/158>
> <https://github.com/seznam/euphoria/issues/158
> <https://github.com/seznam/euphoria/issues/158>>
> <https://github.com/seznam/euphoria/issues/158
> <https://github.com/seznam/euphoria/issues/158>
> <https://github.com/seznam/euphoria/issues/158
> <https://github.com/seznam/euphoria/issues/158>>>). I'm not aware
> of the time model differences (Euphoria supports
> ingestion and
> event time, we don't support processing time by
> decision).
> Regarding other differences (looking into Beam
> capability
> matrix, I'd say that):
>
> - we don't support stateful FlatMap (i.e.
> ParDo) for now
> (https://github.com/seznam/euphoria/issues/192
> <https://github.com/seznam/euphoria/issues/192>
> <https://github.com/seznam/euphoria/issues/192
> <https://github.com/seznam/euphoria/issues/192>>
> <https://github.com/seznam/euphoria/issues/192
> <https://github.com/seznam/euphoria/issues/192>
> <https://github.com/seznam/euphoria/issues/192
> <https://github.com/seznam/euphoria/issues/192>>>)
>
> - we don't support side inputs (by decision
> now, but
> might be
> reconsidered) and outputs
> (https://github.com/seznam/euphoria/issues/124
> <https://github.com/seznam/euphoria/issues/124>
> <https://github.com/seznam/euphoria/issues/124
> <https://github.com/seznam/euphoria/issues/124>>
> <https://github.com/seznam/euphoria/issues/124
> <https://github.com/seznam/euphoria/issues/124>
> <https://github.com/seznam/euphoria/issues/124
> <https://github.com/seznam/euphoria/issues/124>>>)
>
>
> - we support complete event-time windows
> (non-merging,
> merging, aligned, unaligned) and time control
>
> - we don't support processing time by
> decision (might be
> reconsidered if a valid use-case is found)
>
> - we support window triggering based on both
> time
> and data,
> including discarding and accumulating (without
> accumulating &
> retracting)
>
> All our executors (runners) - Flink, Spark and
> Local -
> implement
> the complete model, which we enforce using
> "operator
> test kit"
> that all executors must pass. Spark executor
> supports
> bounded
> sources only (for now). As David said, we currently
> don't have
> serialization abstraction, so there is some
> work to be
> done in
> that regard.
>
> Our intention is to completely supersede
> Euphoria, we
> would like
> to consider possibility to use executors that
> would not
> rely on
> Beam, but that is optional now and should be
> straightforward.
>
> We'd be happy to answer any more questions you
> might
> have and
> thanks a lot!
>
> Best,
>
> Jan
>
>
> On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>
> Hi,
>
> It is great to see that you guys have
> achieved a
> maturity
> point to
> propose this. Congratulations for your work
> and the
> idea to
> contribute
> it into Beam.
>
> I remember from a previous discussion with Jan
> about the model
> mismatch between Euphoria and Beam, because
> of some
> design
> decisions
> of both projects. I remember you guys had some
> issues with
> the way
> Beam's sources do partitioning, as well as
> Beam's
> lack of
> sorted data
> (on shuffle a la hadoop). Also if I
> remember well
> the 'time'
> model of
> Euphoria was simpler than Beam's. I talk
> about all
> of this
> because I
> am curious about what parts of the Euphoria
> model
> you guys
> had to
> sacrifice to support Beam, and what parts
> of Beam's
> model
> should still
> be integrated into Euphoria (and if there is a
> straightforward path to
> do it).
>
> If I understand well if this gets merged into
> Apache this
> means that
> Euphoria's current implementation would be
> superseded by
> this DSL? I
> am curious because I would like to
> understand your
> level of
> investment
> on supporting the future of this DSL.
>
> Thanks and congrats again !
> Ismaël
>
> On Mon, Dec 18, 2017 at 10:12 AM,
> Jean-Baptiste Onofré
> <jb@nanthrax.net <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>> wrote:
>
> Depending of the donation, you would
> need ICLA
> for each
> contributor, and
> CCLA in addition of SGA.
>
> We can sync with Davor and I for the
> legal stuff.
> However, I would wait a little bit just
> to have
> feedback
> from the whole team
> and start a formal vote.
>
> I would be happy to start the formal vote.
>
> Regards
> JB
>
> On 12/18/2017 10:03 AM, David Morávek
> wrote:
>
> Hello,
>
> Thanks for the awesome feedback!
>
> Romain:
>
> We already use Java Stream API in
> all operators
> where it makes sense (eg.:
> ReduceByKey). Still not sure if it
> was a good
> choice, but i can be easily
> converted to iterator anyway.
>
> Side outputs support is coming soon, we
> already made
> an initial work on
> this.
>
> Side inputs are not supported in a
> way you
> are used
> to from beam, because
> it can be replaced by Join operator
> on the
> same key
> (if annotated with
> broadcastHashJoin, it will be
> turned into
> map side
> join).
>
> Only significant difference from
> Beam is,
> that we
> decided not to abstract
> serialization, so we need to add
> support
> for Type
> Hints, because of type
> erasure.
>
> Fluent API:
>
> API is fluent within one operator.
> It is
> designed to
> "lead the
> programmer", which means, that he
> we'll be only
> offered methods that makes
> sense after the last method he used
> (eg.: in
> ReduceByKey, we know that after
> keyBy either reduceBy method should
> come).
> It is
> implemented as a series of
> builders.
>
> Davor:
>
> Thanks, I'll contact you, and will
> start
> the process
> of having all the
> necessary paperwork signed on our
> side, so
> we can
> get things moving.
>
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Dec 18, 2017 at 7:46 AM, Romain
> Manni-Bucau
> <rmannibucau@gmail.com
> <ma...@gmail.com>
> <mailto:rmannibucau@gmail.com <ma...@gmail.com>>
> <mailto:rmannibucau@gmail.com <ma...@gmail.com>
> <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>
> <mailto:rmannibucau@gmail.com
> <ma...@gmail.com>
> <mailto:rmannibucau@gmail.com <ma...@gmail.com>>
> <mailto:rmannibucau@gmail.com
> <ma...@gmail.com>
> <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>>>
> wrote:
>
> Hi guys
>
> A DSL would be very welcomed, in
> particular if
> fluent.
>
> Open question: did you study
> to implement
> Stream API (surely extending
> it to
> have a BeamStream and a few more
> features like
> sides etc)? Would be
> very
> natural and integrable easily
> anywhere and
> avoid a new API discovery.
>
> Hazelcast jet did it so I
> dont see
> why Beam
> couldnt.
>
> Le 18 déc. 2017 07:26, "Davor
> Bonaci"
> <davor@apache.org
> <ma...@apache.org> <mailto:davor@apache.org
> <ma...@apache.org>>
> <mailto:davor@apache.org <ma...@apache.org>
> <mailto:davor@apache.org <ma...@apache.org>>>
> <mailto:davor@apache.org
> <ma...@apache.org>
> <mailto:davor@apache.org <ma...@apache.org>>
>
> <mailto:davor@apache.org
> <ma...@apache.org>
> <mailto:davor@apache.org <ma...@apache.org>>>>> a écrit :
>
> Hi David,
> As JB noted, merging of
> these two
> projects
> is a great idea. If
> fact,
> some of us have had those
> discussions in
> the past.
>
> Legally, nothing
> particular is
> strictly
> necessary as the code seem
> to
> already be Apache 2.0
> licensed.
> We don't,
> however, want to be
> perceived
> as making hostile forks,
> so it
> would be
> great to file a Software
> Grant
> Agreement with the ASF
> Secretary.
> I can
> help with the process, as
> necessary.
>
> Project alignment-wise, there
> aren't any
> particular blockers that
> I am
> aware of. We welcome DSLs.
>
> Technically, the code
> would start
> in a
> feature branch. During this
> stage, we'd need to
> validate a
> few things,
> including confirmation
> the
> code and dependencies
> match the ASF
> policy, automate testing in
> Beam's
> tooling, etc. At that
> point, we'd
> take a
> community vote to accept
> the
> component into master,
> and consider
> author(s) for committership in
> the
> overall project.
>
> Welcome to the ASF and
> Beam -- we are
> thrilled to have you! Hope
> this
> helps, and please reach
> out if
> anybody on
> our end can help,
> including JB
> or myself.
>
> Davor
>
>
> On Sun, Dec 17, 2017 at
> 10:13 AM,
> Jean-Baptiste Onofré
> <jb@nanthrax.net
> <ma...@nanthrax.net> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>
> <mailto:jb@nanthrax.net
> <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>
>
> <mailto:jb@nanthrax.net
> <ma...@nanthrax.net>
> <mailto:jb@nanthrax.net <ma...@nanthrax.net>>>>> wrote:
>
> Hi David,
>
> Generally speaking,
> having
> different
> fluent DSL on top of the
> Beam
> SDK is great.
>
> I would like to take
> a look
> on your
> wordcount examples to give
> you a
> complete feedback. I
> like the
> idea and
> a fluent Java DSL is
> valuable.
>
> Let's wait feedback from
> others. If we
> have a consensus, then
> I
> would be more than
> happy to
> help you
> for the donation (I
> worked on
> the Camel Java DSL
> while ago,
> so I
> have some experience here).
>
> Thanks !
> Regards
> JB
>
> On 12/17/2017 07:00
> PM, David
> Morávek
> wrote:
>
> Hello,
>
>
> First of all,
> thanks for the
> amazing work the Apache Beam
> community is doing!
>
>
> In 2014, we've
> started
> development
> of the runtime
> independent
> Java 8 API, that
> helps us to
> create unified big-data
> processing
> flows. It has
> been used
> as a core
> building block of
> Seznam.cz
> web crawler data
> infrastructure
> every since. Its design
> principles and
> execution
> model are
> very similar to Apache
> Beam.
>
>
> This API was open
> sourced
> in 2016,
> under the name Euphoria
> API:
>
> https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>>>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>
> <https://github.com/seznam/euphoria
> <https://github.com/seznam/euphoria>>>>
>
>
> As it is very
> similar to
> Apache
> Beam, we feel, that it is
> not
> worth of duplicating
> effort in
> terms of development of new
> runtimes and
> fine-tuning of
> current ones.
>
>
> The main blocker
> for us
> to switch
> to Apache Beam is lack
> of the
> Java 8 API. *W*e
> propose the
> integration of Euphoria API
> into
> Apache Beam as a
> Java 8
> DSL, in
> order to share our effort
> with
> the community.
>
>
> Simple example of the
> Euphoria API
> usage, can be found
> here:
>
>
>
> https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>
>
>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount
> <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>>
>
>
>
> If you feel, that
> Beam
> community
> could leverage from our
> work,
> we would love to
> start
> working on
> Euphoria integration
> into
> Apache Beam (we
> already
> have a
> working POC, with few basic
> operators
> implemented).
>
>
> I look forward to
> hearing
> from you,
>
> David
>
>
> --
> Jean-Baptiste
> Onofré
> jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>
> <mailto:jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>>
> <mailto:jbonofre@apache.org
> <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>
> <mailto:jbonofre@apache.org
> <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>>>
> http://blog.nanthrax.net
> Talend -
> http://www.talend.com
>
>
>
>
>
> -- s
> pozdravem
>
> David Morávek
>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>
> <mailto:jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>>
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
>
>
>
> -- Jean-Baptiste Onofré
> jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>
> <mailto:jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>>
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
>
>
> -- s pozdravem
>
> David Morávek
>
>
> -- Jean-Baptiste Onofré
> jbonofre@apache.org <ma...@apache.org>
> <mailto:jbonofre@apache.org <ma...@apache.org>>
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org <ma...@apache.org>
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
--
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com