You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@heron.apache.org by Josh Fischer <jo...@joshfischer.io> on 2018/02/26 05:05:46 UTC

Proposing Changes To Heron

Please see this google drive link for adding comments. I will copy and
paste the drive doc below as well.

https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing

Proposal Below

*I am writing this document to propose changes and to start conversations
on adding functionality similar to Storm SQL to Heron. We would call it
Heron SQL. After reviewing how the code is structured in Storm I have some
suggestions and questions relating to the implementation into the Heron
code base. - High Level Overview Of Code Workflow (Keeping Similar to
Storm)- We would parse the sql with calcite to create the logical and
physical plans- We would then convert the logical and physical plans to a
Heron Topology- We would then submit the Heron Topology into the Heron
System - Some thoughts on code structure and overall functionality- I think
we should place the Heron SQL code base as a top level directory in the
repo. - I will have to add the command “sql” to the Heron command line code
in python.- As a first pass implementation users can interact with Heron
SQL via the following command - heron sql <sql-file> <topology-name>- We
will also support the explain command for displaying the query plan, this
will not deploy the topology- heron sql <sql-file> --explain- After the
first pass implementation is working smoothly, we can then add an
interactive command line interface to accept sql on the fly by omitting the
sql file argument- Heron sql <topology-name>- We would support all of the
existing functionality in Storm SQL today with the exception of being
dependent on trident. We would use Storm SQL as a way to deploy topologies
into Heron. Similar to how you deploy topologies with the Streamlet,
Topology, and ECO APIs- Questions- Do we see any issue with this plan to
implement?- I believe we would have to supply an external jar at times to
connect to external data sources, such as reuse of kafka libraries or
database drivers. I see that Storm has few external connectors for mongo,
kafka, redis and hdfs. Do we want to limit users to what we decide to
build as connectors or do we want to give them the ability to load external
jars at submit time? I don’t think heron offers the ability to pass extra
jars to via the “--jars” or “--artifacts” flags like Storm does today.
Would this be the correct way to pull in external jars? Does anyone have a
different idea? I’m thinking that this might be a v2 feature after we get
Heron sql working well. Ideas, thoughts or concerns?- Is there anything I
missed?*

Re: Proposing Changes To Heron

Posted by Fu Maosong <ma...@gmail.com>.

+1 to SQL on Heron!

2018-02-25 21:05 GMT-08:00 Josh Fischer <jo...@joshfischer.io>:

> Please see this google drive link for adding comments.  I will copy and
> paste the drive doc below as well.
>
> https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>
>
> Proposal Below
>
>
>
>
>
>
>
> *I am writing this document to propose changes and to start conversations
> on adding functionality similar to Storm SQL to Heron.  We would call it
> Heron SQL.  After reviewing how the code is structured in Storm I have some
> suggestions and questions relating to the implementation into the Heron
> code base. - High Level Overview Of Code Workflow (Keeping Similar to
> Storm)- We would parse the sql with calcite to create the logical and
> physical plans- We would then convert the logical and physical plans to a
> Heron Topology- We would then submit the Heron Topology into the Heron
> System - Some thoughts on code structure and overall functionality- I think
>  we should place the Heron SQL code base as a top level directory in the
> repo. - I will have to add the command “sql” to the Heron command line code
> in python.- As a first pass implementation users  can interact with Heron
> SQL via the following command - heron sql <sql-file> <topology-name>- We
> will also support the explain command for displaying the query plan, this
> will not deploy the topology- heron sql <sql-file> --explain- After the
> first pass implementation is working smoothly, we can then add an
> interactive command line interface to accept sql on the fly by omitting the
> sql file argument- Heron sql <topology-name>- We would support all of the
> existing functionality in Storm SQL today with the exception of being
> dependent on trident.  We would use Storm SQL as a way to deploy topologies
> into Heron.  Similar to how you deploy topologies with the Streamlet,
> Topology, and ECO APIs- Questions- Do we see any issue with this plan to
> implement?- I believe we would have to supply an external jar at times to
> connect to external data sources, such as reuse of kafka libraries or
> database drivers.  I see that Storm has few external connectors for mongo,
> kafka, redis and hdfs.  Do we want to limit users to what we decide to
> build as connectors or do we want to give them the ability to load external
> jars at submit time? I don’t think heron offers the ability to pass extra
> jars to via the “--jars” or “--artifacts” flags like Storm does today.
> Would this be the correct way to pull in external jars?  Does anyone have a
> different idea?  I’m thinking that this might be a v2 feature after we get
> Heron sql working well.  Ideas, thoughts or concerns?- Is there anything I
> missed?*
>



-- 
With my best Regards
------------------
Fu Maosong
Twitter Inc.
Mobile: +001-415-244-7520

Re: Proposing Changes To Heron

Posted by Ning Wang <wa...@gmail.com>.

+1 for Heron SQL

On Sun, Feb 25, 2018 at 9:28 PM, Jerry Peng <je...@gmail.com>
wrote:

> Thanks Josh for taking the initiative to get this start!  SQL on Heron
> will be a great feature! The plan sounds great to me.  Lets first get
> an initial version of the Heron SQL out and then we can worry about
> custom / user defined sources and sinks.  We can even start talking
> about UDFs (User defined functions) at that point!
>
> Best,
>
> Jerry
>
> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> > Please see this google drive link for adding comments.  I will copy and
> > paste the drive doc below as well.
> >
> > https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >
> >
> > Proposal Below
> >
> >
> >
> >
> >
> >
> >
> > *I am writing this document to propose changes and to start conversations
> > on adding functionality similar to Storm SQL to Heron.  We would call it
> > Heron SQL.  After reviewing how the code is structured in Storm I have
> some
> > suggestions and questions relating to the implementation into the Heron
> > code base. - High Level Overview Of Code Workflow (Keeping Similar to
> > Storm)- We would parse the sql with calcite to create the logical and
> > physical plans- We would then convert the logical and physical plans to a
> > Heron Topology- We would then submit the Heron Topology into the Heron
> > System - Some thoughts on code structure and overall functionality- I
> think
> >  we should place the Heron SQL code base as a top level directory in the
> > repo. - I will have to add the command “sql” to the Heron command line
> code
> > in python.- As a first pass implementation users  can interact with Heron
> > SQL via the following command - heron sql <sql-file> <topology-name>- We
> > will also support the explain command for displaying the query plan, this
> > will not deploy the topology- heron sql <sql-file> --explain- After the
> > first pass implementation is working smoothly, we can then add an
> > interactive command line interface to accept sql on the fly by omitting
> the
> > sql file argument- Heron sql <topology-name>- We would support all of the
> > existing functionality in Storm SQL today with the exception of being
> > dependent on trident.  We would use Storm SQL as a way to deploy
> topologies
> > into Heron.  Similar to how you deploy topologies with the Streamlet,
> > Topology, and ECO APIs- Questions- Do we see any issue with this plan to
> > implement?- I believe we would have to supply an external jar at times to
> > connect to external data sources, such as reuse of kafka libraries or
> > database drivers.  I see that Storm has few external connectors for
> mongo,
> > kafka, redis and hdfs.  Do we want to limit users to what we decide to
> > build as connectors or do we want to give them the ability to load
> external
> > jars at submit time? I don’t think heron offers the ability to pass extra
> > jars to via the “--jars” or “--artifacts” flags like Storm does today.
> > Would this be the correct way to pull in external jars?  Does anyone
> have a
> > different idea?  I’m thinking that this might be a v2 feature after we
> get
> > Heron sql working well.  Ideas, thoughts or concerns?- Is there anything
> I
> > missed?*
>

Re: Proposing Changes To Heron

Posted by Karthik Ramasamy <ka...@streaml.io>.

Good for a first pass. Otherwise, if you have comments, please address them
in the document.

On Thu, Mar 1, 2018 at 7:53 AM, Josh Fischer <jo...@joshfischer.io> wrote:

> There are still a few comments and thoughts outstanding.  Is this proposal
> good to go for a first pass to implement?
>
> Anyone should be able to comment with this link.
> https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>
> On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <yaliangw@twitter.com.invalid
> >
> wrote:
>
> > Josh,
> >
> > Totally agree with your concern. I was bringing that idea into
> > conversation and thought that as a back up solution. Since Heron is
> getting
> > more and more popular, it would be really nice to have SQL support. I
> think
> > having a built-in Heron SQL can shorten the development iteration since
> we
> > will have less concern of abstraction and generalization in
> implementation.
> >
> > Best,
> > Yaliang
> >
> > > On Feb 26, 2018, at 7:17 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> > >
> > > Yaliang,
> > >
> > > I think this is a fantastic idea and I agree about the code maintenance
> > > being a cost.   I have a concern that creating a smaller project may
> get
> > > abandoned, especially if it had a smaller following.   One of the nice
> > > things about Heron is the large community and list of core contributors
> > > behind it.  But, I don't want to abandon this idea.  I think, for me at
> > > least, that it would make sense to get Storm SQL running in Heron and
> > take
> > > what we learned from that experience and apply it to a third part
> project
> > > if there is a need/demand for it.  What do you think?
> > >
> > > -Josh
> > >
> > > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang
> > <ya...@twitter.com.invalid>
> > > wrote:
> > >
> > >> Sounds like a very great feature to have. A question I have: will it
> be
> > >> feasible to start a separate project to support SQL on Heron-like
> > streaming?
> > >>
> > >> - I’m imaging that there will be a lot code similar/same to Storm SQL.
> > >> - Only the last step of the three steps(parse sql -> logical/physical
> > plan
> > >> -> heron topology) you mentioned is specified for Heron. The first two
> > >> steps can be shared for other heron-like streaming vendors.
> > >> - The native support for SQL inside the Heron project will give extra
> > >> advertising/marketing bonus but with an increase of the code
> maintenance
> > >> cost, especially, if it requires APIs that not very popular and may be
> > >> changed over time. However, a separate project can target a specific
> > >> version of Heron.
> > >>
> > >> Best,
> > >> Yaliang
> > >>
> > >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
> > >> erenavsarogullari@gmail.com> wrote:
> > >>>
> > >>> +1 for Heron SQL Support. Thanks Josh.
> > >>>
> > >>> On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Thanks Josh for initiating this. It will be a great feature to add
> for
> > >>>> Heron.
> > >>>>
> > >>>> cheers
> > >>>> /karthik
> > >>>>
> > >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io>
> > >> wrote:
> > >>>>>
> > >>>>> Jerry,
> > >>>>>
> > >>>>> Great point.  Lets keep things simple for the migration to make
> sure
> > >> the
> > >>>>> implementation is correct.  Then we can modify from there.
> > >>>>>
> > >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> > >>>> jerry.boyang.peng@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Thanks Josh for taking the initiative to get this start!  SQL on
> > Heron
> > >>>>>> will be a great feature! The plan sounds great to me.  Lets first
> > get
> > >>>>>> an initial version of the Heron SQL out and then we can worry
> about
> > >>>>>> custom / user defined sources and sinks.  We can even start
> talking
> > >>>>>> about UDFs (User defined functions) at that point!
> > >>>>>>
> > >>>>>> Best,
> > >>>>>>
> > >>>>>> Jerry
> > >>>>>>
> > >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <
> josh@joshfischer.io>
> > >>>> wrote:
> > >>>>>>> Please see this google drive link for adding comments.  I will
> copy
> > >> and
> > >>>>>>> paste the drive doc below as well.
> > >>>>>>>
> > >>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
> > >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Proposal Below
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> *I am writing this document to propose changes and to start
> > >>>> conversations
> > >>>>>>> on adding functionality similar to Storm SQL to Heron.  We would
> > call
> > >>>> it
> > >>>>>>> Heron SQL.  After reviewing how the code is structured in Storm I
> > >> have
> > >>>>>> some
> > >>>>>>> suggestions and questions relating to the implementation into the
> > >> Heron
> > >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping
> Similar
> > to
> > >>>>>>> Storm)- We would parse the sql with calcite to create the logical
> > and
> > >>>>>>> physical plans- We would then convert the logical and physical
> > plans
> > >>>> to a
> > >>>>>>> Heron Topology- We would then submit the Heron Topology into the
> > >> Heron
> > >>>>>>> System - Some thoughts on code structure and overall
> > functionality- I
> > >>>>>> think
> > >>>>>>> we should place the Heron SQL code base as a top level directory
> in
> > >> the
> > >>>>>>> repo. - I will have to add the command “sql” to the Heron command
> > >> line
> > >>>>>> code
> > >>>>>>> in python.- As a first pass implementation users  can interact
> with
> > >>>> Heron
> > >>>>>>> SQL via the following command - heron sql <sql-file>
> > <topology-name>-
> > >>>> We
> > >>>>>>> will also support the explain command for displaying the query
> > plan,
> > >>>> this
> > >>>>>>> will not deploy the topology- heron sql <sql-file> --explain-
> After
> > >> the
> > >>>>>>> first pass implementation is working smoothly, we can then add an
> > >>>>>>> interactive command line interface to accept sql on the fly by
> > >> omitting
> > >>>>>> the
> > >>>>>>> sql file argument- Heron sql <topology-name>- We would support
> all
> > of
> > >>>> the
> > >>>>>>> existing functionality in Storm SQL today with the exception of
> > being
> > >>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
> > >>>>>> topologies
> > >>>>>>> into Heron.  Similar to how you deploy topologies with the
> > Streamlet,
> > >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this
> > plan
> > >>>> to
> > >>>>>>> implement?- I believe we would have to supply an external jar at
> > >> times
> > >>>> to
> > >>>>>>> connect to external data sources, such as reuse of kafka
> libraries
> > or
> > >>>>>>> database drivers.  I see that Storm has few external connectors
> for
> > >>>>>> mongo,
> > >>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we
> decide
> > >> to
> > >>>>>>> build as connectors or do we want to give them the ability to
> load
> > >>>>>> external
> > >>>>>>> jars at submit time? I don’t think heron offers the ability to
> pass
> > >>>> extra
> > >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
> > >> today.
> > >>>>>>> Would this be the correct way to pull in external jars?  Does
> > anyone
> > >>>>>> have a
> > >>>>>>> different idea?  I’m thinking that this might be a v2 feature
> after
> > >> we
> > >>>>>> get
> > >>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> > >>>> anything
> > >>>>>> I
> > >>>>>>> missed?*
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> >
> > --
> Sent from A Mobile Device
>

Re: Proposing Changes To Heron

Posted by Josh Fischer <jo...@joshfischer.io>.

There are still a few comments and thoughts outstanding.  Is this proposal
good to go for a first pass to implement?

Anyone should be able to comment with this link.
https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing

On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <ya...@twitter.com.invalid>
wrote:

> Josh,
>
> Totally agree with your concern. I was bringing that idea into
> conversation and thought that as a back up solution. Since Heron is getting
> more and more popular, it would be really nice to have SQL support. I think
> having a built-in Heron SQL can shorten the development iteration since we
> will have less concern of abstraction and generalization in implementation.
>
> Best,
> Yaliang
>
> > On Feb 26, 2018, at 7:17 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> >
> > Yaliang,
> >
> > I think this is a fantastic idea and I agree about the code maintenance
> > being a cost.   I have a concern that creating a smaller project may get
> > abandoned, especially if it had a smaller following.   One of the nice
> > things about Heron is the large community and list of core contributors
> > behind it.  But, I don't want to abandon this idea.  I think, for me at
> > least, that it would make sense to get Storm SQL running in Heron and
> take
> > what we learned from that experience and apply it to a third part project
> > if there is a need/demand for it.  What do you think?
> >
> > -Josh
> >
> > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang
> <ya...@twitter.com.invalid>
> > wrote:
> >
> >> Sounds like a very great feature to have. A question I have: will it be
> >> feasible to start a separate project to support SQL on Heron-like
> streaming?
> >>
> >> - I’m imaging that there will be a lot code similar/same to Storm SQL.
> >> - Only the last step of the three steps(parse sql -> logical/physical
> plan
> >> -> heron topology) you mentioned is specified for Heron. The first two
> >> steps can be shared for other heron-like streaming vendors.
> >> - The native support for SQL inside the Heron project will give extra
> >> advertising/marketing bonus but with an increase of the code maintenance
> >> cost, especially, if it requires APIs that not very popular and may be
> >> changed over time. However, a separate project can target a specific
> >> version of Heron.
> >>
> >> Best,
> >> Yaliang
> >>
> >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
> >> erenavsarogullari@gmail.com> wrote:
> >>>
> >>> +1 for Heron SQL Support. Thanks Josh.
> >>>
> >>> On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com>
> >> wrote:
> >>>
> >>>> Thanks Josh for initiating this. It will be a great feature to add for
> >>>> Heron.
> >>>>
> >>>> cheers
> >>>> /karthik
> >>>>
> >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io>
> >> wrote:
> >>>>>
> >>>>> Jerry,
> >>>>>
> >>>>> Great point.  Lets keep things simple for the migration to make sure
> >> the
> >>>>> implementation is correct.  Then we can modify from there.
> >>>>>
> >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> >>>> jerry.boyang.peng@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Thanks Josh for taking the initiative to get this start!  SQL on
> Heron
> >>>>>> will be a great feature! The plan sounds great to me.  Lets first
> get
> >>>>>> an initial version of the Heron SQL out and then we can worry about
> >>>>>> custom / user defined sources and sinks.  We can even start talking
> >>>>>> about UDFs (User defined functions) at that point!
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Jerry
> >>>>>>
> >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io>
> >>>> wrote:
> >>>>>>> Please see this google drive link for adding comments.  I will copy
> >> and
> >>>>>>> paste the drive doc below as well.
> >>>>>>>
> >>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
> >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >>>>>>>
> >>>>>>>
> >>>>>>> Proposal Below
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> *I am writing this document to propose changes and to start
> >>>> conversations
> >>>>>>> on adding functionality similar to Storm SQL to Heron.  We would
> call
> >>>> it
> >>>>>>> Heron SQL.  After reviewing how the code is structured in Storm I
> >> have
> >>>>>> some
> >>>>>>> suggestions and questions relating to the implementation into the
> >> Heron
> >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar
> to
> >>>>>>> Storm)- We would parse the sql with calcite to create the logical
> and
> >>>>>>> physical plans- We would then convert the logical and physical
> plans
> >>>> to a
> >>>>>>> Heron Topology- We would then submit the Heron Topology into the
> >> Heron
> >>>>>>> System - Some thoughts on code structure and overall
> functionality- I
> >>>>>> think
> >>>>>>> we should place the Heron SQL code base as a top level directory in
> >> the
> >>>>>>> repo. - I will have to add the command “sql” to the Heron command
> >> line
> >>>>>> code
> >>>>>>> in python.- As a first pass implementation users  can interact with
> >>>> Heron
> >>>>>>> SQL via the following command - heron sql <sql-file>
> <topology-name>-
> >>>> We
> >>>>>>> will also support the explain command for displaying the query
> plan,
> >>>> this
> >>>>>>> will not deploy the topology- heron sql <sql-file> --explain- After
> >> the
> >>>>>>> first pass implementation is working smoothly, we can then add an
> >>>>>>> interactive command line interface to accept sql on the fly by
> >> omitting
> >>>>>> the
> >>>>>>> sql file argument- Heron sql <topology-name>- We would support all
> of
> >>>> the
> >>>>>>> existing functionality in Storm SQL today with the exception of
> being
> >>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
> >>>>>> topologies
> >>>>>>> into Heron.  Similar to how you deploy topologies with the
> Streamlet,
> >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this
> plan
> >>>> to
> >>>>>>> implement?- I believe we would have to supply an external jar at
> >> times
> >>>> to
> >>>>>>> connect to external data sources, such as reuse of kafka libraries
> or
> >>>>>>> database drivers.  I see that Storm has few external connectors for
> >>>>>> mongo,
> >>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide
> >> to
> >>>>>>> build as connectors or do we want to give them the ability to load
> >>>>>> external
> >>>>>>> jars at submit time? I don’t think heron offers the ability to pass
> >>>> extra
> >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
> >> today.
> >>>>>>> Would this be the correct way to pull in external jars?  Does
> anyone
> >>>>>> have a
> >>>>>>> different idea?  I’m thinking that this might be a v2 feature after
> >> we
> >>>>>> get
> >>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> >>>> anything
> >>>>>> I
> >>>>>>> missed?*
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
> --
Sent from A Mobile Device

Re: Proposing Changes To Heron

Posted by Yaliang Wang <ya...@twitter.com.INVALID>.

Josh,

Totally agree with your concern. I was bringing that idea into conversation and thought that as a back up solution. Since Heron is getting more and more popular, it would be really nice to have SQL support. I think having a built-in Heron SQL can shorten the development iteration since we will have less concern of abstraction and generalization in implementation.

Best,
Yaliang

> On Feb 26, 2018, at 7:17 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> 
> Yaliang,
> 
> I think this is a fantastic idea and I agree about the code maintenance
> being a cost.   I have a concern that creating a smaller project may get
> abandoned, especially if it had a smaller following.   One of the nice
> things about Heron is the large community and list of core contributors
> behind it.  But, I don't want to abandon this idea.  I think, for me at
> least, that it would make sense to get Storm SQL running in Heron and take
> what we learned from that experience and apply it to a third part project
> if there is a need/demand for it.  What do you think?
> 
> -Josh
> 
> On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang <ya...@twitter.com.invalid>
> wrote:
> 
>> Sounds like a very great feature to have. A question I have: will it be
>> feasible to start a separate project to support SQL on Heron-like streaming?
>> 
>> - I’m imaging that there will be a lot code similar/same to Storm SQL.
>> - Only the last step of the three steps(parse sql -> logical/physical plan
>> -> heron topology) you mentioned is specified for Heron. The first two
>> steps can be shared for other heron-like streaming vendors.
>> - The native support for SQL inside the Heron project will give extra
>> advertising/marketing bonus but with an increase of the code maintenance
>> cost, especially, if it requires APIs that not very popular and may be
>> changed over time. However, a separate project can target a specific
>> version of Heron.
>> 
>> Best,
>> Yaliang
>> 
>>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
>> erenavsarogullari@gmail.com> wrote:
>>> 
>>> +1 for Heron SQL Support. Thanks Josh.
>>> 
>>> On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com>
>> wrote:
>>> 
>>>> Thanks Josh for initiating this. It will be a great feature to add for
>>>> Heron.
>>>> 
>>>> cheers
>>>> /karthik
>>>> 
>>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io>
>> wrote:
>>>>> 
>>>>> Jerry,
>>>>> 
>>>>> Great point.  Lets keep things simple for the migration to make sure
>> the
>>>>> implementation is correct.  Then we can modify from there.
>>>>> 
>>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
>>>> jerry.boyang.peng@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
>>>>>> will be a great feature! The plan sounds great to me.  Lets first get
>>>>>> an initial version of the Heron SQL out and then we can worry about
>>>>>> custom / user defined sources and sinks.  We can even start talking
>>>>>> about UDFs (User defined functions) at that point!
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Jerry
>>>>>> 
>>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io>
>>>> wrote:
>>>>>>> Please see this google drive link for adding comments.  I will copy
>> and
>>>>>>> paste the drive doc below as well.
>>>>>>> 
>>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
>>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>>>>>> 
>>>>>>> 
>>>>>>> Proposal Below
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> *I am writing this document to propose changes and to start
>>>> conversations
>>>>>>> on adding functionality similar to Storm SQL to Heron.  We would call
>>>> it
>>>>>>> Heron SQL.  After reviewing how the code is structured in Storm I
>> have
>>>>>> some
>>>>>>> suggestions and questions relating to the implementation into the
>> Heron
>>>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
>>>>>>> Storm)- We would parse the sql with calcite to create the logical and
>>>>>>> physical plans- We would then convert the logical and physical plans
>>>> to a
>>>>>>> Heron Topology- We would then submit the Heron Topology into the
>> Heron
>>>>>>> System - Some thoughts on code structure and overall functionality- I
>>>>>> think
>>>>>>> we should place the Heron SQL code base as a top level directory in
>> the
>>>>>>> repo. - I will have to add the command “sql” to the Heron command
>> line
>>>>>> code
>>>>>>> in python.- As a first pass implementation users  can interact with
>>>> Heron
>>>>>>> SQL via the following command - heron sql <sql-file> <topology-name>-
>>>> We
>>>>>>> will also support the explain command for displaying the query plan,
>>>> this
>>>>>>> will not deploy the topology- heron sql <sql-file> --explain- After
>> the
>>>>>>> first pass implementation is working smoothly, we can then add an
>>>>>>> interactive command line interface to accept sql on the fly by
>> omitting
>>>>>> the
>>>>>>> sql file argument- Heron sql <topology-name>- We would support all of
>>>> the
>>>>>>> existing functionality in Storm SQL today with the exception of being
>>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
>>>>>> topologies
>>>>>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan
>>>> to
>>>>>>> implement?- I believe we would have to supply an external jar at
>> times
>>>> to
>>>>>>> connect to external data sources, such as reuse of kafka libraries or
>>>>>>> database drivers.  I see that Storm has few external connectors for
>>>>>> mongo,
>>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide
>> to
>>>>>>> build as connectors or do we want to give them the ability to load
>>>>>> external
>>>>>>> jars at submit time? I don’t think heron offers the ability to pass
>>>> extra
>>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
>> today.
>>>>>>> Would this be the correct way to pull in external jars?  Does anyone
>>>>>> have a
>>>>>>> different idea?  I’m thinking that this might be a v2 feature after
>> we
>>>>>> get
>>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
>>>> anything
>>>>>> I
>>>>>>> missed?*
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Proposing Changes To Heron

Posted by Josh Fischer <jo...@joshfischer.io>.

Yaliang,

I think this is a fantastic idea and I agree about the code maintenance
being a cost.   I have a concern that creating a smaller project may get
abandoned, especially if it had a smaller following.   One of the nice
things about Heron is the large community and list of core contributors
behind it.  But, I don't want to abandon this idea.  I think, for me at
least, that it would make sense to get Storm SQL running in Heron and take
what we learned from that experience and apply it to a third part project
if there is a need/demand for it.  What do you think?

-Josh

On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang <ya...@twitter.com.invalid>
wrote:

> Sounds like a very great feature to have. A question I have: will it be
> feasible to start a separate project to support SQL on Heron-like streaming?
>
> - I’m imaging that there will be a lot code similar/same to Storm SQL.
> - Only the last step of the three steps(parse sql -> logical/physical plan
> -> heron topology) you mentioned is specified for Heron. The first two
> steps can be shared for other heron-like streaming vendors.
> - The native support for SQL inside the Heron project will give extra
> advertising/marketing bonus but with an increase of the code maintenance
> cost, especially, if it requires APIs that not very popular and may be
> changed over time. However, a separate project can target a specific
> version of Heron.
>
> Best,
> Yaliang
>
> > On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
> erenavsarogullari@gmail.com> wrote:
> >
> > +1 for Heron SQL Support. Thanks Josh.
> >
> > On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com>
> wrote:
> >
> >> Thanks Josh for initiating this. It will be a great feature to add for
> >> Heron.
> >>
> >> cheers
> >> /karthik
> >>
> >>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >>>
> >>> Jerry,
> >>>
> >>> Great point.  Lets keep things simple for the migration to make sure
> the
> >>> implementation is correct.  Then we can modify from there.
> >>>
> >>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> >> jerry.boyang.peng@gmail.com>
> >>> wrote:
> >>>
> >>>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
> >>>> will be a great feature! The plan sounds great to me.  Lets first get
> >>>> an initial version of the Heron SQL out and then we can worry about
> >>>> custom / user defined sources and sinks.  We can even start talking
> >>>> about UDFs (User defined functions) at that point!
> >>>>
> >>>> Best,
> >>>>
> >>>> Jerry
> >>>>
> >>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io>
> >> wrote:
> >>>>> Please see this google drive link for adding comments.  I will copy
> and
> >>>>> paste the drive doc below as well.
> >>>>>
> >>>>> https://docs.google.com/document/d/1PxLCyR_H-
> >>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >>>>>
> >>>>>
> >>>>> Proposal Below
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> *I am writing this document to propose changes and to start
> >> conversations
> >>>>> on adding functionality similar to Storm SQL to Heron.  We would call
> >> it
> >>>>> Heron SQL.  After reviewing how the code is structured in Storm I
> have
> >>>> some
> >>>>> suggestions and questions relating to the implementation into the
> Heron
> >>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
> >>>>> Storm)- We would parse the sql with calcite to create the logical and
> >>>>> physical plans- We would then convert the logical and physical plans
> >> to a
> >>>>> Heron Topology- We would then submit the Heron Topology into the
> Heron
> >>>>> System - Some thoughts on code structure and overall functionality- I
> >>>> think
> >>>>> we should place the Heron SQL code base as a top level directory in
> the
> >>>>> repo. - I will have to add the command “sql” to the Heron command
> line
> >>>> code
> >>>>> in python.- As a first pass implementation users  can interact with
> >> Heron
> >>>>> SQL via the following command - heron sql <sql-file> <topology-name>-
> >> We
> >>>>> will also support the explain command for displaying the query plan,
> >> this
> >>>>> will not deploy the topology- heron sql <sql-file> --explain- After
> the
> >>>>> first pass implementation is working smoothly, we can then add an
> >>>>> interactive command line interface to accept sql on the fly by
> omitting
> >>>> the
> >>>>> sql file argument- Heron sql <topology-name>- We would support all of
> >> the
> >>>>> existing functionality in Storm SQL today with the exception of being
> >>>>> dependent on trident.  We would use Storm SQL as a way to deploy
> >>>> topologies
> >>>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
> >>>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan
> >> to
> >>>>> implement?- I believe we would have to supply an external jar at
> times
> >> to
> >>>>> connect to external data sources, such as reuse of kafka libraries or
> >>>>> database drivers.  I see that Storm has few external connectors for
> >>>> mongo,
> >>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide
> to
> >>>>> build as connectors or do we want to give them the ability to load
> >>>> external
> >>>>> jars at submit time? I don’t think heron offers the ability to pass
> >> extra
> >>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
> today.
> >>>>> Would this be the correct way to pull in external jars?  Does anyone
> >>>> have a
> >>>>> different idea?  I’m thinking that this might be a v2 feature after
> we
> >>>> get
> >>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> >> anything
> >>>> I
> >>>>> missed?*
> >>>>
> >>
> >>
>
>

Re: Proposing Changes To Heron

Posted by Yaliang Wang <ya...@twitter.com.INVALID>.

Sounds like a very great feature to have. A question I have: will it be feasible to start a separate project to support SQL on Heron-like streaming?

- I’m imaging that there will be a lot code similar/same to Storm SQL.
- Only the last step of the three steps(parse sql -> logical/physical plan -> heron topology) you mentioned is specified for Heron. The first two steps can be shared for other heron-like streaming vendors.
- The native support for SQL inside the Heron project will give extra advertising/marketing bonus but with an increase of the code maintenance cost, especially, if it requires APIs that not very popular and may be changed over time. However, a separate project can target a specific version of Heron. 

Best,
Yaliang

> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <er...@gmail.com> wrote:
> 
> +1 for Heron SQL Support. Thanks Josh.
> 
> On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com> wrote:
> 
>> Thanks Josh for initiating this. It will be a great feature to add for
>> Heron.
>> 
>> cheers
>> /karthik
>> 
>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io> wrote:
>>> 
>>> Jerry,
>>> 
>>> Great point.  Lets keep things simple for the migration to make sure the
>>> implementation is correct.  Then we can modify from there.
>>> 
>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
>> jerry.boyang.peng@gmail.com>
>>> wrote:
>>> 
>>>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
>>>> will be a great feature! The plan sounds great to me.  Lets first get
>>>> an initial version of the Heron SQL out and then we can worry about
>>>> custom / user defined sources and sinks.  We can even start talking
>>>> about UDFs (User defined functions) at that point!
>>>> 
>>>> Best,
>>>> 
>>>> Jerry
>>>> 
>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io>
>> wrote:
>>>>> Please see this google drive link for adding comments.  I will copy and
>>>>> paste the drive doc below as well.
>>>>> 
>>>>> https://docs.google.com/document/d/1PxLCyR_H-
>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>>>> 
>>>>> 
>>>>> Proposal Below
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> *I am writing this document to propose changes and to start
>> conversations
>>>>> on adding functionality similar to Storm SQL to Heron.  We would call
>> it
>>>>> Heron SQL.  After reviewing how the code is structured in Storm I have
>>>> some
>>>>> suggestions and questions relating to the implementation into the Heron
>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
>>>>> Storm)- We would parse the sql with calcite to create the logical and
>>>>> physical plans- We would then convert the logical and physical plans
>> to a
>>>>> Heron Topology- We would then submit the Heron Topology into the Heron
>>>>> System - Some thoughts on code structure and overall functionality- I
>>>> think
>>>>> we should place the Heron SQL code base as a top level directory in the
>>>>> repo. - I will have to add the command “sql” to the Heron command line
>>>> code
>>>>> in python.- As a first pass implementation users  can interact with
>> Heron
>>>>> SQL via the following command - heron sql <sql-file> <topology-name>-
>> We
>>>>> will also support the explain command for displaying the query plan,
>> this
>>>>> will not deploy the topology- heron sql <sql-file> --explain- After the
>>>>> first pass implementation is working smoothly, we can then add an
>>>>> interactive command line interface to accept sql on the fly by omitting
>>>> the
>>>>> sql file argument- Heron sql <topology-name>- We would support all of
>> the
>>>>> existing functionality in Storm SQL today with the exception of being
>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
>>>> topologies
>>>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan
>> to
>>>>> implement?- I believe we would have to supply an external jar at times
>> to
>>>>> connect to external data sources, such as reuse of kafka libraries or
>>>>> database drivers.  I see that Storm has few external connectors for
>>>> mongo,
>>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide to
>>>>> build as connectors or do we want to give them the ability to load
>>>> external
>>>>> jars at submit time? I don’t think heron offers the ability to pass
>> extra
>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does today.
>>>>> Would this be the correct way to pull in external jars?  Does anyone
>>>> have a
>>>>> different idea?  I’m thinking that this might be a v2 feature after we
>>>> get
>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
>> anything
>>>> I
>>>>> missed?*
>>>> 
>> 
>>

Re: Proposing Changes To Heron

Posted by Eren Avsarogullari <er...@gmail.com>.

+1 for Heron SQL Support. Thanks Josh.

On 26 February 2018 at 18:42, Karthik Ramasamy <kr...@gmail.com> wrote:

> Thanks Josh for initiating this. It will be a great feature to add for
> Heron.
>
> cheers
> /karthik
>
> > On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io> wrote:
> >
> > Jerry,
> >
> > Great point.  Lets keep things simple for the migration to make sure the
> > implementation is correct.  Then we can modify from there.
> >
> > On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> jerry.boyang.peng@gmail.com>
> > wrote:
> >
> >> Thanks Josh for taking the initiative to get this start!  SQL on Heron
> >> will be a great feature! The plan sounds great to me.  Lets first get
> >> an initial version of the Heron SQL out and then we can worry about
> >> custom / user defined sources and sinks.  We can even start talking
> >> about UDFs (User defined functions) at that point!
> >>
> >> Best,
> >>
> >> Jerry
> >>
> >> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >>> Please see this google drive link for adding comments.  I will copy and
> >>> paste the drive doc below as well.
> >>>
> >>> https://docs.google.com/document/d/1PxLCyR_H-
> >> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >>>
> >>>
> >>> Proposal Below
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *I am writing this document to propose changes and to start
> conversations
> >>> on adding functionality similar to Storm SQL to Heron.  We would call
> it
> >>> Heron SQL.  After reviewing how the code is structured in Storm I have
> >> some
> >>> suggestions and questions relating to the implementation into the Heron
> >>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
> >>> Storm)- We would parse the sql with calcite to create the logical and
> >>> physical plans- We would then convert the logical and physical plans
> to a
> >>> Heron Topology- We would then submit the Heron Topology into the Heron
> >>> System - Some thoughts on code structure and overall functionality- I
> >> think
> >>> we should place the Heron SQL code base as a top level directory in the
> >>> repo. - I will have to add the command “sql” to the Heron command line
> >> code
> >>> in python.- As a first pass implementation users  can interact with
> Heron
> >>> SQL via the following command - heron sql <sql-file> <topology-name>-
> We
> >>> will also support the explain command for displaying the query plan,
> this
> >>> will not deploy the topology- heron sql <sql-file> --explain- After the
> >>> first pass implementation is working smoothly, we can then add an
> >>> interactive command line interface to accept sql on the fly by omitting
> >> the
> >>> sql file argument- Heron sql <topology-name>- We would support all of
> the
> >>> existing functionality in Storm SQL today with the exception of being
> >>> dependent on trident.  We would use Storm SQL as a way to deploy
> >> topologies
> >>> into Heron.  Similar to how you deploy topologies with the Streamlet,
> >>> Topology, and ECO APIs- Questions- Do we see any issue with this plan
> to
> >>> implement?- I believe we would have to supply an external jar at times
> to
> >>> connect to external data sources, such as reuse of kafka libraries or
> >>> database drivers.  I see that Storm has few external connectors for
> >> mongo,
> >>> kafka, redis and hdfs.  Do we want to limit users to what we decide to
> >>> build as connectors or do we want to give them the ability to load
> >> external
> >>> jars at submit time? I don’t think heron offers the ability to pass
> extra
> >>> jars to via the “--jars” or “--artifacts” flags like Storm does today.
> >>> Would this be the correct way to pull in external jars?  Does anyone
> >> have a
> >>> different idea?  I’m thinking that this might be a v2 feature after we
> >> get
> >>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> anything
> >> I
> >>> missed?*
> >>
>
>

Re: Proposing Changes To Heron

Posted by Karthik Ramasamy <kr...@gmail.com>.

Thanks Josh for initiating this. It will be a great feature to add for Heron.

cheers
/karthik

> On Feb 26, 2018, at 11:11 AM, Josh Fischer <jo...@joshfischer.io> wrote:
> 
> Jerry,
> 
> Great point.  Lets keep things simple for the migration to make sure the
> implementation is correct.  Then we can modify from there.
> 
> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <je...@gmail.com>
> wrote:
> 
>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
>> will be a great feature! The plan sounds great to me.  Lets first get
>> an initial version of the Heron SQL out and then we can worry about
>> custom / user defined sources and sinks.  We can even start talking
>> about UDFs (User defined functions) at that point!
>> 
>> Best,
>> 
>> Jerry
>> 
>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io> wrote:
>>> Please see this google drive link for adding comments.  I will copy and
>>> paste the drive doc below as well.
>>> 
>>> https://docs.google.com/document/d/1PxLCyR_H-
>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>> 
>>> 
>>> Proposal Below
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> *I am writing this document to propose changes and to start conversations
>>> on adding functionality similar to Storm SQL to Heron.  We would call it
>>> Heron SQL.  After reviewing how the code is structured in Storm I have
>> some
>>> suggestions and questions relating to the implementation into the Heron
>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
>>> Storm)- We would parse the sql with calcite to create the logical and
>>> physical plans- We would then convert the logical and physical plans to a
>>> Heron Topology- We would then submit the Heron Topology into the Heron
>>> System - Some thoughts on code structure and overall functionality- I
>> think
>>> we should place the Heron SQL code base as a top level directory in the
>>> repo. - I will have to add the command “sql” to the Heron command line
>> code
>>> in python.- As a first pass implementation users  can interact with Heron
>>> SQL via the following command - heron sql <sql-file> <topology-name>- We
>>> will also support the explain command for displaying the query plan, this
>>> will not deploy the topology- heron sql <sql-file> --explain- After the
>>> first pass implementation is working smoothly, we can then add an
>>> interactive command line interface to accept sql on the fly by omitting
>> the
>>> sql file argument- Heron sql <topology-name>- We would support all of the
>>> existing functionality in Storm SQL today with the exception of being
>>> dependent on trident.  We would use Storm SQL as a way to deploy
>> topologies
>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan to
>>> implement?- I believe we would have to supply an external jar at times to
>>> connect to external data sources, such as reuse of kafka libraries or
>>> database drivers.  I see that Storm has few external connectors for
>> mongo,
>>> kafka, redis and hdfs.  Do we want to limit users to what we decide to
>>> build as connectors or do we want to give them the ability to load
>> external
>>> jars at submit time? I don’t think heron offers the ability to pass extra
>>> jars to via the “--jars” or “--artifacts” flags like Storm does today.
>>> Would this be the correct way to pull in external jars?  Does anyone
>> have a
>>> different idea?  I’m thinking that this might be a v2 feature after we
>> get
>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there anything
>> I
>>> missed?*
>>

Re: Proposing Changes To Heron

Posted by Josh Fischer <jo...@joshfischer.io>.

Jerry,

Great point.  Lets keep things simple for the migration to make sure the
implementation is correct.  Then we can modify from there.

On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <je...@gmail.com>
wrote:

> Thanks Josh for taking the initiative to get this start!  SQL on Heron
> will be a great feature! The plan sounds great to me.  Lets first get
> an initial version of the Heron SQL out and then we can worry about
> custom / user defined sources and sinks.  We can even start talking
> about UDFs (User defined functions) at that point!
>
> Best,
>
> Jerry
>
> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> > Please see this google drive link for adding comments.  I will copy and
> > paste the drive doc below as well.
> >
> > https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >
> >
> > Proposal Below
> >
> >
> >
> >
> >
> >
> >
> > *I am writing this document to propose changes and to start conversations
> > on adding functionality similar to Storm SQL to Heron.  We would call it
> > Heron SQL.  After reviewing how the code is structured in Storm I have
> some
> > suggestions and questions relating to the implementation into the Heron
> > code base. - High Level Overview Of Code Workflow (Keeping Similar to
> > Storm)- We would parse the sql with calcite to create the logical and
> > physical plans- We would then convert the logical and physical plans to a
> > Heron Topology- We would then submit the Heron Topology into the Heron
> > System - Some thoughts on code structure and overall functionality- I
> think
> >  we should place the Heron SQL code base as a top level directory in the
> > repo. - I will have to add the command “sql” to the Heron command line
> code
> > in python.- As a first pass implementation users  can interact with Heron
> > SQL via the following command - heron sql <sql-file> <topology-name>- We
> > will also support the explain command for displaying the query plan, this
> > will not deploy the topology- heron sql <sql-file> --explain- After the
> > first pass implementation is working smoothly, we can then add an
> > interactive command line interface to accept sql on the fly by omitting
> the
> > sql file argument- Heron sql <topology-name>- We would support all of the
> > existing functionality in Storm SQL today with the exception of being
> > dependent on trident.  We would use Storm SQL as a way to deploy
> topologies
> > into Heron.  Similar to how you deploy topologies with the Streamlet,
> > Topology, and ECO APIs- Questions- Do we see any issue with this plan to
> > implement?- I believe we would have to supply an external jar at times to
> > connect to external data sources, such as reuse of kafka libraries or
> > database drivers.  I see that Storm has few external connectors for
> mongo,
> > kafka, redis and hdfs.  Do we want to limit users to what we decide to
> > build as connectors or do we want to give them the ability to load
> external
> > jars at submit time? I don’t think heron offers the ability to pass extra
> > jars to via the “--jars” or “--artifacts” flags like Storm does today.
> > Would this be the correct way to pull in external jars?  Does anyone
> have a
> > different idea?  I’m thinking that this might be a v2 feature after we
> get
> > Heron sql working well.  Ideas, thoughts or concerns?- Is there anything
> I
> > missed?*
>

Re: Proposing Changes To Heron

Posted by Jerry Peng <je...@gmail.com>.

Thanks Josh for taking the initiative to get this start!  SQL on Heron
will be a great feature! The plan sounds great to me.  Lets first get
an initial version of the Heron SQL out and then we can worry about
custom / user defined sources and sinks.  We can even start talking
about UDFs (User defined functions) at that point!

Best,

Jerry

On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> Please see this google drive link for adding comments.  I will copy and
> paste the drive doc below as well.
>
> https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>
>
> Proposal Below
>
>
>
>
>
>
>
> *I am writing this document to propose changes and to start conversations
> on adding functionality similar to Storm SQL to Heron.  We would call it
> Heron SQL.  After reviewing how the code is structured in Storm I have some
> suggestions and questions relating to the implementation into the Heron
> code base. - High Level Overview Of Code Workflow (Keeping Similar to
> Storm)- We would parse the sql with calcite to create the logical and
> physical plans- We would then convert the logical and physical plans to a
> Heron Topology- We would then submit the Heron Topology into the Heron
> System - Some thoughts on code structure and overall functionality- I think
>  we should place the Heron SQL code base as a top level directory in the
> repo. - I will have to add the command “sql” to the Heron command line code
> in python.- As a first pass implementation users  can interact with Heron
> SQL via the following command - heron sql <sql-file> <topology-name>- We
> will also support the explain command for displaying the query plan, this
> will not deploy the topology- heron sql <sql-file> --explain- After the
> first pass implementation is working smoothly, we can then add an
> interactive command line interface to accept sql on the fly by omitting the
> sql file argument- Heron sql <topology-name>- We would support all of the
> existing functionality in Storm SQL today with the exception of being
> dependent on trident.  We would use Storm SQL as a way to deploy topologies
> into Heron.  Similar to how you deploy topologies with the Streamlet,
> Topology, and ECO APIs- Questions- Do we see any issue with this plan to
> implement?- I believe we would have to supply an external jar at times to
> connect to external data sources, such as reuse of kafka libraries or
> database drivers.  I see that Storm has few external connectors for mongo,
> kafka, redis and hdfs.  Do we want to limit users to what we decide to
> build as connectors or do we want to give them the ability to load external
> jars at submit time? I don’t think heron offers the ability to pass extra
> jars to via the “--jars” or “--artifacts” flags like Storm does today.
> Would this be the correct way to pull in external jars?  Does anyone have a
> different idea?  I’m thinking that this might be a v2 feature after we get
> Heron sql working well.  Ideas, thoughts or concerns?- Is there anything I
> missed?*