You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@streams.apache.org by sblackmon <sb...@apache.org> on 2016/10/05 01:58:18 UTC

provider refactoring/improvement exemplar

 
All,

I’ve pushed a branch that depicts the sort of provider upgrades that I envision making across all providers this quarter.  

https://github.com/apache/incubator-streams/compare/master...steveblackmon:STREAMS-415?expand=1  

It solves for all of the following clean-up and improvement issues, specifically for the TwitterTimelineProvider  

STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)  
Ensure all providers function stand-alone

STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
SNS providers should emit appropriately typed documents by default

STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)  
ability (and instructions on how) to run providers directly from console

STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
Proof of concept integration test that pulls actual data from generator

Rather than tackle these stories one at a time laterally across all providers, we could consider divide-and-conquer by module or by class to reduce the number of commits and committers hitting each provider.  

Curious what others want to try.  Would you rather take on one provider at a time and improve it’s quality across several dimensions, or improve many at once in a single dimension?

Steve

On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org(mailto:sblackmon@apache.org)) wrote:

> With 0.3-incubating in the books, it’s time to take a look at what is most important for the next release.
>  
> Here are a few concrete things I propose we tackle in October. They are tagged in JIRA with fixVersion=0.4  
>  
> https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530  
>  
> I saw some new issues being created the last few days. Please keep it up, no matter how small!  
>  
> Major themes:  
> * Reboot and Clean-up
> * Provider and Configuration consistency
> * Prep for the AS 2.0 switch
> Also:
> * POC persist integration tests with docker  
> * POC provider integration tests that perform collection
> * Fix lingering and simple bugs / documentation
> * Flink example code
>  
> If we target early week of October 24 for code freeze and cut rc1 that should give us time to get the release passed (hopefully with unanimous support from three active IPMC members on PPMC) by our November report.  
>  
> Feedback welcome! And feel free to assign yourself or let us someone with permissions know if you want to have something assigned to you.  
>  
> Steve  
>  
> On September 28, 2016 at 3:00:01 PM, sblackmon (sblackmon@apache.org(mailto:sblackmon@apache.org)) wrote:
>  
> > All,
> >  
> > Joey brought this up over the weekend and I think a discussion is overdue on the topic.  
> >  
> > Encouraging community growth and performing regular releases are on our list of graduation criteria.  
> >  
> > A few easy behaviors we can adopt to take to make progress on these goals:  
> > - planning release versions around one or two significant improvements
> > - setting target dates to kick off upcoming releases
> > - prioritizing our backlog after each release
> > - discussing project and community milestones openly on the list  
> > - organizing JIRA so that all contributors (especially new) can decide where it’s most important to focus their efforts
> >  
> > I think to get things moving again and demonstrate we are capable of consistent progress, we should aim to perform a release once per month around the end of the month.  
> >  
> > As for what to focus on, I think it’s time to discuss adopting Activity Streams 2.0, figure out what form that transition would take, and get started down that path. Working implementations demonstrate the suitability of the standard and drive it’s adoption, and the prospects of this project are closely tied to those of the standard. Separate DISCUSS coming on this topic.  
> >  
> > Also important for the ‘reboot’ theme, we should delete any modules we aren’t going to maintain, and bring all modules we are going to maintain up to acceptable standards - exactly what that means is an open question but broadly they should have documentation, code comments, and tests at the level of a typical module in a typical TLP.  
> >  
> > Expanding the examples to demonstrate how to use streams providers and processors within various execution engines and fixing any bugs that have been reported is desirable as well. Adding at least one new example per release is a good target for now.  
> >  
> > I have created some future versions with target release dates in JIRA and invite all committers to associate existing or new issues with those releases, or anyone who can’t modify JIRA to summarize their thoughts and share with the list and I will incorporate those ideas into JIRA. This should be the default reference for anyone looking for a way to help - look at issues associated with the next few releases and the top of the backlog and pick something that appeals and is in line with your experience.  
> >  
> > Anything else that should be a top priority for the rest of the year? Or other ideas on improving planning and coordination?  
> >  
> > Steve  
> >  
> > On September 24, 2016 at 1:01:02 PM, apache (sblackmon@apache.org(mailto:sblackmon@apache.org)) wrote:  
> > > - This has already come up, but maybe ActivityStreams 2.0 support would broaden the community and motivate more work. It's also a concrete  
> >  
> >  
> > > goal to work toward so people would know where they can start.
> >  
> >  
> >  
> >  
> > > - Steve and I did a little work here a few months ago, but the JIRA could reflect the priorities better and I think keep the community working in a common direction.
> >  
> >  
> >  
> >  
> >  
>  
>  
>  
>  





Re: provider refactoring/improvement exemplar

Posted by Steve Blackmon <st...@blackmon.org>.
Update on provider refactoring for this release:
Twitter improvements for the issues mentioned in this thread have merged.
PRs for Facebook, instagram, google+, YouTube are open.
The commit logs and JIRA comments cover which issues have been tackled for
which providers.
These PRs don't cover every provider class or complete the refactoring we
need before adding AS 2 but does at least add real integration tests anyone
can run to get current profiles and posts from upstream.
RSS provider already has real testing.
Most remaining provider modules are commercial services and can't be used
or tested this easily.
Feedback on these changes welcome - sometime next week will finish merging
these and whatever other fixes are ready and start the release process.

On Oct 11, 2016 1:57 PM, "Steve Blackmon" <st...@blackmon.org> wrote:

> STREAMS-403 : run embedded, outside of a streams runtime
> STREAMS-411 : run from console via main method
>
>
> On Tue, Oct 11, 2016 at 1:43 PM, Matt Franklin <m....@gmail.com>
> wrote:
> > On Thu, Oct 6, 2016 at 10:32 AM Steve Blackmon <st...@blackmon.org>
> wrote:
> >
> >> This issue order makes sense to me, based on simplicity and complexity
> >>
> >> of improvement:
> >>
> >>
> >>
> >> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
> >>
> >> > > Ensure all providers function stand-alone
> >>
> >
> > Just so I am following, what is the purpose of this?  Run standalone
> from a
> > console or run standalone outside of a streams runtime.
> >
> >
> >>
> >> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
> >>
> >> > > ability (and instructions on how) to run providers directly from
> >> console
> >>
> >> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
> >>
> >> > > SNS providers should emit appropriately typed documents by default
> >>
> >> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
> >>
> >> > > Proof of concept integration test that pulls actual data from
> generator
> >>
> >>
> >>
> >> In terms of which providers to target in what order, I’d suggest:
> >>
> >>
> >>
> >> twitter
> >>
> >> instagram
> >>
> >> rss
> >>
> >> youtube
> >>
> >> facebook
> >>
> >> gplus
> >>
> >> sysomos
> >>
> >> moreover
> >>
> >>
> >>
> >> Further down the list: more uncertainly about what exactly to do, how
> >>
> >> to do it, how to live test it, higher likelihood of encountering
> >>
> >> technical debt.
> >>
> >>
> >>
> >> If anyone wants to take a shot at any piece of this, go for it.  No
> >>
> >> need for omni-bus PRs covering a full story or a full provider, every
> >>
> >> incremental improvement helps. I’ll happily code review any PR that’s
> >>
> >> opened and help get it finished.
> >>
> >>
> >>
> >> Steve
> >>
> >>
> >>
> >>
> >>
> >> On October 6, 2016 at 7:52:42 AM, Suneel Marthi
> >>
> >> (smarthi@apache.org(mailto:smarthi@apache.org)) wrote:
> >>
> >>
> >>
> >> > On Tue, Oct 4, 2016 at 9:58 PM, sblackmon wrote:
> >>
> >> >
> >>
> >> > >
> >>
> >> > > All,
> >>
> >> > >
> >>
> >> > > I’ve pushed a branch that depicts the sort of provider upgrades
> that I
> >>
> >> > > envision making across all providers this quarter.
> >>
> >> > >
> >>
> >> > > https://github.com/apache/incubator-streams/compare/
> >>
> >> > > master...steveblackmon:STREAMS-415?expand=1
> >>
> >> > >
> >>
> >> > > It solves for all of the following clean-up and improvement issues,
> >>
> >> > > specifically for the TwitterTimelineProvider
> >>
> >> > >
> >>
> >> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
> >>
> >> > > Ensure all providers function stand-alone
> >>
> >> > >
> >>
> >> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
> >>
> >> > > SNS providers should emit appropriately typed documents by default
> >>
> >> > >
> >>
> >> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
> >>
> >> > > ability (and instructions on how) to run providers directly from
> >> console
> >>
> >> > >
> >>
> >> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
> >>
> >> > > Proof of concept integration test that pulls actual data from
> generator
> >>
> >> > >
> >>
> >> > > Rather than tackle these stories one at a time laterally across all
> >>
> >> > > providers, we could consider divide-and-conquer by module or by
> class
> >> to
> >>
> >> > > reduce the number of commits and committers hitting each provider.
> >>
> >> > >
> >>
> >> > > Curious what others want to try. Would you rather take on one
> provider
> >> at
> >>
> >> > > a time and improve it’s quality across several dimensions, or
> improve
> >> many
> >>
> >> > > at once in a single dimension?
> >>
> >> > >
> >>
> >> >
> >>
> >> > I think we should do one at a time given the available and committed
> >>
> >> > resources. Which one would u like to tackle fist from the list above?
> >>
> >> >
> >>
> >> > >
> >>
> >> > > Steve
> >>
> >> > >
> >>
> >> > > On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org
> >> (mailto:
> >>
> >> > > sblackmon@apache.org)) wrote:
> >>
> >> > >
> >>
> >> > > > With 0.3-incubating in the books, it’s time to take a look at
> what is
> >>
> >> > > most important for the next release.
> >>
> >> > > >
> >>
> >> > > > Here are a few concrete things I propose we tackle in October.
> They
> >> are
> >>
> >> > > tagged in JIRA with fixVersion=0.4
> >>
> >> > > >
> >>
> >> > > > https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530
> >>
> >> > > >
> >>
> >> > > > I saw some new issues being created the last few days. Please
> keep it
> >>
> >> > > up, no matter how small!
> >>
> >> > > >
> >>
> >> > > > Major themes:
> >>
> >> > > > * Reboot and Clean-up
> >>
> >> > > > * Provider and Configuration consistency
> >>
> >> > > > * Prep for the AS 2.0 switch
> >>
> >> > > > Also:
> >>
> >> > > > * POC persist integration tests with docker
> >>
> >> > > > * POC provider integration tests that perform collection
> >>
> >> > > > * Fix lingering and simple bugs / documentation
> >>
> >> > > > * Flink example code
> >>
> >> > > >
> >>
> >> > > > If we target early week of October 24 for code freeze and cut rc1
> >> that
> >>
> >> > > should give us time to get the release passed (hopefully with
> unanimous
> >>
> >> > > support from three active IPMC members on PPMC) by our November
> report.
> >>
> >> > > >
> >>
> >> > > > Feedback welcome! And feel free to assign yourself or let us
> someone
> >>
> >> > > with permissions know if you want to have something assigned to you.
> >>
> >> > > >
> >>
> >> > > > Steve
> >>
> >> > > >
> >>
> >> > > > On September 28, 2016 at 3:00:01 PM, sblackmon (
> sblackmon@apache.org
> >>
> >> > > (mailto:sblackmon@apache.org)) wrote:
> >>
> >> > > >
> >>
> >> > > > > All,
> >>
> >> > > > >
> >>
> >> > > > > Joey brought this up over the weekend and I think a discussion
> is
> >>
> >> > > overdue on the topic.
> >>
> >> > > > >
> >>
> >> > > > > Encouraging community growth and performing regular releases
> are on
> >>
> >> > > our list of graduation criteria.
> >>
> >> > > > >
> >>
> >> > > > > A few easy behaviors we can adopt to take to make progress on
> these
> >>
> >> > > goals:
> >>
> >> > > > > - planning release versions around one or two significant
> >> improvements
> >>
> >> > > > > - setting target dates to kick off upcoming releases
> >>
> >> > > > > - prioritizing our backlog after each release
> >>
> >> > > > > - discussing project and community milestones openly on the list
> >>
> >> > > > > - organizing JIRA so that all contributors (especially new) can
> >> decide
> >>
> >> > > where it’s most important to focus their efforts
> >>
> >> > > > >
> >>
> >> > > > > I think to get things moving again and demonstrate we are
> capable
> >> of
> >>
> >> > > consistent progress, we should aim to perform a release once per
> month
> >>
> >> > > around the end of the month.
> >>
> >> > > > >
> >>
> >> > > > > As for what to focus on, I think it’s time to discuss adopting
> >>
> >> > > Activity Streams 2.0, figure out what form that transition would
> take,
> >> and
> >>
> >> > > get started down that path. Working implementations demonstrate the
> >>
> >> > > suitability of the standard and drive it’s adoption, and the
> prospects
> >> of
> >>
> >> > > this project are closely tied to those of the standard. Separate
> >> DISCUSS
> >>
> >> > > coming on this topic.
> >>
> >> > > > >
> >>
> >> > > > > Also important for the ‘reboot’ theme, we should delete any
> >> modules we
> >>
> >> > > aren’t going to maintain, and bring all modules we are going to
> >> maintain up
> >>
> >> > > to acceptable standards - exactly what that means is an open
> question
> >> but
> >>
> >> > > broadly they should have documentation, code comments, and tests at
> the
> >>
> >> > > level of a typical module in a typical TLP.
> >>
> >> > > > >
> >>
> >> > > > > Expanding the examples to demonstrate how to use streams
> providers
> >> and
> >>
> >> > > processors within various execution engines and fixing any bugs that
> >> have
> >>
> >> > > been reported is desirable as well. Adding at least one new example
> per
> >>
> >> > > release is a good target for now.
> >>
> >> > > > >
> >>
> >> > > > > I have created some future versions with target release dates in
> >> JIRA
> >>
> >> > > and invite all committers to associate existing or new issues with
> >> those
> >>
> >> > > releases, or anyone who can’t modify JIRA to summarize their
> thoughts
> >> and
> >>
> >> > > share with the list and I will incorporate those ideas into JIRA.
> This
> >>
> >> > > should be the default reference for anyone looking for a way to
> help -
> >> look
> >>
> >> > > at issues associated with the next few releases and the top of the
> >> backlog
> >>
> >> > > and pick something that appeals and is in line with your experience.
> >>
> >> > > > >
> >>
> >> > > > > Anything else that should be a top priority for the rest of the
> >> year?
> >>
> >> > > Or other ideas on improving planning and coordination?
> >>
> >> > > > >
> >>
> >> > > > > Steve
> >>
> >> > > > >
> >>
> >> > > > > On September 24, 2016 at 1:01:02 PM, apache (
> sblackmon@apache.org
> >>
> >> > > (mailto:sblackmon@apache.org)) wrote:
> >>
> >> > > > > > - This has already come up, but maybe ActivityStreams 2.0
> support
> >>
> >> > > would broaden the community and motivate more work. It's also a
> >> concrete
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > > > goal to work toward so people would know where they can start.
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > > > - Steve and I did a little work here a few months ago, but the
> >> JIRA
> >>
> >> > > could reflect the priorities better and I think keep the community
> >> working
> >>
> >> > > in a common direction.
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > > >
> >>
> >> > > >
> >>
> >> > > >
> >>
> >> > > >
> >>
> >> > > >
> >>
> >> > >
> >>
> >> > >
> >>
> >> > >
> >>
> >> > >
> >>
> >> > >
> >>
> >>
>

Re: provider refactoring/improvement exemplar

Posted by Steve Blackmon <st...@blackmon.org>.
STREAMS-403 : run embedded, outside of a streams runtime
STREAMS-411 : run from console via main method


On Tue, Oct 11, 2016 at 1:43 PM, Matt Franklin <m....@gmail.com> wrote:
> On Thu, Oct 6, 2016 at 10:32 AM Steve Blackmon <st...@blackmon.org> wrote:
>
>> This issue order makes sense to me, based on simplicity and complexity
>>
>> of improvement:
>>
>>
>>
>> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
>>
>> > > Ensure all providers function stand-alone
>>
>
> Just so I am following, what is the purpose of this?  Run standalone from a
> console or run standalone outside of a streams runtime.
>
>
>>
>> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
>>
>> > > ability (and instructions on how) to run providers directly from
>> console
>>
>> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
>>
>> > > SNS providers should emit appropriately typed documents by default
>>
>> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
>>
>> > > Proof of concept integration test that pulls actual data from generator
>>
>>
>>
>> In terms of which providers to target in what order, I’d suggest:
>>
>>
>>
>> twitter
>>
>> instagram
>>
>> rss
>>
>> youtube
>>
>> facebook
>>
>> gplus
>>
>> sysomos
>>
>> moreover
>>
>>
>>
>> Further down the list: more uncertainly about what exactly to do, how
>>
>> to do it, how to live test it, higher likelihood of encountering
>>
>> technical debt.
>>
>>
>>
>> If anyone wants to take a shot at any piece of this, go for it.  No
>>
>> need for omni-bus PRs covering a full story or a full provider, every
>>
>> incremental improvement helps. I’ll happily code review any PR that’s
>>
>> opened and help get it finished.
>>
>>
>>
>> Steve
>>
>>
>>
>>
>>
>> On October 6, 2016 at 7:52:42 AM, Suneel Marthi
>>
>> (smarthi@apache.org(mailto:smarthi@apache.org)) wrote:
>>
>>
>>
>> > On Tue, Oct 4, 2016 at 9:58 PM, sblackmon wrote:
>>
>> >
>>
>> > >
>>
>> > > All,
>>
>> > >
>>
>> > > I’ve pushed a branch that depicts the sort of provider upgrades that I
>>
>> > > envision making across all providers this quarter.
>>
>> > >
>>
>> > > https://github.com/apache/incubator-streams/compare/
>>
>> > > master...steveblackmon:STREAMS-415?expand=1
>>
>> > >
>>
>> > > It solves for all of the following clean-up and improvement issues,
>>
>> > > specifically for the TwitterTimelineProvider
>>
>> > >
>>
>> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
>>
>> > > Ensure all providers function stand-alone
>>
>> > >
>>
>> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
>>
>> > > SNS providers should emit appropriately typed documents by default
>>
>> > >
>>
>> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
>>
>> > > ability (and instructions on how) to run providers directly from
>> console
>>
>> > >
>>
>> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
>>
>> > > Proof of concept integration test that pulls actual data from generator
>>
>> > >
>>
>> > > Rather than tackle these stories one at a time laterally across all
>>
>> > > providers, we could consider divide-and-conquer by module or by class
>> to
>>
>> > > reduce the number of commits and committers hitting each provider.
>>
>> > >
>>
>> > > Curious what others want to try. Would you rather take on one provider
>> at
>>
>> > > a time and improve it’s quality across several dimensions, or improve
>> many
>>
>> > > at once in a single dimension?
>>
>> > >
>>
>> >
>>
>> > I think we should do one at a time given the available and committed
>>
>> > resources. Which one would u like to tackle fist from the list above?
>>
>> >
>>
>> > >
>>
>> > > Steve
>>
>> > >
>>
>> > > On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org
>> (mailto:
>>
>> > > sblackmon@apache.org)) wrote:
>>
>> > >
>>
>> > > > With 0.3-incubating in the books, it’s time to take a look at what is
>>
>> > > most important for the next release.
>>
>> > > >
>>
>> > > > Here are a few concrete things I propose we tackle in October. They
>> are
>>
>> > > tagged in JIRA with fixVersion=0.4
>>
>> > > >
>>
>> > > > https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530
>>
>> > > >
>>
>> > > > I saw some new issues being created the last few days. Please keep it
>>
>> > > up, no matter how small!
>>
>> > > >
>>
>> > > > Major themes:
>>
>> > > > * Reboot and Clean-up
>>
>> > > > * Provider and Configuration consistency
>>
>> > > > * Prep for the AS 2.0 switch
>>
>> > > > Also:
>>
>> > > > * POC persist integration tests with docker
>>
>> > > > * POC provider integration tests that perform collection
>>
>> > > > * Fix lingering and simple bugs / documentation
>>
>> > > > * Flink example code
>>
>> > > >
>>
>> > > > If we target early week of October 24 for code freeze and cut rc1
>> that
>>
>> > > should give us time to get the release passed (hopefully with unanimous
>>
>> > > support from three active IPMC members on PPMC) by our November report.
>>
>> > > >
>>
>> > > > Feedback welcome! And feel free to assign yourself or let us someone
>>
>> > > with permissions know if you want to have something assigned to you.
>>
>> > > >
>>
>> > > > Steve
>>
>> > > >
>>
>> > > > On September 28, 2016 at 3:00:01 PM, sblackmon (sblackmon@apache.org
>>
>> > > (mailto:sblackmon@apache.org)) wrote:
>>
>> > > >
>>
>> > > > > All,
>>
>> > > > >
>>
>> > > > > Joey brought this up over the weekend and I think a discussion is
>>
>> > > overdue on the topic.
>>
>> > > > >
>>
>> > > > > Encouraging community growth and performing regular releases are on
>>
>> > > our list of graduation criteria.
>>
>> > > > >
>>
>> > > > > A few easy behaviors we can adopt to take to make progress on these
>>
>> > > goals:
>>
>> > > > > - planning release versions around one or two significant
>> improvements
>>
>> > > > > - setting target dates to kick off upcoming releases
>>
>> > > > > - prioritizing our backlog after each release
>>
>> > > > > - discussing project and community milestones openly on the list
>>
>> > > > > - organizing JIRA so that all contributors (especially new) can
>> decide
>>
>> > > where it’s most important to focus their efforts
>>
>> > > > >
>>
>> > > > > I think to get things moving again and demonstrate we are capable
>> of
>>
>> > > consistent progress, we should aim to perform a release once per month
>>
>> > > around the end of the month.
>>
>> > > > >
>>
>> > > > > As for what to focus on, I think it’s time to discuss adopting
>>
>> > > Activity Streams 2.0, figure out what form that transition would take,
>> and
>>
>> > > get started down that path. Working implementations demonstrate the
>>
>> > > suitability of the standard and drive it’s adoption, and the prospects
>> of
>>
>> > > this project are closely tied to those of the standard. Separate
>> DISCUSS
>>
>> > > coming on this topic.
>>
>> > > > >
>>
>> > > > > Also important for the ‘reboot’ theme, we should delete any
>> modules we
>>
>> > > aren’t going to maintain, and bring all modules we are going to
>> maintain up
>>
>> > > to acceptable standards - exactly what that means is an open question
>> but
>>
>> > > broadly they should have documentation, code comments, and tests at the
>>
>> > > level of a typical module in a typical TLP.
>>
>> > > > >
>>
>> > > > > Expanding the examples to demonstrate how to use streams providers
>> and
>>
>> > > processors within various execution engines and fixing any bugs that
>> have
>>
>> > > been reported is desirable as well. Adding at least one new example per
>>
>> > > release is a good target for now.
>>
>> > > > >
>>
>> > > > > I have created some future versions with target release dates in
>> JIRA
>>
>> > > and invite all committers to associate existing or new issues with
>> those
>>
>> > > releases, or anyone who can’t modify JIRA to summarize their thoughts
>> and
>>
>> > > share with the list and I will incorporate those ideas into JIRA. This
>>
>> > > should be the default reference for anyone looking for a way to help -
>> look
>>
>> > > at issues associated with the next few releases and the top of the
>> backlog
>>
>> > > and pick something that appeals and is in line with your experience.
>>
>> > > > >
>>
>> > > > > Anything else that should be a top priority for the rest of the
>> year?
>>
>> > > Or other ideas on improving planning and coordination?
>>
>> > > > >
>>
>> > > > > Steve
>>
>> > > > >
>>
>> > > > > On September 24, 2016 at 1:01:02 PM, apache (sblackmon@apache.org
>>
>> > > (mailto:sblackmon@apache.org)) wrote:
>>
>> > > > > > - This has already come up, but maybe ActivityStreams 2.0 support
>>
>> > > would broaden the community and motivate more work. It's also a
>> concrete
>>
>> > > > >
>>
>> > > > >
>>
>> > > > > > goal to work toward so people would know where they can start.
>>
>> > > > >
>>
>> > > > >
>>
>> > > > >
>>
>> > > > >
>>
>> > > > > > - Steve and I did a little work here a few months ago, but the
>> JIRA
>>
>> > > could reflect the priorities better and I think keep the community
>> working
>>
>> > > in a common direction.
>>
>> > > > >
>>
>> > > > >
>>
>> > > > >
>>
>> > > > >
>>
>> > > > >
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > > >
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > >
>>
>> > >
>>
>>

Re: provider refactoring/improvement exemplar

Posted by Matt Franklin <m....@gmail.com>.
On Thu, Oct 6, 2016 at 10:32 AM Steve Blackmon <st...@blackmon.org> wrote:

> This issue order makes sense to me, based on simplicity and complexity
>
> of improvement:
>
>
>
> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
>
> > > Ensure all providers function stand-alone
>

Just so I am following, what is the purpose of this?  Run standalone from a
console or run standalone outside of a streams runtime.


>
> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
>
> > > ability (and instructions on how) to run providers directly from
> console
>
> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
>
> > > SNS providers should emit appropriately typed documents by default
>
> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
>
> > > Proof of concept integration test that pulls actual data from generator
>
>
>
> In terms of which providers to target in what order, I’d suggest:
>
>
>
> twitter
>
> instagram
>
> rss
>
> youtube
>
> facebook
>
> gplus
>
> sysomos
>
> moreover
>
>
>
> Further down the list: more uncertainly about what exactly to do, how
>
> to do it, how to live test it, higher likelihood of encountering
>
> technical debt.
>
>
>
> If anyone wants to take a shot at any piece of this, go for it.  No
>
> need for omni-bus PRs covering a full story or a full provider, every
>
> incremental improvement helps. I’ll happily code review any PR that’s
>
> opened and help get it finished.
>
>
>
> Steve
>
>
>
>
>
> On October 6, 2016 at 7:52:42 AM, Suneel Marthi
>
> (smarthi@apache.org(mailto:smarthi@apache.org)) wrote:
>
>
>
> > On Tue, Oct 4, 2016 at 9:58 PM, sblackmon wrote:
>
> >
>
> > >
>
> > > All,
>
> > >
>
> > > I’ve pushed a branch that depicts the sort of provider upgrades that I
>
> > > envision making across all providers this quarter.
>
> > >
>
> > > https://github.com/apache/incubator-streams/compare/
>
> > > master...steveblackmon:STREAMS-415?expand=1
>
> > >
>
> > > It solves for all of the following clean-up and improvement issues,
>
> > > specifically for the TwitterTimelineProvider
>
> > >
>
> > > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
>
> > > Ensure all providers function stand-alone
>
> > >
>
> > > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
>
> > > SNS providers should emit appropriately typed documents by default
>
> > >
>
> > > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
>
> > > ability (and instructions on how) to run providers directly from
> console
>
> > >
>
> > > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
>
> > > Proof of concept integration test that pulls actual data from generator
>
> > >
>
> > > Rather than tackle these stories one at a time laterally across all
>
> > > providers, we could consider divide-and-conquer by module or by class
> to
>
> > > reduce the number of commits and committers hitting each provider.
>
> > >
>
> > > Curious what others want to try. Would you rather take on one provider
> at
>
> > > a time and improve it’s quality across several dimensions, or improve
> many
>
> > > at once in a single dimension?
>
> > >
>
> >
>
> > I think we should do one at a time given the available and committed
>
> > resources. Which one would u like to tackle fist from the list above?
>
> >
>
> > >
>
> > > Steve
>
> > >
>
> > > On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org
> (mailto:
>
> > > sblackmon@apache.org)) wrote:
>
> > >
>
> > > > With 0.3-incubating in the books, it’s time to take a look at what is
>
> > > most important for the next release.
>
> > > >
>
> > > > Here are a few concrete things I propose we tackle in October. They
> are
>
> > > tagged in JIRA with fixVersion=0.4
>
> > > >
>
> > > > https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530
>
> > > >
>
> > > > I saw some new issues being created the last few days. Please keep it
>
> > > up, no matter how small!
>
> > > >
>
> > > > Major themes:
>
> > > > * Reboot and Clean-up
>
> > > > * Provider and Configuration consistency
>
> > > > * Prep for the AS 2.0 switch
>
> > > > Also:
>
> > > > * POC persist integration tests with docker
>
> > > > * POC provider integration tests that perform collection
>
> > > > * Fix lingering and simple bugs / documentation
>
> > > > * Flink example code
>
> > > >
>
> > > > If we target early week of October 24 for code freeze and cut rc1
> that
>
> > > should give us time to get the release passed (hopefully with unanimous
>
> > > support from three active IPMC members on PPMC) by our November report.
>
> > > >
>
> > > > Feedback welcome! And feel free to assign yourself or let us someone
>
> > > with permissions know if you want to have something assigned to you.
>
> > > >
>
> > > > Steve
>
> > > >
>
> > > > On September 28, 2016 at 3:00:01 PM, sblackmon (sblackmon@apache.org
>
> > > (mailto:sblackmon@apache.org)) wrote:
>
> > > >
>
> > > > > All,
>
> > > > >
>
> > > > > Joey brought this up over the weekend and I think a discussion is
>
> > > overdue on the topic.
>
> > > > >
>
> > > > > Encouraging community growth and performing regular releases are on
>
> > > our list of graduation criteria.
>
> > > > >
>
> > > > > A few easy behaviors we can adopt to take to make progress on these
>
> > > goals:
>
> > > > > - planning release versions around one or two significant
> improvements
>
> > > > > - setting target dates to kick off upcoming releases
>
> > > > > - prioritizing our backlog after each release
>
> > > > > - discussing project and community milestones openly on the list
>
> > > > > - organizing JIRA so that all contributors (especially new) can
> decide
>
> > > where it’s most important to focus their efforts
>
> > > > >
>
> > > > > I think to get things moving again and demonstrate we are capable
> of
>
> > > consistent progress, we should aim to perform a release once per month
>
> > > around the end of the month.
>
> > > > >
>
> > > > > As for what to focus on, I think it’s time to discuss adopting
>
> > > Activity Streams 2.0, figure out what form that transition would take,
> and
>
> > > get started down that path. Working implementations demonstrate the
>
> > > suitability of the standard and drive it’s adoption, and the prospects
> of
>
> > > this project are closely tied to those of the standard. Separate
> DISCUSS
>
> > > coming on this topic.
>
> > > > >
>
> > > > > Also important for the ‘reboot’ theme, we should delete any
> modules we
>
> > > aren’t going to maintain, and bring all modules we are going to
> maintain up
>
> > > to acceptable standards - exactly what that means is an open question
> but
>
> > > broadly they should have documentation, code comments, and tests at the
>
> > > level of a typical module in a typical TLP.
>
> > > > >
>
> > > > > Expanding the examples to demonstrate how to use streams providers
> and
>
> > > processors within various execution engines and fixing any bugs that
> have
>
> > > been reported is desirable as well. Adding at least one new example per
>
> > > release is a good target for now.
>
> > > > >
>
> > > > > I have created some future versions with target release dates in
> JIRA
>
> > > and invite all committers to associate existing or new issues with
> those
>
> > > releases, or anyone who can’t modify JIRA to summarize their thoughts
> and
>
> > > share with the list and I will incorporate those ideas into JIRA. This
>
> > > should be the default reference for anyone looking for a way to help -
> look
>
> > > at issues associated with the next few releases and the top of the
> backlog
>
> > > and pick something that appeals and is in line with your experience.
>
> > > > >
>
> > > > > Anything else that should be a top priority for the rest of the
> year?
>
> > > Or other ideas on improving planning and coordination?
>
> > > > >
>
> > > > > Steve
>
> > > > >
>
> > > > > On September 24, 2016 at 1:01:02 PM, apache (sblackmon@apache.org
>
> > > (mailto:sblackmon@apache.org)) wrote:
>
> > > > > > - This has already come up, but maybe ActivityStreams 2.0 support
>
> > > would broaden the community and motivate more work. It's also a
> concrete
>
> > > > >
>
> > > > >
>
> > > > > > goal to work toward so people would know where they can start.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > > > > - Steve and I did a little work here a few months ago, but the
> JIRA
>
> > > could reflect the priorities better and I think keep the community
> working
>
> > > in a common direction.
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > >
>
> > > >
>
> > > >
>
> > > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
>

Re: provider refactoring/improvement exemplar

Posted by Steve Blackmon <st...@blackmon.org>.
This issue order makes sense to me, based on simplicity and complexity
of improvement:

> > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
> > Ensure all providers function stand-alone
> > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
> > ability (and instructions on how) to run providers directly from console
> > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
> > SNS providers should emit appropriately typed documents by default
> > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
> > Proof of concept integration test that pulls actual data from generator

In terms of which providers to target in what order, I’d suggest:

twitter
instagram
rss
youtube
facebook
gplus
sysomos
moreover

Further down the list: more uncertainly about what exactly to do, how
to do it, how to live test it, higher likelihood of encountering
technical debt.

If anyone wants to take a shot at any piece of this, go for it.  No
need for omni-bus PRs covering a full story or a full provider, every
incremental improvement helps. I’ll happily code review any PR that’s
opened and help get it finished.

Steve


On October 6, 2016 at 7:52:42 AM, Suneel Marthi
(smarthi@apache.org(mailto:smarthi@apache.org)) wrote:

> On Tue, Oct 4, 2016 at 9:58 PM, sblackmon wrote:
>
> >
> > All,
> >
> > I’ve pushed a branch that depicts the sort of provider upgrades that I
> > envision making across all providers this quarter.
> >
> > https://github.com/apache/incubator-streams/compare/
> > master...steveblackmon:STREAMS-415?expand=1
> >
> > It solves for all of the following clean-up and improvement issues,
> > specifically for the TwitterTimelineProvider
> >
> > STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
> > Ensure all providers function stand-alone
> >
> > STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
> > SNS providers should emit appropriately typed documents by default
> >
> > STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
> > ability (and instructions on how) to run providers directly from console
> >
> > STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
> > Proof of concept integration test that pulls actual data from generator
> >
> > Rather than tackle these stories one at a time laterally across all
> > providers, we could consider divide-and-conquer by module or by class to
> > reduce the number of commits and committers hitting each provider.
> >
> > Curious what others want to try. Would you rather take on one provider at
> > a time and improve it’s quality across several dimensions, or improve many
> > at once in a single dimension?
> >
>
> I think we should do one at a time given the available and committed
> resources. Which one would u like to tackle fist from the list above?
>
> >
> > Steve
> >
> > On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org(mailto:
> > sblackmon@apache.org)) wrote:
> >
> > > With 0.3-incubating in the books, it’s time to take a look at what is
> > most important for the next release.
> > >
> > > Here are a few concrete things I propose we tackle in October. They are
> > tagged in JIRA with fixVersion=0.4
> > >
> > > https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530
> > >
> > > I saw some new issues being created the last few days. Please keep it
> > up, no matter how small!
> > >
> > > Major themes:
> > > * Reboot and Clean-up
> > > * Provider and Configuration consistency
> > > * Prep for the AS 2.0 switch
> > > Also:
> > > * POC persist integration tests with docker
> > > * POC provider integration tests that perform collection
> > > * Fix lingering and simple bugs / documentation
> > > * Flink example code
> > >
> > > If we target early week of October 24 for code freeze and cut rc1 that
> > should give us time to get the release passed (hopefully with unanimous
> > support from three active IPMC members on PPMC) by our November report.
> > >
> > > Feedback welcome! And feel free to assign yourself or let us someone
> > with permissions know if you want to have something assigned to you.
> > >
> > > Steve
> > >
> > > On September 28, 2016 at 3:00:01 PM, sblackmon (sblackmon@apache.org
> > (mailto:sblackmon@apache.org)) wrote:
> > >
> > > > All,
> > > >
> > > > Joey brought this up over the weekend and I think a discussion is
> > overdue on the topic.
> > > >
> > > > Encouraging community growth and performing regular releases are on
> > our list of graduation criteria.
> > > >
> > > > A few easy behaviors we can adopt to take to make progress on these
> > goals:
> > > > - planning release versions around one or two significant improvements
> > > > - setting target dates to kick off upcoming releases
> > > > - prioritizing our backlog after each release
> > > > - discussing project and community milestones openly on the list
> > > > - organizing JIRA so that all contributors (especially new) can decide
> > where it’s most important to focus their efforts
> > > >
> > > > I think to get things moving again and demonstrate we are capable of
> > consistent progress, we should aim to perform a release once per month
> > around the end of the month.
> > > >
> > > > As for what to focus on, I think it’s time to discuss adopting
> > Activity Streams 2.0, figure out what form that transition would take, and
> > get started down that path. Working implementations demonstrate the
> > suitability of the standard and drive it’s adoption, and the prospects of
> > this project are closely tied to those of the standard. Separate DISCUSS
> > coming on this topic.
> > > >
> > > > Also important for the ‘reboot’ theme, we should delete any modules we
> > aren’t going to maintain, and bring all modules we are going to maintain up
> > to acceptable standards - exactly what that means is an open question but
> > broadly they should have documentation, code comments, and tests at the
> > level of a typical module in a typical TLP.
> > > >
> > > > Expanding the examples to demonstrate how to use streams providers and
> > processors within various execution engines and fixing any bugs that have
> > been reported is desirable as well. Adding at least one new example per
> > release is a good target for now.
> > > >
> > > > I have created some future versions with target release dates in JIRA
> > and invite all committers to associate existing or new issues with those
> > releases, or anyone who can’t modify JIRA to summarize their thoughts and
> > share with the list and I will incorporate those ideas into JIRA. This
> > should be the default reference for anyone looking for a way to help - look
> > at issues associated with the next few releases and the top of the backlog
> > and pick something that appeals and is in line with your experience.
> > > >
> > > > Anything else that should be a top priority for the rest of the year?
> > Or other ideas on improving planning and coordination?
> > > >
> > > > Steve
> > > >
> > > > On September 24, 2016 at 1:01:02 PM, apache (sblackmon@apache.org
> > (mailto:sblackmon@apache.org)) wrote:
> > > > > - This has already come up, but maybe ActivityStreams 2.0 support
> > would broaden the community and motivate more work. It's also a concrete
> > > >
> > > >
> > > > > goal to work toward so people would know where they can start.
> > > >
> > > >
> > > >
> > > >
> > > > > - Steve and I did a little work here a few months ago, but the JIRA
> > could reflect the priorities better and I think keep the community working
> > in a common direction.
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >

Re: provider refactoring/improvement exemplar

Posted by Suneel Marthi <sm...@apache.org>.
On Tue, Oct 4, 2016 at 9:58 PM, sblackmon <sb...@apache.org> wrote:

>
> All,
>
> I’ve pushed a branch that depicts the sort of provider upgrades that I
> envision making across all providers this quarter.
>
> https://github.com/apache/incubator-streams/compare/
> master...steveblackmon:STREAMS-415?expand=1
>
> It solves for all of the following clean-up and improvement issues,
> specifically for the TwitterTimelineProvider
>
> STREAMS-403(https://issues.apache.org/jira/browse/STREAMS-403)
> Ensure all providers function stand-alone
>
> STREAMS-422(https://issues.apache.org/jira/browse/STREAMS-422)
> SNS providers should emit appropriately typed documents by default
>
> STREAMS-411(https://issues.apache.org/jira/browse/STREAMS-411)
> ability (and instructions on how) to run providers directly from console
>
> STREAMS-415(https://issues.apache.org/jira/browse/STREAMS-415)
> Proof of concept integration test that pulls actual data from generator
>
> Rather than tackle these stories one at a time laterally across all
> providers, we could consider divide-and-conquer by module or by class to
> reduce the number of commits and committers hitting each provider.
>
> Curious what others want to try.  Would you rather take on one provider at
> a time and improve it’s quality across several dimensions, or improve many
> at once in a single dimension?
>

I think we should do one at a time given the available and committed
resources. Which one would u like to tackle fist from the list above?

>
> Steve
>
> On October 4, 2016 at 11:56:18 AM, sblackmon (sblackmon@apache.org(mailto:
> sblackmon@apache.org)) wrote:
>
> > With 0.3-incubating in the books, it’s time to take a look at what is
> most important for the next release.
> >
> > Here are a few concrete things I propose we tackle in October. They are
> tagged in JIRA with fixVersion=0.4
> >
> > https://issues.apache.org/jira/browse/STREAMS-418?filter=12338530
> >
> > I saw some new issues being created the last few days. Please keep it
> up, no matter how small!
> >
> > Major themes:
> > * Reboot and Clean-up
> > * Provider and Configuration consistency
> > * Prep for the AS 2.0 switch
> > Also:
> > * POC persist integration tests with docker
> > * POC provider integration tests that perform collection
> > * Fix lingering and simple bugs / documentation
> > * Flink example code
> >
> > If we target early week of October 24 for code freeze and cut rc1 that
> should give us time to get the release passed (hopefully with unanimous
> support from three active IPMC members on PPMC) by our November report.
> >
> > Feedback welcome! And feel free to assign yourself or let us someone
> with permissions know if you want to have something assigned to you.
> >
> > Steve
> >
> > On September 28, 2016 at 3:00:01 PM, sblackmon (sblackmon@apache.org
> (mailto:sblackmon@apache.org)) wrote:
> >
> > > All,
> > >
> > > Joey brought this up over the weekend and I think a discussion is
> overdue on the topic.
> > >
> > > Encouraging community growth and performing regular releases are on
> our list of graduation criteria.
> > >
> > > A few easy behaviors we can adopt to take to make progress on these
> goals:
> > > - planning release versions around one or two significant improvements
> > > - setting target dates to kick off upcoming releases
> > > - prioritizing our backlog after each release
> > > - discussing project and community milestones openly on the list
> > > - organizing JIRA so that all contributors (especially new) can decide
> where it’s most important to focus their efforts
> > >
> > > I think to get things moving again and demonstrate we are capable of
> consistent progress, we should aim to perform a release once per month
> around the end of the month.
> > >
> > > As for what to focus on, I think it’s time to discuss adopting
> Activity Streams 2.0, figure out what form that transition would take, and
> get started down that path. Working implementations demonstrate the
> suitability of the standard and drive it’s adoption, and the prospects of
> this project are closely tied to those of the standard. Separate DISCUSS
> coming on this topic.
> > >
> > > Also important for the ‘reboot’ theme, we should delete any modules we
> aren’t going to maintain, and bring all modules we are going to maintain up
> to acceptable standards - exactly what that means is an open question but
> broadly they should have documentation, code comments, and tests at the
> level of a typical module in a typical TLP.
> > >
> > > Expanding the examples to demonstrate how to use streams providers and
> processors within various execution engines and fixing any bugs that have
> been reported is desirable as well. Adding at least one new example per
> release is a good target for now.
> > >
> > > I have created some future versions with target release dates in JIRA
> and invite all committers to associate existing or new issues with those
> releases, or anyone who can’t modify JIRA to summarize their thoughts and
> share with the list and I will incorporate those ideas into JIRA. This
> should be the default reference for anyone looking for a way to help - look
> at issues associated with the next few releases and the top of the backlog
> and pick something that appeals and is in line with your experience.
> > >
> > > Anything else that should be a top priority for the rest of the year?
> Or other ideas on improving planning and coordination?
> > >
> > > Steve
> > >
> > > On September 24, 2016 at 1:01:02 PM, apache (sblackmon@apache.org
> (mailto:sblackmon@apache.org)) wrote:
> > > > - This has already come up, but maybe ActivityStreams 2.0 support
> would broaden the community and motivate more work. It's also a concrete
> > >
> > >
> > > > goal to work toward so people would know where they can start.
> > >
> > >
> > >
> > >
> > > > - Steve and I did a little work here a few months ago, but the JIRA
> could reflect the priorities better and I think keep the community working
> in a common direction.
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>
>