You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@streams.apache.org by Benjamin Young <by...@bigbluehat.com> on 2016/10/18 14:19:33 UTC

Envisioning

Hi all,

Sorry I’ve not written here sooner. I’d reached out to the Incubator list while at the W3C’s TPAC even about keeping Apache Streams in the incubator in hopes of also seeing it support the nearly finalized ActivityStreams 2.0 specification:
https://www.w3.org/TR/activitystreams-core/

Since then, I’ve noticed Steve’s efforts to make Streams much simpler for new years—which is fabulous! I (sadly) don’t code in Java…since college, but I do have a desire to run code that aggregates my social streams into a standard format, store it in a database I prefer (in my case Apache CouchDB), and do cool stuff with it for my own reasons. ;) That desire is what drew me into the Streams talk at ApacheCon.

While digging around the project documents, I’ve found two overview descriptions of the project.

This one’s from the web site:
http://streams.incubator.apache.org/site/0.4-incubating-SNAPSHOT/streams-master/
”Apache Streams (incubating) unifies a diverse world of digital profiles and online activities into common formats and vocabularies, and makes these datasets accessible across a variety of databases, devices, and platforms for streaming, browsing, search, sharing, and analytics use-cases.”

And this one from the repo’s readme file:
https://svn.apache.org/repos/asf/incubator/streams/trunk/README.txt
“Apache Streams is a lightweight (yet scalable) server for ActivityStreams. The role of Apache Streams is to provide a central point of aggregation, filtering and querying for Activities that have been submitted by disparate systems. Apache Streams also intends to include a mechanism for intelligent filtering and recommendation to reduce the noise to end users.”

In either case, the story that I get—and the thing I want—is minimal setup to get my Twitter, etc, piped into a database +/- an API +/- a UI.

Am I on the right track here? Or is Streams really meant for Java-developers to mix into their projects?

Once I know that, I’ll know best how to help. :)

Cheers!
Benjamin
--
http://bigbluehat.com/
http://linkedin.com/in/benjaminyoung


Re: Envisioning

Posted by sblackmon <sb...@apache.org>.
Responses in-inline.
On October 18, 2016 at 9:19:50 AM, Benjamin Young (byoung@bigbluehat.com) wrote:

Hi all,  

Sorry I’ve not written here sooner. I’d reached out to the Incubator list while at the W3C’s TPAC even about keeping Apache Streams in the incubator in hopes of also seeing it support the nearly finalized ActivityStreams 2.0 specification:  
https://www.w3.org/TR/activitystreams-core/  

Thanks!

Since then, I’ve noticed Steve’s efforts to make Streams much simpler for new years—which is fabulous! I (sadly) don’t code in Java…since college, but I do have a desire to run code that aggregates my social streams into a standard format, store it in a database I prefer (in my case Apache CouchDB), and do cool stuff with it for my own reasons. ;) That desire is what drew me into the Streams talk at ApacheCon.  

A lot of businesses, techies, and non-techies are interested in producing and consuming content outside of standard single-channel generic web and mobile apps - but there seems to be a dearth of quality low-cost commercial offerings to do so. 

While digging around the project documents, I’ve found two overview descriptions of the project.  

This one’s from the web site:  
http://streams.incubator.apache.org/site/0.4-incubating-SNAPSHOT/streams-master/  
”Apache Streams (incubating) unifies a diverse world of digital profiles and online activities into common formats and vocabularies, and makes these datasets accessible across a variety of databases, devices, and platforms for streaming, browsing, search, sharing, and analytics use-cases.”  

This is our primary focus right now - expanding interoperability to more sources, and enabling interesting use cases that grow the community.

And this one from the repo’s readme file:  
https://svn.apache.org/repos/asf/incubator/streams/trunk/README.txt  
“Apache Streams is a lightweight (yet scalable) server for ActivityStreams. The role of Apache Streams is to provide a central point of aggregation, filtering and querying for Activities that have been submitted by disparate systems. Apache Streams also intends to include a mechanism for intelligent filtering and recommendation to reduce the noise to end users.”  

This copy is older (the project moved from SVN to GIT in 2013).  It’s still an interesting goal, but data interoperability is a more pressing problem in need of a robust open-source solution, IMO.  There are plenty of mature databases, data science tools, and data vis libraries around -  I think if it were dead simple for anyone to collect and normalize social streams we’d see experimentation and adjacent tooling flourish.

In either case, the story that I get—and the thing I want—is minimal setup to get my Twitter, etc, piped into a database +/- an API +/- a UI.  

I think we are closing in on this, minus official API and UI.  The group of active contributors will need to grow and diversify to tackle those but there’s nothing impeding their development (integration and deployment will require making some choices). 

Am I on the right track here? Or is Streams really meant for Java-developers to mix into their projects?  

We’re looking into distribution with docker which will be a good way for power-users with zero interest in Java or Apache technologies to run streams.  The core project libraries, connectors, and converters may be Java, but there’s plenty of room to innovate and improve the project outside that world.  We have a ton of work ahead answering questions about what normalized data types to support, which systems to prioritize, how we want the normalized data to look, and how to map in data from upstream systems.  Design and product work, not code work.

Once I know that, I’ll know best how to help. :) 

If I can make a suggestion for how to get started, try to run any/all of our providers and examples while refusing to look at any source code. Let us if that’s not working out so we can change things up until it does.  Also let us know how well the existing providers and examples meet your needs as a social data power-user and what opportunities for improvement you see, to help us build out the JIRA backlog.  

Cheers! 
Benjamin 
-- 
http://bigbluehat.com/ 
http://linkedin.com/in/benjaminyoung