You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Jeff Feng <je...@gmail.com> on 2017/05/02 16:54:51 UTC

[RESULT] [VOTE] Superset Proposal for Apache Incubator

The vote for Superset passes with 11 +1 binding votes, 3 +1 non-binding
votes and no -1 votes.  Below are the overall results:

*Binding:*
Ashutosh Chauhan +1
Luke Han + 1
Julian Hyde +1
Jitendra Pandey +1
Joe Witt +1
Ted Dunning +1
P. Taylor Goetz +1
Edward Yoon +1
Jacques Nadeau +1
Julian Le Dem +1
Jim Jagielski +1

*Non-Binding:*
Moon Soo Lee +1
Naresh Agarwal +1
Felix Cheng +1

Thank you to everyone who participated in the vote.

Please welcome Superset to the Apache Incubator!

Jeff



On Sun, Apr 23, 2017 at 7:53 AM, Jeff Feng <je...@gmail.com> wrote:

> Dear Apache Incubator Community,
>
> We have updated the Superset proposal
> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
> Apache Incubation with an additional mentor (Luke Han -
> luke.han@apache.org), and would like to start a vote thread for
> acceptance into the incubator.
>
> Our team is excited to share Superset with the Apache community and we
> hope for the your continued support!
>
> Cheers,
> Jeff & the Superset Team
>
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
>  * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
>  * Access to a wide array of rich, interactive data visualization types.
>  * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
>  * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
>  * Broad data access: Consume data out of any SQL-speaking relational
> database.
>  * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
>  * Fast loading dashboards with configurable web-scale caching.
>  * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
>  * Move the existing codebase to Apache and integrate with the Apache
> development process.
>  * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
>  * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
>  * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project.
> As of March 2017, Superset is officially used in production at about a
> dozen companies, has received contributions from over one hundred
> contributors on Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the
> board, starting with providing a smoother user experience around content
> creation, making sure all features work out-of-the-box on more platforms
> and databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
>  * Maxime Beauchemin (Airbnb)
>  * Alanna Scott (Airbnb)
>  * Bogdan Kyryliuk (Airbnb)
>  * Vera Liu  (Airbnb)
>  * Jeff Feng (Airbnb)
>  * Ashutosh Chauhan (Hortonworks)
>  * Nishant Bangarwa (Hortonworks)
>  * Slim Bouguerra (Hortonworks)
>  * Priyank Shah (Hortonworks)
>  * Sriharsha Chintalapani (Hortonworks)
>  * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct
> competitors to Superset within the Apache Software Foundation.  That said,
> Apache Zeppelin is an indirect competitor, but it solves a different use
> case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
>  * [[http://airbnb.io/superset/|Superset Documentation]]
>  * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> Airbnb’s Data Exploration Platform]]
>  * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
>  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in
> Python.
>
> == Source and Intellectual Property Submission Plan ==
> We do not expect any complications for the submission of the Superset code
> base.  Our code is already in Github and there is only a single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
>  * boto3
>  * celery
>  * cryptography
>  * flask-appbuilder
>  * flask-cache
>  * flask-migrate
>  * flask-script
>  * flask-sqlalchemy
>  * flask-testing
>  * humanize
>  * gunicorn
>  * markdown
>  * pandas
>  * parsedatetime
>  * pydruid
>  * PyHive
>  * python-dateutil
>  * requests
>  * simplejson
>  * six
>  * sqlalchemy
>  * sqlalchemy-utils
>  * sqlparse
>  * thrift
>  * thrift-sasl
>  * werkzeug
>
> List of Javascript packages, from NPM:
>  * autobind-decorator
>  * bootstrap
>  * bootstrap-datepicker
>  * brace
>  * brfs
>  * cal-heatmap
>  * classnames
>  * d3
>  * d3-cloud
>  * d3-sankey
>  * d3-scale
>  * d3-tip
>  * datamaps
>  * datatables-bootstrap3-plugin
>  * datatables.net-bs
>  * font-awesome
>  * gridster
>  * immutability-helper
>  * immutable
>  * jquery
>  * lodash.throttle
>  * mapbox-gl
>  * moment
>  * moments
>  * mustache
>  * nvd3
>  * react
>  * react-ace
>  * react-bootstrap
>  * react-bootstrap-table
>  * react-dom
>  * react-draggable
>  * react-gravatar
>  * react-grid-layout
>  * react-map-gl
>  * react-redux
>  * react-resizable
>  * react-select
>  * react-syntax-highlighter
>  * reactable
>  * redux
>  * redux-localstorage
>  * redux-thunk
>  * shortid
>  * style-loader
>  * supercluster
>  * topojson
>  * victory
>  * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that
> we are planning on deprecating as the Apache.org become ready to serve our
> community.
>
>  * superset-private
>  * superset-dev
>  * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system. http://svn.apache.org/repos/as
> f/incubator/superset
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
> PRs to manage our project as much as possible. It’s been said that there
> are ways to keep Github’s issues in sync with Jira, allowing us to get best
> of both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
>  * Maxime Beauchemin <ma...@airbnb.com> - PPMC & Committer
>  * Alanna Scott <al...@airbnb.com> - PPMC & Committer
>  * Bogdan Kyryliuk <b....@gmail.com> - PPMC & Committer
>  * Vera Liu <ve...@airbnb.com> - Committer
>  * Jeff Feng <je...@airbnb.com> - PPMC & Committer
>  * Ashutosh Chauhan <ha...@apache.org> - Mentor & Committer
>  * Nishant Bangarwa <nb...@hortonworks.com> - PPMC & Committer
>  * Slim Bouguerra <sb...@hortonworks.com> - Committer
>  * Priyank Shah <ps...@hortonworks.com> - Committer
>  * Harsha Chintalapani <sc...@hortonworks.com> - Committer
>  * Daniel Dai <da...@apache.org> - Champion & Committer
>  * Luke Han <lu...@apache.org> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <da...@apache.org>
>
> === Nominated Mentors ===
>  * Ashutosh Chauhan <ha...@apache.org>
>  * Luke Han <lu...@apache.org>
>
> === Sponsoring Entity ===
> Incubator PMC
>
>

Re: [RESULT] [VOTE] Superset Proposal for Apache Incubator

Posted by Jim Jagielski <ji...@jaguNET.com>.
I still offer to help mentor is desired.

> On May 2, 2017, at 12:54 PM, Jeff Feng <je...@gmail.com> wrote:
> 
> The vote for Superset passes with 11 +1 binding votes, 3 +1 non-binding
> votes and no -1 votes.  Below are the overall results:
> 
> *Binding:*
> Ashutosh Chauhan +1
> Luke Han + 1
> Julian Hyde +1
> Jitendra Pandey +1
> Joe Witt +1
> Ted Dunning +1
> P. Taylor Goetz +1
> Edward Yoon +1
> Jacques Nadeau +1
> Julian Le Dem +1
> Jim Jagielski +1
> 
> *Non-Binding:*
> Moon Soo Lee +1
> Naresh Agarwal +1
> Felix Cheng +1
> 
> Thank you to everyone who participated in the vote.
> 
> Please welcome Superset to the Apache Incubator!
> 
> Jeff
> 
> 
> 
> On Sun, Apr 23, 2017 at 7:53 AM, Jeff Feng <je...@gmail.com> wrote:
> 
>> Dear Apache Incubator Community,
>> 
>> We have updated the Superset proposal
>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
>> Apache Incubation with an additional mentor (Luke Han -
>> luke.han@apache.org), and would like to start a vote thread for
>> acceptance into the incubator.
>> 
>> Our team is excited to share Superset with the Apache community and we
>> hope for the your continued support!
>> 
>> Cheers,
>> Jeff & the Superset Team
>> 
>> 
>> 
>> 
>> = Superset =
>> 
>> == Abstract ==
>> Superset is an enterprise-ready web application for data exploration, data
>> visualization and dashboarding.
>> 
>> == Proposal ==
>> Superset is business intelligence (BI) software that helps modern
>> organizations visualize and interact with their data. Superset enables
>> users explore data from a variety of databases, assemble beautiful
>> dashboards and share their findings.  Superset works neatly with all modern
>> SQL-speaking databases, and integrates with Druid.io to provide real-time,
>> interactive, blazing fast data access to large datasets.
>> 
>> == Background ==
>> Data is mission critical. To succeed in this era, organizations need to
>> provide low-friction, intuitive and interactive access to data. It is
>> paramount for knowledge workers to be capable of answering their own
>> questions by querying, exploring and visualizing data.
>> 
>> The entire business intelligence industry has pivoted from a model of
>> centralized top-down platforms driven by IT organizations to self-service
>> analytics and agile workflows by any user.  This shift unblocks centralized
>> service bottlenecks for creating data visualizations while also creating an
>> environment that is iterative and fast-moving.  This means that business
>> intelligence software must also be easy and delightful to use.
>> Self-service analytics doesn’t mean that admin and governance features are
>> not needed.
>> Modern BI tools provide fine-grain access controls and auditing
>> capabilities to understand how data is being used.  Superset is a solution
>> that delivers on all of these vectors.
>> 
>> The technology stack is also constantly morphing - vendors are struggling
>> to provide cheap, quick and easy solutions to access data.  Business
>> intelligence users are finding existing solutions lacking as these software
>> products either disregard or react slowly to recent game-changing
>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
>> React.js and iPython’s Jupyter for instance.
>> 
>> == Rationale ==
>> Business intelligence is more relevant today than at any other point in
>> history.  Organizations are currently very limited in options for open
>> source data visualization solutions, especially solutions that are both
>> self-service and enterprise-ready.  Every company informing their decisions
>> with data needs a BI tool.
>> 
>> We believe that Superset will be a strong compliment to existing Apache
>> Software Foundation technologies by offering scalable user interactions to
>> distributed storage and computation solutions.  Users will often find that
>> Superset can act as a catalyst for tooling that can visualize the byproduct
>> of data and computation infrastructure.
>> 
>> Superset has many key design elements that help fill a gap in current
>> solutions for organizations:
>> * Easy, low friction access to data through a simple, web-based data
>> exploration interface.  Composing charts and dashboards are intuitive.
>> Eliminating the need to write code or SQL empowers anyone to use it.
>> * Access to a wide array of rich, interactive data visualization types.
>> * Enterprise-ready: Integration with different authentication mechanisms
>> and granular permissions centered around actions and data access.
>> * Realtime & fast: Superset provides realtime analytics at the speed of
>> thought on very large datasets when integrated with Druid.io.
>> * Broad data access: Consume data out of any SQL-speaking relational
>> database.
>> * Extensible: Can be extended to talk to many noSQL databases like Apache
>> Drill, Elastic Search, and other popular database engines.
>> * Fast loading dashboards with configurable web-scale caching.
>> * Plug-in framework that enables organizations to build custom analytical
>> applications with new UI/UX interfaces.
>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
>> with more flexibility.  SQL Lab integrates with the visualization engine
>> seamlessly.
>> 
>> == Initial Goals ==
>> The initial goals of the Superset project are several-fold:
>> * Move the existing codebase to Apache and integrate with the Apache
>> development process.
>> * Redesign the user interface and interaction model for creating
>> visualizations/dashboards and connecting to data sources
>> * Build robust support for security and governance of the tool including
>> popular authorization modules (including Apache Ranger and Apache Sentry)
>> and a more sophisticated permissions system
>> * Grow the extensibility of the project both in terms of enhanced
>> connectivity to NoSQL-based data sources and creating a plug-in framework
>> that enables organizations to build custom analytical applications which
>> require a new UI/UX
>> 
>> == Current Status ==
>> By many standards, Superset is already a successful open source project.
>> As of March 2017, Superset is officially used in production at about a
>> dozen companies, has received contributions from over one hundred
>> contributors on Github, 1500+ forks, and 12k+ stars.
>> 
>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>> significant contributions, and expressed their commitment to the project.
>> The product is feature complete and has been viable for months. It already
>> serves as the main interface for consuming data at many companies of
>> different sizes.
>> 
>> While the product is usable, there’s room for improvement across the
>> board, starting with providing a smoother user experience around content
>> creation, making sure all features work out-of-the-box on more platforms
>> and databases, providing better user training guides and videos, having a
>> predictable release process, and increasing the overall quality of the
>> Superset releases.
>> 
>> === Meritocracy ===
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. Several companies have expressed interest in
>> this project, and we intend to invite additional developers to participate.
>> We will encourage and monitor community participation so that privileges
>> can be extended to those that contribute.
>> 
>> === Community ===
>> The need for an enterprise-ready data visualization and exploration
>> platform in the open source community is tremendous.  While Superset is
>> fairly well known, recognized and used within the Druid.io community,
>> adoption is currently limited outside of that niche. There is a huge
>> opportunity to grow the community to hundreds if not thousands of
>> organizations, and we are hoping that embracing “the Apache way” will
>> accelerate the growth of our community.
>> 
>> We have already been active at seeking and inviting contributions, and are
>> planning to scale the project by investing time and growing the support
>> structure to grow the community.
>> 
>> === Core Developers ===
>> The initial committers for Superset include experienced full stack,
>> front-end and data engineers:
>> * Maxime Beauchemin (Airbnb)
>> * Alanna Scott (Airbnb)
>> * Bogdan Kyryliuk (Airbnb)
>> * Vera Liu  (Airbnb)
>> * Jeff Feng (Airbnb)
>> * Ashutosh Chauhan (Hortonworks)
>> * Nishant Bangarwa (Hortonworks)
>> * Slim Bouguerra (Hortonworks)
>> * Priyank Shah (Hortonworks)
>> * Sriharsha Chintalapani (Hortonworks)
>> * Daniel Dai (Hortonworks)
>> 
>> We realize that additional employer diversity is needed, and we will work
>> aggressively to recruit developers from additional companies.
>> 
>> === Alignment ===
>> The initial committers strongly believe that a system for interactive
>> visualization of data will gain broader adoption as an open source,
>> community driven project, where the community can contribute not only to
>> the core components, but also to a growing collection of connectors,
>> visualizations and improving integration a all potential data sources.
>> Superset already integrates closely with Apache Hive, the Hive metastore,
>> as well as most SQL-speaking databases found in modern data ecosystems.
>> 
>> == Known Risks ==
>> 
>> === Orphaned Products ===
>> Superset is a vital component for both visualizing, accessing and
>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
>> component of the DataFlow product offering.  Thus, the risk of the project
>> being orphaned is relatively low.  The project could be at risk if Airbnb
>> changes their approach for democratizing data or if Hortonworks changes
>> their strategy in the market.  In such an event, the committers plan to
>> continue working on the project on their own time, thought the progress
>> will likely be slower.  We plan to mitigate this risk by recruiting
>> additional committers.
>> 
>> === Inexperience with Open Source ===
>> The initial committers include veteran Apache members (committers and PPMC
>> members) and other developers who have varying degrees of experience with
>> open source projects. All have been involved with source code that has been
>> released under an open source license, and several also have experience
>> developing code with an open source development process.
>> 
>> === Homogenous Developers ===
>> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
>> committed to recruiting additional committers from other companies.
>> 
>> === Reliance on Salaried Developers ===
>> It is expected that Superset development will occur on both salaried time
>> and on volunteer time, after hours. The majority of initial committers are
>> paid by their employer to contribute to this project. However, they are all
>> passionate about the project, and we are confident that the project will
>> continue even if no salaried developers contribute to the project. We are
>> committed to recruiting additional committers including non-salaried
>> developers.
>> 
>> === Relationships with Other Apache Products ===
>> To the knowledge of the Initial Committers, there are no direct
>> competitors to Superset within the Apache Software Foundation.  That said,
>> Apache Zeppelin is an indirect competitor, but it solves a different use
>> case.
>> 
>> Apache Zeppelin is a web-based notebook that enables interactive data
>> analytics. It enables the creation of beautiful data-driven, interactive
>> and collaborative documents with SQL, Scala and more.  Although a user can
>> create data visualizations using this project, it leverages a notebook
>> style user interfaces and it is geared towards the Spark community where
>> Scala and SQL co-exist
>> 
>> We look forward to collaborating with those communities, as well as other
>> Apache communities.
>> 
>> === An Excessive Fascination with the Apache Brand ===
>> Superset is solving two huge challenges:
>> The challenge of enabling every knowledge worker to make data informed
>> decisions, particularly those who are not deeply skilled at writing SQL.
>> The challenge of visualizing huge amounts of data interactively and in
>> real-time
>> 
>> Superset was first developed as a data visualization solution for Druid.io
>> as a way to visualize billions of rows of data.  Since then, usage of
>> Superset has expanded to address data visualization use cases across SQL
>> speaking data sources as well.
>> 
>> Our rationale for developing Superset as an Apache project is detailed in
>> the Rationale Section.  We believe that the Apache brand and community
>> process will help us attract more contributors to this project, and help
>> grow the footprint of the project through usage at other organizations and
>> within other applications.  Establishing consensus among users and
>> developers will result in a more valuable tool for everyone.
>> 
>> == Documentation ==
>> References to further reading material:
>> * [[http://airbnb.io/superset/|Superset Documentation]]
>> * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
>> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
>> Airbnb’s Data Exploration Platform]]
>> * [[https://medium.com/airbnb-engineering/superset-scaling-dat
>> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
>> Superset: Scaling Data Access & Visual Insights at Airbnb]]
>> 
>> == Initial Source ==
>> The origin of the proposed code base can be found at
>> https://github.com/airbnb/superset.  The code base is primarily in
>> Python.
>> 
>> == Source and Intellectual Property Submission Plan ==
>> We do not expect any complications for the submission of the Superset code
>> base.  Our code is already in Github and there is only a single code base.
>> 
>> == External Dependencies ==
>> List of Python packages, from the Python Package Index (Pypi):
>> 
>> * boto3
>> * celery
>> * cryptography
>> * flask-appbuilder
>> * flask-cache
>> * flask-migrate
>> * flask-script
>> * flask-sqlalchemy
>> * flask-testing
>> * humanize
>> * gunicorn
>> * markdown
>> * pandas
>> * parsedatetime
>> * pydruid
>> * PyHive
>> * python-dateutil
>> * requests
>> * simplejson
>> * six
>> * sqlalchemy
>> * sqlalchemy-utils
>> * sqlparse
>> * thrift
>> * thrift-sasl
>> * werkzeug
>> 
>> List of Javascript packages, from NPM:
>> * autobind-decorator
>> * bootstrap
>> * bootstrap-datepicker
>> * brace
>> * brfs
>> * cal-heatmap
>> * classnames
>> * d3
>> * d3-cloud
>> * d3-sankey
>> * d3-scale
>> * d3-tip
>> * datamaps
>> * datatables-bootstrap3-plugin
>> * datatables.net-bs
>> * font-awesome
>> * gridster
>> * immutability-helper
>> * immutable
>> * jquery
>> * lodash.throttle
>> * mapbox-gl
>> * moment
>> * moments
>> * mustache
>> * nvd3
>> * react
>> * react-ace
>> * react-bootstrap
>> * react-bootstrap-table
>> * react-dom
>> * react-draggable
>> * react-gravatar
>> * react-grid-layout
>> * react-map-gl
>> * react-redux
>> * react-resizable
>> * react-select
>> * react-syntax-highlighter
>> * reactable
>> * redux
>> * redux-localstorage
>> * redux-thunk
>> * shortid
>> * style-loader
>> * supercluster
>> * topojson
>> * victory
>> * viewport-mercator-project
>> 
>> == Cryptography ==
>> The proposal does not include cryptographic code.
>> 
>> == Required Resources ==
>> 
>> === Mailing List ===
>> There is a current mailing list as a Google Group “airbnb_superset” that
>> we are planning on deprecating as the Apache.org become ready to serve our
>> community.
>> 
>> * superset-private
>> * superset-dev
>> * superset-user
>> 
>> === Subversion Directory ===
>> Git is the preferred source control system. http://svn.apache.org/repos/as
>> f/incubator/superset
>> 
>> == Git Repository ==
>> Git is the preferred source control system, we’re assuming
>> https://github.com/apache/incubator-superset based on the naming scheme
>> 
>> == Issue Tracking ==
>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
>> PRs to manage our project as much as possible. It’s been said that there
>> are ways to keep Github’s issues in sync with Jira, allowing us to get best
>> of both worlds. If that is not possible, we will comply to using Jira.
>> 
>> == Other Resources ==
>> We currently use a set of Github integrated services that are free to the
>> open source community, like Travis-ci, Code Climate, Coveralls,
>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
>> these services as they allow us to scale contributions and optimize our
>> development flows. These services require some elevated rights on the
>> Github repository in order to set up or tune and we would like for the
>> committers to have the required rights.
>> 
>> 
>> == Initial Committers ==
>> 
>> * Maxime Beauchemin <ma...@airbnb.com> - PPMC & Committer
>> * Alanna Scott <al...@airbnb.com> - PPMC & Committer
>> * Bogdan Kyryliuk <b....@gmail.com> - PPMC & Committer
>> * Vera Liu <ve...@airbnb.com> - Committer
>> * Jeff Feng <je...@airbnb.com> - PPMC & Committer
>> * Ashutosh Chauhan <ha...@apache.org> - Mentor & Committer
>> * Nishant Bangarwa <nb...@hortonworks.com> - PPMC & Committer
>> * Slim Bouguerra <sb...@hortonworks.com> - Committer
>> * Priyank Shah <ps...@hortonworks.com> - Committer
>> * Harsha Chintalapani <sc...@hortonworks.com> - Committer
>> * Daniel Dai <da...@apache.org> - Champion & Committer
>> * Luke Han <lu...@apache.org> - Mentor
>> 
>> == Affiliations ==
>> The initial committers are employees of Airbnb Inc. and Hortonworks.
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> Daniel Dai <da...@apache.org>
>> 
>> === Nominated Mentors ===
>> * Ashutosh Chauhan <ha...@apache.org>
>> * Luke Han <lu...@apache.org>
>> 
>> === Sponsoring Entity ===
>> Incubator PMC
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [RESULT] [VOTE] Superset Proposal for Apache Incubator

Posted by Pierre Smits <pi...@gmail.com>.
Congratulations and good luck to the team incubating.

Best regards,

Pierre Smits

ORRTIZ.COM <http://www.orrtiz.com>
OFBiz based solutions & services

OFBiz Extensions Marketplace
http://oem.ofbizci.net/oci-2/

On Tue, May 2, 2017 at 6:54 PM, Jeff Feng <je...@gmail.com> wrote:

> The vote for Superset passes with 11 +1 binding votes, 3 +1 non-binding
> votes and no -1 votes.  Below are the overall results:
>
> *Binding:*
> Ashutosh Chauhan +1
> Luke Han + 1
> Julian Hyde +1
> Jitendra Pandey +1
> Joe Witt +1
> Ted Dunning +1
> P. Taylor Goetz +1
> Edward Yoon +1
> Jacques Nadeau +1
> Julian Le Dem +1
> Jim Jagielski +1
>
> *Non-Binding:*
> Moon Soo Lee +1
> Naresh Agarwal +1
> Felix Cheng +1
>
> Thank you to everyone who participated in the vote.
>
> Please welcome Superset to the Apache Incubator!
>
> Jeff
>
>
>
> On Sun, Apr 23, 2017 at 7:53 AM, Jeff Feng <je...@gmail.com> wrote:
>
> > Dear Apache Incubator Community,
> >
> > We have updated the Superset proposal
> > <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
> > Apache Incubation with an additional mentor (Luke Han -
> > luke.han@apache.org), and would like to start a vote thread for
> > acceptance into the incubator.
> >
> > Our team is excited to share Superset with the Apache community and we
> > hope for the your continued support!
> >
> > Cheers,
> > Jeff & the Superset Team
> >
> >
> >
> >
> > = Superset =
> >
> > == Abstract ==
> > Superset is an enterprise-ready web application for data exploration,
> data
> > visualization and dashboarding.
> >
> > == Proposal ==
> > Superset is business intelligence (BI) software that helps modern
> > organizations visualize and interact with their data. Superset enables
> > users explore data from a variety of databases, assemble beautiful
> > dashboards and share their findings.  Superset works neatly with all
> modern
> > SQL-speaking databases, and integrates with Druid.io to provide
> real-time,
> > interactive, blazing fast data access to large datasets.
> >
> > == Background ==
> > Data is mission critical. To succeed in this era, organizations need to
> > provide low-friction, intuitive and interactive access to data. It is
> > paramount for knowledge workers to be capable of answering their own
> > questions by querying, exploring and visualizing data.
> >
> > The entire business intelligence industry has pivoted from a model of
> > centralized top-down platforms driven by IT organizations to self-service
> > analytics and agile workflows by any user.  This shift unblocks
> centralized
> > service bottlenecks for creating data visualizations while also creating
> an
> > environment that is iterative and fast-moving.  This means that business
> > intelligence software must also be easy and delightful to use.
> > Self-service analytics doesn’t mean that admin and governance features
> are
> > not needed.
> > Modern BI tools provide fine-grain access controls and auditing
> > capabilities to understand how data is being used.  Superset is a
> solution
> > that delivers on all of these vectors.
> >
> > The technology stack is also constantly morphing - vendors are struggling
> > to provide cheap, quick and easy solutions to access data.  Business
> > intelligence users are finding existing solutions lacking as these
> software
> > products either disregard or react slowly to recent game-changing
> > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> > React.js and iPython’s Jupyter for instance.
> >
> > == Rationale ==
> > Business intelligence is more relevant today than at any other point in
> > history.  Organizations are currently very limited in options for open
> > source data visualization solutions, especially solutions that are both
> > self-service and enterprise-ready.  Every company informing their
> decisions
> > with data needs a BI tool.
> >
> > We believe that Superset will be a strong compliment to existing Apache
> > Software Foundation technologies by offering scalable user interactions
> to
> > distributed storage and computation solutions.  Users will often find
> that
> > Superset can act as a catalyst for tooling that can visualize the
> byproduct
> > of data and computation infrastructure.
> >
> > Superset has many key design elements that help fill a gap in current
> > solutions for organizations:
> >  * Easy, low friction access to data through a simple, web-based data
> > exploration interface.  Composing charts and dashboards are intuitive.
> > Eliminating the need to write code or SQL empowers anyone to use it.
> >  * Access to a wide array of rich, interactive data visualization types.
> >  * Enterprise-ready: Integration with different authentication mechanisms
> > and granular permissions centered around actions and data access.
> >  * Realtime & fast: Superset provides realtime analytics at the speed of
> > thought on very large datasets when integrated with Druid.io.
> >  * Broad data access: Consume data out of any SQL-speaking relational
> > database.
> >  * Extensible: Can be extended to talk to many noSQL databases like
> Apache
> > Drill, Elastic Search, and other popular database engines.
> >  * Fast loading dashboards with configurable web-scale caching.
> >  * Plug-in framework that enables organizations to build custom
> analytical
> > applications with new UI/UX interfaces.
> >  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> > with more flexibility.  SQL Lab integrates with the visualization engine
> > seamlessly.
> >
> > == Initial Goals ==
> > The initial goals of the Superset project are several-fold:
> >  * Move the existing codebase to Apache and integrate with the Apache
> > development process.
> >  * Redesign the user interface and interaction model for creating
> > visualizations/dashboards and connecting to data sources
> >  * Build robust support for security and governance of the tool including
> > popular authorization modules (including Apache Ranger and Apache Sentry)
> > and a more sophisticated permissions system
> >  * Grow the extensibility of the project both in terms of enhanced
> > connectivity to NoSQL-based data sources and creating a plug-in framework
> > that enables organizations to build custom analytical applications which
> > require a new UI/UX
> >
> > == Current Status ==
> > By many standards, Superset is already a successful open source project.
> > As of March 2017, Superset is officially used in production at about a
> > dozen companies, has received contributions from over one hundred
> > contributors on Github, 1500+ forks, and 12k+ stars.
> >
> > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> > significant contributions, and expressed their commitment to the project.
> > The product is feature complete and has been viable for months. It
> already
> > serves as the main interface for consuming data at many companies of
> > different sizes.
> >
> > While the product is usable, there’s room for improvement across the
> > board, starting with providing a smoother user experience around content
> > creation, making sure all features work out-of-the-box on more platforms
> > and databases, providing better user training guides and videos, having a
> > predictable release process, and increasing the overall quality of the
> > Superset releases.
> >
> > === Meritocracy ===
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in an open forum. Several companies have expressed interest
> in
> > this project, and we intend to invite additional developers to
> participate.
> > We will encourage and monitor community participation so that privileges
> > can be extended to those that contribute.
> >
> > === Community ===
> > The need for an enterprise-ready data visualization and exploration
> > platform in the open source community is tremendous.  While Superset is
> > fairly well known, recognized and used within the Druid.io community,
> > adoption is currently limited outside of that niche. There is a huge
> > opportunity to grow the community to hundreds if not thousands of
> > organizations, and we are hoping that embracing “the Apache way” will
> > accelerate the growth of our community.
> >
> > We have already been active at seeking and inviting contributions, and
> are
> > planning to scale the project by investing time and growing the support
> > structure to grow the community.
> >
> > === Core Developers ===
> > The initial committers for Superset include experienced full stack,
> > front-end and data engineers:
> >  * Maxime Beauchemin (Airbnb)
> >  * Alanna Scott (Airbnb)
> >  * Bogdan Kyryliuk (Airbnb)
> >  * Vera Liu  (Airbnb)
> >  * Jeff Feng (Airbnb)
> >  * Ashutosh Chauhan (Hortonworks)
> >  * Nishant Bangarwa (Hortonworks)
> >  * Slim Bouguerra (Hortonworks)
> >  * Priyank Shah (Hortonworks)
> >  * Sriharsha Chintalapani (Hortonworks)
> >  * Daniel Dai (Hortonworks)
> >
> > We realize that additional employer diversity is needed, and we will work
> > aggressively to recruit developers from additional companies.
> >
> > === Alignment ===
> > The initial committers strongly believe that a system for interactive
> > visualization of data will gain broader adoption as an open source,
> > community driven project, where the community can contribute not only to
> > the core components, but also to a growing collection of connectors,
> > visualizations and improving integration a all potential data sources.
> > Superset already integrates closely with Apache Hive, the Hive metastore,
> > as well as most SQL-speaking databases found in modern data ecosystems.
> >
> > == Known Risks ==
> >
> > === Orphaned Products ===
> > Superset is a vital component for both visualizing, accessing and
> > democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> > component of the DataFlow product offering.  Thus, the risk of the
> project
> > being orphaned is relatively low.  The project could be at risk if Airbnb
> > changes their approach for democratizing data or if Hortonworks changes
> > their strategy in the market.  In such an event, the committers plan to
> > continue working on the project on their own time, thought the progress
> > will likely be slower.  We plan to mitigate this risk by recruiting
> > additional committers.
> >
> > === Inexperience with Open Source ===
> > The initial committers include veteran Apache members (committers and
> PPMC
> > members) and other developers who have varying degrees of experience with
> > open source projects. All have been involved with source code that has
> been
> > released under an open source license, and several also have experience
> > developing code with an open source development process.
> >
> > === Homogenous Developers ===
> > The initial committers are employed by Airbnb Inc. and Hortonworks. We
> are
> > committed to recruiting additional committers from other companies.
> >
> > === Reliance on Salaried Developers ===
> > It is expected that Superset development will occur on both salaried time
> > and on volunteer time, after hours. The majority of initial committers
> are
> > paid by their employer to contribute to this project. However, they are
> all
> > passionate about the project, and we are confident that the project will
> > continue even if no salaried developers contribute to the project. We are
> > committed to recruiting additional committers including non-salaried
> > developers.
> >
> > === Relationships with Other Apache Products ===
> > To the knowledge of the Initial Committers, there are no direct
> > competitors to Superset within the Apache Software Foundation.  That
> said,
> > Apache Zeppelin is an indirect competitor, but it solves a different use
> > case.
> >
> > Apache Zeppelin is a web-based notebook that enables interactive data
> > analytics. It enables the creation of beautiful data-driven, interactive
> > and collaborative documents with SQL, Scala and more.  Although a user
> can
> > create data visualizations using this project, it leverages a notebook
> > style user interfaces and it is geared towards the Spark community where
> > Scala and SQL co-exist
> >
> > We look forward to collaborating with those communities, as well as other
> > Apache communities.
> >
> > === An Excessive Fascination with the Apache Brand ===
> > Superset is solving two huge challenges:
> > The challenge of enabling every knowledge worker to make data informed
> > decisions, particularly those who are not deeply skilled at writing SQL.
> > The challenge of visualizing huge amounts of data interactively and in
> > real-time
> >
> > Superset was first developed as a data visualization solution for
> Druid.io
> > as a way to visualize billions of rows of data.  Since then, usage of
> > Superset has expanded to address data visualization use cases across SQL
> > speaking data sources as well.
> >
> > Our rationale for developing Superset as an Apache project is detailed in
> > the Rationale Section.  We believe that the Apache brand and community
> > process will help us attract more contributors to this project, and help
> > grow the footprint of the project through usage at other organizations
> and
> > within other applications.  Establishing consensus among users and
> > developers will result in a more valuable tool for everyone.
> >
> > == Documentation ==
> > References to further reading material:
> >  * [[http://airbnb.io/superset/|Superset Documentation]]
> >  * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> > Airbnb’s Data Exploration Platform]]
> >  * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
> Post:
> >  Superset: Scaling Data Access & Visual Insights at Airbnb]]
> >
> > == Initial Source ==
> > The origin of the proposed code base can be found at
> > https://github.com/airbnb/superset.  The code base is primarily in
> > Python.
> >
> > == Source and Intellectual Property Submission Plan ==
> > We do not expect any complications for the submission of the Superset
> code
> > base.  Our code is already in Github and there is only a single code
> base.
> >
> > == External Dependencies ==
> > List of Python packages, from the Python Package Index (Pypi):
> >
> >  * boto3
> >  * celery
> >  * cryptography
> >  * flask-appbuilder
> >  * flask-cache
> >  * flask-migrate
> >  * flask-script
> >  * flask-sqlalchemy
> >  * flask-testing
> >  * humanize
> >  * gunicorn
> >  * markdown
> >  * pandas
> >  * parsedatetime
> >  * pydruid
> >  * PyHive
> >  * python-dateutil
> >  * requests
> >  * simplejson
> >  * six
> >  * sqlalchemy
> >  * sqlalchemy-utils
> >  * sqlparse
> >  * thrift
> >  * thrift-sasl
> >  * werkzeug
> >
> > List of Javascript packages, from NPM:
> >  * autobind-decorator
> >  * bootstrap
> >  * bootstrap-datepicker
> >  * brace
> >  * brfs
> >  * cal-heatmap
> >  * classnames
> >  * d3
> >  * d3-cloud
> >  * d3-sankey
> >  * d3-scale
> >  * d3-tip
> >  * datamaps
> >  * datatables-bootstrap3-plugin
> >  * datatables.net-bs
> >  * font-awesome
> >  * gridster
> >  * immutability-helper
> >  * immutable
> >  * jquery
> >  * lodash.throttle
> >  * mapbox-gl
> >  * moment
> >  * moments
> >  * mustache
> >  * nvd3
> >  * react
> >  * react-ace
> >  * react-bootstrap
> >  * react-bootstrap-table
> >  * react-dom
> >  * react-draggable
> >  * react-gravatar
> >  * react-grid-layout
> >  * react-map-gl
> >  * react-redux
> >  * react-resizable
> >  * react-select
> >  * react-syntax-highlighter
> >  * reactable
> >  * redux
> >  * redux-localstorage
> >  * redux-thunk
> >  * shortid
> >  * style-loader
> >  * supercluster
> >  * topojson
> >  * victory
> >  * viewport-mercator-project
> >
> > == Cryptography ==
> > The proposal does not include cryptographic code.
> >
> > == Required Resources ==
> >
> > === Mailing List ===
> > There is a current mailing list as a Google Group “airbnb_superset” that
> > we are planning on deprecating as the Apache.org become ready to serve
> our
> > community.
> >
> >  * superset-private
> >  * superset-dev
> >  * superset-user
> >
> > === Subversion Directory ===
> > Git is the preferred source control system.
> http://svn.apache.org/repos/as
> > f/incubator/superset
> >
> > == Git Repository ==
> > Git is the preferred source control system, we’re assuming
> > https://github.com/apache/incubator-superset based on the naming scheme
> >
> > == Issue Tracking ==
> > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
> > PRs to manage our project as much as possible. It’s been said that there
> > are ways to keep Github’s issues in sync with Jira, allowing us to get
> best
> > of both worlds. If that is not possible, we will comply to using Jira.
> >
> > == Other Resources ==
> > We currently use a set of Github integrated services that are free to the
> > open source community, like Travis-ci, Code Climate, Coveralls,
> > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
> using
> > these services as they allow us to scale contributions and optimize our
> > development flows. These services require some elevated rights on the
> > Github repository in order to set up or tune and we would like for the
> > committers to have the required rights.
> >
> >
> > == Initial Committers ==
> >
> >  * Maxime Beauchemin <ma...@airbnb.com> - PPMC & Committer
> >  * Alanna Scott <al...@airbnb.com> - PPMC & Committer
> >  * Bogdan Kyryliuk <b....@gmail.com> - PPMC & Committer
> >  * Vera Liu <ve...@airbnb.com> - Committer
> >  * Jeff Feng <je...@airbnb.com> - PPMC & Committer
> >  * Ashutosh Chauhan <ha...@apache.org> - Mentor & Committer
> >  * Nishant Bangarwa <nb...@hortonworks.com> - PPMC & Committer
> >  * Slim Bouguerra <sb...@hortonworks.com> - Committer
> >  * Priyank Shah <ps...@hortonworks.com> - Committer
> >  * Harsha Chintalapani <sc...@hortonworks.com> - Committer
> >  * Daniel Dai <da...@apache.org> - Champion & Committer
> >  * Luke Han <lu...@apache.org> - Mentor
> >
> > == Affiliations ==
> > The initial committers are employees of Airbnb Inc. and Hortonworks.
> >
> > == Sponsors ==
> >
> > === Champion ===
> > Daniel Dai <da...@apache.org>
> >
> > === Nominated Mentors ===
> >  * Ashutosh Chauhan <ha...@apache.org>
> >  * Luke Han <lu...@apache.org>
> >
> > === Sponsoring Entity ===
> > Incubator PMC
> >
> >
>

Re: [RESULT] [VOTE] Superset Proposal for Apache Incubator

Posted by James Mayfield <ja...@airbnb.com.INVALID>.
Congrats team!

On Tue, May 2, 2017 at 9:54 AM, Jeff Feng <je...@gmail.com> wrote:

> The vote for Superset passes with 11 +1 binding votes, 3 +1 non-binding
> votes and no -1 votes.  Below are the overall results:
>
> *Binding:*
> Ashutosh Chauhan +1
> Luke Han + 1
> Julian Hyde +1
> Jitendra Pandey +1
> Joe Witt +1
> Ted Dunning +1
> P. Taylor Goetz +1
> Edward Yoon +1
> Jacques Nadeau +1
> Julian Le Dem +1
> Jim Jagielski +1
>
> *Non-Binding:*
> Moon Soo Lee +1
> Naresh Agarwal +1
> Felix Cheng +1
>
> Thank you to everyone who participated in the vote.
>
> Please welcome Superset to the Apache Incubator!
>
> Jeff
>
>
>
> On Sun, Apr 23, 2017 at 7:53 AM, Jeff Feng <je...@gmail.com> wrote:
>
>> Dear Apache Incubator Community,
>>
>> We have updated the Superset proposal
>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
>> Apache Incubation with an additional mentor (Luke Han -
>> luke.han@apache.org), and would like to start a vote thread for
>> acceptance into the incubator.
>>
>> Our team is excited to share Superset with the Apache community and we
>> hope for the your continued support!
>>
>> Cheers,
>> Jeff & the Superset Team
>>
>>
>>
>>
>> = Superset =
>>
>> == Abstract ==
>> Superset is an enterprise-ready web application for data exploration,
>> data visualization and dashboarding.
>>
>> == Proposal ==
>> Superset is business intelligence (BI) software that helps modern
>> organizations visualize and interact with their data. Superset enables
>> users explore data from a variety of databases, assemble beautiful
>> dashboards and share their findings.  Superset works neatly with all modern
>> SQL-speaking databases, and integrates with Druid.io to provide real-time,
>> interactive, blazing fast data access to large datasets.
>>
>> == Background ==
>> Data is mission critical. To succeed in this era, organizations need to
>> provide low-friction, intuitive and interactive access to data. It is
>> paramount for knowledge workers to be capable of answering their own
>> questions by querying, exploring and visualizing data.
>>
>> The entire business intelligence industry has pivoted from a model of
>> centralized top-down platforms driven by IT organizations to self-service
>> analytics and agile workflows by any user.  This shift unblocks centralized
>> service bottlenecks for creating data visualizations while also creating an
>> environment that is iterative and fast-moving.  This means that business
>> intelligence software must also be easy and delightful to use.
>> Self-service analytics doesn’t mean that admin and governance features are
>> not needed.
>> Modern BI tools provide fine-grain access controls and auditing
>> capabilities to understand how data is being used.  Superset is a solution
>> that delivers on all of these vectors.
>>
>> The technology stack is also constantly morphing - vendors are struggling
>> to provide cheap, quick and easy solutions to access data.  Business
>> intelligence users are finding existing solutions lacking as these software
>> products either disregard or react slowly to recent game-changing
>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
>> React.js and iPython’s Jupyter for instance.
>>
>> == Rationale ==
>> Business intelligence is more relevant today than at any other point in
>> history.  Organizations are currently very limited in options for open
>> source data visualization solutions, especially solutions that are both
>> self-service and enterprise-ready.  Every company informing their decisions
>> with data needs a BI tool.
>>
>> We believe that Superset will be a strong compliment to existing Apache
>> Software Foundation technologies by offering scalable user interactions to
>> distributed storage and computation solutions.  Users will often find that
>> Superset can act as a catalyst for tooling that can visualize the byproduct
>> of data and computation infrastructure.
>>
>> Superset has many key design elements that help fill a gap in current
>> solutions for organizations:
>>  * Easy, low friction access to data through a simple, web-based data
>> exploration interface.  Composing charts and dashboards are intuitive.
>> Eliminating the need to write code or SQL empowers anyone to use it.
>>  * Access to a wide array of rich, interactive data visualization types.
>>  * Enterprise-ready: Integration with different authentication mechanisms
>> and granular permissions centered around actions and data access.
>>  * Realtime & fast: Superset provides realtime analytics at the speed of
>> thought on very large datasets when integrated with Druid.io.
>>  * Broad data access: Consume data out of any SQL-speaking relational
>> database.
>>  * Extensible: Can be extended to talk to many noSQL databases like
>> Apache Drill, Elastic Search, and other popular database engines.
>>  * Fast loading dashboards with configurable web-scale caching.
>>  * Plug-in framework that enables organizations to build custom
>> analytical applications with new UI/UX interfaces.
>>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
>> with more flexibility.  SQL Lab integrates with the visualization engine
>> seamlessly.
>>
>> == Initial Goals ==
>> The initial goals of the Superset project are several-fold:
>>  * Move the existing codebase to Apache and integrate with the Apache
>> development process.
>>  * Redesign the user interface and interaction model for creating
>> visualizations/dashboards and connecting to data sources
>>  * Build robust support for security and governance of the tool including
>> popular authorization modules (including Apache Ranger and Apache Sentry)
>> and a more sophisticated permissions system
>>  * Grow the extensibility of the project both in terms of enhanced
>> connectivity to NoSQL-based data sources and creating a plug-in framework
>> that enables organizations to build custom analytical applications which
>> require a new UI/UX
>>
>> == Current Status ==
>> By many standards, Superset is already a successful open source project.
>> As of March 2017, Superset is officially used in production at about a
>> dozen companies, has received contributions from over one hundred
>> contributors on Github, 1500+ forks, and 12k+ stars.
>>
>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>> significant contributions, and expressed their commitment to the project.
>> The product is feature complete and has been viable for months. It already
>> serves as the main interface for consuming data at many companies of
>> different sizes.
>>
>> While the product is usable, there’s room for improvement across the
>> board, starting with providing a smoother user experience around content
>> creation, making sure all features work out-of-the-box on more platforms
>> and databases, providing better user training guides and videos, having a
>> predictable release process, and increasing the overall quality of the
>> Superset releases.
>>
>> === Meritocracy ===
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. Several companies have expressed interest in
>> this project, and we intend to invite additional developers to participate.
>> We will encourage and monitor community participation so that privileges
>> can be extended to those that contribute.
>>
>> === Community ===
>> The need for an enterprise-ready data visualization and exploration
>> platform in the open source community is tremendous.  While Superset is
>> fairly well known, recognized and used within the Druid.io community,
>> adoption is currently limited outside of that niche. There is a huge
>> opportunity to grow the community to hundreds if not thousands of
>> organizations, and we are hoping that embracing “the Apache way” will
>> accelerate the growth of our community.
>>
>> We have already been active at seeking and inviting contributions, and
>> are planning to scale the project by investing time and growing the support
>> structure to grow the community.
>>
>> === Core Developers ===
>> The initial committers for Superset include experienced full stack,
>> front-end and data engineers:
>>  * Maxime Beauchemin (Airbnb)
>>  * Alanna Scott (Airbnb)
>>  * Bogdan Kyryliuk (Airbnb)
>>  * Vera Liu  (Airbnb)
>>  * Jeff Feng (Airbnb)
>>  * Ashutosh Chauhan (Hortonworks)
>>  * Nishant Bangarwa (Hortonworks)
>>  * Slim Bouguerra (Hortonworks)
>>  * Priyank Shah (Hortonworks)
>>  * Sriharsha Chintalapani (Hortonworks)
>>  * Daniel Dai (Hortonworks)
>>
>> We realize that additional employer diversity is needed, and we will work
>> aggressively to recruit developers from additional companies.
>>
>> === Alignment ===
>> The initial committers strongly believe that a system for interactive
>> visualization of data will gain broader adoption as an open source,
>> community driven project, where the community can contribute not only to
>> the core components, but also to a growing collection of connectors,
>> visualizations and improving integration a all potential data sources.
>> Superset already integrates closely with Apache Hive, the Hive metastore,
>> as well as most SQL-speaking databases found in modern data ecosystems.
>>
>> == Known Risks ==
>>
>> === Orphaned Products ===
>> Superset is a vital component for both visualizing, accessing and
>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
>> component of the DataFlow product offering.  Thus, the risk of the project
>> being orphaned is relatively low.  The project could be at risk if Airbnb
>> changes their approach for democratizing data or if Hortonworks changes
>> their strategy in the market.  In such an event, the committers plan to
>> continue working on the project on their own time, thought the progress
>> will likely be slower.  We plan to mitigate this risk by recruiting
>> additional committers.
>>
>> === Inexperience with Open Source ===
>> The initial committers include veteran Apache members (committers and
>> PPMC members) and other developers who have varying degrees of experience
>> with open source projects. All have been involved with source code that has
>> been released under an open source license, and several also have
>> experience developing code with an open source development process.
>>
>> === Homogenous Developers ===
>> The initial committers are employed by Airbnb Inc. and Hortonworks. We
>> are committed to recruiting additional committers from other companies.
>>
>> === Reliance on Salaried Developers ===
>> It is expected that Superset development will occur on both salaried time
>> and on volunteer time, after hours. The majority of initial committers are
>> paid by their employer to contribute to this project. However, they are all
>> passionate about the project, and we are confident that the project will
>> continue even if no salaried developers contribute to the project. We are
>> committed to recruiting additional committers including non-salaried
>> developers.
>>
>> === Relationships with Other Apache Products ===
>> To the knowledge of the Initial Committers, there are no direct
>> competitors to Superset within the Apache Software Foundation.  That said,
>> Apache Zeppelin is an indirect competitor, but it solves a different use
>> case.
>>
>> Apache Zeppelin is a web-based notebook that enables interactive data
>> analytics. It enables the creation of beautiful data-driven, interactive
>> and collaborative documents with SQL, Scala and more.  Although a user can
>> create data visualizations using this project, it leverages a notebook
>> style user interfaces and it is geared towards the Spark community where
>> Scala and SQL co-exist
>>
>> We look forward to collaborating with those communities, as well as other
>> Apache communities.
>>
>> === An Excessive Fascination with the Apache Brand ===
>> Superset is solving two huge challenges:
>> The challenge of enabling every knowledge worker to make data informed
>> decisions, particularly those who are not deeply skilled at writing SQL.
>> The challenge of visualizing huge amounts of data interactively and in
>> real-time
>>
>> Superset was first developed as a data visualization solution for
>> Druid.io as a way to visualize billions of rows of data.  Since then, usage
>> of Superset has expanded to address data visualization use cases across SQL
>> speaking data sources as well.
>>
>> Our rationale for developing Superset as an Apache project is detailed in
>> the Rationale Section.  We believe that the Apache brand and community
>> process will help us attract more contributors to this project, and help
>> grow the footprint of the project through usage at other organizations and
>> within other applications.  Establishing consensus among users and
>> developers will result in a more valuable tool for everyone.
>>
>> == Documentation ==
>> References to further reading material:
>>  * [[http://airbnb.io/superset/|Superset Documentation]]
>>  * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
>> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
>> Airbnb’s Data Exploration Platform]]
>>  * [[https://medium.com/airbnb-engineering/superset-scaling-dat
>> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
>> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>>
>> == Initial Source ==
>> The origin of the proposed code base can be found at
>> https://github.com/airbnb/superset.  The code base is primarily in
>> Python.
>>
>> == Source and Intellectual Property Submission Plan ==
>> We do not expect any complications for the submission of the Superset
>> code base.  Our code is already in Github and there is only a single code
>> base.
>>
>> == External Dependencies ==
>> List of Python packages, from the Python Package Index (Pypi):
>>
>>  * boto3
>>  * celery
>>  * cryptography
>>  * flask-appbuilder
>>  * flask-cache
>>  * flask-migrate
>>  * flask-script
>>  * flask-sqlalchemy
>>  * flask-testing
>>  * humanize
>>  * gunicorn
>>  * markdown
>>  * pandas
>>  * parsedatetime
>>  * pydruid
>>  * PyHive
>>  * python-dateutil
>>  * requests
>>  * simplejson
>>  * six
>>  * sqlalchemy
>>  * sqlalchemy-utils
>>  * sqlparse
>>  * thrift
>>  * thrift-sasl
>>  * werkzeug
>>
>> List of Javascript packages, from NPM:
>>  * autobind-decorator
>>  * bootstrap
>>  * bootstrap-datepicker
>>  * brace
>>  * brfs
>>  * cal-heatmap
>>  * classnames
>>  * d3
>>  * d3-cloud
>>  * d3-sankey
>>  * d3-scale
>>  * d3-tip
>>  * datamaps
>>  * datatables-bootstrap3-plugin
>>  * datatables.net-bs
>>  * font-awesome
>>  * gridster
>>  * immutability-helper
>>  * immutable
>>  * jquery
>>  * lodash.throttle
>>  * mapbox-gl
>>  * moment
>>  * moments
>>  * mustache
>>  * nvd3
>>  * react
>>  * react-ace
>>  * react-bootstrap
>>  * react-bootstrap-table
>>  * react-dom
>>  * react-draggable
>>  * react-gravatar
>>  * react-grid-layout
>>  * react-map-gl
>>  * react-redux
>>  * react-resizable
>>  * react-select
>>  * react-syntax-highlighter
>>  * reactable
>>  * redux
>>  * redux-localstorage
>>  * redux-thunk
>>  * shortid
>>  * style-loader
>>  * supercluster
>>  * topojson
>>  * victory
>>  * viewport-mercator-project
>>
>> == Cryptography ==
>> The proposal does not include cryptographic code.
>>
>> == Required Resources ==
>>
>> === Mailing List ===
>> There is a current mailing list as a Google Group “airbnb_superset” that
>> we are planning on deprecating as the Apache.org become ready to serve our
>> community.
>>
>>  * superset-private
>>  * superset-dev
>>  * superset-user
>>
>> === Subversion Directory ===
>> Git is the preferred source control system.
>> http://svn.apache.org/repos/asf/incubator/superset
>>
>> == Git Repository ==
>> Git is the preferred source control system, we’re assuming
>> https://github.com/apache/incubator-superset based on the naming scheme
>>
>> == Issue Tracking ==
>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
>> PRs to manage our project as much as possible. It’s been said that there
>> are ways to keep Github’s issues in sync with Jira, allowing us to get best
>> of both worlds. If that is not possible, we will comply to using Jira.
>>
>> == Other Resources ==
>> We currently use a set of Github integrated services that are free to the
>> open source community, like Travis-ci, Code Climate, Coveralls,
>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
>> these services as they allow us to scale contributions and optimize our
>> development flows. These services require some elevated rights on the
>> Github repository in order to set up or tune and we would like for the
>> committers to have the required rights.
>>
>>
>> == Initial Committers ==
>>
>>  * Maxime Beauchemin <ma...@airbnb.com> - PPMC & Committer
>>  * Alanna Scott <al...@airbnb.com> - PPMC & Committer
>>  * Bogdan Kyryliuk <b....@gmail.com> - PPMC & Committer
>>  * Vera Liu <ve...@airbnb.com> - Committer
>>  * Jeff Feng <je...@airbnb.com> - PPMC & Committer
>>  * Ashutosh Chauhan <ha...@apache.org> - Mentor & Committer
>>  * Nishant Bangarwa <nb...@hortonworks.com> - PPMC & Committer
>>  * Slim Bouguerra <sb...@hortonworks.com> - Committer
>>  * Priyank Shah <ps...@hortonworks.com> - Committer
>>  * Harsha Chintalapani <sc...@hortonworks.com> - Committer
>>  * Daniel Dai <da...@apache.org> - Champion & Committer
>>  * Luke Han <lu...@apache.org> - Mentor
>>
>> == Affiliations ==
>> The initial committers are employees of Airbnb Inc. and Hortonworks.
>>
>> == Sponsors ==
>>
>> === Champion ===
>> Daniel Dai <da...@apache.org>
>>
>> === Nominated Mentors ===
>>  * Ashutosh Chauhan <ha...@apache.org>
>>  * Luke Han <lu...@apache.org>
>>
>> === Sponsoring Entity ===
>> Incubator PMC
>>
>>
>