You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Punyashloka Biswal <pu...@gmail.com> on 2015/04/24 14:12:41 UTC

Design docs: consolidation and discoverability

Dear Spark devs,

Right now, design docs are stored on Google docs and linked from tickets.
For someone new to the project, it's hard to figure out what subjects are
being discussed, what organization to follow for new feature proposals, etc.

Would it make sense to consolidate future design docs in either a
designated area on the Apache Confluence Wiki, or on GitHub's Wiki pages?
If people have a strong preference to keep the design docs on Google Docs,
then could we have a top-level page on the confluence wiki that lists all
active and archived design docs?

Punya

Re: Design docs: consolidation and discoverability

Posted by Punyashloka Biswal <pu...@gmail.com>.
Github's wiki is just another Git repo. If we use a separate repo, it's
probably easiest to use the wiki git repo rather than the "primary" git
repo.

Punya

On Mon, Apr 27, 2015 at 1:50 PM Nicholas Chammas <ni...@gmail.com>
wrote:

> Oh, a GitHub wiki (which is separate from having docs in a repo) is yet
> another approach we could take, though if we want to do that on the main
> Spark repo we'd need permission from Apache, which may be tough to get...
>
> On Mon, Apr 27, 2015 at 1:47 PM Punyashloka Biswal <pu...@gmail.com>
> wrote:
>
>> Nick, I like your idea of keeping it in a separate git repository. It
>> seems to combine the advantages of the present Google Docs approach with
>> the crisper history, discoverability, and text format simplicity of GitHub
>> wikis.
>>
>> Punya
>> On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas <
>> nicholas.chammas@gmail.com> wrote:
>>
>>> I like the idea of having design docs be kept up to date and tracked in
>>> git.
>>>
>>> If the Apache repo isn't a good fit, perhaps we can have a separate repo
>>> just for design docs? Maybe something like
>>> github.com/spark-docs/spark-docs/
>>> ?
>>>
>>> If there's other stuff we want to track but haven't, perhaps we can
>>> generalize the purpose of the repo a bit and rename it accordingly (e.g.
>>> spark-misc/spark-misc).
>>>
>>> Nick
>>>
>>> On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza <sa...@cloudera.com>
>>> wrote:
>>>
>>> > My only issue with Google Docs is that they're mutable, so it's
>>> difficult
>>> > to follow a design's history through its revisions and link up JIRA
>>> > comments with the relevant version.
>>> >
>>> > -Sandy
>>> >
>>> > On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <
>>> stevel@hortonworks.com>
>>> > wrote:
>>> >
>>> > >
>>> > > One thing to consider is that while docs as PDFs in JIRAs do
>>> document the
>>> > > original proposal, that's not the place to keep living
>>> specifications.
>>> > That
>>> > > stuff needs to live in SCM, in a format which can be easily
>>> maintained,
>>> > can
>>> > > generate readable documents, and, in an unrealistically ideal world,
>>> even
>>> > > be used by machines to validate compliance with the design. Test
>>> suites
>>> > > tend to be the implicit machine-readable part of the specification,
>>> > though
>>> > > they aren't usually viewed as such.
>>> > >
>>> > > PDFs of word docs in JIRAs are not the place for ongoing work, even
>>> if
>>> > the
>>> > > early drafts can contain them. Given it's just as easy to point to
>>> > markdown
>>> > > docs in github by commit ID, that could be an alternative way to
>>> publish
>>> > > docs, with the document itself being viewed as one of the
>>> deliverables.
>>> > > When the time comes to update a document, then its there in the
>>> source
>>> > tree
>>> > > to edit.
>>> > >
>>> > > If there's a flaw here, its that design docs are that: the design.
>>> The
>>> > > implementation may not match, ongoing work will certainly diverge.
>>> If the
>>> > > design docs aren't kept in sync, then they can mislead people.
>>> > Accordingly,
>>> > > once the design docs are incorporated into the source tree, keeping
>>> them
>>> > in
>>> > > sync with changes has be viewed as essential as keeping tests up to
>>> date
>>> > >
>>> > > > On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com>
>>> wrote:
>>> > > >
>>> > > > I actually don't totally see why we can't use Google Docs provided
>>> it
>>> > > > is clearly discoverable from the JIRA. It was my understanding that
>>> > > > many projects do this. Maybe not (?).
>>> > > >
>>> > > > If it's a matter of maintaining public record on ASF
>>> infrastructure,
>>> > > > perhaps we can just automate that if an issue is closed we capture
>>> the
>>> > > > doc content and attach it to the JIRA as a PDF.
>>> > > >
>>> > > > My sense is that in general the ASF infrastructure policy is
>>> becoming
>>> > > > more and more lenient with regards to using third party services,
>>> > > > provided the are broadly accessible (such as a public google doc)
>>> and
>>> > > > can be definitively archived on ASF controlled storage.
>>> > > >
>>> > > > - Patrick
>>> > > >
>>> > > > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com>
>>> wrote:
>>> > > >> I know I recently used Google Docs from a JIRA, so am guilty as
>>> > > >> charged. I don't think there are a lot of design docs in general,
>>> but
>>> > > >> the ones I've seen have simply pushed docs to a JIRA. (I did the
>>> same,
>>> > > >> mirroring PDFs of the Google Doc.) I don't think this is hard to
>>> > > >> follow.
>>> > > >>
>>> > > >> I think you can do what you like: make a JIRA and attach files.
>>> Make a
>>> > > >> WIP PR and attach your notes. Make a Google Doc if you're feeling
>>> > > >> transgressive.
>>> > > >>
>>> > > >> I don't see much of a problem to solve here. In practice there are
>>> > > >> plenty of workable options, all of which are mainstream, and so I
>>> do
>>> > > >> not see an argument that somehow this is solved by letting people
>>> make
>>> > > >> wikis.
>>> > > >>
>>> > > >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
>>> > > >> <pu...@gmail.com> wrote:
>>> > > >>> Okay, I can understand wanting to keep Git history clean, and
>>> avoid
>>> > > >>> bottlenecking on committers. Is it reasonable to establish a
>>> > > convention of
>>> > > >>> having a label, component or (best of all) an issue type for
>>> issues
>>> > > that are
>>> > > >>> associated with design docs? For example, if we used the existing
>>> > > >>> "Brainstorming" issue type, and people put their design doc in
>>> the
>>> > > >>> description of the ticket, it would be relatively easy to figure
>>> out
>>> > > what
>>> > > >>> designs are in progress.
>>> > > >>>
>>> > > >>> Given the push-back against design docs in Git or on the wiki
>>> and the
>>> > > strong
>>> > > >>> preference for keeping docs on ASF property, I'm a bit surprised
>>> that
>>> > > all
>>> > > >>> the existing design docs are on Google Docs. Perhaps Apache
>>> should
>>> > > consider
>>> > > >>> opening up parts of the wiki to a larger group, to better serve
>>> this
>>> > > use
>>> > > >>> case.
>>> > > >>>
>>> > > >>> Punya
>>> > > >>>
>>> > > >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <
>>> pwendell@gmail.com>
>>> > > wrote:
>>> > > >>>>
>>> > > >>>> Using our ASF git repository as a working area for design docs,
>>> it
>>> > > >>>> seems potentially concerning to me. It's difficult process wise
>>> > > >>>> because all commits need to go through committers and also, we'd
>>> > > >>>> pollute our git history a lot with random incremental design
>>> > updates.
>>> > > >>>>
>>> > > >>>> The git history is used a lot by downstream packagers, us
>>> during our
>>> > > >>>> QA process, etc... we really try to keep it oriented around code
>>> > > >>>> patches:
>>> > > >>>>
>>> > > >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>>> > > >>>>
>>> > > >>>> Committing a polished design doc along with a feature, maybe
>>> that's
>>> > > >>>> something we could consider. But I still think JIRA is the best
>>> > > >>>> location for these docs, consistent with what most other ASF
>>> > projects
>>> > > >>>> do that I know.
>>> > > >>>>
>>> > > >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <
>>> cody@koeninger.org
>>> > >
>>> > > >>>> wrote:
>>> > > >>>>> Why can't pull requests be used for design docs in Git if
>>> people
>>> > who
>>> > > >>>>> aren't
>>> > > >>>>> committers want to contribute changes (as opposed to just
>>> > comments)?
>>> > > >>>>>
>>> > > >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <sowen@cloudera.com
>>> >
>>> > > wrote:
>>> > > >>>>>
>>> > > >>>>>> Only catch there is it requires commit access to the repo. We
>>> > need a
>>> > > >>>>>> way for people who aren't committers to write and collaborate
>>> (for
>>> > > >>>>>> point #1)
>>> > > >>>>>>
>>> > > >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>>> > > >>>>>> <pu...@gmail.com> wrote:
>>> > > >>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git
>>> satisfy
>>> > the
>>> > > >>>>>> history
>>> > > >>>>>>> requirement? Referring back to my Gradle example, it seems
>>> that
>>> > > >>>>>>>
>>> > > >>>>>>
>>> > > >>>>>>
>>> > >
>>> >
>>> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>>> > > >>>>>>> is a really good way to see why the design doc evolved the
>>> way it
>>> > > >>>>>>> did.
>>> > > >>>>>> When
>>> > > >>>>>>> keeping the doc in Jira (presumably as an attachment) it's
>>> not
>>> > easy
>>> > > >>>>>>> to
>>> > > >>>>>> see
>>> > > >>>>>>> what changed between successive versions of the doc.
>>> > > >>>>>>>
>>> > > >>>>>>> Punya
>>> > > >>>>>>>
>>> > > >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <
>>> > > sandy.ryza@cloudera.com>
>>> > > >>>>>> wrote:
>>> > > >>>>>>>>
>>> > > >>>>>>>> I think there are maybe two separate things we're talking
>>> about?
>>> > > >>>>>>>>
>>> > > >>>>>>>> 1. Design discussions and in-progress design docs.
>>> > > >>>>>>>>
>>> > > >>>>>>>> My two cents are that JIRA is the best place for this.  It
>>> > allows
>>> > > >>>>>> tracking
>>> > > >>>>>>>> the progression of a design across multiple PRs and
>>> > > contributors.  A
>>> > > >>>>>> piece
>>> > > >>>>>>>> of useful feedback that I've gotten in the past is to make
>>> > design
>>> > > >>>>>>>> docs
>>> > > >>>>>>>> immutable.  When updating them in response to feedback,
>>> post a
>>> > new
>>> > > >>>>>> version
>>> > > >>>>>>>> rather than editing the existing one.  This enables
>>> tracking the
>>> > > >>>>>> history of
>>> > > >>>>>>>> a design and makes it possible to read comments about
>>> previous
>>> > > >>>>>>>> designs
>>> > > >>>>>> in
>>> > > >>>>>>>> context.  Otherwise it's really difficult to understand why
>>> > > >>>>>>>> particular
>>> > > >>>>>>>> approaches were chosen or abandoned.
>>> > > >>>>>>>>
>>> > > >>>>>>>> 2. Completed design docs for features that we've
>>> implemented.
>>> > > >>>>>>>>
>>> > > >>>>>>>> Perhaps less essential to project progress, but it would be
>>> > really
>>> > > >>>>>> lovely
>>> > > >>>>>>>> to have a central repository to all the projects design
>>> doc.  If
>>> > > >>>>>>>> anyone
>>> > > >>>>>>>> wants to step up to maintain it, it would be cool to have a
>>> wiki
>>> > > >>>>>>>> page
>>> > > >>>>>> with
>>> > > >>>>>>>> links to all the final design docs posted on JIRA.
>>> > > >>>>>>>>
>>> > > >>>>>>
>>> > > >
>>> > > >
>>> ---------------------------------------------------------------------
>>> > > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > > > For additional commands, e-mail: dev-help@spark.apache.org
>>> > > >
>>> > >
>>> > >
>>> > > ---------------------------------------------------------------------
>>> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > > For additional commands, e-mail: dev-help@spark.apache.org
>>> > >
>>> > >
>>> >
>>>
>>

Re: Design docs: consolidation and discoverability

Posted by Nicholas Chammas <ni...@gmail.com>.
Oh, a GitHub wiki (which is separate from having docs in a repo) is yet
another approach we could take, though if we want to do that on the main
Spark repo we'd need permission from Apache, which may be tough to get...

On Mon, Apr 27, 2015 at 1:47 PM Punyashloka Biswal <pu...@gmail.com>
wrote:

> Nick, I like your idea of keeping it in a separate git repository. It
> seems to combine the advantages of the present Google Docs approach with
> the crisper history, discoverability, and text format simplicity of GitHub
> wikis.
>
> Punya
> On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas <
> nicholas.chammas@gmail.com> wrote:
>
>> I like the idea of having design docs be kept up to date and tracked in
>> git.
>>
>> If the Apache repo isn't a good fit, perhaps we can have a separate repo
>> just for design docs? Maybe something like
>> github.com/spark-docs/spark-docs/
>> ?
>>
>> If there's other stuff we want to track but haven't, perhaps we can
>> generalize the purpose of the repo a bit and rename it accordingly (e.g.
>> spark-misc/spark-misc).
>>
>> Nick
>>
>> On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza <sa...@cloudera.com>
>> wrote:
>>
>> > My only issue with Google Docs is that they're mutable, so it's
>> difficult
>> > to follow a design's history through its revisions and link up JIRA
>> > comments with the relevant version.
>> >
>> > -Sandy
>> >
>> > On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <stevel@hortonworks.com
>> >
>> > wrote:
>> >
>> > >
>> > > One thing to consider is that while docs as PDFs in JIRAs do document
>> the
>> > > original proposal, that's not the place to keep living specifications.
>> > That
>> > > stuff needs to live in SCM, in a format which can be easily
>> maintained,
>> > can
>> > > generate readable documents, and, in an unrealistically ideal world,
>> even
>> > > be used by machines to validate compliance with the design. Test
>> suites
>> > > tend to be the implicit machine-readable part of the specification,
>> > though
>> > > they aren't usually viewed as such.
>> > >
>> > > PDFs of word docs in JIRAs are not the place for ongoing work, even if
>> > the
>> > > early drafts can contain them. Given it's just as easy to point to
>> > markdown
>> > > docs in github by commit ID, that could be an alternative way to
>> publish
>> > > docs, with the document itself being viewed as one of the
>> deliverables.
>> > > When the time comes to update a document, then its there in the source
>> > tree
>> > > to edit.
>> > >
>> > > If there's a flaw here, its that design docs are that: the design. The
>> > > implementation may not match, ongoing work will certainly diverge. If
>> the
>> > > design docs aren't kept in sync, then they can mislead people.
>> > Accordingly,
>> > > once the design docs are incorporated into the source tree, keeping
>> them
>> > in
>> > > sync with changes has be viewed as essential as keeping tests up to
>> date
>> > >
>> > > > On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com>
>> wrote:
>> > > >
>> > > > I actually don't totally see why we can't use Google Docs provided
>> it
>> > > > is clearly discoverable from the JIRA. It was my understanding that
>> > > > many projects do this. Maybe not (?).
>> > > >
>> > > > If it's a matter of maintaining public record on ASF infrastructure,
>> > > > perhaps we can just automate that if an issue is closed we capture
>> the
>> > > > doc content and attach it to the JIRA as a PDF.
>> > > >
>> > > > My sense is that in general the ASF infrastructure policy is
>> becoming
>> > > > more and more lenient with regards to using third party services,
>> > > > provided the are broadly accessible (such as a public google doc)
>> and
>> > > > can be definitively archived on ASF controlled storage.
>> > > >
>> > > > - Patrick
>> > > >
>> > > > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com>
>> wrote:
>> > > >> I know I recently used Google Docs from a JIRA, so am guilty as
>> > > >> charged. I don't think there are a lot of design docs in general,
>> but
>> > > >> the ones I've seen have simply pushed docs to a JIRA. (I did the
>> same,
>> > > >> mirroring PDFs of the Google Doc.) I don't think this is hard to
>> > > >> follow.
>> > > >>
>> > > >> I think you can do what you like: make a JIRA and attach files.
>> Make a
>> > > >> WIP PR and attach your notes. Make a Google Doc if you're feeling
>> > > >> transgressive.
>> > > >>
>> > > >> I don't see much of a problem to solve here. In practice there are
>> > > >> plenty of workable options, all of which are mainstream, and so I
>> do
>> > > >> not see an argument that somehow this is solved by letting people
>> make
>> > > >> wikis.
>> > > >>
>> > > >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
>> > > >> <pu...@gmail.com> wrote:
>> > > >>> Okay, I can understand wanting to keep Git history clean, and
>> avoid
>> > > >>> bottlenecking on committers. Is it reasonable to establish a
>> > > convention of
>> > > >>> having a label, component or (best of all) an issue type for
>> issues
>> > > that are
>> > > >>> associated with design docs? For example, if we used the existing
>> > > >>> "Brainstorming" issue type, and people put their design doc in the
>> > > >>> description of the ticket, it would be relatively easy to figure
>> out
>> > > what
>> > > >>> designs are in progress.
>> > > >>>
>> > > >>> Given the push-back against design docs in Git or on the wiki and
>> the
>> > > strong
>> > > >>> preference for keeping docs on ASF property, I'm a bit surprised
>> that
>> > > all
>> > > >>> the existing design docs are on Google Docs. Perhaps Apache should
>> > > consider
>> > > >>> opening up parts of the wiki to a larger group, to better serve
>> this
>> > > use
>> > > >>> case.
>> > > >>>
>> > > >>> Punya
>> > > >>>
>> > > >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <
>> pwendell@gmail.com>
>> > > wrote:
>> > > >>>>
>> > > >>>> Using our ASF git repository as a working area for design docs,
>> it
>> > > >>>> seems potentially concerning to me. It's difficult process wise
>> > > >>>> because all commits need to go through committers and also, we'd
>> > > >>>> pollute our git history a lot with random incremental design
>> > updates.
>> > > >>>>
>> > > >>>> The git history is used a lot by downstream packagers, us during
>> our
>> > > >>>> QA process, etc... we really try to keep it oriented around code
>> > > >>>> patches:
>> > > >>>>
>> > > >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>> > > >>>>
>> > > >>>> Committing a polished design doc along with a feature, maybe
>> that's
>> > > >>>> something we could consider. But I still think JIRA is the best
>> > > >>>> location for these docs, consistent with what most other ASF
>> > projects
>> > > >>>> do that I know.
>> > > >>>>
>> > > >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <
>> cody@koeninger.org
>> > >
>> > > >>>> wrote:
>> > > >>>>> Why can't pull requests be used for design docs in Git if people
>> > who
>> > > >>>>> aren't
>> > > >>>>> committers want to contribute changes (as opposed to just
>> > comments)?
>> > > >>>>>
>> > > >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com>
>> > > wrote:
>> > > >>>>>
>> > > >>>>>> Only catch there is it requires commit access to the repo. We
>> > need a
>> > > >>>>>> way for people who aren't committers to write and collaborate
>> (for
>> > > >>>>>> point #1)
>> > > >>>>>>
>> > > >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>> > > >>>>>> <pu...@gmail.com> wrote:
>> > > >>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git
>> satisfy
>> > the
>> > > >>>>>> history
>> > > >>>>>>> requirement? Referring back to my Gradle example, it seems
>> that
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>>
>> > >
>> >
>> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>> > > >>>>>>> is a really good way to see why the design doc evolved the
>> way it
>> > > >>>>>>> did.
>> > > >>>>>> When
>> > > >>>>>>> keeping the doc in Jira (presumably as an attachment) it's not
>> > easy
>> > > >>>>>>> to
>> > > >>>>>> see
>> > > >>>>>>> what changed between successive versions of the doc.
>> > > >>>>>>>
>> > > >>>>>>> Punya
>> > > >>>>>>>
>> > > >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <
>> > > sandy.ryza@cloudera.com>
>> > > >>>>>> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>> I think there are maybe two separate things we're talking
>> about?
>> > > >>>>>>>>
>> > > >>>>>>>> 1. Design discussions and in-progress design docs.
>> > > >>>>>>>>
>> > > >>>>>>>> My two cents are that JIRA is the best place for this.  It
>> > allows
>> > > >>>>>> tracking
>> > > >>>>>>>> the progression of a design across multiple PRs and
>> > > contributors.  A
>> > > >>>>>> piece
>> > > >>>>>>>> of useful feedback that I've gotten in the past is to make
>> > design
>> > > >>>>>>>> docs
>> > > >>>>>>>> immutable.  When updating them in response to feedback, post
>> a
>> > new
>> > > >>>>>> version
>> > > >>>>>>>> rather than editing the existing one.  This enables tracking
>> the
>> > > >>>>>> history of
>> > > >>>>>>>> a design and makes it possible to read comments about
>> previous
>> > > >>>>>>>> designs
>> > > >>>>>> in
>> > > >>>>>>>> context.  Otherwise it's really difficult to understand why
>> > > >>>>>>>> particular
>> > > >>>>>>>> approaches were chosen or abandoned.
>> > > >>>>>>>>
>> > > >>>>>>>> 2. Completed design docs for features that we've implemented.
>> > > >>>>>>>>
>> > > >>>>>>>> Perhaps less essential to project progress, but it would be
>> > really
>> > > >>>>>> lovely
>> > > >>>>>>>> to have a central repository to all the projects design
>> doc.  If
>> > > >>>>>>>> anyone
>> > > >>>>>>>> wants to step up to maintain it, it would be cool to have a
>> wiki
>> > > >>>>>>>> page
>> > > >>>>>> with
>> > > >>>>>>>> links to all the final design docs posted on JIRA.
>> > > >>>>>>>>
>> > > >>>>>>
>> > > >
>> > > >
>> ---------------------------------------------------------------------
>> > > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > > > For additional commands, e-mail: dev-help@spark.apache.org
>> > > >
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > > For additional commands, e-mail: dev-help@spark.apache.org
>> > >
>> > >
>> >
>>
>

Re: Design docs: consolidation and discoverability

Posted by Punyashloka Biswal <pu...@gmail.com>.
Nick, I like your idea of keeping it in a separate git repository. It seems
to combine the advantages of the present Google Docs approach with the
crisper history, discoverability, and text format simplicity of GitHub
wikis.

Punya
On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas <ni...@gmail.com>
wrote:

> I like the idea of having design docs be kept up to date and tracked in
> git.
>
> If the Apache repo isn't a good fit, perhaps we can have a separate repo
> just for design docs? Maybe something like
> github.com/spark-docs/spark-docs/
> ?
>
> If there's other stuff we want to track but haven't, perhaps we can
> generalize the purpose of the repo a bit and rename it accordingly (e.g.
> spark-misc/spark-misc).
>
> Nick
>
> On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza <sa...@cloudera.com>
> wrote:
>
> > My only issue with Google Docs is that they're mutable, so it's difficult
> > to follow a design's history through its revisions and link up JIRA
> > comments with the relevant version.
> >
> > -Sandy
> >
> > On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <st...@hortonworks.com>
> > wrote:
> >
> > >
> > > One thing to consider is that while docs as PDFs in JIRAs do document
> the
> > > original proposal, that's not the place to keep living specifications.
> > That
> > > stuff needs to live in SCM, in a format which can be easily maintained,
> > can
> > > generate readable documents, and, in an unrealistically ideal world,
> even
> > > be used by machines to validate compliance with the design. Test suites
> > > tend to be the implicit machine-readable part of the specification,
> > though
> > > they aren't usually viewed as such.
> > >
> > > PDFs of word docs in JIRAs are not the place for ongoing work, even if
> > the
> > > early drafts can contain them. Given it's just as easy to point to
> > markdown
> > > docs in github by commit ID, that could be an alternative way to
> publish
> > > docs, with the document itself being viewed as one of the deliverables.
> > > When the time comes to update a document, then its there in the source
> > tree
> > > to edit.
> > >
> > > If there's a flaw here, its that design docs are that: the design. The
> > > implementation may not match, ongoing work will certainly diverge. If
> the
> > > design docs aren't kept in sync, then they can mislead people.
> > Accordingly,
> > > once the design docs are incorporated into the source tree, keeping
> them
> > in
> > > sync with changes has be viewed as essential as keeping tests up to
> date
> > >
> > > > On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com>
> wrote:
> > > >
> > > > I actually don't totally see why we can't use Google Docs provided it
> > > > is clearly discoverable from the JIRA. It was my understanding that
> > > > many projects do this. Maybe not (?).
> > > >
> > > > If it's a matter of maintaining public record on ASF infrastructure,
> > > > perhaps we can just automate that if an issue is closed we capture
> the
> > > > doc content and attach it to the JIRA as a PDF.
> > > >
> > > > My sense is that in general the ASF infrastructure policy is becoming
> > > > more and more lenient with regards to using third party services,
> > > > provided the are broadly accessible (such as a public google doc) and
> > > > can be definitively archived on ASF controlled storage.
> > > >
> > > > - Patrick
> > > >
> > > > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com>
> wrote:
> > > >> I know I recently used Google Docs from a JIRA, so am guilty as
> > > >> charged. I don't think there are a lot of design docs in general,
> but
> > > >> the ones I've seen have simply pushed docs to a JIRA. (I did the
> same,
> > > >> mirroring PDFs of the Google Doc.) I don't think this is hard to
> > > >> follow.
> > > >>
> > > >> I think you can do what you like: make a JIRA and attach files.
> Make a
> > > >> WIP PR and attach your notes. Make a Google Doc if you're feeling
> > > >> transgressive.
> > > >>
> > > >> I don't see much of a problem to solve here. In practice there are
> > > >> plenty of workable options, all of which are mainstream, and so I do
> > > >> not see an argument that somehow this is solved by letting people
> make
> > > >> wikis.
> > > >>
> > > >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
> > > >> <pu...@gmail.com> wrote:
> > > >>> Okay, I can understand wanting to keep Git history clean, and avoid
> > > >>> bottlenecking on committers. Is it reasonable to establish a
> > > convention of
> > > >>> having a label, component or (best of all) an issue type for issues
> > > that are
> > > >>> associated with design docs? For example, if we used the existing
> > > >>> "Brainstorming" issue type, and people put their design doc in the
> > > >>> description of the ticket, it would be relatively easy to figure
> out
> > > what
> > > >>> designs are in progress.
> > > >>>
> > > >>> Given the push-back against design docs in Git or on the wiki and
> the
> > > strong
> > > >>> preference for keeping docs on ASF property, I'm a bit surprised
> that
> > > all
> > > >>> the existing design docs are on Google Docs. Perhaps Apache should
> > > consider
> > > >>> opening up parts of the wiki to a larger group, to better serve
> this
> > > use
> > > >>> case.
> > > >>>
> > > >>> Punya
> > > >>>
> > > >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <
> pwendell@gmail.com>
> > > wrote:
> > > >>>>
> > > >>>> Using our ASF git repository as a working area for design docs, it
> > > >>>> seems potentially concerning to me. It's difficult process wise
> > > >>>> because all commits need to go through committers and also, we'd
> > > >>>> pollute our git history a lot with random incremental design
> > updates.
> > > >>>>
> > > >>>> The git history is used a lot by downstream packagers, us during
> our
> > > >>>> QA process, etc... we really try to keep it oriented around code
> > > >>>> patches:
> > > >>>>
> > > >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
> > > >>>>
> > > >>>> Committing a polished design doc along with a feature, maybe
> that's
> > > >>>> something we could consider. But I still think JIRA is the best
> > > >>>> location for these docs, consistent with what most other ASF
> > projects
> > > >>>> do that I know.
> > > >>>>
> > > >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <
> cody@koeninger.org
> > >
> > > >>>> wrote:
> > > >>>>> Why can't pull requests be used for design docs in Git if people
> > who
> > > >>>>> aren't
> > > >>>>> committers want to contribute changes (as opposed to just
> > comments)?
> > > >>>>>
> > > >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com>
> > > wrote:
> > > >>>>>
> > > >>>>>> Only catch there is it requires commit access to the repo. We
> > need a
> > > >>>>>> way for people who aren't committers to write and collaborate
> (for
> > > >>>>>> point #1)
> > > >>>>>>
> > > >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
> > > >>>>>> <pu...@gmail.com> wrote:
> > > >>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git satisfy
> > the
> > > >>>>>> history
> > > >>>>>>> requirement? Referring back to my Gradle example, it seems that
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > >
> >
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> > > >>>>>>> is a really good way to see why the design doc evolved the way
> it
> > > >>>>>>> did.
> > > >>>>>> When
> > > >>>>>>> keeping the doc in Jira (presumably as an attachment) it's not
> > easy
> > > >>>>>>> to
> > > >>>>>> see
> > > >>>>>>> what changed between successive versions of the doc.
> > > >>>>>>>
> > > >>>>>>> Punya
> > > >>>>>>>
> > > >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <
> > > sandy.ryza@cloudera.com>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> I think there are maybe two separate things we're talking
> about?
> > > >>>>>>>>
> > > >>>>>>>> 1. Design discussions and in-progress design docs.
> > > >>>>>>>>
> > > >>>>>>>> My two cents are that JIRA is the best place for this.  It
> > allows
> > > >>>>>> tracking
> > > >>>>>>>> the progression of a design across multiple PRs and
> > > contributors.  A
> > > >>>>>> piece
> > > >>>>>>>> of useful feedback that I've gotten in the past is to make
> > design
> > > >>>>>>>> docs
> > > >>>>>>>> immutable.  When updating them in response to feedback, post a
> > new
> > > >>>>>> version
> > > >>>>>>>> rather than editing the existing one.  This enables tracking
> the
> > > >>>>>> history of
> > > >>>>>>>> a design and makes it possible to read comments about previous
> > > >>>>>>>> designs
> > > >>>>>> in
> > > >>>>>>>> context.  Otherwise it's really difficult to understand why
> > > >>>>>>>> particular
> > > >>>>>>>> approaches were chosen or abandoned.
> > > >>>>>>>>
> > > >>>>>>>> 2. Completed design docs for features that we've implemented.
> > > >>>>>>>>
> > > >>>>>>>> Perhaps less essential to project progress, but it would be
> > really
> > > >>>>>> lovely
> > > >>>>>>>> to have a central repository to all the projects design doc.
> If
> > > >>>>>>>> anyone
> > > >>>>>>>> wants to step up to maintain it, it would be cool to have a
> wiki
> > > >>>>>>>> page
> > > >>>>>> with
> > > >>>>>>>> links to all the final design docs posted on JIRA.
> > > >>>>>>>>
> > > >>>>>>
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > > > For additional commands, e-mail: dev-help@spark.apache.org
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > > For additional commands, e-mail: dev-help@spark.apache.org
> > >
> > >
> >
>

Re: Design docs: consolidation and discoverability

Posted by Nicholas Chammas <ni...@gmail.com>.
I like the idea of having design docs be kept up to date and tracked in
git.

If the Apache repo isn't a good fit, perhaps we can have a separate repo
just for design docs? Maybe something like github.com/spark-docs/spark-docs/
?

If there's other stuff we want to track but haven't, perhaps we can
generalize the purpose of the repo a bit and rename it accordingly (e.g.
spark-misc/spark-misc).

Nick

On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza <sa...@cloudera.com> wrote:

> My only issue with Google Docs is that they're mutable, so it's difficult
> to follow a design's history through its revisions and link up JIRA
> comments with the relevant version.
>
> -Sandy
>
> On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
> >
> > One thing to consider is that while docs as PDFs in JIRAs do document the
> > original proposal, that's not the place to keep living specifications.
> That
> > stuff needs to live in SCM, in a format which can be easily maintained,
> can
> > generate readable documents, and, in an unrealistically ideal world, even
> > be used by machines to validate compliance with the design. Test suites
> > tend to be the implicit machine-readable part of the specification,
> though
> > they aren't usually viewed as such.
> >
> > PDFs of word docs in JIRAs are not the place for ongoing work, even if
> the
> > early drafts can contain them. Given it's just as easy to point to
> markdown
> > docs in github by commit ID, that could be an alternative way to publish
> > docs, with the document itself being viewed as one of the deliverables.
> > When the time comes to update a document, then its there in the source
> tree
> > to edit.
> >
> > If there's a flaw here, its that design docs are that: the design. The
> > implementation may not match, ongoing work will certainly diverge. If the
> > design docs aren't kept in sync, then they can mislead people.
> Accordingly,
> > once the design docs are incorporated into the source tree, keeping them
> in
> > sync with changes has be viewed as essential as keeping tests up to date
> >
> > > On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com> wrote:
> > >
> > > I actually don't totally see why we can't use Google Docs provided it
> > > is clearly discoverable from the JIRA. It was my understanding that
> > > many projects do this. Maybe not (?).
> > >
> > > If it's a matter of maintaining public record on ASF infrastructure,
> > > perhaps we can just automate that if an issue is closed we capture the
> > > doc content and attach it to the JIRA as a PDF.
> > >
> > > My sense is that in general the ASF infrastructure policy is becoming
> > > more and more lenient with regards to using third party services,
> > > provided the are broadly accessible (such as a public google doc) and
> > > can be definitively archived on ASF controlled storage.
> > >
> > > - Patrick
> > >
> > > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com> wrote:
> > >> I know I recently used Google Docs from a JIRA, so am guilty as
> > >> charged. I don't think there are a lot of design docs in general, but
> > >> the ones I've seen have simply pushed docs to a JIRA. (I did the same,
> > >> mirroring PDFs of the Google Doc.) I don't think this is hard to
> > >> follow.
> > >>
> > >> I think you can do what you like: make a JIRA and attach files. Make a
> > >> WIP PR and attach your notes. Make a Google Doc if you're feeling
> > >> transgressive.
> > >>
> > >> I don't see much of a problem to solve here. In practice there are
> > >> plenty of workable options, all of which are mainstream, and so I do
> > >> not see an argument that somehow this is solved by letting people make
> > >> wikis.
> > >>
> > >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
> > >> <pu...@gmail.com> wrote:
> > >>> Okay, I can understand wanting to keep Git history clean, and avoid
> > >>> bottlenecking on committers. Is it reasonable to establish a
> > convention of
> > >>> having a label, component or (best of all) an issue type for issues
> > that are
> > >>> associated with design docs? For example, if we used the existing
> > >>> "Brainstorming" issue type, and people put their design doc in the
> > >>> description of the ticket, it would be relatively easy to figure out
> > what
> > >>> designs are in progress.
> > >>>
> > >>> Given the push-back against design docs in Git or on the wiki and the
> > strong
> > >>> preference for keeping docs on ASF property, I'm a bit surprised that
> > all
> > >>> the existing design docs are on Google Docs. Perhaps Apache should
> > consider
> > >>> opening up parts of the wiki to a larger group, to better serve this
> > use
> > >>> case.
> > >>>
> > >>> Punya
> > >>>
> > >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com>
> > wrote:
> > >>>>
> > >>>> Using our ASF git repository as a working area for design docs, it
> > >>>> seems potentially concerning to me. It's difficult process wise
> > >>>> because all commits need to go through committers and also, we'd
> > >>>> pollute our git history a lot with random incremental design
> updates.
> > >>>>
> > >>>> The git history is used a lot by downstream packagers, us during our
> > >>>> QA process, etc... we really try to keep it oriented around code
> > >>>> patches:
> > >>>>
> > >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
> > >>>>
> > >>>> Committing a polished design doc along with a feature, maybe that's
> > >>>> something we could consider. But I still think JIRA is the best
> > >>>> location for these docs, consistent with what most other ASF
> projects
> > >>>> do that I know.
> > >>>>
> > >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <cody@koeninger.org
> >
> > >>>> wrote:
> > >>>>> Why can't pull requests be used for design docs in Git if people
> who
> > >>>>> aren't
> > >>>>> committers want to contribute changes (as opposed to just
> comments)?
> > >>>>>
> > >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com>
> > wrote:
> > >>>>>
> > >>>>>> Only catch there is it requires commit access to the repo. We
> need a
> > >>>>>> way for people who aren't committers to write and collaborate (for
> > >>>>>> point #1)
> > >>>>>>
> > >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
> > >>>>>> <pu...@gmail.com> wrote:
> > >>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git satisfy
> the
> > >>>>>> history
> > >>>>>>> requirement? Referring back to my Gradle example, it seems that
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> >
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> > >>>>>>> is a really good way to see why the design doc evolved the way it
> > >>>>>>> did.
> > >>>>>> When
> > >>>>>>> keeping the doc in Jira (presumably as an attachment) it's not
> easy
> > >>>>>>> to
> > >>>>>> see
> > >>>>>>> what changed between successive versions of the doc.
> > >>>>>>>
> > >>>>>>> Punya
> > >>>>>>>
> > >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <
> > sandy.ryza@cloudera.com>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> I think there are maybe two separate things we're talking about?
> > >>>>>>>>
> > >>>>>>>> 1. Design discussions and in-progress design docs.
> > >>>>>>>>
> > >>>>>>>> My two cents are that JIRA is the best place for this.  It
> allows
> > >>>>>> tracking
> > >>>>>>>> the progression of a design across multiple PRs and
> > contributors.  A
> > >>>>>> piece
> > >>>>>>>> of useful feedback that I've gotten in the past is to make
> design
> > >>>>>>>> docs
> > >>>>>>>> immutable.  When updating them in response to feedback, post a
> new
> > >>>>>> version
> > >>>>>>>> rather than editing the existing one.  This enables tracking the
> > >>>>>> history of
> > >>>>>>>> a design and makes it possible to read comments about previous
> > >>>>>>>> designs
> > >>>>>> in
> > >>>>>>>> context.  Otherwise it's really difficult to understand why
> > >>>>>>>> particular
> > >>>>>>>> approaches were chosen or abandoned.
> > >>>>>>>>
> > >>>>>>>> 2. Completed design docs for features that we've implemented.
> > >>>>>>>>
> > >>>>>>>> Perhaps less essential to project progress, but it would be
> really
> > >>>>>> lovely
> > >>>>>>>> to have a central repository to all the projects design doc.  If
> > >>>>>>>> anyone
> > >>>>>>>> wants to step up to maintain it, it would be cool to have a wiki
> > >>>>>>>> page
> > >>>>>> with
> > >>>>>>>> links to all the final design docs posted on JIRA.
> > >>>>>>>>
> > >>>>>>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > > For additional commands, e-mail: dev-help@spark.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
> >
>

Re: Design docs: consolidation and discoverability

Posted by Sandy Ryza <sa...@cloudera.com>.
My only issue with Google Docs is that they're mutable, so it's difficult
to follow a design's history through its revisions and link up JIRA
comments with the relevant version.

-Sandy

On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran <st...@hortonworks.com>
wrote:

>
> One thing to consider is that while docs as PDFs in JIRAs do document the
> original proposal, that's not the place to keep living specifications. That
> stuff needs to live in SCM, in a format which can be easily maintained, can
> generate readable documents, and, in an unrealistically ideal world, even
> be used by machines to validate compliance with the design. Test suites
> tend to be the implicit machine-readable part of the specification, though
> they aren't usually viewed as such.
>
> PDFs of word docs in JIRAs are not the place for ongoing work, even if the
> early drafts can contain them. Given it's just as easy to point to markdown
> docs in github by commit ID, that could be an alternative way to publish
> docs, with the document itself being viewed as one of the deliverables.
> When the time comes to update a document, then its there in the source tree
> to edit.
>
> If there's a flaw here, its that design docs are that: the design. The
> implementation may not match, ongoing work will certainly diverge. If the
> design docs aren't kept in sync, then they can mislead people. Accordingly,
> once the design docs are incorporated into the source tree, keeping them in
> sync with changes has be viewed as essential as keeping tests up to date
>
> > On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com> wrote:
> >
> > I actually don't totally see why we can't use Google Docs provided it
> > is clearly discoverable from the JIRA. It was my understanding that
> > many projects do this. Maybe not (?).
> >
> > If it's a matter of maintaining public record on ASF infrastructure,
> > perhaps we can just automate that if an issue is closed we capture the
> > doc content and attach it to the JIRA as a PDF.
> >
> > My sense is that in general the ASF infrastructure policy is becoming
> > more and more lenient with regards to using third party services,
> > provided the are broadly accessible (such as a public google doc) and
> > can be definitively archived on ASF controlled storage.
> >
> > - Patrick
> >
> > On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com> wrote:
> >> I know I recently used Google Docs from a JIRA, so am guilty as
> >> charged. I don't think there are a lot of design docs in general, but
> >> the ones I've seen have simply pushed docs to a JIRA. (I did the same,
> >> mirroring PDFs of the Google Doc.) I don't think this is hard to
> >> follow.
> >>
> >> I think you can do what you like: make a JIRA and attach files. Make a
> >> WIP PR and attach your notes. Make a Google Doc if you're feeling
> >> transgressive.
> >>
> >> I don't see much of a problem to solve here. In practice there are
> >> plenty of workable options, all of which are mainstream, and so I do
> >> not see an argument that somehow this is solved by letting people make
> >> wikis.
> >>
> >> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
> >> <pu...@gmail.com> wrote:
> >>> Okay, I can understand wanting to keep Git history clean, and avoid
> >>> bottlenecking on committers. Is it reasonable to establish a
> convention of
> >>> having a label, component or (best of all) an issue type for issues
> that are
> >>> associated with design docs? For example, if we used the existing
> >>> "Brainstorming" issue type, and people put their design doc in the
> >>> description of the ticket, it would be relatively easy to figure out
> what
> >>> designs are in progress.
> >>>
> >>> Given the push-back against design docs in Git or on the wiki and the
> strong
> >>> preference for keeping docs on ASF property, I'm a bit surprised that
> all
> >>> the existing design docs are on Google Docs. Perhaps Apache should
> consider
> >>> opening up parts of the wiki to a larger group, to better serve this
> use
> >>> case.
> >>>
> >>> Punya
> >>>
> >>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com>
> wrote:
> >>>>
> >>>> Using our ASF git repository as a working area for design docs, it
> >>>> seems potentially concerning to me. It's difficult process wise
> >>>> because all commits need to go through committers and also, we'd
> >>>> pollute our git history a lot with random incremental design updates.
> >>>>
> >>>> The git history is used a lot by downstream packagers, us during our
> >>>> QA process, etc... we really try to keep it oriented around code
> >>>> patches:
> >>>>
> >>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
> >>>>
> >>>> Committing a polished design doc along with a feature, maybe that's
> >>>> something we could consider. But I still think JIRA is the best
> >>>> location for these docs, consistent with what most other ASF projects
> >>>> do that I know.
> >>>>
> >>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org>
> >>>> wrote:
> >>>>> Why can't pull requests be used for design docs in Git if people who
> >>>>> aren't
> >>>>> committers want to contribute changes (as opposed to just comments)?
> >>>>>
> >>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com>
> wrote:
> >>>>>
> >>>>>> Only catch there is it requires commit access to the repo. We need a
> >>>>>> way for people who aren't committers to write and collaborate (for
> >>>>>> point #1)
> >>>>>>
> >>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
> >>>>>> <pu...@gmail.com> wrote:
> >>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
> >>>>>> history
> >>>>>>> requirement? Referring back to my Gradle example, it seems that
> >>>>>>>
> >>>>>>
> >>>>>>
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> >>>>>>> is a really good way to see why the design doc evolved the way it
> >>>>>>> did.
> >>>>>> When
> >>>>>>> keeping the doc in Jira (presumably as an attachment) it's not easy
> >>>>>>> to
> >>>>>> see
> >>>>>>> what changed between successive versions of the doc.
> >>>>>>>
> >>>>>>> Punya
> >>>>>>>
> >>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <
> sandy.ryza@cloudera.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I think there are maybe two separate things we're talking about?
> >>>>>>>>
> >>>>>>>> 1. Design discussions and in-progress design docs.
> >>>>>>>>
> >>>>>>>> My two cents are that JIRA is the best place for this.  It allows
> >>>>>> tracking
> >>>>>>>> the progression of a design across multiple PRs and
> contributors.  A
> >>>>>> piece
> >>>>>>>> of useful feedback that I've gotten in the past is to make design
> >>>>>>>> docs
> >>>>>>>> immutable.  When updating them in response to feedback, post a new
> >>>>>> version
> >>>>>>>> rather than editing the existing one.  This enables tracking the
> >>>>>> history of
> >>>>>>>> a design and makes it possible to read comments about previous
> >>>>>>>> designs
> >>>>>> in
> >>>>>>>> context.  Otherwise it's really difficult to understand why
> >>>>>>>> particular
> >>>>>>>> approaches were chosen or abandoned.
> >>>>>>>>
> >>>>>>>> 2. Completed design docs for features that we've implemented.
> >>>>>>>>
> >>>>>>>> Perhaps less essential to project progress, but it would be really
> >>>>>> lovely
> >>>>>>>> to have a central repository to all the projects design doc.  If
> >>>>>>>> anyone
> >>>>>>>> wants to step up to maintain it, it would be cool to have a wiki
> >>>>>>>> page
> >>>>>> with
> >>>>>>>> links to all the final design docs posted on JIRA.
> >>>>>>>>
> >>>>>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: Design docs: consolidation and discoverability

Posted by Steve Loughran <st...@hortonworks.com>.
One thing to consider is that while docs as PDFs in JIRAs do document the original proposal, that's not the place to keep living specifications. That stuff needs to live in SCM, in a format which can be easily maintained, can generate readable documents, and, in an unrealistically ideal world, even be used by machines to validate compliance with the design. Test suites tend to be the implicit machine-readable part of the specification, though they aren't usually viewed as such.

PDFs of word docs in JIRAs are not the place for ongoing work, even if the early drafts can contain them. Given it's just as easy to point to markdown docs in github by commit ID, that could be an alternative way to publish docs, with the document itself being viewed as one of the deliverables. When the time comes to update a document, then its there in the source tree to edit.

If there's a flaw here, its that design docs are that: the design. The implementation may not match, ongoing work will certainly diverge. If the design docs aren't kept in sync, then they can mislead people. Accordingly, once the design docs are incorporated into the source tree, keeping them in sync with changes has be viewed as essential as keeping tests up to date

> On 26 Apr 2015, at 22:34, Patrick Wendell <pw...@gmail.com> wrote:
> 
> I actually don't totally see why we can't use Google Docs provided it
> is clearly discoverable from the JIRA. It was my understanding that
> many projects do this. Maybe not (?).
> 
> If it's a matter of maintaining public record on ASF infrastructure,
> perhaps we can just automate that if an issue is closed we capture the
> doc content and attach it to the JIRA as a PDF.
> 
> My sense is that in general the ASF infrastructure policy is becoming
> more and more lenient with regards to using third party services,
> provided the are broadly accessible (such as a public google doc) and
> can be definitively archived on ASF controlled storage.
> 
> - Patrick
> 
> On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com> wrote:
>> I know I recently used Google Docs from a JIRA, so am guilty as
>> charged. I don't think there are a lot of design docs in general, but
>> the ones I've seen have simply pushed docs to a JIRA. (I did the same,
>> mirroring PDFs of the Google Doc.) I don't think this is hard to
>> follow.
>> 
>> I think you can do what you like: make a JIRA and attach files. Make a
>> WIP PR and attach your notes. Make a Google Doc if you're feeling
>> transgressive.
>> 
>> I don't see much of a problem to solve here. In practice there are
>> plenty of workable options, all of which are mainstream, and so I do
>> not see an argument that somehow this is solved by letting people make
>> wikis.
>> 
>> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
>> <pu...@gmail.com> wrote:
>>> Okay, I can understand wanting to keep Git history clean, and avoid
>>> bottlenecking on committers. Is it reasonable to establish a convention of
>>> having a label, component or (best of all) an issue type for issues that are
>>> associated with design docs? For example, if we used the existing
>>> "Brainstorming" issue type, and people put their design doc in the
>>> description of the ticket, it would be relatively easy to figure out what
>>> designs are in progress.
>>> 
>>> Given the push-back against design docs in Git or on the wiki and the strong
>>> preference for keeping docs on ASF property, I'm a bit surprised that all
>>> the existing design docs are on Google Docs. Perhaps Apache should consider
>>> opening up parts of the wiki to a larger group, to better serve this use
>>> case.
>>> 
>>> Punya
>>> 
>>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com> wrote:
>>>> 
>>>> Using our ASF git repository as a working area for design docs, it
>>>> seems potentially concerning to me. It's difficult process wise
>>>> because all commits need to go through committers and also, we'd
>>>> pollute our git history a lot with random incremental design updates.
>>>> 
>>>> The git history is used a lot by downstream packagers, us during our
>>>> QA process, etc... we really try to keep it oriented around code
>>>> patches:
>>>> 
>>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>>>> 
>>>> Committing a polished design doc along with a feature, maybe that's
>>>> something we could consider. But I still think JIRA is the best
>>>> location for these docs, consistent with what most other ASF projects
>>>> do that I know.
>>>> 
>>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org>
>>>> wrote:
>>>>> Why can't pull requests be used for design docs in Git if people who
>>>>> aren't
>>>>> committers want to contribute changes (as opposed to just comments)?
>>>>> 
>>>>> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:
>>>>> 
>>>>>> Only catch there is it requires commit access to the repo. We need a
>>>>>> way for people who aren't committers to write and collaborate (for
>>>>>> point #1)
>>>>>> 
>>>>>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>>>>>> <pu...@gmail.com> wrote:
>>>>>>> Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
>>>>>> history
>>>>>>> requirement? Referring back to my Gradle example, it seems that
>>>>>>> 
>>>>>> 
>>>>>> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>>>>>>> is a really good way to see why the design doc evolved the way it
>>>>>>> did.
>>>>>> When
>>>>>>> keeping the doc in Jira (presumably as an attachment) it's not easy
>>>>>>> to
>>>>>> see
>>>>>>> what changed between successive versions of the doc.
>>>>>>> 
>>>>>>> Punya
>>>>>>> 
>>>>>>> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> I think there are maybe two separate things we're talking about?
>>>>>>>> 
>>>>>>>> 1. Design discussions and in-progress design docs.
>>>>>>>> 
>>>>>>>> My two cents are that JIRA is the best place for this.  It allows
>>>>>> tracking
>>>>>>>> the progression of a design across multiple PRs and contributors.  A
>>>>>> piece
>>>>>>>> of useful feedback that I've gotten in the past is to make design
>>>>>>>> docs
>>>>>>>> immutable.  When updating them in response to feedback, post a new
>>>>>> version
>>>>>>>> rather than editing the existing one.  This enables tracking the
>>>>>> history of
>>>>>>>> a design and makes it possible to read comments about previous
>>>>>>>> designs
>>>>>> in
>>>>>>>> context.  Otherwise it's really difficult to understand why
>>>>>>>> particular
>>>>>>>> approaches were chosen or abandoned.
>>>>>>>> 
>>>>>>>> 2. Completed design docs for features that we've implemented.
>>>>>>>> 
>>>>>>>> Perhaps less essential to project progress, but it would be really
>>>>>> lovely
>>>>>>>> to have a central repository to all the projects design doc.  If
>>>>>>>> anyone
>>>>>>>> wants to step up to maintain it, it would be cool to have a wiki
>>>>>>>> page
>>>>>> with
>>>>>>>> links to all the final design docs posted on JIRA.
>>>>>>>> 
>>>>>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Patrick Wendell <pw...@gmail.com>.
I actually don't totally see why we can't use Google Docs provided it
is clearly discoverable from the JIRA. It was my understanding that
many projects do this. Maybe not (?).

If it's a matter of maintaining public record on ASF infrastructure,
perhaps we can just automate that if an issue is closed we capture the
doc content and attach it to the JIRA as a PDF.

My sense is that in general the ASF infrastructure policy is becoming
more and more lenient with regards to using third party services,
provided the are broadly accessible (such as a public google doc) and
can be definitively archived on ASF controlled storage.

- Patrick

On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen <so...@cloudera.com> wrote:
> I know I recently used Google Docs from a JIRA, so am guilty as
> charged. I don't think there are a lot of design docs in general, but
> the ones I've seen have simply pushed docs to a JIRA. (I did the same,
> mirroring PDFs of the Google Doc.) I don't think this is hard to
> follow.
>
> I think you can do what you like: make a JIRA and attach files. Make a
> WIP PR and attach your notes. Make a Google Doc if you're feeling
> transgressive.
>
> I don't see much of a problem to solve here. In practice there are
> plenty of workable options, all of which are mainstream, and so I do
> not see an argument that somehow this is solved by letting people make
> wikis.
>
> On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
> <pu...@gmail.com> wrote:
>> Okay, I can understand wanting to keep Git history clean, and avoid
>> bottlenecking on committers. Is it reasonable to establish a convention of
>> having a label, component or (best of all) an issue type for issues that are
>> associated with design docs? For example, if we used the existing
>> "Brainstorming" issue type, and people put their design doc in the
>> description of the ticket, it would be relatively easy to figure out what
>> designs are in progress.
>>
>> Given the push-back against design docs in Git or on the wiki and the strong
>> preference for keeping docs on ASF property, I'm a bit surprised that all
>> the existing design docs are on Google Docs. Perhaps Apache should consider
>> opening up parts of the wiki to a larger group, to better serve this use
>> case.
>>
>> Punya
>>
>> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com> wrote:
>>>
>>> Using our ASF git repository as a working area for design docs, it
>>> seems potentially concerning to me. It's difficult process wise
>>> because all commits need to go through committers and also, we'd
>>> pollute our git history a lot with random incremental design updates.
>>>
>>> The git history is used a lot by downstream packagers, us during our
>>> QA process, etc... we really try to keep it oriented around code
>>> patches:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>>>
>>> Committing a polished design doc along with a feature, maybe that's
>>> something we could consider. But I still think JIRA is the best
>>> location for these docs, consistent with what most other ASF projects
>>> do that I know.
>>>
>>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org>
>>> wrote:
>>> > Why can't pull requests be used for design docs in Git if people who
>>> > aren't
>>> > committers want to contribute changes (as opposed to just comments)?
>>> >
>>> > On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:
>>> >
>>> >> Only catch there is it requires commit access to the repo. We need a
>>> >> way for people who aren't committers to write and collaborate (for
>>> >> point #1)
>>> >>
>>> >> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>>> >> <pu...@gmail.com> wrote:
>>> >> > Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
>>> >> history
>>> >> > requirement? Referring back to my Gradle example, it seems that
>>> >> >
>>> >>
>>> >> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>>> >> > is a really good way to see why the design doc evolved the way it
>>> >> > did.
>>> >> When
>>> >> > keeping the doc in Jira (presumably as an attachment) it's not easy
>>> >> > to
>>> >> see
>>> >> > what changed between successive versions of the doc.
>>> >> >
>>> >> > Punya
>>> >> >
>>> >> > On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
>>> >> wrote:
>>> >> >>
>>> >> >> I think there are maybe two separate things we're talking about?
>>> >> >>
>>> >> >> 1. Design discussions and in-progress design docs.
>>> >> >>
>>> >> >> My two cents are that JIRA is the best place for this.  It allows
>>> >> tracking
>>> >> >> the progression of a design across multiple PRs and contributors.  A
>>> >> piece
>>> >> >> of useful feedback that I've gotten in the past is to make design
>>> >> >> docs
>>> >> >> immutable.  When updating them in response to feedback, post a new
>>> >> version
>>> >> >> rather than editing the existing one.  This enables tracking the
>>> >> history of
>>> >> >> a design and makes it possible to read comments about previous
>>> >> >> designs
>>> >> in
>>> >> >> context.  Otherwise it's really difficult to understand why
>>> >> >> particular
>>> >> >> approaches were chosen or abandoned.
>>> >> >>
>>> >> >> 2. Completed design docs for features that we've implemented.
>>> >> >>
>>> >> >> Perhaps less essential to project progress, but it would be really
>>> >> lovely
>>> >> >> to have a central repository to all the projects design doc.  If
>>> >> >> anyone
>>> >> >> wants to step up to maintain it, it would be cool to have a wiki
>>> >> >> page
>>> >> with
>>> >> >> links to all the final design docs posted on JIRA.
>>> >> >>
>>> >>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Sean Owen <so...@cloudera.com>.
I know I recently used Google Docs from a JIRA, so am guilty as
charged. I don't think there are a lot of design docs in general, but
the ones I've seen have simply pushed docs to a JIRA. (I did the same,
mirroring PDFs of the Google Doc.) I don't think this is hard to
follow.

I think you can do what you like: make a JIRA and attach files. Make a
WIP PR and attach your notes. Make a Google Doc if you're feeling
transgressive.

I don't see much of a problem to solve here. In practice there are
plenty of workable options, all of which are mainstream, and so I do
not see an argument that somehow this is solved by letting people make
wikis.

On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
<pu...@gmail.com> wrote:
> Okay, I can understand wanting to keep Git history clean, and avoid
> bottlenecking on committers. Is it reasonable to establish a convention of
> having a label, component or (best of all) an issue type for issues that are
> associated with design docs? For example, if we used the existing
> "Brainstorming" issue type, and people put their design doc in the
> description of the ticket, it would be relatively easy to figure out what
> designs are in progress.
>
> Given the push-back against design docs in Git or on the wiki and the strong
> preference for keeping docs on ASF property, I'm a bit surprised that all
> the existing design docs are on Google Docs. Perhaps Apache should consider
> opening up parts of the wiki to a larger group, to better serve this use
> case.
>
> Punya
>
> On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com> wrote:
>>
>> Using our ASF git repository as a working area for design docs, it
>> seems potentially concerning to me. It's difficult process wise
>> because all commits need to go through committers and also, we'd
>> pollute our git history a lot with random incremental design updates.
>>
>> The git history is used a lot by downstream packagers, us during our
>> QA process, etc... we really try to keep it oriented around code
>> patches:
>>
>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>>
>> Committing a polished design doc along with a feature, maybe that's
>> something we could consider. But I still think JIRA is the best
>> location for these docs, consistent with what most other ASF projects
>> do that I know.
>>
>> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>> > Why can't pull requests be used for design docs in Git if people who
>> > aren't
>> > committers want to contribute changes (as opposed to just comments)?
>> >
>> > On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:
>> >
>> >> Only catch there is it requires commit access to the repo. We need a
>> >> way for people who aren't committers to write and collaborate (for
>> >> point #1)
>> >>
>> >> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>> >> <pu...@gmail.com> wrote:
>> >> > Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
>> >> history
>> >> > requirement? Referring back to my Gradle example, it seems that
>> >> >
>> >>
>> >> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>> >> > is a really good way to see why the design doc evolved the way it
>> >> > did.
>> >> When
>> >> > keeping the doc in Jira (presumably as an attachment) it's not easy
>> >> > to
>> >> see
>> >> > what changed between successive versions of the doc.
>> >> >
>> >> > Punya
>> >> >
>> >> > On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
>> >> wrote:
>> >> >>
>> >> >> I think there are maybe two separate things we're talking about?
>> >> >>
>> >> >> 1. Design discussions and in-progress design docs.
>> >> >>
>> >> >> My two cents are that JIRA is the best place for this.  It allows
>> >> tracking
>> >> >> the progression of a design across multiple PRs and contributors.  A
>> >> piece
>> >> >> of useful feedback that I've gotten in the past is to make design
>> >> >> docs
>> >> >> immutable.  When updating them in response to feedback, post a new
>> >> version
>> >> >> rather than editing the existing one.  This enables tracking the
>> >> history of
>> >> >> a design and makes it possible to read comments about previous
>> >> >> designs
>> >> in
>> >> >> context.  Otherwise it's really difficult to understand why
>> >> >> particular
>> >> >> approaches were chosen or abandoned.
>> >> >>
>> >> >> 2. Completed design docs for features that we've implemented.
>> >> >>
>> >> >> Perhaps less essential to project progress, but it would be really
>> >> lovely
>> >> >> to have a central repository to all the projects design doc.  If
>> >> >> anyone
>> >> >> wants to step up to maintain it, it would be cool to have a wiki
>> >> >> page
>> >> with
>> >> >> links to all the final design docs posted on JIRA.
>> >> >>
>> >>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Punyashloka Biswal <pu...@gmail.com>.
Okay, I can understand wanting to keep Git history clean, and avoid
bottlenecking on committers. Is it reasonable to establish a convention of
having a label, component or (best of all) an issue type for issues that
are associated with design docs? For example, if we used the existing
"Brainstorming" issue type, and people put their design doc in the
description of the ticket, it would be relatively easy to figure out what
designs are in progress.

Given the push-back against design docs in Git or on the wiki and the
strong preference for keeping docs on ASF property, I'm a bit surprised
that all the existing design docs are on Google Docs. Perhaps Apache should
consider opening up parts of the wiki to a larger group, to better serve
this use case.

Punya

On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell <pw...@gmail.com> wrote:

> Using our ASF git repository as a working area for design docs, it
> seems potentially concerning to me. It's difficult process wise
> because all commits need to go through committers and also, we'd
> pollute our git history a lot with random incremental design updates.
>
> The git history is used a lot by downstream packagers, us during our
> QA process, etc... we really try to keep it oriented around code
> patches:
>
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
>
> Committing a polished design doc along with a feature, maybe that's
> something we could consider. But I still think JIRA is the best
> location for these docs, consistent with what most other ASF projects
> do that I know.
>
> On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org>
> wrote:
> > Why can't pull requests be used for design docs in Git if people who
> aren't
> > committers want to contribute changes (as opposed to just comments)?
> >
> > On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:
> >
> >> Only catch there is it requires commit access to the repo. We need a
> >> way for people who aren't committers to write and collaborate (for
> >> point #1)
> >>
> >> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
> >> <pu...@gmail.com> wrote:
> >> > Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
> >> history
> >> > requirement? Referring back to my Gradle example, it seems that
> >> >
> >>
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> >> > is a really good way to see why the design doc evolved the way it did.
> >> When
> >> > keeping the doc in Jira (presumably as an attachment) it's not easy to
> >> see
> >> > what changed between successive versions of the doc.
> >> >
> >> > Punya
> >> >
> >> > On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
> >> wrote:
> >> >>
> >> >> I think there are maybe two separate things we're talking about?
> >> >>
> >> >> 1. Design discussions and in-progress design docs.
> >> >>
> >> >> My two cents are that JIRA is the best place for this.  It allows
> >> tracking
> >> >> the progression of a design across multiple PRs and contributors.  A
> >> piece
> >> >> of useful feedback that I've gotten in the past is to make design
> docs
> >> >> immutable.  When updating them in response to feedback, post a new
> >> version
> >> >> rather than editing the existing one.  This enables tracking the
> >> history of
> >> >> a design and makes it possible to read comments about previous
> designs
> >> in
> >> >> context.  Otherwise it's really difficult to understand why
> particular
> >> >> approaches were chosen or abandoned.
> >> >>
> >> >> 2. Completed design docs for features that we've implemented.
> >> >>
> >> >> Perhaps less essential to project progress, but it would be really
> >> lovely
> >> >> to have a central repository to all the projects design doc.  If
> anyone
> >> >> wants to step up to maintain it, it would be cool to have a wiki page
> >> with
> >> >> links to all the final design docs posted on JIRA.
> >> >>
> >>
>

Re: Design docs: consolidation and discoverability

Posted by Patrick Wendell <pw...@gmail.com>.
Using our ASF git repository as a working area for design docs, it
seems potentially concerning to me. It's difficult process wise
because all commits need to go through committers and also, we'd
pollute our git history a lot with random incremental design updates.

The git history is used a lot by downstream packagers, us during our
QA process, etc... we really try to keep it oriented around code
patches:

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

Committing a polished design doc along with a feature, maybe that's
something we could consider. But I still think JIRA is the best
location for these docs, consistent with what most other ASF projects
do that I know.

On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger <co...@koeninger.org> wrote:
> Why can't pull requests be used for design docs in Git if people who aren't
> committers want to contribute changes (as opposed to just comments)?
>
> On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:
>
>> Only catch there is it requires commit access to the repo. We need a
>> way for people who aren't committers to write and collaborate (for
>> point #1)
>>
>> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
>> <pu...@gmail.com> wrote:
>> > Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
>> history
>> > requirement? Referring back to my Gradle example, it seems that
>> >
>> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
>> > is a really good way to see why the design doc evolved the way it did.
>> When
>> > keeping the doc in Jira (presumably as an attachment) it's not easy to
>> see
>> > what changed between successive versions of the doc.
>> >
>> > Punya
>> >
>> > On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
>> wrote:
>> >>
>> >> I think there are maybe two separate things we're talking about?
>> >>
>> >> 1. Design discussions and in-progress design docs.
>> >>
>> >> My two cents are that JIRA is the best place for this.  It allows
>> tracking
>> >> the progression of a design across multiple PRs and contributors.  A
>> piece
>> >> of useful feedback that I've gotten in the past is to make design docs
>> >> immutable.  When updating them in response to feedback, post a new
>> version
>> >> rather than editing the existing one.  This enables tracking the
>> history of
>> >> a design and makes it possible to read comments about previous designs
>> in
>> >> context.  Otherwise it's really difficult to understand why particular
>> >> approaches were chosen or abandoned.
>> >>
>> >> 2. Completed design docs for features that we've implemented.
>> >>
>> >> Perhaps less essential to project progress, but it would be really
>> lovely
>> >> to have a central repository to all the projects design doc.  If anyone
>> >> wants to step up to maintain it, it would be cool to have a wiki page
>> with
>> >> links to all the final design docs posted on JIRA.
>> >>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Cody Koeninger <co...@koeninger.org>.
Why can't pull requests be used for design docs in Git if people who aren't
committers want to contribute changes (as opposed to just comments)?

On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen <so...@cloudera.com> wrote:

> Only catch there is it requires commit access to the repo. We need a
> way for people who aren't committers to write and collaborate (for
> point #1)
>
> On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
> <pu...@gmail.com> wrote:
> > Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
> history
> > requirement? Referring back to my Gradle example, it seems that
> >
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> > is a really good way to see why the design doc evolved the way it did.
> When
> > keeping the doc in Jira (presumably as an attachment) it's not easy to
> see
> > what changed between successive versions of the doc.
> >
> > Punya
> >
> > On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com>
> wrote:
> >>
> >> I think there are maybe two separate things we're talking about?
> >>
> >> 1. Design discussions and in-progress design docs.
> >>
> >> My two cents are that JIRA is the best place for this.  It allows
> tracking
> >> the progression of a design across multiple PRs and contributors.  A
> piece
> >> of useful feedback that I've gotten in the past is to make design docs
> >> immutable.  When updating them in response to feedback, post a new
> version
> >> rather than editing the existing one.  This enables tracking the
> history of
> >> a design and makes it possible to read comments about previous designs
> in
> >> context.  Otherwise it's really difficult to understand why particular
> >> approaches were chosen or abandoned.
> >>
> >> 2. Completed design docs for features that we've implemented.
> >>
> >> Perhaps less essential to project progress, but it would be really
> lovely
> >> to have a central repository to all the projects design doc.  If anyone
> >> wants to step up to maintain it, it would be cool to have a wiki page
> with
> >> links to all the final design docs posted on JIRA.
> >>
>

Re: Design docs: consolidation and discoverability

Posted by Sean Owen <so...@cloudera.com>.
Only catch there is it requires commit access to the repo. We need a
way for people who aren't committers to write and collaborate (for
point #1)

On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
<pu...@gmail.com> wrote:
> Sandy, doesn't keeping (in-progress) design docs in Git satisfy the history
> requirement? Referring back to my Gradle example, it seems that
> https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
> is a really good way to see why the design doc evolved the way it did. When
> keeping the doc in Jira (presumably as an attachment) it's not easy to see
> what changed between successive versions of the doc.
>
> Punya
>
> On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com> wrote:
>>
>> I think there are maybe two separate things we're talking about?
>>
>> 1. Design discussions and in-progress design docs.
>>
>> My two cents are that JIRA is the best place for this.  It allows tracking
>> the progression of a design across multiple PRs and contributors.  A piece
>> of useful feedback that I've gotten in the past is to make design docs
>> immutable.  When updating them in response to feedback, post a new version
>> rather than editing the existing one.  This enables tracking the history of
>> a design and makes it possible to read comments about previous designs in
>> context.  Otherwise it's really difficult to understand why particular
>> approaches were chosen or abandoned.
>>
>> 2. Completed design docs for features that we've implemented.
>>
>> Perhaps less essential to project progress, but it would be really lovely
>> to have a central repository to all the projects design doc.  If anyone
>> wants to step up to maintain it, it would be cool to have a wiki page with
>> links to all the final design docs posted on JIRA.
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Punyashloka Biswal <pu...@gmail.com>.
Sandy, doesn't keeping (in-progress) design docs in Git satisfy the history
requirement? Referring back to my Gradle example, it seems that
https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
is a really good way to see why the design doc evolved the way it did. When
keeping the doc in Jira (presumably as an attachment) it's not easy to see
what changed between successive versions of the doc.

Punya

On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza <sa...@cloudera.com> wrote:

> I think there are maybe two separate things we're talking about?
>
> 1. Design discussions and in-progress design docs.
>
> My two cents are that JIRA is the best place for this.  It allows tracking
> the progression of a design across multiple PRs and contributors.  A piece
> of useful feedback that I've gotten in the past is to make design docs
> immutable.  When updating them in response to feedback, post a new version
> rather than editing the existing one.  This enables tracking the history of
> a design and makes it possible to read comments about previous designs in
> context.  Otherwise it's really difficult to understand why particular
> approaches were chosen or abandoned.
>
> 2. Completed design docs for features that we've implemented.
>
> Perhaps less essential to project progress, but it would be really lovely
> to have a central repository to all the projects design doc.  If anyone
> wants to step up to maintain it, it would be cool to have a wiki page with
> links to all the final design docs posted on JIRA.
>
> -Sandy
>
> On Fri, Apr 24, 2015 at 12:01 PM, Punyashloka Biswal <
> punya.biswal@gmail.com> wrote:
>
>> The Gradle dev team keep their design documents  *checked into* their Git
>
>
>> repository -- see
>>
>> https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
>> for example. The advantages I see to their approach are:
>>
>>    - design docs stay on ASF property (since Github is synced to the
>>    Apache-run Git repository)
>>    - design docs have a lifetime across PRs, but can still be modified and
>
>
>>    commented on through the mechanism of PRs
>>
>    - keeping a central location helps people to find good role models and
>
>
>>    converge on conventions
>>
>> Sean, I find it hard to use the central Jira as a jumping-off point for
>> understanding ongoing design work because a tiny fraction of the tickets
>> actually relate to design docs, and it's not easy from the outside to
>> figure out which ones are relevant.
>>
>> Punya
>>
>> On Fri, Apr 24, 2015 at 2:49 PM Sean Owen <so...@cloudera.com> wrote:
>>
>> > I think it's OK to have design discussions on github, as emails go to
>> > ASF lists. After all, loads of PR discussions happen there. It's easy
>> > for anyone to follow.
>> >
>> > I also would rather just discuss on Github, except for all that noise.
>> >
>> > It's not great to put discussions in something like Google Docs
>> > actually; the resulting doc needs to be pasted back to JIRA promptly
>> > if so. I suppose it's still better than a private conversation or not
>> > talking at all, but the principle is that one should be able to access
>> > any substantive decision or conversation by being tuned in to only the
>> > project systems of record -- mailing list, JIRA.
>> >
>> >
>> >
>> > On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin <rx...@databricks.com>
>> wrote:
>> > > I'd love to see more design discussions consolidated in a single
>> place as
>> > > well. That said, there are many practical challenges to overcome.
>> Some of
>> > > them are out of our control:
>> > >
>> > > 1. For large features, it is fairly common to open a PR for
>> discussion,
>> > > close the PR taking some feedback into account, and reopen another
>> one.
>> > You
>> > > sort of lose the discussions that way.
>> > >
>> > > 2. With the way Jenkins is setup currently, Jenkins testing
>> introduces a
>> > lot
>> > > of noise to GitHub pull requests, making it hard to differentiate
>> > legitimate
>> > > comments from noise. This is unfortunately due to the fact that ASF
>> won't
>> > > allow our Jenkins bot to have API privilege to post messages.
>> > >
>> > > 3. The Apache Way is that all development discussions need to happen
>> on
>> > ASF
>> > > property, i.e. dev lists and JIRA. As a result, technically we are not
>> > > allowed to have development discussions on GitHub.
>> > >
>> > >
>> > > On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger <co...@koeninger.org>
>> > wrote:
>> > >>
>> > >> My 2 cents - I'd rather see design docs in github pull requests
>> (using
>> > >> plain text / markdown).  That doesn't require changing access or
>> adding
>> > >> people, and github PRs already allow for conversation / email
>> > >> notifications.
>> > >>
>> > >> Conversation is already split between jira and github PRs.  Having a
>> > third
>> > >> stream of conversation in Google Docs just leads to things being
>> > ignored.
>> > >>
>> > >> On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com>
>> wrote:
>> > >>
>> > >> > That would require giving wiki access to everyone or manually
>> adding
>> > >> > people
>> > >> > any time they make a doc.
>> > >> >
>> > >> > I don't see how this helps though. They're still docs on the
>> internet
>> > >> > and
>> > >> > they're still linked from the central project JIRA, which is what
>> you
>> > >> > should follow.
>> > >> >  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <
>> > punya.biswal@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> > > Dear Spark devs,
>> > >> > >
>> > >> > > Right now, design docs are stored on Google docs and linked from
>> > >> > > tickets.
>> > >> > > For someone new to the project, it's hard to figure out what
>> > subjects
>> > >> > > are
>> > >> > > being discussed, what organization to follow for new feature
>> > >> > > proposals,
>> > >> > > etc.
>> > >> > >
>> > >> > > Would it make sense to consolidate future design docs in either a
>> > >> > > designated area on the Apache Confluence Wiki, or on GitHub's
>> Wiki
>> > >> > > pages?
>> > >> > > If people have a strong preference to keep the design docs on
>> Google
>> > >> > Docs,
>> > >> > > then could we have a top-level page on the confluence wiki that
>> > lists
>> > >> > > all
>> > >> > > active and archived design docs?
>> > >> > >
>> > >> > > Punya
>> > >> > >
>> > >> >
>> > >
>> > >
>> >
>>
>

Re: Design docs: consolidation and discoverability

Posted by Sandy Ryza <sa...@cloudera.com>.
I think there are maybe two separate things we're talking about?

1. Design discussions and in-progress design docs.

My two cents are that JIRA is the best place for this.  It allows tracking
the progression of a design across multiple PRs and contributors.  A piece
of useful feedback that I've gotten in the past is to make design docs
immutable.  When updating them in response to feedback, post a new version
rather than editing the existing one.  This enables tracking the history of
a design and makes it possible to read comments about previous designs in
context.  Otherwise it's really difficult to understand why particular
approaches were chosen or abandoned.

2. Completed design docs for features that we've implemented.

Perhaps less essential to project progress, but it would be really lovely
to have a central repository to all the projects design doc.  If anyone
wants to step up to maintain it, it would be cool to have a wiki page with
links to all the final design docs posted on JIRA.

-Sandy

On Fri, Apr 24, 2015 at 12:01 PM, Punyashloka Biswal <punya.biswal@gmail.com
> wrote:

> The Gradle dev team keep their design documents  *checked into* their Git
> repository -- see
>
> https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
> for example. The advantages I see to their approach are:
>
>    - design docs stay on ASF property (since Github is synced to the
>    Apache-run Git repository)
>    - design docs have a lifetime across PRs, but can still be modified and
>    commented on through the mechanism of PRs
>    - keeping a central location helps people to find good role models and
>    converge on conventions
>
> Sean, I find it hard to use the central Jira as a jumping-off point for
> understanding ongoing design work because a tiny fraction of the tickets
> actually relate to design docs, and it's not easy from the outside to
> figure out which ones are relevant.
>
> Punya
>
> On Fri, Apr 24, 2015 at 2:49 PM Sean Owen <so...@cloudera.com> wrote:
>
> > I think it's OK to have design discussions on github, as emails go to
> > ASF lists. After all, loads of PR discussions happen there. It's easy
> > for anyone to follow.
> >
> > I also would rather just discuss on Github, except for all that noise.
> >
> > It's not great to put discussions in something like Google Docs
> > actually; the resulting doc needs to be pasted back to JIRA promptly
> > if so. I suppose it's still better than a private conversation or not
> > talking at all, but the principle is that one should be able to access
> > any substantive decision or conversation by being tuned in to only the
> > project systems of record -- mailing list, JIRA.
> >
> >
> >
> > On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin <rx...@databricks.com>
> wrote:
> > > I'd love to see more design discussions consolidated in a single place
> as
> > > well. That said, there are many practical challenges to overcome. Some
> of
> > > them are out of our control:
> > >
> > > 1. For large features, it is fairly common to open a PR for discussion,
> > > close the PR taking some feedback into account, and reopen another one.
> > You
> > > sort of lose the discussions that way.
> > >
> > > 2. With the way Jenkins is setup currently, Jenkins testing introduces
> a
> > lot
> > > of noise to GitHub pull requests, making it hard to differentiate
> > legitimate
> > > comments from noise. This is unfortunately due to the fact that ASF
> won't
> > > allow our Jenkins bot to have API privilege to post messages.
> > >
> > > 3. The Apache Way is that all development discussions need to happen on
> > ASF
> > > property, i.e. dev lists and JIRA. As a result, technically we are not
> > > allowed to have development discussions on GitHub.
> > >
> > >
> > > On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger <co...@koeninger.org>
> > wrote:
> > >>
> > >> My 2 cents - I'd rather see design docs in github pull requests (using
> > >> plain text / markdown).  That doesn't require changing access or
> adding
> > >> people, and github PRs already allow for conversation / email
> > >> notifications.
> > >>
> > >> Conversation is already split between jira and github PRs.  Having a
> > third
> > >> stream of conversation in Google Docs just leads to things being
> > ignored.
> > >>
> > >> On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com>
> wrote:
> > >>
> > >> > That would require giving wiki access to everyone or manually adding
> > >> > people
> > >> > any time they make a doc.
> > >> >
> > >> > I don't see how this helps though. They're still docs on the
> internet
> > >> > and
> > >> > they're still linked from the central project JIRA, which is what
> you
> > >> > should follow.
> > >> >  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <
> > punya.biswal@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Dear Spark devs,
> > >> > >
> > >> > > Right now, design docs are stored on Google docs and linked from
> > >> > > tickets.
> > >> > > For someone new to the project, it's hard to figure out what
> > subjects
> > >> > > are
> > >> > > being discussed, what organization to follow for new feature
> > >> > > proposals,
> > >> > > etc.
> > >> > >
> > >> > > Would it make sense to consolidate future design docs in either a
> > >> > > designated area on the Apache Confluence Wiki, or on GitHub's Wiki
> > >> > > pages?
> > >> > > If people have a strong preference to keep the design docs on
> Google
> > >> > Docs,
> > >> > > then could we have a top-level page on the confluence wiki that
> > lists
> > >> > > all
> > >> > > active and archived design docs?
> > >> > >
> > >> > > Punya
> > >> > >
> > >> >
> > >
> > >
> >
>

Re: Design docs: consolidation and discoverability

Posted by Punyashloka Biswal <pu...@gmail.com>.
The Gradle dev team keep their design documents  *checked into* their Git
repository -- see
https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
for example. The advantages I see to their approach are:

   - design docs stay on ASF property (since Github is synced to the
   Apache-run Git repository)
   - design docs have a lifetime across PRs, but can still be modified and
   commented on through the mechanism of PRs
   - keeping a central location helps people to find good role models and
   converge on conventions

Sean, I find it hard to use the central Jira as a jumping-off point for
understanding ongoing design work because a tiny fraction of the tickets
actually relate to design docs, and it's not easy from the outside to
figure out which ones are relevant.

Punya

On Fri, Apr 24, 2015 at 2:49 PM Sean Owen <so...@cloudera.com> wrote:

> I think it's OK to have design discussions on github, as emails go to
> ASF lists. After all, loads of PR discussions happen there. It's easy
> for anyone to follow.
>
> I also would rather just discuss on Github, except for all that noise.
>
> It's not great to put discussions in something like Google Docs
> actually; the resulting doc needs to be pasted back to JIRA promptly
> if so. I suppose it's still better than a private conversation or not
> talking at all, but the principle is that one should be able to access
> any substantive decision or conversation by being tuned in to only the
> project systems of record -- mailing list, JIRA.
>
>
>
> On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin <rx...@databricks.com> wrote:
> > I'd love to see more design discussions consolidated in a single place as
> > well. That said, there are many practical challenges to overcome. Some of
> > them are out of our control:
> >
> > 1. For large features, it is fairly common to open a PR for discussion,
> > close the PR taking some feedback into account, and reopen another one.
> You
> > sort of lose the discussions that way.
> >
> > 2. With the way Jenkins is setup currently, Jenkins testing introduces a
> lot
> > of noise to GitHub pull requests, making it hard to differentiate
> legitimate
> > comments from noise. This is unfortunately due to the fact that ASF won't
> > allow our Jenkins bot to have API privilege to post messages.
> >
> > 3. The Apache Way is that all development discussions need to happen on
> ASF
> > property, i.e. dev lists and JIRA. As a result, technically we are not
> > allowed to have development discussions on GitHub.
> >
> >
> > On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger <co...@koeninger.org>
> wrote:
> >>
> >> My 2 cents - I'd rather see design docs in github pull requests (using
> >> plain text / markdown).  That doesn't require changing access or adding
> >> people, and github PRs already allow for conversation / email
> >> notifications.
> >>
> >> Conversation is already split between jira and github PRs.  Having a
> third
> >> stream of conversation in Google Docs just leads to things being
> ignored.
> >>
> >> On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> > That would require giving wiki access to everyone or manually adding
> >> > people
> >> > any time they make a doc.
> >> >
> >> > I don't see how this helps though. They're still docs on the internet
> >> > and
> >> > they're still linked from the central project JIRA, which is what you
> >> > should follow.
> >> >  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <
> punya.biswal@gmail.com>
> >> > wrote:
> >> >
> >> > > Dear Spark devs,
> >> > >
> >> > > Right now, design docs are stored on Google docs and linked from
> >> > > tickets.
> >> > > For someone new to the project, it's hard to figure out what
> subjects
> >> > > are
> >> > > being discussed, what organization to follow for new feature
> >> > > proposals,
> >> > > etc.
> >> > >
> >> > > Would it make sense to consolidate future design docs in either a
> >> > > designated area on the Apache Confluence Wiki, or on GitHub's Wiki
> >> > > pages?
> >> > > If people have a strong preference to keep the design docs on Google
> >> > Docs,
> >> > > then could we have a top-level page on the confluence wiki that
> lists
> >> > > all
> >> > > active and archived design docs?
> >> > >
> >> > > Punya
> >> > >
> >> >
> >
> >
>

Re: Design docs: consolidation and discoverability

Posted by Sean Owen <so...@cloudera.com>.
I think it's OK to have design discussions on github, as emails go to
ASF lists. After all, loads of PR discussions happen there. It's easy
for anyone to follow.

I also would rather just discuss on Github, except for all that noise.

It's not great to put discussions in something like Google Docs
actually; the resulting doc needs to be pasted back to JIRA promptly
if so. I suppose it's still better than a private conversation or not
talking at all, but the principle is that one should be able to access
any substantive decision or conversation by being tuned in to only the
project systems of record -- mailing list, JIRA.



On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin <rx...@databricks.com> wrote:
> I'd love to see more design discussions consolidated in a single place as
> well. That said, there are many practical challenges to overcome. Some of
> them are out of our control:
>
> 1. For large features, it is fairly common to open a PR for discussion,
> close the PR taking some feedback into account, and reopen another one. You
> sort of lose the discussions that way.
>
> 2. With the way Jenkins is setup currently, Jenkins testing introduces a lot
> of noise to GitHub pull requests, making it hard to differentiate legitimate
> comments from noise. This is unfortunately due to the fact that ASF won't
> allow our Jenkins bot to have API privilege to post messages.
>
> 3. The Apache Way is that all development discussions need to happen on ASF
> property, i.e. dev lists and JIRA. As a result, technically we are not
> allowed to have development discussions on GitHub.
>
>
> On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger <co...@koeninger.org> wrote:
>>
>> My 2 cents - I'd rather see design docs in github pull requests (using
>> plain text / markdown).  That doesn't require changing access or adding
>> people, and github PRs already allow for conversation / email
>> notifications.
>>
>> Conversation is already split between jira and github PRs.  Having a third
>> stream of conversation in Google Docs just leads to things being ignored.
>>
>> On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> > That would require giving wiki access to everyone or manually adding
>> > people
>> > any time they make a doc.
>> >
>> > I don't see how this helps though. They're still docs on the internet
>> > and
>> > they're still linked from the central project JIRA, which is what you
>> > should follow.
>> >  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <pu...@gmail.com>
>> > wrote:
>> >
>> > > Dear Spark devs,
>> > >
>> > > Right now, design docs are stored on Google docs and linked from
>> > > tickets.
>> > > For someone new to the project, it's hard to figure out what subjects
>> > > are
>> > > being discussed, what organization to follow for new feature
>> > > proposals,
>> > > etc.
>> > >
>> > > Would it make sense to consolidate future design docs in either a
>> > > designated area on the Apache Confluence Wiki, or on GitHub's Wiki
>> > > pages?
>> > > If people have a strong preference to keep the design docs on Google
>> > Docs,
>> > > then could we have a top-level page on the confluence wiki that lists
>> > > all
>> > > active and archived design docs?
>> > >
>> > > Punya
>> > >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Design docs: consolidation and discoverability

Posted by Reynold Xin <rx...@databricks.com>.
I'd love to see more design discussions consolidated in a single place as
well. That said, there are many practical challenges to overcome. Some of
them are out of our control:

1. For large features, it is fairly common to open a PR for discussion,
close the PR taking some feedback into account, and reopen another one. You
sort of lose the discussions that way.

2. With the way Jenkins is setup currently, Jenkins testing introduces a
lot of noise to GitHub pull requests, making it hard to differentiate
legitimate comments from noise. This is unfortunately due to the fact that
ASF won't allow our Jenkins bot to have API privilege to post messages.

3. The Apache Way is that all development discussions need to happen on ASF
property, i.e. dev lists and JIRA. As a result, technically we are not
allowed to have development discussions on GitHub.


On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger <co...@koeninger.org> wrote:

> My 2 cents - I'd rather see design docs in github pull requests (using
> plain text / markdown).  That doesn't require changing access or adding
> people, and github PRs already allow for conversation / email
> notifications.
>
> Conversation is already split between jira and github PRs.  Having a third
> stream of conversation in Google Docs just leads to things being ignored.
>
> On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com> wrote:
>
> > That would require giving wiki access to everyone or manually adding
> people
> > any time they make a doc.
> >
> > I don't see how this helps though. They're still docs on the internet and
> > they're still linked from the central project JIRA, which is what you
> > should follow.
> >  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <pu...@gmail.com>
> > wrote:
> >
> > > Dear Spark devs,
> > >
> > > Right now, design docs are stored on Google docs and linked from
> tickets.
> > > For someone new to the project, it's hard to figure out what subjects
> are
> > > being discussed, what organization to follow for new feature proposals,
> > > etc.
> > >
> > > Would it make sense to consolidate future design docs in either a
> > > designated area on the Apache Confluence Wiki, or on GitHub's Wiki
> pages?
> > > If people have a strong preference to keep the design docs on Google
> > Docs,
> > > then could we have a top-level page on the confluence wiki that lists
> all
> > > active and archived design docs?
> > >
> > > Punya
> > >
> >
>

Re: Design docs: consolidation and discoverability

Posted by Cody Koeninger <co...@koeninger.org>.
My 2 cents - I'd rather see design docs in github pull requests (using
plain text / markdown).  That doesn't require changing access or adding
people, and github PRs already allow for conversation / email notifications.

Conversation is already split between jira and github PRs.  Having a third
stream of conversation in Google Docs just leads to things being ignored.

On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen <so...@cloudera.com> wrote:

> That would require giving wiki access to everyone or manually adding people
> any time they make a doc.
>
> I don't see how this helps though. They're still docs on the internet and
> they're still linked from the central project JIRA, which is what you
> should follow.
>  On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <pu...@gmail.com>
> wrote:
>
> > Dear Spark devs,
> >
> > Right now, design docs are stored on Google docs and linked from tickets.
> > For someone new to the project, it's hard to figure out what subjects are
> > being discussed, what organization to follow for new feature proposals,
> > etc.
> >
> > Would it make sense to consolidate future design docs in either a
> > designated area on the Apache Confluence Wiki, or on GitHub's Wiki pages?
> > If people have a strong preference to keep the design docs on Google
> Docs,
> > then could we have a top-level page on the confluence wiki that lists all
> > active and archived design docs?
> >
> > Punya
> >
>

Re: Design docs: consolidation and discoverability

Posted by Sean Owen <so...@cloudera.com>.
That would require giving wiki access to everyone or manually adding people
any time they make a doc.

I don't see how this helps though. They're still docs on the internet and
they're still linked from the central project JIRA, which is what you
should follow.
 On Apr 24, 2015 8:14 AM, "Punyashloka Biswal" <pu...@gmail.com>
wrote:

> Dear Spark devs,
>
> Right now, design docs are stored on Google docs and linked from tickets.
> For someone new to the project, it's hard to figure out what subjects are
> being discussed, what organization to follow for new feature proposals,
> etc.
>
> Would it make sense to consolidate future design docs in either a
> designated area on the Apache Confluence Wiki, or on GitHub's Wiki pages?
> If people have a strong preference to keep the design docs on Google Docs,
> then could we have a top-level page on the confluence wiki that lists all
> active and archived design docs?
>
> Punya
>