You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@community.apache.org by jan i <ja...@apache.org> on 2015/04/18 11:44:54 UTC

Re: Re : Re: Project Visualization Tool...

On Saturday, April 18, 2015, <he...@free.fr> wrote:

> It was told the new site would use native json, instead of doap
> But I'm not convinced at all, since Doap is an invaluable source of info,
> documented, and so on

json is also a documented standard, that in general is more known, and I
believe has more tools supporting it.


>
> then imho it would be better to generate json from doap
>
> I disabled the json edit feature recently since it will cause problems

which problems?

with a defined json it is simple to generate the doap file.

I highly recommend staying at json and using that as base for all our
central data.

rgds
jan i



>
> regards
>
> Hervé
> ----- Mail d'origine -----
> De: Shane Curcuru <asf@shanecurcuru.org <javascript:;>>
> À: dev@community.apache.org <javascript:;>
> Envoyé: Sat, 18 Apr 2015 06:43:37 +0200 (CEST)
> Objet: Re: Project Visualization Tool...
>
> We had a great session, and a lot of energy, hopefully we can make some
> progress. One note: this needs to be a comdev PMC project, and we need
> to really plan the data part out if we want to be successful.
>
> Note that projects-new.a.o is the planned future replacement for
> projects.a.o - there are *significant* differences, so you need to look
> at the About page and the source repo. In particular, the new site uses
> it's own new JSON generated sources which (I think) will no longer use
> the DOAPs.
>
> In particular, Infra currently does *not* consider either the data
> gathering (i.e. populating the JSON behind the projects-new site) nor
> the visualizations (current or ones we want to build) as core supported
> services. So whatever we build needs to be maintained by this PMC to
> start with.
>
> Also, Link dump of useful related bits: ----------------
>
> Old service, based on crappy cron jobs and DOAP files from projects:
> https://projects.apache.org/
>
> New service, soon to be infra supported, relying on JSON data generated
> by infra on a regular schedule:
> https://projects-new.apache.org/
>
> Useful PMC chair report helper, that surfaces a number of different
> statistics about your PMC(s), including mailing list stats,
> PMC/committer changes, some software releases, etc. etc. (Members have
> visibility to all PMCs):
> https://reporter.apache.org
>
> Rob Weir (AOO, Member) used to do some visualization stuff and might
> have code ideas:
> http://www.robweir.com/blog/2013/05/mapping-apache.html
>
> Ken Coar's old mailing list stats page:
>
> https://people.apache.org/~coar/mlists.html
>
> The AOO project wrote a mailing list visualizer for who talks to whom:
> https://blogs.apache.org/OOo/entry/visualizing_the_aoo_dev_list
>
> Some outside statistics FLOSSmole generated about Apache communities and
> lists:
> http://flossmole.org/category/tags/apache
>
> Random other interesting analytics:
> The Subversion project has the "contribulyzer"
>
>
>
> - Shane
>
>

-- 
Sent from My iPad, sorry for any misspellings.

Re: Project Visualization Tool...

Posted by sebb <se...@gmail.com>.

On 16 May 2015 at 08:31, Hervé BOUTEMY <he...@free.fr> wrote:
> Le samedi 16 mai 2015 00:30:55 sebb a écrit :
>> On 15 May 2015 at 23:28, Hervé BOUTEMY <he...@free.fr> wrote:
>> > Le vendredi 15 mai 2015 15:34:47 sebb a écrit :
>> >> > I think we really have some data model problem here regarding what is a
>> >> > "project's DOAP file": sometimes, a project is a PMC, sometimes a
>> >> > project
>> >> > is a deliverable, more like what is called in projectsnew.a.o a
>> >> > "sub-project"
>> >>
>> >> That is not how I understand DOAPs.
>> >>
>> >> DOAP == Description Of A Project
>> >>
>> >> i.e. some releaseable artifact.
>> >>
>> >> A single PMC may have multiple projects, each with its own releases
>> >> and repositories.
>> >> These are modelled quite well in the DOAPs that PMCs have created.
>> >
>> > +1
>> >
>> >> Information about the PMC which manages the projects is NOT stored in
>> >> a DOAP, it is stored in a PMC data file.
>> >> This is referenced from a DOAP using
>> >>
>> >> <asfext:pmd rdf:resource="URL"/>
>> >>
>> >> where URL is either an actual URL of a PMC data file or a dummy URL e.g.
>> >>
>> >> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" />
>> >>
>> >> which leads to a file here:
>> >>
>> >> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects
>> >> /da ta_files/<pmcname>.rdf
>> >
>> > I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf
>> > file seems strange: I understand it is coded/known from projects.a.o xslt
>> > transformation
>> Yes.
>>
>> > But this should be usable from any RDF tooling, no?
>>
>> It's not currently usable except by using special processing.
>>
>> The problem is that the shorthand URL is used by all but about 4 of
>> the PMCs, so it would be a major challenge to get this fixed.
>>
>> Some PMCs are quick to fix such issues; some may take weeks or months
>> to fix even a simple error.
> I think that people don't understand this PMC information rdf file (I didn't
> until our current discussion)
> But with good explanations and visualization help given by projects-new.a.o,
> we can go really faster: I'm ready to try once we're clear :)
>
>>
>> > Another problem I see with these PMC data rdf files is that they seem to
>> > not be really maintained: I doubt PMCs update PMC data rdf files on each
>> > PMC Chair change.
>>
>> Yes.
>>
>> > That's why I had the idea of generating/updating the chair when
>> > parsing committee-info.txt.
>>
>> Fair enough, but that does not mean the code needs to create yet
>> another RDF file.
> +1
> my itend was not to create a new one, but replace with generated info
>
>>
>> > But other information manually written in current PMC data rdf files can't
>> > be found anywhere else, AFAIK.
>>
>> Yes.
> that's where it hurst: we need to mix handwritten with generated content...
> nedd to be clear on the process
>
>>
>> > Last problem: I personnally really didn't understand this PMC data rdf
>> > file
>> > until now. I don't know who understands it :)
>> > IMHO, the magic algorithm to find the rdf file is a root cause.
>>
>> The PMC data file is documented here:
>>
>> http://projects.apache.org/docs/pmc.html
> yeah, I read it several time before, I knew I was not confident with what I
> read, and now I know I completely misread it until now.

Does it need clarifying? If so, what is not clear? How could it be improved?

>>
>> >> > if you look at https://projects-new.apache.org/projects.html?pmc,
>> >> > typical
>> >> > cases for that are:
>> >> > - Incubator: there is the "the Incubator project", displayed without
>> >> > DOAP
>> >> > file since the incubator has special source info, and many sub-projects
>> >> > which provide DOAP files
>> >> > - Commons: there is no "Commons' DOAP file", then no TLP... on
>> >> > sub-project
>> >> > is quasi randomly chosen... Common's DOAP file, if it existed would not
>> >> > release anything, it"s a pure "organizational" project
>> >>
>> >> There is an ambiguity here: project can mean an organisational entity
>> >> and project can mean a releaseable artifact.
>> >>
>> >> There are different RDF files for the two meanings; only the artifact
>> >> has an associated DOAP.
>> >>
>> >> > - Ant: there is an Ant DOAP file that represent the TLP and the main
>> >> > released artifact
>> >>
>> >> No, it only links to the TLP = PMC data file, it does not represent the
>> >> TLP. The Ant DOAP file only represents the Ant product.
>> >
>> > ok, IIUC, I should rephrase
>> > https://projects-new.apache.org/project.html?ant : 1. "Top Level Project
>> > data:" to "Apache Committee data:"
>> > 2. "Project established:" to "Committee established:"
>>
>> That does not seem necessary.
>>
>> > 3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if
>> > one is the TLP
>>
>> No - none of the projects are the TLP.
> as said in the other thread, this assertion is confusing: "none of the
> projects are the Top Level Project"

On reflection, I think I was wrong about that.
The TLP is the original project which the PMC was created to manage.

>> The TLP / PMC is not the same as any of its projects.
>>
>> Most PMCs happen to have the same name as one of their projects, but
>> they are distinct entities.
>>
>> To take the Ant example, there needs to be an Ant PMC/TLP page and a
>> separate Ant project page.
>> These should be linked somehow.
>>
>> > and I should rename tlps.json to committees.json (and update code
>> > accordingly)
>> No need.
> given this problem with "a TLP is not a project", I think using committee or
> PMC would avoid confusion
>
>>
>> > then on https://projects-new.apache.org/ , do we really want to graph TLPs
>> > evolution or committees?
>>
>> No idea
> ok, for a later discussion :)
>
>>
>> > I suppose commons can be called a TLP, even if it does not have any "main"
>> > project that is the effective TLP
>>
>> Yes, Commons is a TLP/PMC.

Scratch that: Commons is a PMC; there is no real Commons TLP.

>>
>> I don't think it's helpful to think of PMCs having a "main" project.
>>
>> PMCs have one or more projects; each project has a single PMC.
>>
>> > comdev is not really a TLP: should probably not be listed in projects
>> > list,
>> > but as "special committee not producing projects"?
>>
>> Well, it is responsible for this mailing list and is probably
>> responsible for the projects.a.o website.
>>
>> > is Labs a TLP? or like comdev?
>>
>> What does committee-info.txt say?
> these are normal committees
> but form a software perspective, they're not expected to produce any project
> AFAIK, that's why I think they are special regarding the other 161 PMC that
> are meant to produce projects
>
>>
>> > I suppose we can hard-code the list of committees that are not expected to
>> > have projects, the list should not change often: Labs and comdev seem to
>> > be
>> > the only 2 (that extend special committees from 5 to 7)
>> >
>> > and finally, in https://projects-new.apache.org/
>> > change "163 top level software projects
>> > 107 sub-projects" to "270 projects managed by 163 committees" (or 161 if
>> > labs and comdev are special committees)
>> >
>> >
>> > this seems to make sense
>> > if no objection, I'll code it
>> >
>> > Regards,
>> >
>> > Hervé
>> >
>> >> > I chose Commons, but it could have been HttpComponents or Logging
>> >> > Services, or Lucene (Lucene have been very clear that there is a
>> >> > "Lucene
>> >> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics,
>> >> > Attic, Creadur, DB, jUDDI, Tcl
>> >> >
>> >> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP
>> >> > Server,
>> >> > MyFaces, Tomcat
>> >> >
>> >> >> - (future) UI additions for *other* places.  It would be awesome, for
>> >> >> example, to provide a tiny scriptlet that any project could inject in
>> >> >> their website that displays a "see also" menu.  That would link to a
>> >> >> specific URL on projects.a.o that would say "hey, you came from
>> >> >> Cassandra, here are: -other big data projects, -other projects in
>> >> >> Java,
>> >> >> -other projects with the same committers... etc." as a service.
>> >> >>
>> >> >> - Shane
>> >> >
>> >> > I'll continue tonight on this
>> >> > Any help appreciated
>> >> >
>> >> > Regards,
>> >> >
>> >> > Hervé
>

Re: Project Visualization Tool...

Posted by Hervé BOUTEMY <he...@free.fr>.

Le samedi 16 mai 2015 00:30:55 sebb a écrit :
> On 15 May 2015 at 23:28, Hervé BOUTEMY <he...@free.fr> wrote:
> > Le vendredi 15 mai 2015 15:34:47 sebb a écrit :
> >> > I think we really have some data model problem here regarding what is a
> >> > "project's DOAP file": sometimes, a project is a PMC, sometimes a
> >> > project
> >> > is a deliverable, more like what is called in projectsnew.a.o a
> >> > "sub-project"
> >> 
> >> That is not how I understand DOAPs.
> >> 
> >> DOAP == Description Of A Project
> >> 
> >> i.e. some releaseable artifact.
> >> 
> >> A single PMC may have multiple projects, each with its own releases
> >> and repositories.
> >> These are modelled quite well in the DOAPs that PMCs have created.
> > 
> > +1
> > 
> >> Information about the PMC which manages the projects is NOT stored in
> >> a DOAP, it is stored in a PMC data file.
> >> This is referenced from a DOAP using
> >> 
> >> <asfext:pmd rdf:resource="URL"/>
> >> 
> >> where URL is either an actual URL of a PMC data file or a dummy URL e.g.
> >> 
> >> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" />
> >> 
> >> which leads to a file here:
> >> 
> >> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects
> >> /da ta_files/<pmcname>.rdf
> > 
> > I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf
> > file seems strange: I understand it is coded/known from projects.a.o xslt
> > transformation
> Yes.
> 
> > But this should be usable from any RDF tooling, no?
> 
> It's not currently usable except by using special processing.
> 
> The problem is that the shorthand URL is used by all but about 4 of
> the PMCs, so it would be a major challenge to get this fixed.
> 
> Some PMCs are quick to fix such issues; some may take weeks or months
> to fix even a simple error.
I think that people don't understand this PMC information rdf file (I didn't 
until our current discussion)
But with good explanations and visualization help given by projects-new.a.o, 
we can go really faster: I'm ready to try once we're clear :)

> 
> > Another problem I see with these PMC data rdf files is that they seem to
> > not be really maintained: I doubt PMCs update PMC data rdf files on each
> > PMC Chair change.
> 
> Yes.
> 
> > That's why I had the idea of generating/updating the chair when
> > parsing committee-info.txt.
> 
> Fair enough, but that does not mean the code needs to create yet
> another RDF file.
+1
my itend was not to create a new one, but replace with generated info

> 
> > But other information manually written in current PMC data rdf files can't
> > be found anywhere else, AFAIK.
> 
> Yes.
that's where it hurst: we need to mix handwritten with generated content... 
nedd to be clear on the process

> 
> > Last problem: I personnally really didn't understand this PMC data rdf
> > file
> > until now. I don't know who understands it :)
> > IMHO, the magic algorithm to find the rdf file is a root cause.
> 
> The PMC data file is documented here:
> 
> http://projects.apache.org/docs/pmc.html
yeah, I read it several time before, I knew I was not confident with what I 
read, and now I know I completely misread it until now.

> 
> >> > if you look at https://projects-new.apache.org/projects.html?pmc,
> >> > typical
> >> > cases for that are:
> >> > - Incubator: there is the "the Incubator project", displayed without
> >> > DOAP
> >> > file since the incubator has special source info, and many sub-projects
> >> > which provide DOAP files
> >> > - Commons: there is no "Commons' DOAP file", then no TLP... on
> >> > sub-project
> >> > is quasi randomly chosen... Common's DOAP file, if it existed would not
> >> > release anything, it"s a pure "organizational" project
> >> 
> >> There is an ambiguity here: project can mean an organisational entity
> >> and project can mean a releaseable artifact.
> >> 
> >> There are different RDF files for the two meanings; only the artifact
> >> has an associated DOAP.
> >> 
> >> > - Ant: there is an Ant DOAP file that represent the TLP and the main
> >> > released artifact
> >> 
> >> No, it only links to the TLP = PMC data file, it does not represent the
> >> TLP. The Ant DOAP file only represents the Ant product.
> > 
> > ok, IIUC, I should rephrase
> > https://projects-new.apache.org/project.html?ant : 1. "Top Level Project
> > data:" to "Apache Committee data:"
> > 2. "Project established:" to "Committee established:"
> 
> That does not seem necessary.
> 
> > 3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if
> > one is the TLP
> 
> No - none of the projects are the TLP.
as said in the other thread, this assertion is confusing: "none of the 
projects are the Top Level Project"

> The TLP / PMC is not the same as any of its projects.
> 
> Most PMCs happen to have the same name as one of their projects, but
> they are distinct entities.
> 
> To take the Ant example, there needs to be an Ant PMC/TLP page and a
> separate Ant project page.
> These should be linked somehow.
> 
> > and I should rename tlps.json to committees.json (and update code
> > accordingly)
> No need.
given this problem with "a TLP is not a project", I think using committee or 
PMC would avoid confusion

> 
> > then on https://projects-new.apache.org/ , do we really want to graph TLPs
> > evolution or committees?
> 
> No idea
ok, for a later discussion :)

> 
> > I suppose commons can be called a TLP, even if it does not have any "main"
> > project that is the effective TLP
> 
> Yes, Commons is a TLP/PMC.
> 
> I don't think it's helpful to think of PMCs having a "main" project.
> 
> PMCs have one or more projects; each project has a single PMC.
> 
> > comdev is not really a TLP: should probably not be listed in projects
> > list,
> > but as "special committee not producing projects"?
> 
> Well, it is responsible for this mailing list and is probably
> responsible for the projects.a.o website.
> 
> > is Labs a TLP? or like comdev?
> 
> What does committee-info.txt say?
these are normal committees
but form a software perspective, they're not expected to produce any project 
AFAIK, that's why I think they are special regarding the other 161 PMC that 
are meant to produce projects

> 
> > I suppose we can hard-code the list of committees that are not expected to
> > have projects, the list should not change often: Labs and comdev seem to
> > be
> > the only 2 (that extend special committees from 5 to 7)
> > 
> > and finally, in https://projects-new.apache.org/
> > change "163 top level software projects
> > 107 sub-projects" to "270 projects managed by 163 committees" (or 161 if
> > labs and comdev are special committees)
> > 
> > 
> > this seems to make sense
> > if no objection, I'll code it
> > 
> > Regards,
> > 
> > Hervé
> > 
> >> > I chose Commons, but it could have been HttpComponents or Logging
> >> > Services, or Lucene (Lucene have been very clear that there is a
> >> > "Lucene
> >> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics,
> >> > Attic, Creadur, DB, jUDDI, Tcl
> >> > 
> >> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP
> >> > Server,
> >> > MyFaces, Tomcat
> >> > 
> >> >> - (future) UI additions for *other* places.  It would be awesome, for
> >> >> example, to provide a tiny scriptlet that any project could inject in
> >> >> their website that displays a "see also" menu.  That would link to a
> >> >> specific URL on projects.a.o that would say "hey, you came from
> >> >> Cassandra, here are: -other big data projects, -other projects in
> >> >> Java,
> >> >> -other projects with the same committers... etc." as a service.
> >> >> 
> >> >> - Shane
> >> > 
> >> > I'll continue tonight on this
> >> > Any help appreciated
> >> > 
> >> > Regards,
> >> > 
> >> > Hervé

Re: Project Visualization Tool...

Posted by sebb <se...@gmail.com>.

On 16 May 2015 at 00:30, sebb <se...@gmail.com> wrote:
> On 15 May 2015 at 23:28, Hervé BOUTEMY <he...@free.fr> wrote:
>> Le vendredi 15 mai 2015 15:34:47 sebb a écrit :
>>> > I think we really have some data model problem here regarding what is a
>>> > "project's DOAP file": sometimes, a project is a PMC, sometimes a project
>>> > is a deliverable, more like what is called in projectsnew.a.o a
>>> > "sub-project"
>>> That is not how I understand DOAPs.
>>>
>>> DOAP == Description Of A Project
>>>
>>> i.e. some releaseable artifact.
>>>
>>> A single PMC may have multiple projects, each with its own releases
>>> and repositories.
>>> These are modelled quite well in the DOAPs that PMCs have created.
>> +1
>>
>>> Information about the PMC which manages the projects is NOT stored in
>>> a DOAP, it is stored in a PMC data file.
>>> This is referenced from a DOAP using
>>>
>>> <asfext:pmd rdf:resource="URL"/>
>>>
>>> where URL is either an actual URL of a PMC data file or a dummy URL e.g.
>>>
>>> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" />
>>>
>>> which leads to a file here:
>>>
>>> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/da
>>> ta_files/<pmcname>.rdf
>> I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf file seems
>> strange: I understand it is coded/known from projects.a.o xslt transformation
>
> Yes.
>
>> But this should be usable from any RDF tooling, no?
>
> It's not currently usable except by using special processing.
>
> The problem is that the shorthand URL is used by all but about 4 of
> the PMCs, so it would be a major challenge to get this fixed.
>
> Some PMCs are quick to fix such issues; some may take weeks or months
> to fix even a simple error.
>
>> Another problem I see with these PMC data rdf files is that they seem to not be
>> really maintained: I doubt PMCs update PMC data rdf files on each PMC Chair
>> change.
>
> Yes.
>
>> That's why I had the idea of generating/updating the chair when
>> parsing committee-info.txt.
>
> Fair enough, but that does not mean the code needs to create yet
> another RDF file.
>
>> But other information manually written in current PMC data rdf files can't be
>> found anywhere else, AFAIK.
>>
>
> Yes.
>
>> Last problem: I personnally really didn't understand this PMC data rdf file
>> until now. I don't know who understands it :)
>> IMHO, the magic algorithm to find the rdf file is a root cause.
>
> The PMC data file is documented here:
>
> http://projects.apache.org/docs/pmc.html
>
>>> > if you look at https://projects-new.apache.org/projects.html?pmc, typical
>>> > cases for that are:
>>> > - Incubator: there is the "the Incubator project", displayed without DOAP
>>> > file since the incubator has special source info, and many sub-projects
>>> > which provide DOAP files
>>> > - Commons: there is no "Commons' DOAP file", then no TLP... on sub-project
>>> > is quasi randomly chosen... Common's DOAP file, if it existed would not
>>> > release anything, it"s a pure "organizational" project
>>>
>>> There is an ambiguity here: project can mean an organisational entity
>>> and project can mean a releaseable artifact.
>>>
>>> There are different RDF files for the two meanings; only the artifact
>>> has an associated DOAP.
>>>
>>> > - Ant: there is an Ant DOAP file that represent the TLP and the main
>>> > released artifact
>>>
>>> No, it only links to the TLP = PMC data file, it does not represent the TLP.
>>> The Ant DOAP file only represents the Ant product.
>> ok, IIUC, I should rephrase https://projects-new.apache.org/project.html?ant :
>> 1. "Top Level Project data:" to "Apache Committee data:"
>> 2. "Project established:" to "Committee established:"
>
> That does not seem necessary.
>
>> 3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if one
>> is the TLP
>
> No - none of the projects are the TLP.
> The TLP / PMC is not the same as any of its projects.
>
> Most PMCs happen to have the same name as one of their projects, but
> they are distinct entities.

Note that the Creadur PMC does not have a Creadur project.

> To take the Ant example, there needs to be an Ant PMC/TLP page and a
> separate Ant project page.
> These should be linked somehow.
>
>> and I should rename tlps.json to committees.json (and update code accordingly)
>
> No need.
>
>> then on https://projects-new.apache.org/ , do we really want to graph TLPs
>> evolution or committees?
>
> No idea
>
>> I suppose commons can be called a TLP, even if it does not have any "main"
>> project that is the effective TLP
>
> Yes, Commons is a TLP/PMC.
>
> I don't think it's helpful to think of PMCs having a "main" project.
>
> PMCs have one or more projects; each project has a single PMC.
>
>> comdev is not really a TLP: should probably not be listed in projects list,
>> but as "special committee not producing projects"?
>
> Well, it is responsible for this mailing list and is probably
> responsible for the projects.a.o website.
>
>> is Labs a TLP? or like comdev?
>
> What does committee-info.txt say?
>
>> I suppose we can hard-code the list of committees that are not expected to
>> have projects, the list should not change often: Labs and comdev seem to be
>> the only 2 (that extend special committees from 5 to 7)
>>
>> and finally, in https://projects-new.apache.org/
>> change "163 top level software projects
>> 107 sub-projects" to "270 projects managed by 163 committees" (or 161 if labs
>> and comdev are special committees)
>>
>>
>> this seems to make sense
>> if no objection, I'll code it
>>
>> Regards,
>>
>> Hervé
>>
>>>
>>> > I chose Commons, but it could have been HttpComponents or Logging
>>> > Services, or Lucene (Lucene have been very clear that there is a "Lucene
>>> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics,
>>> > Attic, Creadur, DB, jUDDI, Tcl
>>> >
>>> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP
>>> > Server,
>>> > MyFaces, Tomcat
>>> >
>>> >> - (future) UI additions for *other* places.  It would be awesome, for
>>> >> example, to provide a tiny scriptlet that any project could inject in
>>> >> their website that displays a "see also" menu.  That would link to a
>>> >> specific URL on projects.a.o that would say "hey, you came from
>>> >> Cassandra, here are: -other big data projects, -other projects in Java,
>>> >> -other projects with the same committers... etc." as a service.
>>> >>
>>> >> - Shane
>>> >
>>> > I'll continue tonight on this
>>> > Any help appreciated
>>> >
>>> > Regards,
>>> >
>>> > Hervé
>>

Re: Project Visualization Tool...

Posted by sebb <se...@gmail.com>.

On 15 May 2015 at 23:28, Hervé BOUTEMY <he...@free.fr> wrote:
> Le vendredi 15 mai 2015 15:34:47 sebb a écrit :
>> > I think we really have some data model problem here regarding what is a
>> > "project's DOAP file": sometimes, a project is a PMC, sometimes a project
>> > is a deliverable, more like what is called in projectsnew.a.o a
>> > "sub-project"
>> That is not how I understand DOAPs.
>>
>> DOAP == Description Of A Project
>>
>> i.e. some releaseable artifact.
>>
>> A single PMC may have multiple projects, each with its own releases
>> and repositories.
>> These are modelled quite well in the DOAPs that PMCs have created.
> +1
>
>> Information about the PMC which manages the projects is NOT stored in
>> a DOAP, it is stored in a PMC data file.
>> This is referenced from a DOAP using
>>
>> <asfext:pmd rdf:resource="URL"/>
>>
>> where URL is either an actual URL of a PMC data file or a dummy URL e.g.
>>
>> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" />
>>
>> which leads to a file here:
>>
>> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/da
>> ta_files/<pmcname>.rdf
> I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf file seems
> strange: I understand it is coded/known from projects.a.o xslt transformation

Yes.

> But this should be usable from any RDF tooling, no?

It's not currently usable except by using special processing.

The problem is that the shorthand URL is used by all but about 4 of
the PMCs, so it would be a major challenge to get this fixed.

Some PMCs are quick to fix such issues; some may take weeks or months
to fix even a simple error.

> Another problem I see with these PMC data rdf files is that they seem to not be
> really maintained: I doubt PMCs update PMC data rdf files on each PMC Chair
> change.

Yes.

> That's why I had the idea of generating/updating the chair when
> parsing committee-info.txt.

Fair enough, but that does not mean the code needs to create yet
another RDF file.

> But other information manually written in current PMC data rdf files can't be
> found anywhere else, AFAIK.
>

Yes.

> Last problem: I personnally really didn't understand this PMC data rdf file
> until now. I don't know who understands it :)
> IMHO, the magic algorithm to find the rdf file is a root cause.

The PMC data file is documented here:

http://projects.apache.org/docs/pmc.html

>> > if you look at https://projects-new.apache.org/projects.html?pmc, typical
>> > cases for that are:
>> > - Incubator: there is the "the Incubator project", displayed without DOAP
>> > file since the incubator has special source info, and many sub-projects
>> > which provide DOAP files
>> > - Commons: there is no "Commons' DOAP file", then no TLP... on sub-project
>> > is quasi randomly chosen... Common's DOAP file, if it existed would not
>> > release anything, it"s a pure "organizational" project
>>
>> There is an ambiguity here: project can mean an organisational entity
>> and project can mean a releaseable artifact.
>>
>> There are different RDF files for the two meanings; only the artifact
>> has an associated DOAP.
>>
>> > - Ant: there is an Ant DOAP file that represent the TLP and the main
>> > released artifact
>>
>> No, it only links to the TLP = PMC data file, it does not represent the TLP.
>> The Ant DOAP file only represents the Ant product.
> ok, IIUC, I should rephrase https://projects-new.apache.org/project.html?ant :
> 1. "Top Level Project data:" to "Apache Committee data:"
> 2. "Project established:" to "Committee established:"

That does not seem necessary.

> 3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if one
> is the TLP

No - none of the projects are the TLP.
The TLP / PMC is not the same as any of its projects.

Most PMCs happen to have the same name as one of their projects, but
they are distinct entities.

To take the Ant example, there needs to be an Ant PMC/TLP page and a
separate Ant project page.
These should be linked somehow.

> and I should rename tlps.json to committees.json (and update code accordingly)

No need.

> then on https://projects-new.apache.org/ , do we really want to graph TLPs
> evolution or committees?

No idea

> I suppose commons can be called a TLP, even if it does not have any "main"
> project that is the effective TLP

Yes, Commons is a TLP/PMC.

I don't think it's helpful to think of PMCs having a "main" project.

PMCs have one or more projects; each project has a single PMC.

> comdev is not really a TLP: should probably not be listed in projects list,
> but as "special committee not producing projects"?

Well, it is responsible for this mailing list and is probably
responsible for the projects.a.o website.

> is Labs a TLP? or like comdev?

What does committee-info.txt say?

> I suppose we can hard-code the list of committees that are not expected to
> have projects, the list should not change often: Labs and comdev seem to be
> the only 2 (that extend special committees from 5 to 7)
>
> and finally, in https://projects-new.apache.org/
> change "163 top level software projects
> 107 sub-projects" to "270 projects managed by 163 committees" (or 161 if labs
> and comdev are special committees)
>
>
> this seems to make sense
> if no objection, I'll code it
>
> Regards,
>
> Hervé
>
>>
>> > I chose Commons, but it could have been HttpComponents or Logging
>> > Services, or Lucene (Lucene have been very clear that there is a "Lucene
>> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics,
>> > Attic, Creadur, DB, jUDDI, Tcl
>> >
>> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP
>> > Server,
>> > MyFaces, Tomcat
>> >
>> >> - (future) UI additions for *other* places.  It would be awesome, for
>> >> example, to provide a tiny scriptlet that any project could inject in
>> >> their website that displays a "see also" menu.  That would link to a
>> >> specific URL on projects.a.o that would say "hey, you came from
>> >> Cassandra, here are: -other big data projects, -other projects in Java,
>> >> -other projects with the same committers... etc." as a service.
>> >>
>> >> - Shane
>> >
>> > I'll continue tonight on this
>> > Any help appreciated
>> >
>> > Regards,
>> >
>> > Hervé
>

Re: Project Visualization Tool...

Posted by Hervé BOUTEMY <he...@free.fr>.

Le vendredi 15 mai 2015 15:34:47 sebb a écrit :
> > I think we really have some data model problem here regarding what is a
> > "project's DOAP file": sometimes, a project is a PMC, sometimes a project
> > is a deliverable, more like what is called in projectsnew.a.o a
> > "sub-project"
> That is not how I understand DOAPs.
> 
> DOAP == Description Of A Project
> 
> i.e. some releaseable artifact.
> 
> A single PMC may have multiple projects, each with its own releases
> and repositories.
> These are modelled quite well in the DOAPs that PMCs have created.
+1

> Information about the PMC which manages the projects is NOT stored in
> a DOAP, it is stored in a PMC data file.
> This is referenced from a DOAP using
> 
> <asfext:pmd rdf:resource="URL"/>
> 
> where URL is either an actual URL of a PMC data file or a dummy URL e.g.
> 
> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" />
> 
> which leads to a file here:
> 
> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/da
> ta_files/<pmcname>.rdf
I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf file seems 
strange: I understand it is coded/known from projects.a.o xslt transformation
But this should be usable from any RDF tooling, no?

Another problem I see with these PMC data rdf files is that they seem to not be 
really maintained: I doubt PMCs update PMC data rdf files on each PMC Chair 
change. That's why I had the idea of generating/updating the chair when 
parsing committee-info.txt.
But other information manually written in current PMC data rdf files can't be 
found anywhere else, AFAIK.

Last problem: I personnally really didn't understand this PMC data rdf file 
until now. I don't know who understands it :)
IMHO, the magic algorithm to find the rdf file is a root cause.

> > if you look at https://projects-new.apache.org/projects.html?pmc, typical
> > cases for that are:
> > - Incubator: there is the "the Incubator project", displayed without DOAP
> > file since the incubator has special source info, and many sub-projects
> > which provide DOAP files
> > - Commons: there is no "Commons' DOAP file", then no TLP... on sub-project
> > is quasi randomly chosen... Common's DOAP file, if it existed would not
> > release anything, it"s a pure "organizational" project
> 
> There is an ambiguity here: project can mean an organisational entity
> and project can mean a releaseable artifact.
> 
> There are different RDF files for the two meanings; only the artifact
> has an associated DOAP.
> 
> > - Ant: there is an Ant DOAP file that represent the TLP and the main
> > released artifact
> 
> No, it only links to the TLP = PMC data file, it does not represent the TLP.
> The Ant DOAP file only represents the Ant product.
ok, IIUC, I should rephrase https://projects-new.apache.org/project.html?ant : 
1. "Top Level Project data:" to "Apache Committee data:"
2. "Project established:" to "Committee established:"
3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if one 
is the TLP

and I should rename tlps.json to committees.json (and update code accordingly)

then on https://projects-new.apache.org/ , do we really want to graph TLPs 
evolution or committees?
I suppose commons can be called a TLP, even if it does not have any "main" 
project that is the effective TLP
comdev is not really a TLP: should probably not be listed in projects list, 
but as "special committee not producing projects"?
is Labs a TLP? or like comdev?
I suppose we can hard-code the list of committees that are not expected to 
have projects, the list should not change often: Labs and comdev seem to be 
the only 2 (that extend special committees from 5 to 7)

and finally, in https://projects-new.apache.org/
change "163 top level software projects
107 sub-projects" to "270 projects managed by 163 committees" (or 161 if labs 
and comdev are special committees)


this seems to make sense
if no objection, I'll code it

Regards,

Hervé

> 
> > I chose Commons, but it could have been HttpComponents or Logging
> > Services, or Lucene (Lucene have been very clear that there is a "Lucene
> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics,
> > Attic, Creadur, DB, jUDDI, Tcl
> > 
> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP
> > Server,
> > MyFaces, Tomcat
> > 
> >> - (future) UI additions for *other* places.  It would be awesome, for
> >> example, to provide a tiny scriptlet that any project could inject in
> >> their website that displays a "see also" menu.  That would link to a
> >> specific URL on projects.a.o that would say "hey, you came from
> >> Cassandra, here are: -other big data projects, -other projects in Java,
> >> -other projects with the same committers... etc." as a service.
> >> 
> >> - Shane
> > 
> > I'll continue tonight on this
> > Any help appreciated
> > 
> > Regards,
> > 
> > Hervé

Re: Project Visualization Tool...

Posted by sebb <se...@gmail.com>.

On 5 May 2015 at 07:38, Hervé BOUTEMY <he...@free.fr> wrote:
> Le samedi 18 avril 2015 10:55:00 Shane Curcuru a écrit :
>> LOL, below.
>>
>> I highly recommend separating the model from the views, so that we can
>> efficiently enable our volunteer's energy here to actually accomplish
>> something valuable.
> +1
>
>>
>> So let's work on stuff to do that excites us, but remember to keep the
>> technical problems focused on what this PMC believes we can truly create
>> and maintain going forward.
>>
>> Don't worry about everything at once.  Just focus on separate bits:
>>
>> - Method to scrape source data from our various definitive or even not
>> completely definitive but very close places (txt files, websites, LDAP)
>>
>> - Model and data source that actually holds info about committer lists
>> and project metadata.  I'm betting Daniels' projects-new does this very
>> well already.
> +1 it's a perfect starting point: just need to document and continue to
> improve
> then I started by documenting what are the current information sources used
> for generating projects-new.a.o json files:
> see https://projects-new.apache.org/json/foundation/ and
> http://svn.apache.org/viewvc/comdev/projects.apache.org/scripts/README.txt?view=markup
>
>>
>> ----------
>> - Stable API to get at that model.  Would be really nice if we did this
>> just once, so that people working above here don't interfere with people
>> working below here.
>> ----------
> +1
>
> Since there are multiple information sources for TLPs/PMCs/committers, I think
> I will consolidate to avoid what's currently happenning: the projects.js (ie
> one visualization) contains a lot of code to consolidate the multiple
> information sources
> If the consolidation is done server side, in the generation scripts, it will
> be easier to use for projects.js and any other tool wanting to do other future
> visualizations
>
>>
>> - Visualizations.  There's lots of different stuff to do here, and I
>> think it'd be super helpful if everyone just did something they want,
>> and then show us the code.
> +1
>
>>
>> Sure, there's lots of "what is important" to focus on, but I for one
>> would love to see real examples of all the cool visualization libraries
>> out there, and I know a couple folks already use some of them.
>>
>> - UI additions for the projects-new/projects websites, which are
>> featured at the top level of a.o.  I.e., this is our "projects
>> directory", how can we better lead people who arrive there at what they
>> want to know?
> at the moment, I'm not trying to add any new UI, but improve the consistency
> of displayed data, since current state is not really consistent: some PMCs are
> not displayed, probably because they have not provided any DOAP file. But even
> without DOAP file, we have a lot of data to display for a TLP, most of what we
> display for a TLP (ie a project that does not have any subproject)
>
> I think we really have some data model problem here regarding what is a
> "project's DOAP file": sometimes, a project is a PMC, sometimes a project is a
> deliverable, more like what is called in projectsnew.a.o a "sub-project"

That is not how I understand DOAPs.

DOAP == Description Of A Project

i.e. some releaseable artifact.

A single PMC may have multiple projects, each with its own releases
and repositories.
These are modelled quite well in the DOAPs that PMCs have created.

Information about the PMC which manages the projects is NOT stored in
a DOAP, it is stored in a PMC data file.

This is referenced from a DOAP using

<asfext:pmd rdf:resource="URL"/>

where URL is either an actual URL of a PMC data file or a dummy URL e.g.

<asfext:pmc rdf:resource="http://<pmcname>.apache.org" />

which leads to a file here:

https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files/<pmcname>.rdf


> if you look at https://projects-new.apache.org/projects.html?pmc, typical
> cases for that are:
> - Incubator: there is the "the Incubator project", displayed without DOAP file
> since the incubator has special source info, and many sub-projects which
> provide DOAP files
> - Commons: there is no "Commons' DOAP file", then no TLP... on sub-project is
> quasi randomly chosen... Common's DOAP file, if it existed would not release
> anything, it"s a pure "organizational" project

There is an ambiguity here: project can mean an organisational entity
and project can mean a releaseable artifact.

There are different RDF files for the two meanings; only the artifact
has an associated DOAP.

> - Ant: there is an Ant DOAP file that represent the TLP and the main released
> artifact

No, it only links to the TLP = PMC data file, it does not represent the TLP.
The Ant DOAP file only represents the Ant product.

> I chose Commons, but it could have been HttpComponents or Logging Services, or
> Lucene (Lucene have been very clear that there is a "Lucene core" sub-
> project), Web Services, Axis, Xalan, Xerces, XML Graphics, Attic, Creadur, DB,
> jUDDI, Tcl
>
> I chose Ant, but it could have been Velocity, MINA, Directory, HTTP Server,
> MyFaces, Tomcat
>
>
>>
>> - (future) UI additions for *other* places.  It would be awesome, for
>> example, to provide a tiny scriptlet that any project could inject in
>> their website that displays a "see also" menu.  That would link to a
>> specific URL on projects.a.o that would say "hey, you came from
>> Cassandra, here are: -other big data projects, -other projects in Java,
>> -other projects with the same committers... etc." as a service.
>>
>> - Shane
>
> I'll continue tonight on this
> Any help appreciated
>
> Regards,
>
> Hervé
>

Re: Project Visualization Tool...

Posted by Hervé BOUTEMY <he...@free.fr>.

Le samedi 18 avril 2015 10:55:00 Shane Curcuru a écrit :
> LOL, below.
> 
> I highly recommend separating the model from the views, so that we can
> efficiently enable our volunteer's energy here to actually accomplish
> something valuable.
+1

> 
> So let's work on stuff to do that excites us, but remember to keep the
> technical problems focused on what this PMC believes we can truly create
> and maintain going forward.
> 
> Don't worry about everything at once.  Just focus on separate bits:
> 
> - Method to scrape source data from our various definitive or even not
> completely definitive but very close places (txt files, websites, LDAP)
> 
> - Model and data source that actually holds info about committer lists
> and project metadata.  I'm betting Daniels' projects-new does this very
> well already.
+1 it's a perfect starting point: just need to document and continue to 
improve
then I started by documenting what are the current information sources used 
for generating projects-new.a.o json files:
see https://projects-new.apache.org/json/foundation/ and 
http://svn.apache.org/viewvc/comdev/projects.apache.org/scripts/README.txt?view=markup

> 
> ----------
> - Stable API to get at that model.  Would be really nice if we did this
> just once, so that people working above here don't interfere with people
> working below here.
> ----------
+1

Since there are multiple information sources for TLPs/PMCs/committers, I think 
I will consolidate to avoid what's currently happenning: the projects.js (ie 
one visualization) contains a lot of code to consolidate the multiple 
information sources
If the consolidation is done server side, in the generation scripts, it will 
be easier to use for projects.js and any other tool wanting to do other future 
visualizations

> 
> - Visualizations.  There's lots of different stuff to do here, and I
> think it'd be super helpful if everyone just did something they want,
> and then show us the code.
+1

> 
> Sure, there's lots of "what is important" to focus on, but I for one
> would love to see real examples of all the cool visualization libraries
> out there, and I know a couple folks already use some of them.
> 
> - UI additions for the projects-new/projects websites, which are
> featured at the top level of a.o.  I.e., this is our "projects
> directory", how can we better lead people who arrive there at what they
> want to know?
at the moment, I'm not trying to add any new UI, but improve the consistency 
of displayed data, since current state is not really consistent: some PMCs are 
not displayed, probably because they have not provided any DOAP file. But even 
without DOAP file, we have a lot of data to display for a TLP, most of what we 
display for a TLP (ie a project that does not have any subproject)

I think we really have some data model problem here regarding what is a 
"project's DOAP file": sometimes, a project is a PMC, sometimes a project is a 
deliverable, more like what is called in projectsnew.a.o a "sub-project"

if you look at https://projects-new.apache.org/projects.html?pmc, typical 
cases for that are:
- Incubator: there is the "the Incubator project", displayed without DOAP file 
since the incubator has special source info, and many sub-projects which 
provide DOAP files
- Commons: there is no "Commons' DOAP file", then no TLP... on sub-project is 
quasi randomly chosen... Common's DOAP file, if it existed would not release 
anything, it"s a pure "organizational" project
- Ant: there is an Ant DOAP file that represent the TLP and the main released 
artifact

I chose Commons, but it could have been HttpComponents or Logging Services, or 
Lucene (Lucene have been very clear that there is a "Lucene core" sub-
project), Web Services, Axis, Xalan, Xerces, XML Graphics, Attic, Creadur, DB, 
jUDDI, Tcl

I chose Ant, but it could have been Velocity, MINA, Directory, HTTP Server, 
MyFaces, Tomcat


> 
> - (future) UI additions for *other* places.  It would be awesome, for
> example, to provide a tiny scriptlet that any project could inject in
> their website that displays a "see also" menu.  That would link to a
> specific URL on projects.a.o that would say "hey, you came from
> Cassandra, here are: -other big data projects, -other projects in Java,
> -other projects with the same committers... etc." as a service.
> 
> - Shane

I'll continue tonight on this
Any help appreciated

Regards,

Hervé

Re: Re : Re: Project Visualization Tool...

Posted by Shane Curcuru <as...@shanecurcuru.org>.

LOL, below.

I highly recommend separating the model from the views, so that we can
efficiently enable our volunteer's energy here to actually accomplish
something valuable.

So let's work on stuff to do that excites us, but remember to keep the
technical problems focused on what this PMC believes we can truly create
and maintain going forward.

Don't worry about everything at once.  Just focus on separate bits:

- Method to scrape source data from our various definitive or even not
completely definitive but very close places (txt files, websites, LDAP)

- Model and data source that actually holds info about committer lists
and project metadata.  I'm betting Daniels' projects-new does this very
well already.

----------
- Stable API to get at that model.  Would be really nice if we did this
just once, so that people working above here don't interfere with people
working below here.
----------

- Visualizations.  There's lots of different stuff to do here, and I
think it'd be super helpful if everyone just did something they want,
and then show us the code.

Sure, there's lots of "what is important" to focus on, but I for one
would love to see real examples of all the cool visualization libraries
out there, and I know a couple folks already use some of them.

- UI additions for the projects-new/projects websites, which are
featured at the top level of a.o.  I.e., this is our "projects
directory", how can we better lead people who arrive there at what they
want to know?

- (future) UI additions for *other* places.  It would be awesome, for
example, to provide a tiny scriptlet that any project could inject in
their website that displays a "see also" menu.  That would link to a
specific URL on projects.a.o that would say "hey, you came from
Cassandra, here are: -other big data projects, -other projects in Java,
-other projects with the same committers... etc." as a service.

- Shane


On 4/18/15 5:44 AM, jan i wrote:
> On Saturday, April 18, 2015, <he...@free.fr> wrote:
> 
>> It was told the new site would use native json, instead of doap
>> But I'm not convinced at all, since Doap is an invaluable source of info,
>> documented, and so on
> 
> json is also a documented standard, that in general is more known, and I
> believe has more tools supporting it.
> 
> 
>>
>> then imho it would be better to generate json from doap
>>
>> I disabled the json edit feature recently since it will cause problems
> 
> which problems?
> 
> with a defined json it is simple to generate the doap file.
> 
> I highly recommend staying at json and using that as base for all our
> central data.
> 
> rgds
> jan i
> 
> 
> 
>>
>> regards
>>
>> Hervé
>> ----- Mail d'origine -----
>> De: Shane Curcuru <asf@shanecurcuru.org <javascript:;>>
>> À: dev@community.apache.org <javascript:;>
>> Envoyé: Sat, 18 Apr 2015 06:43:37 +0200 (CEST)
>> Objet: Re: Project Visualization Tool...
>>
>> We had a great session, and a lot of energy, hopefully we can make some
>> progress. One note: this needs to be a comdev PMC project, and we need
>> to really plan the data part out if we want to be successful.
>>
>> Note that projects-new.a.o is the planned future replacement for
>> projects.a.o - there are *significant* differences, so you need to look
>> at the About page and the source repo. In particular, the new site uses
>> it's own new JSON generated sources which (I think) will no longer use
>> the DOAPs.
>>
>> In particular, Infra currently does *not* consider either the data
>> gathering (i.e. populating the JSON behind the projects-new site) nor
>> the visualizations (current or ones we want to build) as core supported
>> services. So whatever we build needs to be maintained by this PMC to
>> start with.
>>
>> Also, Link dump of useful related bits: ----------------
>>
>> Old service, based on crappy cron jobs and DOAP files from projects:
>> https://projects.apache.org/
>>
>> New service, soon to be infra supported, relying on JSON data generated
>> by infra on a regular schedule:
>> https://projects-new.apache.org/
>>
>> Useful PMC chair report helper, that surfaces a number of different
>> statistics about your PMC(s), including mailing list stats,
>> PMC/committer changes, some software releases, etc. etc. (Members have
>> visibility to all PMCs):
>> https://reporter.apache.org
>>
>> Rob Weir (AOO, Member) used to do some visualization stuff and might
>> have code ideas:
>> http://www.robweir.com/blog/2013/05/mapping-apache.html
>>
>> Ken Coar's old mailing list stats page:
>>
>> https://people.apache.org/~coar/mlists.html
>>
>> The AOO project wrote a mailing list visualizer for who talks to whom:
>> https://blogs.apache.org/OOo/entry/visualizing_the_aoo_dev_list
>>
>> Some outside statistics FLOSSmole generated about Apache communities and
>> lists:
>> http://flossmole.org/category/tags/apache
>>
>> Random other interesting analytics:
>> The Subversion project has the "contribulyzer"
>>
>>
>>
>> - Shane
>>
>>
>

Re : Re: Re : Re: Project Visualization Tool...

Posted by he...@free.fr.

Yes, I have no problem with json vs xml: the question is more to define the schema like doap did it, and write documentation for projects to know where to publish what information

editing current generated json just creates a new information source, without any documentation

My point is: afaik, the purpose of the site is to display info in newer ways, then json generated from every existing piece of information is great, like any other format that would better suit some other visualization

But if we're creating any new source of information that competes with existing one, this has to be done with great care on documentation, explanation on how to migrate and so on

of course the raw format is not an issue: no religion here on xml vs json vs yaml vs ... 

Regards

Hervé 


----- Mail d'origine -----
De: jan i <ja...@apache.org>
À: dev@community.apache.org
Envoyé: Sat, 18 Apr 2015 11:44:54 +0200 (CEST)
Objet: Re: Re : Re: Project Visualization Tool...

On Saturday, April 18, 2015, <he...@free.fr> wrote:

> It was told the new site would use native json, instead of doap
> But I'm not convinced at all, since Doap is an invaluable source of info,
> documented, and so on

json is also a documented standard, that in general is more known, and I
believe has more tools supporting it.


>
> then imho it would be better to generate json from doap
>
> I disabled the json edit feature recently since it will cause problems

which problems?

with a defined json it is simple to generate the doap file.

I highly recommend staying at json and using that as base for all our
central data.

rgds
jan i



>
> regards
>
> Hervé
> ----- Mail d'origine -----
> De: Shane Curcuru <asf@shanecurcuru.org <javascript:;>>
> À: dev@community.apache.org <javascript:;>
> Envoyé: Sat, 18 Apr 2015 06:43:37 +0200 (CEST)
> Objet: Re: Project Visualization Tool...
>
> We had a great session, and a lot of energy, hopefully we can make some
> progress. One note: this needs to be a comdev PMC project, and we need
> to really plan the data part out if we want to be successful.
>
> Note that projects-new.a.o is the planned future replacement for
> projects.a.o - there are *significant* differences, so you need to look
> at the About page and the source repo. In particular, the new site uses
> it's own new JSON generated sources which (I think) will no longer use
> the DOAPs.
>
> In particular, Infra currently does *not* consider either the data
> gathering (i.e. populating the JSON behind the projects-new site) nor
> the visualizations (current or ones we want to build) as core supported
> services. So whatever we build needs to be maintained by this PMC to
> start with.
>
> Also, Link dump of useful related bits: ----------------
>
> Old service, based on crappy cron jobs and DOAP files from projects:
> https://projects.apache.org/
>
> New service, soon to be infra supported, relying on JSON data generated
> by infra on a regular schedule:
> https://projects-new.apache.org/
>
> Useful PMC chair report helper, that surfaces a number of different
> statistics about your PMC(s), including mailing list stats,
> PMC/committer changes, some software releases, etc. etc. (Members have
> visibility to all PMCs):
> https://reporter.apache.org
>
> Rob Weir (AOO, Member) used to do some visualization stuff and might
> have code ideas:
> http://www.robweir.com/blog/2013/05/mapping-apache.html
>
> Ken Coar's old mailing list stats page:
>
> https://people.apache.org/~coar/mlists.html
>
> The AOO project wrote a mailing list visualizer for who talks to whom:
> https://blogs.apache.org/OOo/entry/visualizing_the_aoo_dev_list
>
> Some outside statistics FLOSSmole generated about Apache communities and
> lists:
> http://flossmole.org/category/tags/apache
>
> Random other interesting analytics:
> The Subversion project has the "contribulyzer"
>
>
>
> - Shane
>
>

-- 
Sent from My iPad, sorry for any misspellings.