You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by sebb <se...@gmail.com> on 2015/07/19 10:48:17 UTC

Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
> Author: hboutemy
> Date: Wed Jul 15 21:11:32 2015
> New Revision: 1691273
>
> URL: http://svn.apache.org/r1691273
> Log:
> import projects DOAP files updates
>
> Modified:
>     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
>     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
>     comdev/projects.apache.org/site/json/foundation/projects.json
>     comdev/projects.apache.org/site/json/projects/cxf.json
>     comdev/projects.apache.org/site/json/projects/httpd.json

Why are these copies being committed to SVN?

Projects-old makes do with a local copy of the files which it keeps in
sync with the ones listed in files.xml

It seems wasteful and unnecessary to create new backup copies in SVN.

AFAICT they are bound to be out of date as they are committed manually.

Furthermore there is also a  danger that the wrong copy may be updated
by someone.

Re: Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

Posted by Hervé BOUTEMY <he...@free.fr>.
Le mercredi 22 juillet 2015 00:16:40 sebb a écrit :
> On 21 July 2015 at 06:45, Hervé BOUTEMY <he...@free.fr> wrote:
> > Le lundi 20 juillet 2015 01:43:11 sebb a écrit :
> >> On 19 July 2015 at 14:18, Hervé BOUTEMY <he...@free.fr> wrote:
> >> > time to explain what I have in mind, because I understand the reactions
> >> > about these svn content questions: but I need to explain why I think
> >> > that
> >> > it's not a bug, it's a feature :)
> >> > 
> >> > 
> >> > 1. generated json files in svn
> >> > 
> >> > even if they are generated, these ones are IMHO useful to ease people
> >> > just
> >> > wanting to work on information rendering, ie the site's html+javascript
> >> 
> >> The current files can still be accessed from the web server; they
> >> don't have to be in SVN to be useful.
> > 
> > seems I was not clear: the question is not the web server.
> > The question is the lambda ASF committer who does not have access to the
> > web server but would like to contribute to the web part, fix an issue he
> > sees on the live site: currently, one svn checkout, read STRUCTURE.txt
> > and start your local web server, and you can fix any html+css+javascript
> > issue
> > 
> >> > Experience with releases.json not being in svn in the first place told
> >> > me
> >> > that not having whole json content in svn was just increasing barrier
> >> > to
> >> > commits from whole ASF committers to projects directory visualization
> >> 
> >> Or maybe it was just that the file formats were not clearly documented.
> > 
> > no, the problem was not the format, it was the data (even if format
> > documentation is something we need also).
> 
> But how can one provide the data if the format is not clearly documented?
currently, every json file is generated: nobody has to write such data by hand, 
but run scripts that generate content
This does not mean that documentation is not useful: just that it is a little 
bit less critical
but if we were to hand-write these json files, yes, documentation would be the 
first step

disclaimer: I want to make documentation, I just don't know how to do it with 
json files. I'm interested in a demo on 1 file from people with experience on 
this topic

> 
> >> > 2. doap files in svn (copies of parsed content or generated ones)
> >> > 
> >> > From the beginning of my work on projects-new, I had a question in
> >> > mind:
> >> > is
> >> > DOAP itself a problem (since not easy, not well understood), or are
> >> > there
> >> > just problems about the way DOAP is used and explained to ASF
> >> > committers
> >> > (= not DOAP experts, if DOAP experts exist)?
> >> > 
> >> > Any discussion on this list about that question lead to some people
> >> > wanting to simply drop DOAP, because for them, implicitely, the format
> >> > itself/only was the problem, without answering previous question (and
> >> > without providing a better alternative = the show stopper for me: no,
> >> > simply telling "json" is not a sufficient answer, there has at least to
> >> > be a schema)
> >> 
> >> Indeed.
> >> Abandoning DOAP and using JSON will just lead to exactly the same
> >> problem down the line: *unless* the JSON schema is well designed and
> >> documented. Likewise for any other replacement.
> > 
> > +1
> > 
> >> It's usually obvious to the code/data developers who create the
> >> initial codebase how everything hangs together, but as the codebase
> >> matures the detailed knowledge will be lost unless it is documented.
> >> It's usually possible to tweak existing code to make small fixes
> >> without fully understanding the whole, but without a clear
> >> understanding of the way the parts are designed to work together the
> >> code (and data) tends to grow like spaghetti.
> >> 
> >> The way that the ASF used the DOAP files was not properly documented
> >> originally (it's a bit better now), but that tends to be the way with
> >> developers - documentation is done after the event, if at all. This is
> >> true of many of the new JSON files.
> > 
> > IMHO, here, the requirement on documentation is even higher since a lot of
> > people will need to write data, without being involved in the code using
> > the data.
> > 
> >> Note that when we refer to DOAP in this context we are referring to
> >> the XML representation.
> >> There might be a different representation that is easier to use.
> > 
> > I'm not a semantic web expert: could we try to write down (in the wiki for
> > example) one project RDF/XML DOAP and its equivalent in another notation?
> 
> Off-hand, I don't know what other representations exist, but I do know
> that there are some complaints about the suitability of XML to
> represent RDF.
+1
but you're like me: don't really know the alternatives
after a quick search, I found Turtle, RDFa, JSON-LD, or TriG:
http://www.w3.org/TR/rdf11-concepts/#rdf-documents

any experienced people on this?

> 
> >> After all, we are trying to describe projects, so Description Of A
> >> Project should be a good fit, even if using XML to define the DOAP is
> >> not so suitable.
> > 
> > +1 on the general logic
> > but I could not find a good documentation on DOAP apart from the DOAP
> > schema itself: did I miss something?
> 
> I don't know, but it seems logical otherwise DOAP would have never
> gained any followers.
same logic for me, but apparently nobody at ASF has knowledge (or is 
interested on ASF projects directory)

> 
> >> Do we really want to design a new DOAP schema using JSON?
> > 
> > I rephrase the same question: will we do better docuemntation if we
> > reinvent the wheel? (the answer may be "yes", but need real investment)
> 
> I'm not sure that is an equivalent question.
> The point is that it takes a lot of effort to design and document a
> suitable schema.
> We need to be very sure that DOAP is unsuitable before replacing it.
> Though it might be possible to replace DOAP/XML with DOAP/JSON or
> DOAP/xyz with a lot less effort.
+1
seems we should try writing one existing DOAP/XML document in 4 equivalent 
formats to check: Turtle, RDFa, JSON-LD and TriG
that one is perfect for the wiki :)

> 
> >> > Then my first steps were:
> >> > - improve projects new site and switch from projects old, as each
> >> > project
> >> > page on projects-new more clearly shows information that comes from the
> >> > project's DOAP file (IMHO, projects old was failing at this, no pun
> >> > intended): we'll see if ASF committers can improve their DOAP files (as
> >> > some already did since the switch)
> >> 
> >> Yes, better presentation of the data should help to persuade PMCs to
> >> fix/improve their data.
> >> 
> >> > - the new DOAP listings location, that is like projects old, but
> >> > simplified
> >> > since only focused on DOAP listings and content (no code):
> >> > http://svn.apache.org/viewvc/comdev/projects.apache.org/data/
> >> > 
> >> > These are only the first steps IMHO before deciding if we should
> >> > continue
> >> > with DOAP or find a better alternative (yet to be found/proposed).
> >> > 
> >> > I see 2 other steps:
> >> > - clarify what committee DOAP files (also called "PMC descriptors") are
> >> > supposed to contain, and how projects (maintained by the committees)
> >> > are
> >> > supposed to link to the committee. As discussed previously, current
> >> > convention [1] is really strange.
> >> 
> >> I expect there was a good reason for this at the time, but the "magic"
> >> behaviour of the URL is a bad idea in retrospect. Less typing, but
> >> lots more special-case code.
> >> 
> >> > And PMC members list are easier updated automatically
> >> > from committee-info.txt than manually.
> >> 
> >> Yes; that is waiting on INFRA-9942 which seems to have been ignored.
> >> Perhaps you can prod infra as well.
> > 
> > to me, this is not priority 1: since json files are in svn ( :) ), I can
> > parse content regularly and update json files from my own computer
> > Having the parsing fully automated will be useful sometime, but at the
> > moment there is no strong pressure
> 
> But when you go on holiday, there is no-one to update the files.
> 
> I favour a script that runs whenever committee-info.txt is updated
> (can use svnpubsub for this).
> The script should convert committee-info.txt into one or two JSON
> files, but not do anything else.
> The current parsing script is really complicated and generates
> additional output.
> The generated files would then be available for other scripts to use.
we're completely ok: that's just a question of priority

> 
> >> > - prefer https://projects.apache.org/doap/ to
> >> > https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
> >> > IMHO, /doap/ in projects site, with every ASF committer commit access,
> >> > and
> >> > its per-committee directory containing both PMC descriptor and projects
> >> > DOAP descriptors would be easier to understand and maintain than an XML
> >> > listing in svn then descriptors in a lot of different places
> >> > And this would give a good canonical url for each DOAP file (easing
> >> > work
> >> > on
> >> > previous item)
> >> 
> >> Agreed it would be easier to have a central location to maintain the
> >> DOAPs.
> >> I tried a similar with Commons, however several people wanted to keep
> >> the DOAPs with the project code.
> >> 
> >> But perhaps if all the DOAPs are together then the objection will be
> >> overcome - at least there is a canonical location for them.
> >> 
> >> And it would be a lot easier to fix the typos and syntax errors if all
> >> the files were co-located.
> >> 
> >> Note that this will require a good naming convention to avoid clashes
> >> and keep track of everything.
> > 
> > that's the purpose of the https://projects.apache.org/doap/ demo:
> > https://projects.apache.org/doap/{committee id}/{project id}.rdf
> > ie one directory per Apache committee/TLP/PMC (choose your wording)
> > 
> > and pmc.rdf for the committee PMC data file
> > 
> > really nothing hard
> > 
> >> But in the meantime, I still think it would be better to take local
> >> copies and not commit them to SVN.
> > 
> > here, I disagree :)
> > 
> >> > I know this is a long post: sorry, could not make it shorter.
> >> > 
> >> > Switching from projects old to projects new without changing much
> >> > things
> >> > to
> >> > DOAP sources was only the beginning of a story: we need to define next
> >> > steps.
> >> 
> >> Yes.
> >> What data needs to be collected by PMCs?
> >> In what format is it stored?
> >> Where is it stored?
> >> 
> >> These would probably be better discussed on a Wiki.
> > 
> > does it mean you have a proposal?
> > 
> > Regards,
> > 
> > Hervé
> > 
> >> > Regards,
> >> > 
> >> > Hervé
> >> > 
> >> > 
> >> > [1] https://projects-old.apache.org/guidelines.html see 2 last bullets:
> >> > - PMCs can be referenced as an rdf:resource that points at
> >> > http://<pmc>.apache.org/. e.g.
> >> > <asfext:pmc rdf:resource="http://httpd.apache.org/" />.
> >> > In this case, the PMC descriptor file must be called <pmc>.rdf and must
> >> > be
> >> > stored in the directory:
> >> > http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/project
> >> > s/d
> >> > ata_files/ - PMCs descriptors can also be stored anywhere else (e.g. on
> >> > the TLP website or in SVN), in which case they must be referenced using
> >> > the full URL, for example
> >> > <asfext:pmc rdf:resource="http://tlp.apache.org/pmc/tlp.rdf" />
> >> > 
> >> > Le dimanche 19 juillet 2015 09:48:17 sebb a écrit :
> >> >> On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
> >> >> > Author: hboutemy
> >> >> > Date: Wed Jul 15 21:11:32 2015
> >> >> > New Revision: 1691273
> >> >> > 
> >> >> > URL: http://svn.apache.org/r1691273
> >> >> > Log:
> >> >> > import projects DOAP files updates
> >> >> > 
> >> >> > Modified:
> >> >> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
> >> >> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
> >> >> >     comdev/projects.apache.org/site/json/foundation/projects.json
> >> >> >     comdev/projects.apache.org/site/json/projects/cxf.json
> >> >> >     comdev/projects.apache.org/site/json/projects/httpd.json
> >> >> 
> >> >> Why are these copies being committed to SVN?
> >> >> 
> >> >> Projects-old makes do with a local copy of the files which it keeps in
> >> >> sync with the ones listed in files.xml
> >> >> 
> >> >> It seems wasteful and unnecessary to create new backup copies in SVN.
> >> >> 
> >> >> AFAICT they are bound to be out of date as they are committed
> >> >> manually.
> >> >> 
> >> >> Furthermore there is also a  danger that the wrong copy may be updated
> >> >> by someone.


Re: Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

Posted by sebb <se...@gmail.com>.
On 21 July 2015 at 06:45, Hervé BOUTEMY <he...@free.fr> wrote:
> Le lundi 20 juillet 2015 01:43:11 sebb a écrit :
>> On 19 July 2015 at 14:18, Hervé BOUTEMY <he...@free.fr> wrote:
>> > time to explain what I have in mind, because I understand the reactions
>> > about these svn content questions: but I need to explain why I think that
>> > it's not a bug, it's a feature :)
>> >
>> >
>> > 1. generated json files in svn
>> >
>> > even if they are generated, these ones are IMHO useful to ease people just
>> > wanting to work on information rendering, ie the site's html+javascript
>>
>> The current files can still be accessed from the web server; they
>> don't have to be in SVN to be useful.
> seems I was not clear: the question is not the web server.
> The question is the lambda ASF committer who does not have access to the web
> server but would like to contribute to the web part, fix an issue he sees on
> the live site: currently, one svn checkout, read STRUCTURE.txt and start your
> local web server, and you can fix any html+css+javascript issue
>
>>
>> > Experience with releases.json not being in svn in the first place told me
>> > that not having whole json content in svn was just increasing barrier to
>> > commits from whole ASF committers to projects directory visualization
>>
>> Or maybe it was just that the file formats were not clearly documented.
> no, the problem was not the format, it was the data (even if format
> documentation is something we need also).

But how can one provide the data if the format is not clearly documented?

>
>>
>> > 2. doap files in svn (copies of parsed content or generated ones)
>> >
>> > From the beginning of my work on projects-new, I had a question in mind:
>> > is
>> > DOAP itself a problem (since not easy, not well understood), or are there
>> > just problems about the way DOAP is used and explained to ASF committers
>> > (= not DOAP experts, if DOAP experts exist)?
>> >
>> > Any discussion on this list about that question lead to some people
>> > wanting to simply drop DOAP, because for them, implicitely, the format
>> > itself/only was the problem, without answering previous question (and
>> > without providing a better alternative = the show stopper for me: no,
>> > simply telling "json" is not a sufficient answer, there has at least to
>> > be a schema)
>>
>> Indeed.
>> Abandoning DOAP and using JSON will just lead to exactly the same
>> problem down the line: *unless* the JSON schema is well designed and
>> documented. Likewise for any other replacement.
> +1
>
>>
>> It's usually obvious to the code/data developers who create the
>> initial codebase how everything hangs together, but as the codebase
>> matures the detailed knowledge will be lost unless it is documented.
>> It's usually possible to tweak existing code to make small fixes
>> without fully understanding the whole, but without a clear
>> understanding of the way the parts are designed to work together the
>> code (and data) tends to grow like spaghetti.
>>
>> The way that the ASF used the DOAP files was not properly documented
>> originally (it's a bit better now), but that tends to be the way with
>> developers - documentation is done after the event, if at all. This is
>> true of many of the new JSON files.
> IMHO, here, the requirement on documentation is even higher since a lot of
> people will need to write data, without being involved in the code using the
> data.
>
>>
>> Note that when we refer to DOAP in this context we are referring to
>> the XML representation.
>> There might be a different representation that is easier to use.
> I'm not a semantic web expert: could we try to write down (in the wiki for
> example) one project RDF/XML DOAP and its equivalent in another notation?

Off-hand, I don't know what other representations exist, but I do know
that there are some complaints about the suitability of XML to
represent RDF.

>>
>> After all, we are trying to describe projects, so Description Of A
>> Project should be a good fit, even if using XML to define the DOAP is
>> not so suitable.
> +1 on the general logic
> but I could not find a good documentation on DOAP apart from the DOAP schema
> itself: did I miss something?

I don't know, but it seems logical otherwise DOAP would have never
gained any followers.

>>
>> Do we really want to design a new DOAP schema using JSON?
> I rephrase the same question: will we do better docuemntation if we reinvent
> the wheel? (the answer may be "yes", but need real investment)

I'm not sure that is an equivalent question.
The point is that it takes a lot of effort to design and document a
suitable schema.
We need to be very sure that DOAP is unsuitable before replacing it.
Though it might be possible to replace DOAP/XML with DOAP/JSON or
DOAP/xyz with a lot less effort.

>>
>> > Then my first steps were:
>> > - improve projects new site and switch from projects old, as each project
>> > page on projects-new more clearly shows information that comes from the
>> > project's DOAP file (IMHO, projects old was failing at this, no pun
>> > intended): we'll see if ASF committers can improve their DOAP files (as
>> > some already did since the switch)
>>
>> Yes, better presentation of the data should help to persuade PMCs to
>> fix/improve their data.
>>
>> > - the new DOAP listings location, that is like projects old, but
>> > simplified
>> > since only focused on DOAP listings and content (no code):
>> > http://svn.apache.org/viewvc/comdev/projects.apache.org/data/
>> >
>> > These are only the first steps IMHO before deciding if we should continue
>> > with DOAP or find a better alternative (yet to be found/proposed).
>> >
>> > I see 2 other steps:
>> > - clarify what committee DOAP files (also called "PMC descriptors") are
>> > supposed to contain, and how projects (maintained by the committees) are
>> > supposed to link to the committee. As discussed previously, current
>> > convention [1] is really strange.
>>
>> I expect there was a good reason for this at the time, but the "magic"
>> behaviour of the URL is a bad idea in retrospect. Less typing, but
>> lots more special-case code.
>>
>> > And PMC members list are easier updated automatically
>> > from committee-info.txt than manually.
>>
>> Yes; that is waiting on INFRA-9942 which seems to have been ignored.
>> Perhaps you can prod infra as well.
> to me, this is not priority 1: since json files are in svn ( :) ), I can parse
> content regularly and update json files from my own computer
> Having the parsing fully automated will be useful sometime, but at the moment
> there is no strong pressure

But when you go on holiday, there is no-one to update the files.

I favour a script that runs whenever committee-info.txt is updated
(can use svnpubsub for this).
The script should convert committee-info.txt into one or two JSON
files, but not do anything else.
The current parsing script is really complicated and generates
additional output.
The generated files would then be available for other scripts to use.

>>
>> > - prefer https://projects.apache.org/doap/ to
>> > https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
>> > IMHO, /doap/ in projects site, with every ASF committer commit access, and
>> > its per-committee directory containing both PMC descriptor and projects
>> > DOAP descriptors would be easier to understand and maintain than an XML
>> > listing in svn then descriptors in a lot of different places
>> > And this would give a good canonical url for each DOAP file (easing work
>> > on
>> > previous item)
>>
>> Agreed it would be easier to have a central location to maintain the DOAPs.
>> I tried a similar with Commons, however several people wanted to keep
>> the DOAPs with the project code.
>>
>> But perhaps if all the DOAPs are together then the objection will be
>> overcome - at least there is a canonical location for them.
>>
>> And it would be a lot easier to fix the typos and syntax errors if all
>> the files were co-located.
>>
>> Note that this will require a good naming convention to avoid clashes
>> and keep track of everything.
> that's the purpose of the https://projects.apache.org/doap/ demo:
> https://projects.apache.org/doap/{committee id}/{project id}.rdf
> ie one directory per Apache committee/TLP/PMC (choose your wording)
>
> and pmc.rdf for the committee PMC data file
>
> really nothing hard
>
>>
>> But in the meantime, I still think it would be better to take local
>> copies and not commit them to SVN.
> here, I disagree :)
>
>>
>> > I know this is a long post: sorry, could not make it shorter.
>> >
>> > Switching from projects old to projects new without changing much things
>> > to
>> > DOAP sources was only the beginning of a story: we need to define next
>> > steps.
>> Yes.
>> What data needs to be collected by PMCs?
>> In what format is it stored?
>> Where is it stored?
>>
>> These would probably be better discussed on a Wiki.
> does it mean you have a proposal?
>
> Regards,
>
> Hervé
>
>>
>> > Regards,
>> >
>> > Hervé
>> >
>> >
>> > [1] https://projects-old.apache.org/guidelines.html see 2 last bullets:
>> > - PMCs can be referenced as an rdf:resource that points at
>> > http://<pmc>.apache.org/. e.g.
>> > <asfext:pmc rdf:resource="http://httpd.apache.org/" />.
>> > In this case, the PMC descriptor file must be called <pmc>.rdf and must be
>> > stored in the directory:
>> > http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/d
>> > ata_files/ - PMCs descriptors can also be stored anywhere else (e.g. on
>> > the TLP website or in SVN), in which case they must be referenced using
>> > the full URL, for example
>> > <asfext:pmc rdf:resource="http://tlp.apache.org/pmc/tlp.rdf" />
>> >
>> > Le dimanche 19 juillet 2015 09:48:17 sebb a écrit :
>> >> On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
>> >> > Author: hboutemy
>> >> > Date: Wed Jul 15 21:11:32 2015
>> >> > New Revision: 1691273
>> >> >
>> >> > URL: http://svn.apache.org/r1691273
>> >> > Log:
>> >> > import projects DOAP files updates
>> >> >
>> >> > Modified:
>> >> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
>> >> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
>> >> >     comdev/projects.apache.org/site/json/foundation/projects.json
>> >> >     comdev/projects.apache.org/site/json/projects/cxf.json
>> >> >     comdev/projects.apache.org/site/json/projects/httpd.json
>> >>
>> >> Why are these copies being committed to SVN?
>> >>
>> >> Projects-old makes do with a local copy of the files which it keeps in
>> >> sync with the ones listed in files.xml
>> >>
>> >> It seems wasteful and unnecessary to create new backup copies in SVN.
>> >>
>> >> AFAICT they are bound to be out of date as they are committed manually.
>> >>
>> >> Furthermore there is also a  danger that the wrong copy may be updated
>> >> by someone.
>

Re: Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

Posted by Hervé BOUTEMY <he...@free.fr>.
Le lundi 20 juillet 2015 01:43:11 sebb a écrit :
> On 19 July 2015 at 14:18, Hervé BOUTEMY <he...@free.fr> wrote:
> > time to explain what I have in mind, because I understand the reactions
> > about these svn content questions: but I need to explain why I think that
> > it's not a bug, it's a feature :)
> > 
> > 
> > 1. generated json files in svn
> > 
> > even if they are generated, these ones are IMHO useful to ease people just
> > wanting to work on information rendering, ie the site's html+javascript
> 
> The current files can still be accessed from the web server; they
> don't have to be in SVN to be useful.
seems I was not clear: the question is not the web server.
The question is the lambda ASF committer who does not have access to the web 
server but would like to contribute to the web part, fix an issue he sees on 
the live site: currently, one svn checkout, read STRUCTURE.txt and start your 
local web server, and you can fix any html+css+javascript issue

> 
> > Experience with releases.json not being in svn in the first place told me
> > that not having whole json content in svn was just increasing barrier to
> > commits from whole ASF committers to projects directory visualization
> 
> Or maybe it was just that the file formats were not clearly documented.
no, the problem was not the format, it was the data (even if format 
documentation is something we need also).

> 
> > 2. doap files in svn (copies of parsed content or generated ones)
> > 
> > From the beginning of my work on projects-new, I had a question in mind:
> > is
> > DOAP itself a problem (since not easy, not well understood), or are there
> > just problems about the way DOAP is used and explained to ASF committers
> > (= not DOAP experts, if DOAP experts exist)?
> > 
> > Any discussion on this list about that question lead to some people
> > wanting to simply drop DOAP, because for them, implicitely, the format
> > itself/only was the problem, without answering previous question (and
> > without providing a better alternative = the show stopper for me: no,
> > simply telling "json" is not a sufficient answer, there has at least to
> > be a schema)
> 
> Indeed.
> Abandoning DOAP and using JSON will just lead to exactly the same
> problem down the line: *unless* the JSON schema is well designed and
> documented. Likewise for any other replacement.
+1

> 
> It's usually obvious to the code/data developers who create the
> initial codebase how everything hangs together, but as the codebase
> matures the detailed knowledge will be lost unless it is documented.
> It's usually possible to tweak existing code to make small fixes
> without fully understanding the whole, but without a clear
> understanding of the way the parts are designed to work together the
> code (and data) tends to grow like spaghetti.
> 
> The way that the ASF used the DOAP files was not properly documented
> originally (it's a bit better now), but that tends to be the way with
> developers - documentation is done after the event, if at all. This is
> true of many of the new JSON files.
IMHO, here, the requirement on documentation is even higher since a lot of 
people will need to write data, without being involved in the code using the 
data.

> 
> Note that when we refer to DOAP in this context we are referring to
> the XML representation.
> There might be a different representation that is easier to use.
I'm not a semantic web expert: could we try to write down (in the wiki for 
example) one project RDF/XML DOAP and its equivalent in another notation?

> 
> After all, we are trying to describe projects, so Description Of A
> Project should be a good fit, even if using XML to define the DOAP is
> not so suitable.
+1 on the general logic
but I could not find a good documentation on DOAP apart from the DOAP schema 
itself: did I miss something?

> 
> Do we really want to design a new DOAP schema using JSON?
I rephrase the same question: will we do better docuemntation if we reinvent 
the wheel? (the answer may be "yes", but need real investment)

> 
> > Then my first steps were:
> > - improve projects new site and switch from projects old, as each project
> > page on projects-new more clearly shows information that comes from the
> > project's DOAP file (IMHO, projects old was failing at this, no pun
> > intended): we'll see if ASF committers can improve their DOAP files (as
> > some already did since the switch)
> 
> Yes, better presentation of the data should help to persuade PMCs to
> fix/improve their data.
> 
> > - the new DOAP listings location, that is like projects old, but
> > simplified
> > since only focused on DOAP listings and content (no code):
> > http://svn.apache.org/viewvc/comdev/projects.apache.org/data/
> > 
> > These are only the first steps IMHO before deciding if we should continue
> > with DOAP or find a better alternative (yet to be found/proposed).
> > 
> > I see 2 other steps:
> > - clarify what committee DOAP files (also called "PMC descriptors") are
> > supposed to contain, and how projects (maintained by the committees) are
> > supposed to link to the committee. As discussed previously, current
> > convention [1] is really strange.
> 
> I expect there was a good reason for this at the time, but the "magic"
> behaviour of the URL is a bad idea in retrospect. Less typing, but
> lots more special-case code.
> 
> > And PMC members list are easier updated automatically
> > from committee-info.txt than manually.
> 
> Yes; that is waiting on INFRA-9942 which seems to have been ignored.
> Perhaps you can prod infra as well.
to me, this is not priority 1: since json files are in svn ( :) ), I can parse 
content regularly and update json files from my own computer
Having the parsing fully automated will be useful sometime, but at the moment 
there is no strong pressure

> 
> > - prefer https://projects.apache.org/doap/ to
> > https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
> > IMHO, /doap/ in projects site, with every ASF committer commit access, and
> > its per-committee directory containing both PMC descriptor and projects
> > DOAP descriptors would be easier to understand and maintain than an XML
> > listing in svn then descriptors in a lot of different places
> > And this would give a good canonical url for each DOAP file (easing work
> > on
> > previous item)
> 
> Agreed it would be easier to have a central location to maintain the DOAPs.
> I tried a similar with Commons, however several people wanted to keep
> the DOAPs with the project code.
> 
> But perhaps if all the DOAPs are together then the objection will be
> overcome - at least there is a canonical location for them.
> 
> And it would be a lot easier to fix the typos and syntax errors if all
> the files were co-located.
> 
> Note that this will require a good naming convention to avoid clashes
> and keep track of everything.
that's the purpose of the https://projects.apache.org/doap/ demo:
https://projects.apache.org/doap/{committee id}/{project id}.rdf
ie one directory per Apache committee/TLP/PMC (choose your wording)

and pmc.rdf for the committee PMC data file

really nothing hard

> 
> But in the meantime, I still think it would be better to take local
> copies and not commit them to SVN.
here, I disagree :)

> 
> > I know this is a long post: sorry, could not make it shorter.
> > 
> > Switching from projects old to projects new without changing much things
> > to
> > DOAP sources was only the beginning of a story: we need to define next
> > steps.
> Yes.
> What data needs to be collected by PMCs?
> In what format is it stored?
> Where is it stored?
> 
> These would probably be better discussed on a Wiki.
does it mean you have a proposal?

Regards,

Hervé

> 
> > Regards,
> > 
> > Hervé
> > 
> > 
> > [1] https://projects-old.apache.org/guidelines.html see 2 last bullets:
> > - PMCs can be referenced as an rdf:resource that points at
> > http://<pmc>.apache.org/. e.g.
> > <asfext:pmc rdf:resource="http://httpd.apache.org/" />.
> > In this case, the PMC descriptor file must be called <pmc>.rdf and must be
> > stored in the directory:
> > http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/d
> > ata_files/ - PMCs descriptors can also be stored anywhere else (e.g. on
> > the TLP website or in SVN), in which case they must be referenced using
> > the full URL, for example
> > <asfext:pmc rdf:resource="http://tlp.apache.org/pmc/tlp.rdf" />
> > 
> > Le dimanche 19 juillet 2015 09:48:17 sebb a écrit :
> >> On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
> >> > Author: hboutemy
> >> > Date: Wed Jul 15 21:11:32 2015
> >> > New Revision: 1691273
> >> > 
> >> > URL: http://svn.apache.org/r1691273
> >> > Log:
> >> > import projects DOAP files updates
> >> > 
> >> > Modified:
> >> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
> >> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
> >> >     comdev/projects.apache.org/site/json/foundation/projects.json
> >> >     comdev/projects.apache.org/site/json/projects/cxf.json
> >> >     comdev/projects.apache.org/site/json/projects/httpd.json
> >> 
> >> Why are these copies being committed to SVN?
> >> 
> >> Projects-old makes do with a local copy of the files which it keeps in
> >> sync with the ones listed in files.xml
> >> 
> >> It seems wasteful and unnecessary to create new backup copies in SVN.
> >> 
> >> AFAICT they are bound to be out of date as they are committed manually.
> >> 
> >> Furthermore there is also a  danger that the wrong copy may be updated
> >> by someone.


Re: Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

Posted by sebb <se...@gmail.com>.
On 19 July 2015 at 14:18, Hervé BOUTEMY <he...@free.fr> wrote:
> time to explain what I have in mind, because I understand the reactions about
> these svn content questions: but I need to explain why I think that it's not a
> bug, it's a feature :)
>
>
> 1. generated json files in svn
>
> even if they are generated, these ones are IMHO useful to ease people just
> wanting to work on information rendering, ie the site's html+javascript

The current files can still be accessed from the web server; they
don't have to be in SVN to be useful.

> Experience with releases.json not being in svn in the first place told me that
> not having whole json content in svn was just increasing barrier to commits
> from whole ASF committers to projects directory visualization

Or maybe it was just that the file formats were not clearly documented.

>
> 2. doap files in svn (copies of parsed content or generated ones)
>
> From the beginning of my work on projects-new, I had a question in mind: is
> DOAP itself a problem (since not easy, not well understood), or are there just
> problems about the way DOAP is used and explained to ASF committers (= not
> DOAP experts, if DOAP experts exist)?
>
> Any discussion on this list about that question lead to some people wanting to
> simply drop DOAP, because for them, implicitely, the format itself/only was
> the problem, without answering previous question (and without providing a
> better alternative = the show stopper for me: no, simply telling "json" is not
> a sufficient answer, there has at least to be a schema)

Indeed.
Abandoning DOAP and using JSON will just lead to exactly the same
problem down the line: *unless* the JSON schema is well designed and
documented. Likewise for any other replacement.

It's usually obvious to the code/data developers who create the
initial codebase how everything hangs together, but as the codebase
matures the detailed knowledge will be lost unless it is documented.
It's usually possible to tweak existing code to make small fixes
without fully understanding the whole, but without a clear
understanding of the way the parts are designed to work together the
code (and data) tends to grow like spaghetti.

The way that the ASF used the DOAP files was not properly documented
originally (it's a bit better now), but that tends to be the way with
developers - documentation is done after the event, if at all. This is
true of many of the new JSON files.

Note that when we refer to DOAP in this context we are referring to
the XML representation.
There might be a different representation that is easier to use.

After all, we are trying to describe projects, so Description Of A
Project should be a good fit, even if using XML to define the DOAP is
not so suitable.

Do we really want to design a new DOAP schema using JSON?

> Then my first steps were:
> - improve projects new site and switch from projects old, as each project page
> on projects-new more clearly shows information that comes from the project's
> DOAP file (IMHO, projects old was failing at this, no pun intended): we'll see
> if ASF committers can improve their DOAP files (as some already did since the
> switch)

Yes, better presentation of the data should help to persuade PMCs to
fix/improve their data.

> - the new DOAP listings location, that is like projects old, but simplified
> since only focused on DOAP listings and content (no code):
> http://svn.apache.org/viewvc/comdev/projects.apache.org/data/
>
> These are only the first steps IMHO before deciding if we should continue with
> DOAP or find a better alternative (yet to be found/proposed).
>
> I see 2 other steps:
> - clarify what committee DOAP files (also called "PMC descriptors") are
> supposed to contain, and how projects (maintained by the committees) are
> supposed to link to the committee. As discussed previously, current convention
> [1] is really strange.

I expect there was a good reason for this at the time, but the "magic"
behaviour of the URL is a bad idea in retrospect. Less typing, but
lots more special-case code.

> And PMC members list are easier updated automatically
> from committee-info.txt than manually.

Yes; that is waiting on INFRA-9942 which seems to have been ignored.
Perhaps you can prod infra as well.

> - prefer https://projects.apache.org/doap/ to
> https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
> IMHO, /doap/ in projects site, with every ASF committer commit access, and its
> per-committee directory containing both PMC descriptor and projects DOAP
> descriptors would be easier to understand and maintain than an XML listing in
> svn then descriptors in a lot of different places
> And this would give a good canonical url for each DOAP file (easing work on
> previous item)

Agreed it would be easier to have a central location to maintain the DOAPs.
I tried a similar with Commons, however several people wanted to keep
the DOAPs with the project code.

But perhaps if all the DOAPs are together then the objection will be
overcome - at least there is a canonical location for them.

And it would be a lot easier to fix the typos and syntax errors if all
the files were co-located.

Note that this will require a good naming convention to avoid clashes
and keep track of everything.

But in the meantime, I still think it would be better to take local
copies and not commit them to SVN.

>
> I know this is a long post: sorry, could not make it shorter.
>
> Switching from projects old to projects new without changing much things to
> DOAP sources was only the beginning of a story: we need to define next steps.

Yes.
What data needs to be collected by PMCs?
In what format is it stored?
Where is it stored?

These would probably be better discussed on a Wiki.

> Regards,
>
> Hervé
>
>
> [1] https://projects-old.apache.org/guidelines.html see 2 last bullets:
> - PMCs can be referenced as an rdf:resource that points at
> http://<pmc>.apache.org/. e.g.
> <asfext:pmc rdf:resource="http://httpd.apache.org/" />.
> In this case, the PMC descriptor file must be called <pmc>.rdf and must be
> stored in the directory:
> http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files/
> - PMCs descriptors can also be stored anywhere else (e.g. on the TLP website
> or in SVN), in which case they must be referenced using the full URL, for
> example
> <asfext:pmc rdf:resource="http://tlp.apache.org/pmc/tlp.rdf" />
>
> Le dimanche 19 juillet 2015 09:48:17 sebb a écrit :
>> On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
>> > Author: hboutemy
>> > Date: Wed Jul 15 21:11:32 2015
>> > New Revision: 1691273
>> >
>> > URL: http://svn.apache.org/r1691273
>> > Log:
>> > import projects DOAP files updates
>> >
>> > Modified:
>> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
>> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
>> >     comdev/projects.apache.org/site/json/foundation/projects.json
>> >     comdev/projects.apache.org/site/json/projects/cxf.json
>> >     comdev/projects.apache.org/site/json/projects/httpd.json
>>
>> Why are these copies being committed to SVN?
>>
>> Projects-old makes do with a local copy of the files which it keeps in
>> sync with the ones listed in files.xml
>>
>> It seems wasteful and unnecessary to create new backup copies in SVN.
>>
>> AFAICT they are bound to be out of date as they are committed manually.
>>
>> Furthermore there is also a  danger that the wrong copy may be updated
>> by someone.
>

Re: Unnecessary SVN commits [was: svn commit: r1691273 - in /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf json/foundation/projects.json json/projects/cxf.json json/projects/httpd.json]

Posted by Hervé BOUTEMY <he...@free.fr>.
time to explain what I have in mind, because I understand the reactions about 
these svn content questions: but I need to explain why I think that it's not a 
bug, it's a feature :)


1. generated json files in svn

even if they are generated, these ones are IMHO useful to ease people just 
wanting to work on information rendering, ie the site's html+javascript
Experience with releases.json not being in svn in the first place told me that 
not having whole json content in svn was just increasing barrier to commits 
from whole ASF committers to projects directory visualization


2. doap files in svn (copies of parsed content or generated ones)

>From the beginning of my work on projects-new, I had a question in mind: is 
DOAP itself a problem (since not easy, not well understood), or are there just 
problems about the way DOAP is used and explained to ASF committers (= not 
DOAP experts, if DOAP experts exist)?

Any discussion on this list about that question lead to some people wanting to 
simply drop DOAP, because for them, implicitely, the format itself/only was 
the problem, without answering previous question (and without providing a 
better alternative = the show stopper for me: no, simply telling "json" is not 
a sufficient answer, there has at least to be a schema)

Then my first steps were:
- improve projects new site and switch from projects old, as each project page 
on projects-new more clearly shows information that comes from the project's 
DOAP file (IMHO, projects old was failing at this, no pun intended): we'll see 
if ASF committers can improve their DOAP files (as some already did since the 
switch)
- the new DOAP listings location, that is like projects old, but simplified 
since only focused on DOAP listings and content (no code): 
http://svn.apache.org/viewvc/comdev/projects.apache.org/data/

These are only the first steps IMHO before deciding if we should continue with 
DOAP or find a better alternative (yet to be found/proposed).

I see 2 other steps:
- clarify what committee DOAP files (also called "PMC descriptors") are 
supposed to contain, and how projects (maintained by the committees) are 
supposed to link to the committee. As discussed previously, current convention 
[1] is really strange. And PMC members list are easier updated automatically 
from committee-info.txt than manually.

- prefer https://projects.apache.org/doap/ to 
https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
IMHO, /doap/ in projects site, with every ASF committer commit access, and its 
per-committee directory containing both PMC descriptor and projects DOAP 
descriptors would be easier to understand and maintain than an XML listing in 
svn then descriptors in a lot of different places
And this would give a good canonical url for each DOAP file (easing work on 
previous item)


I know this is a long post: sorry, could not make it shorter.

Switching from projects old to projects new without changing much things to 
DOAP sources was only the beginning of a story: we need to define next steps.

Regards,

Hervé


[1] https://projects-old.apache.org/guidelines.html see 2 last bullets:
- PMCs can be referenced as an rdf:resource that points at 
http://<pmc>.apache.org/. e.g. 
<asfext:pmc rdf:resource="http://httpd.apache.org/" />. 
In this case, the PMC descriptor file must be called <pmc>.rdf and must be 
stored in the directory: 
http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files/
- PMCs descriptors can also be stored anywhere else (e.g. on the TLP website 
or in SVN), in which case they must be referenced using the full URL, for 
example 
<asfext:pmc rdf:resource="http://tlp.apache.org/pmc/tlp.rdf" />

Le dimanche 19 juillet 2015 09:48:17 sebb a écrit :
> On 15 July 2015 at 22:11,  <hb...@apache.org> wrote:
> > Author: hboutemy
> > Date: Wed Jul 15 21:11:32 2015
> > New Revision: 1691273
> > 
> > URL: http://svn.apache.org/r1691273
> > Log:
> > import projects DOAP files updates
> > 
> > Modified:
> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
> >     comdev/projects.apache.org/site/json/foundation/projects.json
> >     comdev/projects.apache.org/site/json/projects/cxf.json
> >     comdev/projects.apache.org/site/json/projects/httpd.json
> 
> Why are these copies being committed to SVN?
> 
> Projects-old makes do with a local copy of the files which it keeps in
> sync with the ones listed in files.xml
> 
> It seems wasteful and unnecessary to create new backup copies in SVN.
> 
> AFAICT they are bound to be out of date as they are committed manually.
> 
> Furthermore there is also a  danger that the wrong copy may be updated
> by someone.