You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@httpd.apache.org by Rich Bowen <rb...@rcbowen.com> on 2011/07/11 15:28:51 UTC
Re: modules.apache.org data
So ... Where are we on this? Has anybody put any time into it yet? Where could someone jump in?
--
Rich Bowen
rbowen@rcbowen.com
rbowen@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Mads Toftum <ma...@toftum.dk>.
On Mon, Jul 11, 2011 at 09:28:51AM -0400, Rich Bowen wrote:
> So ... Where are we on this? Has anybody put any time into it yet? Where could someone jump in?
>
I've had a quick run through the dump and made a hackish attempt that
got me something close to doap without much effort before $work got in
the way. Fairly simple to do.
vh
Mads Toftum
--
http://soulfood.dk
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Lee Fisher <bl...@gmail.com>.
On 7/11/11 12:15 PM, Mads Toftum wrote:
> On Mon, Jul 11, 2011 at 12:05:05PM -0700, Lee Fisher wrote:
>> If I can ignore the password fields, and the login history table,
>> and can manage to convert the Php/MySQL numeric date to a
>> human-readable XML date, then the work I've been doing in XML is
>> usable.
>
> If we're going to go with a simple site created using doap, then you
> shouldn't worry too much about usernames and passwords. If we want to
> bring the accounts back and have a management system again, then we can
> always extract the data at that point.
If no password fields are needed, then I need to figure out the date
field conversion (if the old database registation date is still needed).
If I can make it past that, then I think the XML work I've done so far
might be usable, after some more cleanup on user data.
If DOAP format is wanted, then we need Programming Language, which we
don't have. Optionally, it'd be nice to have other metadata: Operating
System, Bug Database, Category, split Url field into Homepage, Old
Homepage, Download Page, and for Apache-hosted (and other
publicly-projects, use the Repository class. Besides Developer and
Maintainer contacts, perhaps also the Documentor, Translator, and Helper
contacts? So, I think we'd need a way to get new data from existing
projects, either before or after publishing this new format.
Is there an Apache-controlled vocabulary that can be used as DOAP
Keywords? Right now they're quite free-form, if they exist. Having some
existing 'tag cloud' could be helpful going foreward.
Is this eventually going to be internalized into non-English languages,
via xml:lang?
The Apache Labs DOAPizer seems to only target internal projects that
have Svn projects, as Svn folder is used for Lab Identifier in form.
Would the results of this new set of DOAP files be integrated with other
existing Apache-hosted DOAP files? Perhaps make the Apache Labs DOAPizer
something that httpd module authors could use in new system?
http://labs.apache.org/doapizer.html
BTW, since the DOAP site is down (for me at least), there is a new
similar project being worked on by Linux Foundation might be worth
looking at.
http://spdx.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Mads Toftum <ma...@toftum.dk>.
On Mon, Jul 11, 2011 at 12:05:05PM -0700, Lee Fisher wrote:
> If I can ignore the password fields, and the login history table,
> and can manage to convert the Php/MySQL numeric date to a
> human-readable XML date, then the work I've been doing in XML is
> usable.
If we're going to go with a simple site created using doap, then you
shouldn't worry too much about usernames and passwords. If we want to
bring the accounts back and have a management system again, then we can
always extract the data at that point.
vh.
Mads Toftum
--
http://soulfood.dk
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Lee Fisher <bl...@gmail.com>.
> In my ideal world, we'd want something that the module authors
> can update themselves, and doesn't require a lot of maintenance
> on our end. Having something where they host some kind of data
> file on their end that we retrieve periodically seems good to
> me, but I don't know the details of making that happen. Having
> a database thingy that they update on our end seems fine, too.
If I can ignore the password fields, and the login history table, and
can manage to convert the Php/MySQL numeric date to a human-readable XML
date, then the work I've been doing in XML is usable.
If the login history and password fields need to be used in the
resulting system, I think I have to drop this XML work, and go back to
getting the Php/MySQL running as a new system.
A new system might be able to use the email fields to reset things, and
ask for a new password. If solution is wiki-based, it seems likey they'd
have to create a new account on the wiki, and might be ok if we don't
have to translate the old accounts (and passwords) over to new wiki
accounts.
So, besides password field issue, and data field conversion, I can
continue to normalize the XML of the base data. But if login history or
passwords need to be carried forward, I should start over and figure out
how to rebuild the old Php/MySql web app, so 100% of it's tables can be
migrated forward.
Is there any high-level Apache DOAP, FOAF policy that would help decide
the proper direction w/r/t hosted XML metadata?
If a DOAP file approach is used, shouldn't these DOAP files be
distributed with the module source, and thus probably be hosted in Svn,
or generated when generating a build?
PS: The current module data has a variety of data normalization issues
that need addressing. There are multiple broken email addresses, some
null, some " at " and " dot " notation. There are multiple empty and
broken URLs. Some HTML markup is included in the SQL string fields. Some
records include multiple URLs (one for the .c module source, one for
home page, and since there's only 1 URL field they overload the title or
description field. Some unnecessary newlines, trailing and leading
whitespace are in many fields. And there's a few int'l names that will
need robust character set support in final system.
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Rich Bowen <rb...@rcbowen.com>.
On Jul 11, 2011, at 1:12 PM, Lee Fisher wrote:
> On 7/11/11 6:28 AM, Rich Bowen wrote:
> > So ... Where are we on this? Has anybody put any time into it yet?
> > Where could someone jump in?
>
> I've spent a few hours trying to make sense of the data.
>
> It'd help to know what the resulting goal is.
>
> The tarball includes php source, a .SQL file from MySQL. The .SQL file contians a few tables, the one interesting table of modules (including two password fields). There are also tables for that appear to be more related to tracking logins, including a nice chunk of spam.
>
> I've started manually converted the .SQL file into an .XML file. But it's not ready for use yet, there's a lot of data normalization that needs doing. Right now, the main problem with this approach is the MySQL/PHP-formatted date field is nonsense as ASCII and I'd need to convert it somehow to make it useful in XML.
>
> Another approach might be to setup Php and MySQL with this project, then update the Php to output the data.
>
> But what is the goal of this? To create a new web site that tracks users passwords and login dates? Or to create a list of all the modules?
>
> If just the latter, the XML conversion might work. If the goal is to migrate the user accounts and their passwords, then the focus should probably be with the Php and a new MySQL site.
>
> Is there any opportunity during this transition to add some fields? There's a few things that could probably be improved in the current schema.
At a high level, the goal is to replace modules.apache.org with a new modules.apache.org which we host and operate. Specific implementation details are entirely up to us - this is a do-ocracy.
In my ideal world, we'd want something that the module authors can update themselves, and doesn't require a lot of maintenance on our end. Having something where they host some kind of data file on their end that we retrieve periodically seems good to me, but I don't know the details of making that happen. Having a database thingy that they update on our end seems fine, too.
We've always talked about how nice it would be to have a user rating/comment system attached to it, so that you can distinguish between mod_foo that was abandoned in 1997 and the one that was used successfully yesterday.
--
Rich Bowen
rbowen@rcbowen.com
rbowen@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: modules.apache.org data
Posted by Lee Fisher <bl...@gmail.com>.
On 7/11/11 6:28 AM, Rich Bowen wrote:
> So ... Where are we on this? Has anybody put any time into it yet?
> Where could someone jump in?
I've spent a few hours trying to make sense of the data.
It'd help to know what the resulting goal is.
The tarball includes php source, a .SQL file from MySQL. The .SQL file
contians a few tables, the one interesting table of modules (including
two password fields). There are also tables for that appear to be more
related to tracking logins, including a nice chunk of spam.
I've started manually converted the .SQL file into an .XML file. But
it's not ready for use yet, there's a lot of data normalization that
needs doing. Right now, the main problem with this approach is the
MySQL/PHP-formatted date field is nonsense as ASCII and I'd need to
convert it somehow to make it useful in XML.
Another approach might be to setup Php and MySQL with this project, then
update the Php to output the data.
But what is the goal of this? To create a new web site that tracks users
passwords and login dates? Or to create a list of all the modules?
If just the latter, the XML conversion might work. If the goal is to
migrate the user accounts and their passwords, then the focus should
probably be with the Php and a new MySQL site.
Is there any opportunity during this transition to add some fields?
There's a few things that could probably be improved in the current schema.
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org