You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@httpd.apache.org by Rich Bowen <rb...@rcbowen.com> on 2011/07/11 15:28:51 UTC

Re: modules.apache.org data

So ... Where are we on this? Has anybody put any time into it yet? Where could someone jump in?


--
Rich Bowen
rbowen@rcbowen.com
rbowen@apache.org







---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Mads Toftum <ma...@toftum.dk>.
On Mon, Jul 11, 2011 at 09:28:51AM -0400, Rich Bowen wrote:
> So ... Where are we on this? Has anybody put any time into it yet? Where could someone jump in?
> 
I've had a quick run through the dump and made a hackish attempt that
got me something close to doap without much effort before $work got in
the way. Fairly simple to do.

vh

Mads Toftum
-- 
http://soulfood.dk

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Lee Fisher <bl...@gmail.com>.
On 7/11/11 12:15 PM, Mads Toftum wrote:
> On Mon, Jul 11, 2011 at 12:05:05PM -0700, Lee Fisher wrote:
>> If I can ignore the password fields, and the login history table,
>> and can manage to convert the Php/MySQL numeric date to a
>> human-readable XML date, then the work I've been doing in XML is
>> usable.
>
> If we're going to go with a simple site created using doap, then you
> shouldn't worry too much about usernames and passwords. If we want to
> bring the accounts back and have a management system again, then we can
> always extract the data at that point.

If no password fields are needed, then I need to figure out the date 
field conversion (if the old database registation date is still needed). 
If I can make it past that, then I think the XML work I've done so far 
might be usable, after some more cleanup on user data.

If DOAP format is wanted, then we need Programming Language, which we 
don't have. Optionally, it'd be nice to have other metadata: Operating 
System, Bug Database, Category, split Url field into Homepage, Old 
Homepage, Download Page, and for Apache-hosted (and other 
publicly-projects, use the Repository class. Besides Developer and 
Maintainer contacts, perhaps also the Documentor, Translator, and Helper 
contacts? So, I think we'd need a way to get new data from existing 
projects, either before or after publishing this new format.

Is there an Apache-controlled vocabulary that can be used as DOAP 
Keywords? Right now they're quite free-form, if they exist. Having some 
existing 'tag cloud' could be helpful going foreward.

Is this eventually going to be internalized into non-English languages, 
via xml:lang?

The Apache Labs DOAPizer seems to only target internal projects that 
have Svn projects, as Svn folder is used for Lab Identifier in form. 
Would the results of this new set of DOAP files be integrated with other 
existing Apache-hosted DOAP files? Perhaps make the Apache Labs DOAPizer 
something that httpd module authors could use in new system?
http://labs.apache.org/doapizer.html

BTW, since the DOAP site is down (for me at least), there is a new 
similar project being worked on by Linux Foundation might be worth 
looking at.
http://spdx.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Mads Toftum <ma...@toftum.dk>.
On Mon, Jul 11, 2011 at 12:05:05PM -0700, Lee Fisher wrote:
> If I can ignore the password fields, and the login history table,
> and can manage to convert the Php/MySQL numeric date to a
> human-readable XML date, then the work I've been doing in XML is
> usable.

If we're going to go with a simple site created using doap, then you
shouldn't worry too much about usernames and passwords. If we want to
bring the accounts back and have a management system again, then we can
always extract the data at that point.

vh.

Mads Toftum
-- 
http://soulfood.dk

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Lee Fisher <bl...@gmail.com>.
 > In my ideal world, we'd want something that the module authors
 > can update themselves, and doesn't require a lot of maintenance
 > on our end. Having something where they host some kind of data
 > file on their end that we retrieve periodically seems good to
 > me, but I don't know the details of making that happen. Having
 > a database thingy that they update on our end seems fine, too.

If I can ignore the password fields, and the login history table, and 
can manage to convert the Php/MySQL numeric date to a human-readable XML 
date, then the work I've been doing in XML is usable.

If the login history and password fields need to be used in the 
resulting system, I think I have to drop this XML work, and go back to 
getting the Php/MySQL running as a new system.

A new system might be able to use the email fields to reset things, and 
ask for a new password. If solution is wiki-based, it seems likey they'd 
have to create a new account on the wiki, and might be ok if we don't 
have to translate the old accounts (and passwords) over to new wiki 
accounts.

So, besides password field issue, and data field conversion, I can 
continue to normalize the XML of the base data. But if login history or 
passwords need to be carried forward, I should start over and figure out 
how to rebuild the old Php/MySql web app, so 100% of it's tables can be 
migrated forward.

Is there any high-level Apache DOAP, FOAF policy that would help decide 
the proper direction w/r/t hosted XML metadata?

If a DOAP file approach is used, shouldn't these DOAP files be 
distributed with the module source, and thus probably be hosted in Svn, 
or generated when generating a build?

PS: The current module data has a variety of data normalization issues 
that need addressing. There are multiple broken email addresses, some 
null, some " at " and " dot " notation. There are multiple empty and 
broken URLs. Some HTML markup is included in the SQL string fields. Some 
records include multiple URLs (one for the .c module source, one for 
home page, and since there's only 1 URL field they overload the title or 
description field. Some unnecessary newlines, trailing and leading 
whitespace are in many fields. And there's a few int'l names that will 
need robust character set support in final system.


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Rich Bowen <rb...@rcbowen.com>.
On Jul 11, 2011, at 1:12 PM, Lee Fisher wrote:

> On 7/11/11 6:28 AM, Rich Bowen wrote:
> > So ... Where are we on this? Has anybody put any time into it yet?
> > Where could someone jump in?
> 
> I've spent a few hours trying to make sense of the data.
> 
> It'd help to know what the resulting goal is.
> 
> The tarball includes php source, a .SQL file from MySQL. The .SQL file contians a few tables, the one interesting table of modules (including two password fields). There are also tables for that appear to be more related to tracking logins, including a nice chunk of spam.
> 
> I've started manually converted the .SQL file into an .XML file. But it's not ready for use yet, there's a lot of data normalization that needs doing. Right now, the main problem with this approach is the MySQL/PHP-formatted date field is nonsense as ASCII and I'd need to convert it somehow to make it useful in XML.
> 
> Another approach might be to setup Php and MySQL with this project, then update the Php to output the data.
> 
> But what is the goal of this? To create a new web site that tracks users passwords and login dates? Or to create a list of all the modules?
> 
> If just the latter, the XML conversion might work. If the goal is to migrate the user accounts and their passwords, then the focus should probably be with the Php and a new MySQL site.
> 
> Is there any opportunity during this transition to add some fields? There's a few things that could probably be improved in the current schema.

At a high level, the goal is to replace modules.apache.org with a new modules.apache.org which we host and operate. Specific implementation details are entirely up to us - this is a do-ocracy.

In my ideal world, we'd want something that the module authors can update themselves, and doesn't require a lot of maintenance on our end. Having something where they host some kind of data file on their end that we retrieve periodically seems good to me, but I don't know the details of making that happen. Having a database thingy that they update on our end seems fine, too.

We've always talked about how nice it would be to have a user rating/comment system attached to it, so that you can distinguish between mod_foo that was abandoned in 1997 and the one that was used successfully yesterday.

--
Rich Bowen
rbowen@rcbowen.com
rbowen@apache.org







---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: modules.apache.org data

Posted by Lee Fisher <bl...@gmail.com>.
On 7/11/11 6:28 AM, Rich Bowen wrote:
 > So ... Where are we on this? Has anybody put any time into it yet?
 > Where could someone jump in?

I've spent a few hours trying to make sense of the data.

It'd help to know what the resulting goal is.

The tarball includes php source, a .SQL file from MySQL. The .SQL file 
contians a few tables, the one interesting table of modules (including 
two password fields). There are also tables for that appear to be more 
related to tracking logins, including a nice chunk of spam.

I've started manually converted the .SQL file into an .XML file. But 
it's not ready for use yet, there's a lot of data normalization that 
needs doing. Right now, the main problem with this approach is the 
MySQL/PHP-formatted date field is nonsense as ASCII and I'd need to 
convert it somehow to make it useful in XML.

Another approach might be to setup Php and MySQL with this project, then 
update the Php to output the data.

But what is the goal of this? To create a new web site that tracks users 
passwords and login dates? Or to create a list of all the modules?

If just the latter, the XML conversion might work. If the goal is to 
migrate the user accounts and their passwords, then the focus should 
probably be with the Php and a new MySQL site.

Is there any opportunity during this transition to add some fields? 
There's a few things that could probably be improved in the current schema.


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org