You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Sebb (JIRA)" <ji...@apache.org> on 2018/07/29 14:15:00 UTC

[jira] [Commented] (COMDEV-292) Mailglomper does not handle renamed lists well

    [ https://issues.apache.org/jira/browse/COMDEV-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561121#comment-16561121 ] 

Sebb commented on COMDEV-292:
-----------------------------

It would save a lot of network traffic if the statistics were calculated on the mail-archive host.

> Mailglomper does not handle renamed lists well
> ----------------------------------------------
>
>                 Key: COMDEV-292
>                 URL: https://issues.apache.org/jira/browse/COMDEV-292
>             Project: Community Development
>          Issue Type: Bug
>          Components: Reporter Tool
>            Reporter: Sebb
>            Priority: Major
>
> The mailglomper script does not take account of renamed mailing lists.
> This can result in double counting the activity for a project.
> For example, commits@libcloud was renamed to notifications@libcloud in March 2014.
> However the data in the maildata_extended.json file includes weekly epoch entries
> for commits:
> 1507161600 2017-10-05 00:00:00 UTC
> to
> 1524096000 2018-04-19 00:00:00 UTC
> whereas notifications has:
> 1515024000 2018-01-04 00:00:00 UTC
> to
> 1531958400 2018-07-19 00:00:00 UTC
> The weekly counts agree for the overlap period.
> If the commits mbox files were still present up to April 2018, there would be an index entry for the list, and if there was also a redirect in place, the code would see the redirected files.
> I think the code should probably ignore redirects if that's possible.
> When a list is renamed, the old data ought to be dropped, otherwise it may be double-counted.
> Also the obsolete entries will gradually accumulate.
> This applies to both the maildata_weekly.json and maildata_extended.json files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org