You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Roman Shaposhnik (JIRA)" <ji...@apache.org> on 2012/09/19 18:53:07 UTC

[jira] [Created] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Roman Shaposhnik created CRUNCH-69:
--------------------------------------

             Summary: it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
                 Key: CRUNCH-69
                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
             Project: Crunch
          Issue Type: Improvement
          Components: Core
    Affects Versions: 0.3.0
            Reporter: Roman Shaposhnik
            Assignee: Josh Wills
            Priority: Minor


Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463986#comment-13463986 ] 

Josh Wills commented on CRUNCH-69:
----------------------------------

+1-- thanks Brock!
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461181#comment-13461181 ] 

Matthias Friedrich commented on CRUNCH-69:
------------------------------------------

I think we have to anonymize these logs before committing them, or preferably make up artificial data. I don't know about US laws, but in Germany IP addresses are considered private data, it would be illegal to store them for longer than a few days, much less publish them.
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463983#comment-13463983 ] 

Roman Shaposhnik commented on CRUNCH-69:
----------------------------------------

+1 (non-binding ;-))
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brock Noland updated CRUNCH-69:
-------------------------------

    Attachment: access_log.zip

I do, attached!
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461468#comment-13461468 ] 

Brock Noland commented on CRUNCH-69:
------------------------------------

Not a bad a idea. There is a script hanging around which does this. I'll get that done and then check it in. 
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459213#comment-13459213 ] 

Josh Wills commented on CRUNCH-69:
----------------------------------

Cool, thanks! Roman, should this input go under src/main/resources in a tar.gz file?
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459180#comment-13459180 ] 

Josh Wills commented on CRUNCH-69:
----------------------------------

[~brocknoland] do you have some sample inputs lying around?
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Josh Wills
>            Priority: Minor
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brock Noland reassigned CRUNCH-69:
----------------------------------

    Assignee: Brock Noland  (was: Josh Wills)
    
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CRUNCH-69) it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brock Noland updated CRUNCH-69:
-------------------------------

    Attachment: access_logs.tar.gz

OK, the log is now anonymized. I'd like to place this in src/main/resources.

Let me know and I'll add it.
                
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CRUNCH-69) AverageBytesByIP and TotalBytesByIP should have sample data

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brock Noland updated CRUNCH-69:
-------------------------------

    Summary: AverageBytesByIP and TotalBytesByIP should have sample data  (was: it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples)
    
> AverageBytesByIP and TotalBytesByIP should have sample data
> -----------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>         Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CRUNCH-69) AverageBytesByIP and TotalBytesByIP should have sample data

Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brock Noland resolved CRUNCH-69.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0

Committed here in 1c58b6fb24bb178571510be681ca8a588a2aa022
                
> AverageBytesByIP and TotalBytesByIP should have sample data
> -----------------------------------------------------------
>
>                 Key: CRUNCH-69
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-69
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Roman Shaposhnik
>            Assignee: Brock Noland
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira