You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Roman Shaposhnik (JIRA)" <ji...@apache.org> on 2012/09/19 18:53:07 UTC
[jira] [Created] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Roman Shaposhnik created CRUNCH-69:
--------------------------------------
Summary: it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
Key: CRUNCH-69
URL: https://issues.apache.org/jira/browse/CRUNCH-69
Project: Crunch
Issue Type: Improvement
Components: Core
Affects Versions: 0.3.0
Reporter: Roman Shaposhnik
Assignee: Josh Wills
Priority: Minor
Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463986#comment-13463986 ]
Josh Wills commented on CRUNCH-69:
----------------------------------
+1-- thanks Brock!
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461181#comment-13461181 ]
Matthias Friedrich commented on CRUNCH-69:
------------------------------------------
I think we have to anonymize these logs before committing them, or preferably make up artificial data. I don't know about US laws, but in Germany IP addresses are considered private data, it would be illegal to store them for longer than a few days, much less publish them.
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Josh Wills
> Priority: Minor
> Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463983#comment-13463983 ]
Roman Shaposhnik commented on CRUNCH-69:
----------------------------------------
+1 (non-binding ;-))
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated CRUNCH-69:
-------------------------------
Attachment: access_log.zip
I do, attached!
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Josh Wills
> Priority: Minor
> Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461468#comment-13461468 ]
Brock Noland commented on CRUNCH-69:
------------------------------------
Not a bad a idea. There is a script hanging around which does this. I'll get that done and then check it in.
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459213#comment-13459213 ]
Josh Wills commented on CRUNCH-69:
----------------------------------
Cool, thanks! Roman, should this input go under src/main/resources in a tar.gz file?
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Josh Wills
> Priority: Minor
> Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459180#comment-13459180 ]
Josh Wills commented on CRUNCH-69:
----------------------------------
[~brocknoland] do you have some sample inputs lying around?
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Josh Wills
> Priority: Minor
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland reassigned CRUNCH-69:
----------------------------------
Assignee: Brock Noland (was: Josh Wills)
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CRUNCH-69) it would be useful to include sample
data for AverageBytesByIP and TotalBytesByIP examples
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated CRUNCH-69:
-------------------------------
Attachment: access_logs.tar.gz
OK, the log is now anonymized. I'd like to place this in src/main/resources.
Let me know and I'll add it.
> it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples
> ------------------------------------------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CRUNCH-69) AverageBytesByIP and TotalBytesByIP
should have sample data
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated CRUNCH-69:
-------------------------------
Summary: AverageBytesByIP and TotalBytesByIP should have sample data (was: it would be useful to include sample data for AverageBytesByIP and TotalBytesByIP examples)
> AverageBytesByIP and TotalBytesByIP should have sample data
> -----------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CRUNCH-69) AverageBytesByIP and TotalBytesByIP
should have sample data
Posted by "Brock Noland (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CRUNCH-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland resolved CRUNCH-69.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.4.0
Committed here in 1c58b6fb24bb178571510be681ca8a588a2aa022
> AverageBytesByIP and TotalBytesByIP should have sample data
> -----------------------------------------------------------
>
> Key: CRUNCH-69
> URL: https://issues.apache.org/jira/browse/CRUNCH-69
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.3.0
> Reporter: Roman Shaposhnik
> Assignee: Brock Noland
> Priority: Minor
> Fix For: 0.4.0
>
> Attachments: access_logs.tar.gz, access_log.zip
>
>
> Currently one has to wonder what kind of input to give those examples. It would be very nice if there existed a canonical set of input files as part of example's resources.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira