You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2007/05/09 19:59:15 UTC
[jira] Created: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
------------------------------------------------------------------------------------------------------------------------
Key: HADOOP-1342
URL: https://issues.apache.org/jira/browse/HADOOP-1342
Project: Hadoop
Issue Type: Improvement
Reporter: Runping Qi
In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-1342:
---------------------------------
Fix Version/s: 0.14.0
Status: Open (was: Patch Available)
I'm not sure what's changed, but this no longer passes unit tests against trunk.
Testcase: testAggregates took 1.59 sec
FAILED
expected:<...5...> but was:<...9...>
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495185 ]
Runping Qi commented on HADOOP-1342:
------------------------------------
That explained why the unit test failed.
The patch failed to apply because r537300 did some format change on TestAggregates.java, which caused conflicts.
I will re-generate the patch next.
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495962 ]
Hadoop QA commented on HADOOP-1342:
-----------------------------------
Integrated in Hadoop-Nightly #89 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/89/)
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342:
-------------------------------
Attachment: patch-1342.txt
This patch added a limit on the number of unique values for UniqueValueCount aggregator. If the actual number of values is greater than the limit, the counter will be limit + 1.
The limit is under the attribute name: "aggregate.max.num.unique.values".
It can be set by calling job.setLong("aggregate.max.num.unique.values", 200).
The default is Long.MAX_VALUE (same as the current behavior).
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342:
-------------------------------
Attachment: patch-1342.txt
A new patch with conflict with trunk resolved.
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-1342:
---------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Runping!
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495191 ]
Hadoop QA commented on HADOOP-1342:
-----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12357147/patch-1342.txt applied and successfully tested against trunk revision r537295.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/136/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/136/console
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495183 ]
Doug Cutting commented on HADOOP-1342:
--------------------------------------
The patch simply fails to apply to trunk.
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495064 ]
Hadoop QA commented on HADOOP-1342:
-----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12357108/patch-1342.txt applied and successfully tested against trunk revision r536583.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/132/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/132/console
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342:
-------------------------------
Status: Patch Available (was: Open)
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi reassigned HADOOP-1342:
----------------------------------
Assignee: Runping Qi
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1342) A configurable limit on the number
of unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495178 ]
Runping Qi commented on HADOOP-1342:
------------------------------------
Looks like the changes made on TestAggregates part were applied, but the changes on the aggregate code did not.
Can you try to re-apply the patch? Or send me the following files in your trunk:
ValueAggregatorBaseDescriptor.java and
UniqValueCount.java
so that I can take a look at them.
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342:
-------------------------------
Attachment: (was: patch-1342.txt)
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1342) A configurable limit on the number of
unique values should be set on the UniqueValueCount and ValueHistogram
aggregators
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-1342:
-------------------------------
Status: Patch Available (was: Open)
> A configurable limit on the number of unique values should be set on the UniqueValueCount and ValueHistogram aggregators
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1342
> URL: https://issues.apache.org/jira/browse/HADOOP-1342
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assigned To: Runping Qi
> Fix For: 0.14.0
>
> Attachments: patch-1342.txt
>
>
> In the current implementation, the uniq number of values may increase unbounded, causing out of memory eventually.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.