You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/07/29 01:14:16 UTC

[jira] Created: (PIG-1524) 'Proactive spill count' is misleading

'Proactive spill count' is misleading
-------------------------------------

                 Key: PIG-1524
                 URL: https://issues.apache.org/jira/browse/PIG-1524
             Project: Pig
          Issue Type: Bug
            Reporter: Thejas M Nair
             Fix For: 0.8.0


InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.

Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900089#action_12900089 ] 

Olga Natkovich commented on PIG-1524:
-------------------------------------

I am reviewing this patch

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Status: Patch Available  (was: Open)

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Attachment: PIG-1524.patch

In this patch (PIG-1524.patch) I have also re-factored the spill code in case of InternalSortedBag and InternalDistinctBag into a common super class SortedSpillBag.
I don't have any new test cases because the counter values will vary depending on current max memory.


> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair reassigned PIG-1524:
----------------------------------

    Assignee: Thejas M Nair

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900066#action_12900066 ] 

Thejas M Nair commented on PIG-1524:
------------------------------------

All core, contrib tests pass .
Result of test-patch
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     -1 release audit.  The applied patch generated 418 release audit warnings (more than the trunk's current 415 warnings).

As mentioned earlier, I don't have any new test cases because the counter values will vary depending on current max memory. 
The release audit -1 is caused by java doc jdiff changes.

Patch is ready for review.


> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Attachment: PIG-1524.2.patch

New patch with fix for issues found after more tests. 


> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900107#action_12900107 ] 

Olga Natkovich commented on PIG-1524:
-------------------------------------

+1 with a couple comment cleanups:

(1) Locking comment is misleading because we don't actually lock anything :)
(2) Comment regarding moving data from list to array for sorting needs to be also clarified.

Other than that, looks good. Please, commit

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

          Status: Resolved  (was: Patch Available)
    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Patch with modified comments as per Olga's recommendation committed to trunk.

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Attachment: PIG-1524.3.patch

Patch with fix for a javadoc warning.


> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Status: Open  (was: Patch Available)

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1524:
-------------------------------

    Status: Patch Available  (was: Open)

> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1524.2.patch, PIG-1524.3.patch, PIG-1524.patch
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1524) 'Proactive spill count' is misleading

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893428#action_12893428 ] 

Thejas M Nair commented on PIG-1524:
------------------------------------

It will be useful to have a measure of number of records being spilled as well as the number of bags that spilled.
ie have two counters - 'proactive spill count' and 'proactively spilled record count' .


> 'Proactive spill count' is misleading
> -------------------------------------
>
>                 Key: PIG-1524
>                 URL: https://issues.apache.org/jira/browse/PIG-1524
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> InternalCacheBag, InternalSortedBag, InternalDistinctBag increment this counter for every record that it writes to disk, once it exceeds the memory limit. This number is misleading.
> Instead, this counter should be increment it by 1 for each instance of these bags that has spilled to disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.