You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/07/27 01:46:16 UTC

[jira] Created: (PIG-1519) Stop relying on finalize() to delete files, close filehandles in bag implementations

Stop relying on finalize() to delete files, close filehandles in bag implementations
------------------------------------------------------------------------------------

                 Key: PIG-1519
                 URL: https://issues.apache.org/jira/browse/PIG-1519
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.8.0
            Reporter: Thejas M Nair
            Priority: Minor


In DefaultAbstractBag and its subclasses, the files used for spilling to disk are deleted using finalize() . 
The iterators associated with these bags use DataInputStreams but don't call close on them, and the underlying FileInputStream.close() is called only through FileInputStream.finalize().

The use of finalize has performance implications and also makes it hard to predict when the resources will get freed. 

WeakReferences can be used to avoid the use of finalize().  See http://java.sun.com/developer/technicalArticles/javase/finalization/ (look for "An Alternative to Finalization") .

I have marked the priority has minor because the allocation of these resources objects that have finalize happens only for large bags that spill to disk (see related jira - PIG-1516), so the performance  impact of the use of finalize is not likely to be significant. Also, I haven't come across any case where we have run out of these resources because finalize() thread has not freed them yet.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1519) Stop relying on finalize() to delete files, close filehandles in bag implementations

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892551#action_12892551 ] 

Thejas M Nair commented on PIG-1519:
------------------------------------

As part of these changes, we should consider keeping a (weak?) reference in the bags to all the iterators that have been created and call clear() (a new method in iterator impl class) that closes the DataInputStreams and invalidates the iterators.


> Stop relying on finalize() to delete files, close filehandles in bag implementations
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-1519
>                 URL: https://issues.apache.org/jira/browse/PIG-1519
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Priority: Minor
>
> In DefaultAbstractBag and its subclasses, the files used for spilling to disk are deleted using finalize() . 
> The iterators associated with these bags use DataInputStreams but don't call close on them, and the underlying FileInputStream.close() is called only through FileInputStream.finalize().
> The use of finalize has performance implications and also makes it hard to predict when the resources will get freed. 
> WeakReferences can be used to avoid the use of finalize().  See http://java.sun.com/developer/technicalArticles/javase/finalization/ (look for "An Alternative to Finalization") .
> I have marked the priority has minor because the allocation of these resources objects that have finalize happens only for large bags that spill to disk (see related jira - PIG-1516), so the performance  impact of the use of finalize is not likely to be significant. Also, I haven't come across any case where we have run out of these resources because finalize() thread has not freed them yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.