You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "George Caragea (JIRA)" <ji...@apache.org> on 2016/04/06 21:00:27 UTC

[jira] [Commented] (HAWQ-633) Do not error out when cleaning up workfiles during AbortTransaction

    [ https://issues.apache.org/jira/browse/HAWQ-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228895#comment-15228895 ] 

George Caragea commented on HAWQ-633:
-------------------------------------

1) Yes, workfile_mgr_unlink_directory() is called in two scenarios: 
   A. When an operator is done with execution, and wants to clean up its own spill files
   B. When there was an error and transaction is aborting, and we want to clean up all the spill files created. 

2) Here are the stack traces for the two scenarios. A pointer to workfile_mgr_unlink_directory is stored in the ```Cache::cleanupEntry``` field. 
   A. Normal execution: 

{noformat}
           [Operator such as HJ, HashAgg]
                workfile_mgr_close_set()
                    Cache_Release()
                        Cache_ReleaseAcquired()
                              Cache::cleanupEntry()
{noformat}

  B. Abort xact
{noformat}
      AbortTransaction()
         AtEOXact_Files() or AtProcExit_Files()
              CleanupTempFiles()
                  workfile_mgr_cleanup()
                       Cache_SurrenderClientEntries()       
                            Cache_ReleaseAcquired()
                                  Cache::cleanupEntry()
{noformat}




> Do not error out when cleaning up workfiles during AbortTransaction
> -------------------------------------------------------------------
>
>                 Key: HAWQ-633
>                 URL: https://issues.apache.org/jira/browse/HAWQ-633
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Query Execution
>            Reporter: George Caragea
>            Assignee: George Caragea
>
> If we reach out of disk space when creating temporary spill files, we will error out during writing to disk, and abort the transaction. 
> When aborting the transaction, part of the cleanup code we call workfile_mgr_unlink_directory() to delete the directory containing all the work files. But in some cases that directory might not even be created, because of the out of disk space. 
> Instead of erroring out again, just give a warning and continue with the abort code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)