You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Billy Pearson (JIRA)" <ji...@apache.org> on 2009/06/04 19:47:07 UTC

[jira] Created: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

compaction file not cleaned up after a crash/OOME server
--------------------------------------------------------

                 Key: HBASE-1480
                 URL: https://issues.apache.org/jira/browse/HBASE-1480
             Project: Hadoop HBase
          Issue Type: Bug
          Components: master, regionserver
            Reporter: Billy Pearson
             Fix For: 0.20.0


We do not clean up compaction files after a crash/OOME of a region server.

I am not sure how the compaction file naming is anymore if its not reproducable some how we 
should let the master or the server with the root region check every so often and delete old files say 
older then 24 hours in the compaction dir's of the tables





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716468#action_12716468 ] 

Andrew Purtell commented on HBASE-1480:
---------------------------------------

Unless we're going to move to a master-less architecture, I think the master is a good place for this kind of housekeeping. It's the only entity allowed to delete dirs in HDFS already. 

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>             Fix For: 0.20.0
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "Evgeny Ryabitskiy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evgeny Ryabitskiy updated HBASE-1480:
-------------------------------------

    Attachment: HBASE-1480.patch

There was a bug in Cleaning up for HRegions.
Each time at beginning and end of Compaction HRegion deletes regionCompactionDir in FS for clean up.

Name of regionCompactionDir is computed this way :

{code}
this.regionCompactionDir =  new Path(getCompactionDir(basedir), encodedNameStr);

Ex.:  hbase/myTable/compaction.dir/23143254
{code}

But! Stores for this HRegion create and use compactionDir, with name computed this way:

{code}
this.compactionDir = getCompactionDir(basedir);

Ex.:  hbase/myTable/compaction.dir
{code}

Diff in sub dir for each HRegion with encodedNameStr

So I changed dir where Stores put compaction files. Now compactions for Each Stores are stored separately in FS and clean up performed each time compaction starts and ends.


*Last one note about migration:*

*No one file should be stored in upper compactionDir (they should be stored in subdirs, regionCompactionDirs). So we need to clean up all compactionDirs for all Table dirs in FS during migration*

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>            Assignee: Evgeny Ryabitskiy
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1480.patch
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "Evgeny Ryabitskiy (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718295#action_12718295 ] 

Evgeny Ryabitskiy commented on HBASE-1480:
------------------------------------------

Same problem can be with split dir.

Not sure about master as an only entity to del dirs in HDFS. Region deletes its compaction dir every time it starts compaction and after successfully finishes it (2 times).

Master is a good place, I will also look if new HRS that got Region can do this clean up.

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>            Assignee: Evgeny Ryabitskiy
>             Fix For: 0.20.0
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "Evgeny Ryabitskiy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evgeny Ryabitskiy reassigned HBASE-1480:
----------------------------------------

    Assignee: Evgeny Ryabitskiy

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>            Assignee: Evgeny Ryabitskiy
>             Fix For: 0.20.0
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "Evgeny Ryabitskiy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evgeny Ryabitskiy updated HBASE-1480:
-------------------------------------

    Status: Patch Available  (was: Open)

So this patch should solve problem and there is no need for master to parse compaction dirs.

Also I improved Junit to ensure that regionCompactionDir is really cleaned up each time after compaction

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>            Assignee: Evgeny Ryabitskiy
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1480.patch
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1480) compaction file not cleaned up after a crash/OOME server

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1480:
-------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I committed Evgeny's patch after watching it in action up on an running test (we make the compactions into a subdir of compaction.dir with this patch rather than at top-level).

The original issue was about clean up after a crash.  With this patch, should be clean up when Region is opened again.

I presume that good enough.  Lets open new issue if not.  Resolving this one.

> compaction file not cleaned up after a crash/OOME server
> --------------------------------------------------------
>
>                 Key: HBASE-1480
>                 URL: https://issues.apache.org/jira/browse/HBASE-1480
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master, regionserver
>            Reporter: Billy Pearson
>            Assignee: Evgeny Ryabitskiy
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1480.patch
>
>
> We do not clean up compaction files after a crash/OOME of a region server.
> I am not sure how the compaction file naming is anymore if its not reproducable some how we 
> should let the master or the server with the root region check every so often and delete old files say 
> older then 24 hours in the compaction dir's of the tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.