You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (Created) (JIRA)" <ji...@apache.org> on 2011/11/22 02:24:40 UTC

[jira] [Created] (HBASE-4844) Coprocessor hooks for log rolling

Coprocessor hooks for log rolling
---------------------------------

                 Key: HBASE-4844
                 URL: https://issues.apache.org/jira/browse/HBASE-4844
             Project: HBase
          Issue Type: New Feature
    Affects Versions: 0.94.0
            Reporter: Lars Hofhansl
            Priority: Minor


In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164022#comment-13164022 ] 

Lars Hofhansl commented on HBASE-4844:
--------------------------------------

bq. The logroller could signal new log to copy?

Right, and it could trigger a coprocessor hook to do the actual work of archiving. The coprocessor would get the path to the old file and then copy it somewhere else.
Looking at the code, there're races, though. Until the HLogs.writer is set to the new writer, all writes would still go to the old file. So if the coprocessor post hook is before that and it makes a copy of the file some edit might be missed (that go into the old file after it was copied, but before the writer was switched over).
Wouldn't it be nice if we had hardlinks in HDFS? :)

So I think the coprocessor post hook should be called after the HLog.writer assignment. If it did the copy synchronously it only needs to finish before the next log for the same regionserver is rolled (still a race, though).

I'll attach a very simple patch tonight or tomorrow morning and then folks can poke holes in it.

                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163975#comment-13163975 ] 

stack commented on HBASE-4844:
------------------------------

We don't back up WALs.

The logroller could signal new log to copy?  Then the copying process could look in .logs (and in .oldlogs in case it got archived) for the log to copy.  You'd need to add a plugin to logcleaner so master didn't clean the logs from .oldlogs before it'd been backedup (can we add a generic plugin to the master that won't delete logs if enabled -- then backup can do the deletes instead).

                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Jan Van Besien (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Van Besien updated HBASE-4844:
----------------------------------

    Attachment: HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
    
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
>
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Jan Van Besien (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487610#comment-13487610 ] 

Jan Van Besien commented on HBASE-4844:
---------------------------------------

I have never really given the combination of my use case with replication much thought, but I think they should be orthogonal to each other? Everything that ends up in the HLog should eventually get replicated and should also eventually be seen in my coprocessor for my use case.

But your remark made me realize when thinking of failure scenarios that both the WALActionsListener.postWALWriter and/or the WALObserver.postWALWriter calls might fail to notify that a HLog file was rolled.

Am I correct that in the replication code, the recovery mechanism to recover HLog files from failed region servers takes care of this issue?

If so, I think I'll need a similar mechanism in the implementation for my use case.

Jan
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
>
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486828#comment-13486828 ] 

Hadoop QA commented on HBASE-4844:
----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12551335/HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 6 new or modified tests.

    {color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 2.0 profile.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 85 warning messages.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3180//console

This message is automatically generated.
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
>
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161808#comment-13161808 ] 

Lars Hofhansl commented on HBASE-4844:
--------------------------------------

@stack: The hooks here would be specific to log rolling. There are already hooks that are called per log edit.
My thought was that without much tighter integration into the core, there would be nothing useful the coprocessor hooks could do with the log *files* (can't move them, delete them, change them, as the rest of the system could not be altered accordingly to deal with that).
But maybe you're right and the pre hooks should be added as well.

Yes, adding a WALObserver would require a restart of the regionserver(s), there is no dynamic loading (as far as I can see) for WALObserver, currently.

How do folks backup WALs now? Have some watcher process that watches the .logs and .oldlogs directory?
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Jan Van Besien (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Van Besien updated HBASE-4844:
----------------------------------

    Release Note: Addition of two methods to the WALObserver to be notified of HLog rotation (postLogRoll) and HLog archiving (postLogArchive).
          Status: Patch Available  (was: Open)

I created a fairly straightforward patch to add the two discussed methods to the WALObserver. It simply makes the calls whenever the similar methods in WALActionListener are also called. It seems to work, but maybe I overlooked some things I don't know of?

I included a basic test for the postLogRoll method.
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487008#comment-13487008 ] 

Ted Yu commented on HBASE-4844:
-------------------------------

In the patch, WALObserver methods are called after WALActionsListener methods are called.
When replication is enabled, it would be interesting to consider various failure scenarios in Jan Van's use case.
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: HBASE-4844__postLogRoll_and_postLogArchive_functionality_in_WALObserver.patch
>
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161461#comment-13161461 ] 

stack commented on HBASE-4844:
------------------------------

A pre could alter the edit?   Remove it?  Accumulate a bunch of them and do some summing before letting it out into the WAL?
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Jan Van Besien (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486745#comment-13486745 ] 

Jan Van Besien commented on HBASE-4844:
---------------------------------------

I have a use case which would benefit from this feature as well.

In a coprocessor, I want to do some "eventual execution" of actions related to Puts/Deletes on HBase. The actions should be executed "at least once" and should only execute if the actual HBase operation was succesful.

I can use regular coprocessor functionality to get notified of Puts and Deletes etc, but this doesn't give me guaranteed (eventual) execution of my actions because the region server might go down in between the HLog write and the coprocessor being called.

So to solve this problem, I want to read from the HLog files. This will give me a guarantee that I "see" everything that happened on HBase.

In a way it is very similar to how HBase replication works. So I can also translate my needs like this: "what functionality would be required for coprocessors such that HBase replication can be implemented as a coprocessor rather than 'in' HBase".
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160698#comment-13160698 ] 

Lars Hofhansl commented on HBASE-4844:
--------------------------------------

In part that can be done already using TimeToLiveLogCleaner and configure a long TTL (looks like the default is 600000ms = 10mins), or a custom log cleaner.

There are also WALActionListeners, although it looks like they can only be added via code changes to HRegionServer (see getWALActionListeners()).

So I am on the fence with this now. On one hand it would be nice to be able to implement some arbitrary actions on log rolling without touching the HBase code, on the other hand it'll likely duplicate logic that already exists.

The interface in WALObserver would be very simple:
{code}
  void postLogRoll(ObserverContext<WALCoprocessorEnvironment> ctx,
      Path oldPath, Path newPath) throws IOException;

  void postLogArchive(ObserverContext<WALCoprocessorEnvironment> ctx,
      Path oldPath, Path newPath) throws IOException;
{code}

There would be no "pre" method, as there is no way really do anything useful outside of the core HBase code (at least I cannot think of anything).

Opinions?

                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161462#comment-13161462 ] 

stack commented on HBASE-4844:
------------------------------

Adding a WALObserver would necessitate adding the implementation to classpath and restarting regionserver?
                
> Coprocessor hooks for log rolling
> ---------------------------------
>
>                 Key: HBASE-4844
>                 URL: https://issues.apache.org/jira/browse/HBASE-4844
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> In order to eventually do point in time recovery we need a way to reliably back up the logs. Rather than adding some hard coded changes, we can provide coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira