You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2009/01/19 21:15:59 UTC

[jira] Created: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Test that readers opened after a sync can see all data up to the sync
---------------------------------------------------------------------

                 Key: HBASE-1138
                 URL: https://issues.apache.org/jira/browse/HBASE-1138
             Project: Hadoop HBase
          Issue Type: Test
          Components: regionserver, test
    Affects Versions: 0.19.0
            Reporter: Jim Kellerman
            Assignee: Jim Kellerman
             Fix For: 0.19.1, 0.20.0


Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1138) Make HLog block size configurable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman resolved HBASE-1138.
----------------------------------

    Resolution: Fixed

Committed to branch and trunk.

> Make HLog block size configurable
> ---------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: master, regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1138.patch
>
>
> Make HLog blocksize configurable to mitigate possible data loss.
> The ideal size is (size of record (avg key length + avg value length)) * (number of records between sync())
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-1138:
---------------------------------

    Attachment: 1138.patch

Patch for 0.19.1

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1138.patch
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665607#action_12665607 ] 

Jim Kellerman commented on HBASE-1138:
--------------------------------------

Committed temporary patch to trunk until HADOOP-4379 is resolved.

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1138.patch
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665594#action_12665594 ] 

Jim Kellerman commented on HBASE-1138:
--------------------------------------

If we make the default block size 1MB, and we write 819200 records of ~1KB, sync-ing after each 100 writes, we can read 815002 records if the writer crashes after writing 819200 records.

This increases our record read ability from 84% to 99%.

If users have a record length different from 1KB, they can adjust hbase.regionserver.hlog.blocksize (default = 1048576) to achieve similar results.

Patch follows.


> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-1138:
------------------------------------

    Assignee: Jim Kellerman

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1138.patch
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-1138:
------------------------------------

    Assignee:     (was: Jim Kellerman)

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665334#action_12665334 ] 

stack commented on HBASE-1138:
------------------------------

So seems as though its been verified that this feature has not been implemented yet going by tests posted to HADOOP-4379.  In https://issues.apache.org/jira/browse/HBASE-728?focusedCommentId=12642525#action_12642525, I ask if we tested crashing and recovering crashed server logs verifying if edits lost and was told it had been tested.

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665235#action_12665235 ] 

stack commented on HBASE-1138:
------------------------------

HADOOP-4379 says its not implemented yet.

> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1138) Make HLog block size configurable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-1138:
---------------------------------

    Component/s: master
    Description: 
Make HLog blocksize configurable to mitigate possible data loss.

The ideal size is (size of record (avg key length + avg value length)) * (number of records between sync())
 

  was:Test that readers opened on a file after it has been synced can see all the records prior to the sync.

     Issue Type: Improvement  (was: Test)
        Summary: Make HLog block size configurable  (was: Test that readers opened after a sync can see all data up to the sync)

> Make HLog block size configurable
> ---------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: master, regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1138.patch
>
>
> Make HLog blocksize configurable to mitigate possible data loss.
> The ideal size is (size of record (avg key length + avg value length)) * (number of records between sync())
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1138) Test that readers opened after a sync can see all data up to the sync

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665462#action_12665462 ] 

Jim Kellerman commented on HBASE-1138:
--------------------------------------

We can mitigate this in the short term by making HLog block size configurable as there is a (new to me) createWriter method:
{code}
  /**
   * Construct the preferred type of SequenceFile Writer.
   * @param fs The configured filesystem.
   * @param conf The configuration.
   * @param name The name of the file.
   * @param keyClass The 'key' type.
   * @param valClass The 'value' type.
   * @param bufferSize buffer size for the underlaying outputstream.
   * @param replication replication factor for the file.
   * @param blockSize block size for the file.
   * @param compressionType The compression type.
   * @param codec The compression codec.
   * @param progress The Progressable object to track progress.
   * @param metadata The metadata of the file.
   * @return Returns the handle to the constructed SequenceFile Writer.
   * @throws IOException
   */
  public static Writer
    createWriter(FileSystem fs, Configuration conf, Path name,
                 Class keyClass, Class valClass, int bufferSize,
                 short replication, long blockSize,
                 CompressionType compressionType, CompressionCodec codec,
                 Progressable progress, Metadata metadata) throws IOException {
{code}

If we did make HLog block size configurable, we'd have to provide guidelines for setting it based on 
average key and value length, and the sync interval.


> Test that readers opened after a sync can see all data up to the sync
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1138
>                 URL: https://issues.apache.org/jira/browse/HBASE-1138
>             Project: Hadoop HBase
>          Issue Type: Test
>          Components: regionserver, test
>    Affects Versions: 0.19.0
>            Reporter: Jim Kellerman
>             Fix For: 0.19.1, 0.20.0
>
>
> Test that readers opened on a file after it has been synced can see all the records prior to the sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.