You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/07/29 22:00:10 UTC

[jira] [Created] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
--------------------------------------------------------

                 Key: HBASE-4148
                 URL: https://issues.apache.org/jira/browse/HBASE-4148
             Project: HBase
          Issue Type: Bug
          Components: mapreduce
    Affects Versions: 0.90.3
            Reporter: Todd Lipcon
             Fix For: 0.90.5


When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4148:
--------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch

xxx.trunk2.patch is a version that can be applied to trunk.  It cleans up a conflict marker in a comment that I had missed.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch

Patch with nits addressed

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073230#comment-13073230 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
-----------------------------------------------------------

Review request for hbase and Todd Lipcon.


Summary
-------

When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.


This addresses bug HBASE-4148.
    https://issues.apache.org/jira/browse/HBASE-4148


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 8ccdf4d 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 89241eb 

Diff: https://reviews.apache.org/r/1229/diff


Testing
-------

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon reassigned HBASE-4148:
----------------------------------

    Assignee: Jonathan Hsieh

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073432#comment-13073432 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
-----------------------------------------------------------

(Updated 2011-08-01 04:31:42.869399)


Review request for hbase and Todd Lipcon.


Changes
-------

tweaks to make apply cleanly on trunk.


Summary
-------

When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.


This addresses bug HBASE-4148.
    https://issues.apache.org/jira/browse/HBASE-4148


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 3c48d08 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 2f3f5df 

Diff: https://reviews.apache.org/r/1229/diff


Testing
-------

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073236#comment-13073236 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1243
-----------------------------------------------------------

Ship it!


KVs come in descending timestamp order so that the most recent data is always the first thing on disk when you seek to a particular row-col. This is because the default "get" mode wants to see the latest version, rather than an older one. So, you have to skip over less data.

As for tests passing, so long as this new test passes, don't worry about the others. You can check the Apache hudson instance to see what has failed recently.


src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
<https://reviews.apache.org/r/1229/#comment2840>

    assertEquals


- Todd


On 2011-07-30 18:40:49, jmhsieh wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-07-30 18:40:49)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.      https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 8ccdf4d 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 89241eb 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075975#comment-13075975 ] 

Hudson commented on HBASE-4148:
-------------------------------

Integrated in HBase-TRUNK #2066 (See [https://builds.apache.org/job/HBase-TRUNK/2066/])
    HBASE-4148  HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata (Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java


> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073368#comment-13073368 ] 

Todd Lipcon commented on HBASE-4148:
------------------------------------

Looks like the current patch only applies against 0.90. Can you make one for trunk too?

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073355#comment-13073355 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1244
-----------------------------------------------------------

Ship it!


- Ted


On 2011-07-31 05:52:30, jmhsieh wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-07-31 05:52:30)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.      https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 8ccdf4d 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 89241eb 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment:     (was: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch)

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073328#comment-13073328 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
-----------------------------------------------------------

(Updated 2011-07-31 05:52:30.608713)


Review request for hbase and Todd Lipcon.


Changes
-------

Updated to address nit.


Summary
-------

When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.


This addresses bug HBASE-4148.
    https://issues.apache.org/jira/browse/HBASE-4148


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 8ccdf4d 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 89241eb 

Diff: https://reviews.apache.org/r/1229/diff


Testing
-------

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076029#comment-13076029 ] 

Jonathan Hsieh commented on HBASE-4148:
---------------------------------------

Sorry about previous post; posted to wrong jira.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073632#comment-13073632 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
-----------------------------------------------------------

(Updated 2011-08-01 17:54:26.858153)


Review request for hbase and Todd Lipcon.


Changes
-------

Cleaned up nit.


Summary
-------

When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.


This addresses bug HBASE-4148.
    https://issues.apache.org/jira/browse/HBASE-4148


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 3c48d08 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 2f3f5df 

Diff: https://reviews.apache.org/r/1229/diff


Testing
-------

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Attachment: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch

This version has some tweaks to make it apply cleanly on trunk. (previous version applies on 0.90 branch)

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073811#comment-13073811 ] 

Ted Yu commented on HBASE-4148:
-------------------------------

Integrated to branch and TRUNK.

Thanks for the patch Jonathan.
Thanks for the review, Todd and Li.

I ran TestHFileOutputFormat in 0.90 branch and it passed.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4148:
----------------------------------

    Status: Patch Available  (was: Open)

Up for review here: https://reviews.apache.org/r/1229/

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076027#comment-13076027 ] 

Jonathan Hsieh commented on HBASE-4148:
---------------------------------------

Patrick Hunt suggested using 'apt-get purge' instead of 'apt-get remove'.  This seems to have worked. The difference between 'purge' (remove binary + configs) and 'remove' (just remove binary) wan't clear to me until I looked it up. 

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073687#comment-13073687 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1253
-----------------------------------------------------------

Ship it!


- Li


On 2011-08-01 17:54:26, jmhsieh wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-01 17:54:26)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.      https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 3c48d08 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 2f3f5df 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073619#comment-13073619 ] 

jiraposter@reviews.apache.org commented on HBASE-4148:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1250
-----------------------------------------------------------


small nit (conflict marker in the patch)


src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
<https://reviews.apache.org/r/1229/#comment2862>

    conflict marker


- Todd


On 2011-08-01 04:31:42, jmhsieh wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-01 04:31:42)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.      https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 3c48d08 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 2f3f5df 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) value must be written before the one with the smaller timestamp (1000). I can see the code that enforces this (HFile.checkKey) but not why keys are larger to smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  Suite seems to timeout on tests unrelated to this.  Would appreciate some hints or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> --------------------------------------------------------
>
>                 Key: HBASE-4148
>                 URL: https://issues.apache.org/jira/browse/HBASE-4148
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute TIMERANGE_KEY which can be used to cull HFiles when performing a time-restricted scan. Files produced by HFileOutputFormat are currently missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira