You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jan Lukavsky (JIRA)" <ji...@apache.org> on 2011/08/12 11:54:27 UTC

[jira] [Created] (HBASE-4196) TableRecordReader may skip first row of region

TableRecordReader may skip first row of region
----------------------------------------------

                 Key: HBASE-4196
                 URL: https://issues.apache.org/jira/browse/HBASE-4196
             Project: HBase
          Issue Type: Bug
          Components: mapreduce
    Affects Versions: 0.90.4
            Reporter: Jan Lukavsky


After the following scenario, the first record of region is skipped, without being sent to Mapper:
 - the reader is initialized with TableRecordReader.init()
 - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
 - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma reassigned HBASE-4196:
------------------------------

    Assignee: Ming Ma

> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084482#comment-13084482 ] 

Ted Yu commented on HBASE-4196:
-------------------------------

+1 on patch version 2.
Minor comment on formatting: indentation should increase for this.scanner.next():
{code}
+        restart(lastSuccessfulRow);
       this.scanner.next();    // skip presumed already mapped row
+      }
{code}

> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>         Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma updated HBASE-4196:
---------------------------

    Attachment: HBASE-4196-trunk.patch

Thanks. Here is the update. Also, please note that the mapred version used to handle only UnknownScannerException. It is fixed to handle IOException.


> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>         Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma updated HBASE-4196:
---------------------------

    Attachment: HBASE-4196-trunk.patch

That is due to the svn flag "-w" used. I have fixed it.

> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>         Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma updated HBASE-4196:
---------------------------

    Attachment: HBASE-4196-trunk.patch

Here is the fix.

> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>         Attachments: HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-4196.
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.90.5
     Hadoop Flags: [Reviewed]

Committed branch and trunk.  Thanks for the patch Ming (And review Ted)

> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084735#comment-13084735 ] 

Hudson commented on HBASE-4196:
-------------------------------

Integrated in HBase-TRUNK #2113 (See [https://builds.apache.org/job/HBase-TRUNK/2113/])
    HBASE-4196 TableRecordReader may skip first row of region

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java


> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4196) TableRecordReader may skip first row of region

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084421#comment-13084421 ] 

Ted Yu commented on HBASE-4196:
-------------------------------

Patch looks good.
There're two TableRecordReaderImpl.java files, one under mapred and one under mapreduce.
Both of them should be fixed.


> TableRecordReader may skip first row of region
> ----------------------------------------------
>
>                 Key: HBASE-4196
>                 URL: https://issues.apache.org/jira/browse/HBASE-4196
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Jan Lukavsky
>            Assignee: Ming Ma
>         Attachments: HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
>  - the reader is initialized with TableRecordReader.init()
>  - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
>  - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira