You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jan Lukavsky (JIRA)" <ji...@apache.org> on 2011/08/12 11:54:27 UTC
[jira] [Created] (HBASE-4196) TableRecordReader may skip first row
of region
TableRecordReader may skip first row of region
----------------------------------------------
Key: HBASE-4196
URL: https://issues.apache.org/jira/browse/HBASE-4196
Project: HBase
Issue Type: Bug
Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
After the following scenario, the first record of region is skipped, without being sent to Mapper:
- the reader is initialized with TableRecordReader.init()
- then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
- the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4196) TableRecordReader may skip first row
of region
Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma reassigned HBASE-4196:
------------------------------
Assignee: Ming Ma
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4196) TableRecordReader may skip first
row of region
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084482#comment-13084482 ]
Ted Yu commented on HBASE-4196:
-------------------------------
+1 on patch version 2.
Minor comment on formatting: indentation should increase for this.scanner.next():
{code}
+ restart(lastSuccessfulRow);
this.scanner.next(); // skip presumed already mapped row
+ }
{code}
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row
of region
Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated HBASE-4196:
---------------------------
Attachment: HBASE-4196-trunk.patch
Thanks. Here is the update. Also, please note that the mapred version used to handle only UnknownScannerException. It is fixed to handle IOException.
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row
of region
Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated HBASE-4196:
---------------------------
Attachment: HBASE-4196-trunk.patch
That is due to the svn flag "-w" used. I have fixed it.
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row
of region
Posted by "Ming Ma (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated HBASE-4196:
---------------------------
Attachment: HBASE-4196-trunk.patch
Here is the fix.
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Attachments: HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4196) TableRecordReader may skip first row
of region
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-4196.
--------------------------
Resolution: Fixed
Fix Version/s: 0.90.5
Hadoop Flags: [Reviewed]
Committed branch and trunk. Thanks for the patch Ming (And review Ted)
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Fix For: 0.90.5
>
> Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4196) TableRecordReader may skip first
row of region
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084735#comment-13084735 ]
Hudson commented on HBASE-4196:
-------------------------------
Integrated in HBase-TRUNK #2113 (See [https://builds.apache.org/job/HBase-TRUNK/2113/])
HBASE-4196 TableRecordReader may skip first row of region
stack :
Files :
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Fix For: 0.90.5
>
> Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4196) TableRecordReader may skip first
row of region
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084421#comment-13084421 ]
Ted Yu commented on HBASE-4196:
-------------------------------
Patch looks good.
There're two TableRecordReaderImpl.java files, one under mapred and one under mapreduce.
Both of them should be fixed.
> TableRecordReader may skip first row of region
> ----------------------------------------------
>
> Key: HBASE-4196
> URL: https://issues.apache.org/jira/browse/HBASE-4196
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.90.4
> Reporter: Jan Lukavsky
> Assignee: Ming Ma
> Attachments: HBASE-4196-trunk.patch
>
>
> After the following scenario, the first record of region is skipped, without being sent to Mapper:
> - the reader is initialized with TableRecordReader.init()
> - then nextKeyValue is called, causing call to scanner.next() - here ScannerTimeoutException occurs
> - the scanner is restarted by call to restart() and then *two* calls to scanner.next() occur, causing we have lost the first row
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira