You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/01/29 16:21:13 UTC
[jira] [Comment Edited] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565443#comment-13565443 ]
Ted Yu edited comment on HBASE-7495 at 1/29/13 3:20 PM:
--------------------------------------------------------
{code}
+ * Do StoreFileScanner.seek() in parallel
+ * @throws IOException
+ */
+ private void parallelSeek(final List<? extends KeyValueScanner>
+ scanners, final KeyValue keyValue) throws IOException {
{code}
Add javadoc for the new method.
{code}
+ if (scanner instanceof StoreFileScanner) {
+ storeFileScannerNum++;
{code}
storeFileScannerNum -> storeFileScannerCount
{code}
+ for (ScannerSeekWorker worker : workers) {
+ worker.start();
+ }
{code}
Why use a separate loop to start the workers ?
{code}
+ } catch (InterruptedException e) {
+ LOG.error("", e);
+ }
{code}
Restore interrupt status or throw InterruptedIOException.
{code}
+ for (ScannerSeekWorker worker : workers) {
+ if (worker.getErr() != null) {
+ throw new IOException(worker.getErr());
+ }
+ }
{code}
Use MultipleIOException so that user knows about more than one error.
{code}
+ } catch (IOException e) {
+ LOG.info("", e);
{code}
Change to error level.
was (Author: yuzhihong@gmail.com):
{code}
+ * Do StoreFileScanner.seek() in parallel
+ * @throws IOException
+ */
+ private void parallelSeek(final List<? extends KeyValueScanner>
+ scanners, final KeyValue keyValue) throws IOException {
{code}
Add javadoc for the new method.
{code}
+ if (scanner instanceof StoreFileScanner) {
+ storeFileScannerNum++;
{code}
storeFileScannerNum -> storeFileScannerCount
{code}
+ for (ScannerSeekWorker worker : workers) {
+ worker.start();
+ }
{code}
Why use a separate loop to start the workers ?
{code}
+ for (KeyValueScanner scanner : scanners) {
+ if (!(scanner instanceof StoreFileScanner)) {
+ scanner.seek(keyValue);
+ }
+ }
{code}
Why use a third loop to seek non-StoreFileScanner ?
{code}
+ } catch (InterruptedException e) {
+ LOG.error("", e);
+ }
{code}
Restore interrupt status or throw InterruptedIOException.
{code}
+ for (ScannerSeekWorker worker : workers) {
+ if (worker.getErr() != null) {
+ throw new IOException(worker.getErr());
+ }
+ }
{code}
Use MultipleIOException so that user knows about more than one error.
{code}
+ } catch (IOException e) {
+ LOG.info("", e);
{code}
Change to error level.
> parallel seek in StoreScanner
> -----------------------------
>
> Key: HBASE-7495
> URL: https://issues.apache.org/jira/browse/HBASE-7495
> Project: HBase
> Issue Type: Bug
> Components: Scanners
> Affects Versions: 0.94.3, 0.96.0
> Reporter: Liang Xie
> Assignee: Liang Xie
> Attachments: HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt
>
>
> seems there's a potential improvable space before doing scanner.next:
> {code:title=StoreScanner.java|borderStyle=solid}
> if (explicitColumnQuery && lazySeekEnabledGlobally) {
> for (KeyValueScanner scanner : scanners) {
> scanner.requestSeek(matcher.getStartKey(), false, true);
> }
> } else {
> for (KeyValueScanner scanner : scanners) {
> scanner.seek(matcher.getStartKey());
> }
> }
> {code}
> we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case.
> Any ideas on it ? I'll have a try if the comments/suggestions are positive:)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira