You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/01/29 16:21:13 UTC

[jira] [Comment Edited] (HBASE-7495) parallel seek in StoreScanner

    [ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565443#comment-13565443 ] 

Ted Yu edited comment on HBASE-7495 at 1/29/13 3:20 PM:
--------------------------------------------------------

{code}
+   * Do StoreFileScanner.seek() in parallel
+   * @throws IOException
+   */
+  private void parallelSeek(final List<? extends KeyValueScanner>
+      scanners, final KeyValue keyValue) throws IOException {
{code}
Add javadoc for the new method.
{code}
+      if (scanner instanceof StoreFileScanner) {
+        storeFileScannerNum++;
{code}
storeFileScannerNum -> storeFileScannerCount
{code}
+    for (ScannerSeekWorker worker : workers) {
+      worker.start();
+    }
{code}
Why use a separate loop to start the workers ?
{code}
+    } catch (InterruptedException e) {
+      LOG.error("", e);
+    }
{code}
Restore interrupt status or throw InterruptedIOException.
{code}
+    for (ScannerSeekWorker worker : workers) {
+      if (worker.getErr() != null) {
+        throw new IOException(worker.getErr());
+      }
+    }
{code}
Use MultipleIOException so that user knows about more than one error.
{code}
+      } catch (IOException e) {
+        LOG.info("", e);
{code}
Change to error level.
                
      was (Author: yuzhihong@gmail.com):
    {code}
+   * Do StoreFileScanner.seek() in parallel
+   * @throws IOException
+   */
+  private void parallelSeek(final List<? extends KeyValueScanner>
+      scanners, final KeyValue keyValue) throws IOException {
{code}
Add javadoc for the new method.
{code}
+      if (scanner instanceof StoreFileScanner) {
+        storeFileScannerNum++;
{code}
storeFileScannerNum -> storeFileScannerCount
{code}
+    for (ScannerSeekWorker worker : workers) {
+      worker.start();
+    }
{code}
Why use a separate loop to start the workers ?
{code}
+    for (KeyValueScanner scanner : scanners) {
+      if (!(scanner instanceof StoreFileScanner)) {
+        scanner.seek(keyValue);
+      }
+    }
{code}
Why use a third loop to seek non-StoreFileScanner ?
{code}
+    } catch (InterruptedException e) {
+      LOG.error("", e);
+    }
{code}
Restore interrupt status or throw InterruptedIOException.
{code}
+    for (ScannerSeekWorker worker : workers) {
+      if (worker.getErr() != null) {
+        throw new IOException(worker.getErr());
+      }
+    }
{code}
Use MultipleIOException so that user knows about more than one error.
{code}
+      } catch (IOException e) {
+        LOG.info("", e);
{code}
Change to error level.
                  
> parallel seek in StoreScanner
> -----------------------------
>
>                 Key: HBASE-7495
>                 URL: https://issues.apache.org/jira/browse/HBASE-7495
>             Project: HBase
>          Issue Type: Bug
>          Components: Scanners
>    Affects Versions: 0.94.3, 0.96.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt
>
>
> seems there's a potential improvable space before doing scanner.next:
> {code:title=StoreScanner.java|borderStyle=solid}
>     if (explicitColumnQuery && lazySeekEnabledGlobally) {
>       for (KeyValueScanner scanner : scanners) {
>         scanner.requestSeek(matcher.getStartKey(), false, true);
>       }
>     } else {
>       for (KeyValueScanner scanner : scanners) {
>         scanner.seek(matcher.getStartKey());
>       }
>     }
> {code} 
> we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case.
> Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira