You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Vrushali C (JIRA)" <ji...@apache.org> on 2016/06/11 01:44:21 UTC

[jira] [Comment Edited] (YARN-5070) upgrade HBase version for first merge

    [ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325613#comment-15325613 ] 

Vrushali C edited comment on YARN-5070 at 6/11/16 1:43 AM:
-----------------------------------------------------------

bq. l.120: From the javadoc, it appears that ScannerContext keeps track of the progress towards the limits.  If the progress should be monitored across multiple invocations of nextRaw(List<Cell>) .  I'm not sure if this will do that.

Yes, what I believe is that the progress to be tracked is within the context of the invocation of the "next" call, not across. Although the ScannerContext class has keepProgress settings in case we want to track progress across RPCs. But this patch does not do that.
 
The documentation in the ScannerContext class says
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=blob;f=hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java;h=29bffd26753795f33b90f31e9b77a5d1387e5cd7;hb=refs/heads/branch-1.1

{code}
 * ScannerContext instances encapsulate limit tracking AND progress towards those limits during
 * invocations of {@link InternalScanner#next(java.util.List)} and
 * {@link RegionScanner#next(java.util.List)}.

{code}

For the flow run coprocessor, the nextRaw/next functions call the nextInternal function which is the one that actually does the iteration. Hence the batch limit is set up here.

bq. Are we even supposed to create instances of ScannerContext? Am I off? I'm basically not sure what is the correct way of using the ScannerContext.
An example of how the ScannerContext is being used in the hbase region server code:
https://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/StoreFlusher.html

bq. Are we certain that batchLimit is the correct one to use in ScannerContext? 
batchLimit is the one that tracks the batch size during the scan’s next, hence we are using that. There are other settings like max results per column family or max result size which I believe will have corresponding limit settings in ScannerContext. We are not keeping track of those in FlowScanner. 

That said, all this is what I have gathered looking at the code by myself. Would appreciate a feedback from an hbase person. 



was (Author: vrushalic):
bq. l.120: From the javadoc, it appears that ScannerContext keeps track of the progress towards the limits.  If the progress should be monitored across multiple invocations of nextRaw(List<Cell>) .  I'm not sure if this will do that.

Yes, I believe the progress to be tracked is within the context of the invocation of the "next" call, not across.
 
The documentation in the ScannerContext class says
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=blob;f=hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java;h=29bffd26753795f33b90f31e9b77a5d1387e5cd7;hb=refs/heads/branch-1.1

{code}
 * ScannerContext instances encapsulate limit tracking AND progress towards those limits during
 * invocations of {@link InternalScanner#next(java.util.List)} and
 * {@link RegionScanner#next(java.util.List)}.

{code}

For the flow run coprocessor, the nextRaw/next functions call the nextInternal function which is the one that actually does the iteration. Hence the batch limit is set up here.

bq. Are we even supposed to create instances of ScannerContext? Am I off? I'm basically not sure what is the correct way of using the ScannerContext.
An example of how the ScannerContext is being used in the hbase region server code:
https://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/StoreFlusher.html

bq. Are we certain that batchLimit is the correct one to use in ScannerContext? 
batchLimit is the one that tracks the batch size during the scan’s next, hence we are using that.


> upgrade HBase version for first merge
> -------------------------------------
>
>                 Key: YARN-5070
>                 URL: https://issues.apache.org/jira/browse/YARN-5070
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-5070-YARN-2928.01.patch, YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, YARN-5070-YARN-2928.04.patch
>
>
> Currently we set the HBase version for the timeline service storage to 1.0.1. This is a fairly old version, and there are reasons to upgrade to a newer version. We should upgrade it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org