You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Liyin Tang (JIRA)" <ji...@apache.org> on 2012/12/04 01:25:58 UTC

[jira] [Created] (HBASE-7266) [89-fb] Using pread for non-compaction read request

Liyin Tang created HBASE-7266:
---------------------------------

             Summary: [89-fb] Using pread for non-compaction read request
                 Key: HBASE-7266
                 URL: https://issues.apache.org/jira/browse/HBASE-7266
             Project: HBase
          Issue Type: Improvement
            Reporter: Liyin Tang


There are 2 kinds of read operations in HBase: pread and seek+read.
Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.

So far, both compaction and scan are using pread, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.

I will run through the scan benchmark (with no block cache) with verify the performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7266) [89-fb] Using pread for non-compaction read request

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509392#comment-13509392 ] 

Ted Yu commented on HBASE-7266:
-------------------------------

bq. So far, both compaction and scan are using pread, which caused some resource contention. So using the pread for the scan request can avoid the resource contention.
Looks like the above description needs some rephrase.
                
> [89-fb] Using pread for non-compaction read request
> ---------------------------------------------------
>
>                 Key: HBASE-7266
>                 URL: https://issues.apache.org/jira/browse/HBASE-7266
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> There are 2 kinds of read operations in HBase: pread and seek+read.
> Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.
> So far, both compaction and scan are using pread, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.
> I will run through the scan benchmark (with no block cache) with verify the performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7266) [89-fb] Using pread for non-compaction read request

Posted by "binlijin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509466#comment-13509466 ] 

binlijin commented on HBASE-7266:
---------------------------------

[~lhofhansl] i don't get it, when scan the parameter 'pread' is determined by Scan.isGetScan(),which is 
{code}
Scan
  public boolean isGetScan() {
    return this.startRow != null && this.startRow.length > 0 &&
      Bytes.equals(this.startRow, this.stopRow);
  }
{code}
                
> [89-fb] Using pread for non-compaction read request
> ---------------------------------------------------
>
>                 Key: HBASE-7266
>                 URL: https://issues.apache.org/jira/browse/HBASE-7266
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> There are 2 kinds of read operations in HBase: pread and seek+read.
> Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.
> So far, both compaction and scan are using seek+read, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.
> I will run through the scan benchmark (with no block cache) with verify the performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7266) [89-fb] Using pread for non-compaction read request

Posted by "Liyin Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liyin Tang updated HBASE-7266:
------------------------------

    Description: 
There are 2 kinds of read operations in HBase: pread and seek+read.
Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.

So far, both compaction and scan are using seek+read, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.

I will run through the scan benchmark (with no block cache) with verify the performance.

  was:
There are 2 kinds of read operations in HBase: pread and seek+read.
Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.

So far, both compaction and scan are using pread, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.

I will run through the scan benchmark (with no block cache) with verify the performance.

    
> [89-fb] Using pread for non-compaction read request
> ---------------------------------------------------
>
>                 Key: HBASE-7266
>                 URL: https://issues.apache.org/jira/browse/HBASE-7266
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> There are 2 kinds of read operations in HBase: pread and seek+read.
> Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.
> So far, both compaction and scan are using seek+read, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.
> I will run through the scan benchmark (with no block cache) with verify the performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7266) [89-fb] Using pread for non-compaction read request

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509398#comment-13509398 ] 

Lars Hofhansl commented on HBASE-7266:
--------------------------------------

Looks like in 0.94+ we're already doing this (see HFileReaderV2)
                
> [89-fb] Using pread for non-compaction read request
> ---------------------------------------------------
>
>                 Key: HBASE-7266
>                 URL: https://issues.apache.org/jira/browse/HBASE-7266
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> There are 2 kinds of read operations in HBase: pread and seek+read.
> Pread, positional read, is stateless and create a new connection between the DFSClient and DataNode for each operation. While seek+read is to seek to a specific postion and prefetch blocks from data nodes. The benefit of seek+read is that it will cache the prefetch result but the downside is it is stateful and needs to synchronized.
> So far, both compaction and scan are using seek+read, which caused some resource contention. So using the pread for the scan request can avoid the resource contention. In addition, the region server is able to do the prefetch for the scan request (HBASE-6874) so that it won't be necessary to let the DFSClient to prefetch the data any more.
> I will run through the scan benchmark (with no block cache) with verify the performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira