You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Anoop Sam John (Commented) (JIRA)" <ji...@apache.org> on 2012/02/06 11:46:05 UTC
[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

    [ https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201205#comment-13201205 ] 

Anoop Sam John commented on HBASE-2038:
---------------------------------------

Hi Lars,
          I am also trying for a secondary index and I have seen the IHBase concept being good.. But we need this to be moved to coprocessor based so that the kernel code of HBase need not be different for the secondary index. IHBase makes the scan go through all the regions ( as u said ) but they will skip and seek to the later positions in the heap avoid so many possible data read from HDFS etc...
When I saw the current co processor, we call preScannerNext() from HRegionServer next(final long scannerId, int nbRows)  and pass the RegionScanner here to the co processor.  But as per the IHBase way, within the co processor we should be able to seek to the correct row where the indexed col val equals our value. But we can not do this as of now as RegionScanner seek() not there. 

Also this preScannerNext() will be called once before the actual next(final long scannerId, int nbRows) call happening on the region. Here as per the cache value at the client side the nbRows might be more than one. Now suppose this is nbRows=2 and in the region we have 2 rows one at some what in the middle part of an HFile and the other at another HFile. Now as per IHBase we should 1st seek to the 1st position of the row and after reading this data should seek to the next position. Now as per the current way of calling of preScannerNext() this wont be possible. So I think we might need some change in these area?  What do u say?

Mean while what is your plan to continue with the way of IHBase storing the index in memory for each of the region or some change in this?
                
> Coprocessors: Region level indexing
> -----------------------------------
>
>                 Key: HBASE-2038
>                 URL: https://issues.apache.org/jira/browse/HBASE-2038
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors
>            Reporter: Andrew Purtell
>            Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a good goalpost for coprocessor environment design -- there should be enough of it so region level indexing can be reimplemented as a coprocessor without any loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira