You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Jerry He (JIRA)" <ji...@apache.org> on 2016/02/06 21:25:39 UTC

[jira] [Commented] (HBASE-15223) Make convertScanToString public for Spark

    [ https://issues.apache.org/jira/browse/HBASE-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135991#comment-15135991 ] 

Jerry He commented on HBASE-15223:
----------------------------------

The main thing in the patch is to change convertScanToString  and convertStringToScan in TableMapReduceUtil to public so that they can be used by external users.
Users don't need to be concerned by the internal of the conversion.
The other part of the patch is just to use the Scan JSON in the toString() instead of the the encoded string.

> Make convertScanToString public for Spark
> -----------------------------------------
>
>                 Key: HBASE-15223
>                 URL: https://issues.apache.org/jira/browse/HBASE-15223
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jerry He
>            Assignee: Jerry He
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: HBASE-15223-master.patch
>
>
> One way to access HBase from Spark is to use newAPIHadoopRDD, which can take a TableInputFormat as class name.  But we are not able to set a Scan object in there, for example to set a HBase filter.
> In MR,  the public API TableMapReduceUtil.initTableMapperJob() or equivalent is used which can take a Scan object.  But this call is not used in Spark conveniently. 
> We need to make the TableMapReduceUtil.convertScanToString() public.
> So that a Scan object can be created, populated and then convert to the property and used by Spark.  They are now package private.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)