You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/01/07 03:17:34 UTC

[jira] [Commented] (SPARK-5090) The improvement of python converter for hbase

    [ https://issues.apache.org/jira/browse/SPARK-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267184#comment-14267184 ] 

Apache Spark commented on SPARK-5090:
-------------------------------------

User 'GenTang' has created a pull request for this issue:
https://github.com/apache/spark/pull/3920

> The improvement of python converter for hbase
> ---------------------------------------------
>
>                 Key: SPARK-5090
>                 URL: https://issues.apache.org/jira/browse/SPARK-5090
>             Project: Spark
>          Issue Type: Improvement
>          Components: Examples
>    Affects Versions: 1.2.0
>            Reporter: Gen TANG
>              Labels: hbase, python
>             Fix For: 1.2.1
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The python converter `HBaseResultToStringConverter` provided in the HBaseConverter.scala returns only the value of first column in the result. It limits the utility of this converter, because it returns only one value per row(perhaps there are several version in hbase) and moreover it loses the other information of record, such as column:cell, timestamp. 
> Here we would like to propose an improvement about python converter which returns all the records in the results (in a single string) with more complete information. We would like also make some improvements for hbase_inputformat.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org