You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Hafiz Mujadid <ha...@gmail.com> on 2015/10/02 17:50:01 UTC

Append data in hbase using spark-phoenix

Hi all!

I want to append data to an hbase table using spark-phoenix connector. How
can i append data into an existing table?


thanks

Re: Append data in hbase using spark-phoenix

Posted by Konstantinos Kougios <ko...@googlemail.com>.
Hi,

Use rdd.saveToPhoenix(....) , where rdd must be a tuple rdd.

I.e. create a table:

CREATE TABLE OUTPUT_TEST_TABLE (id BIGINT NOT NULL PRIMARY KEY, col1 
VARCHAR, col2 INTEGER) SALT_BUCKETS = 8;


and run this job:

package com.aktit.phoenix

import org.apache.spark.{Logging, SparkConf, SparkContext}

/** * CREATE TABLE OUTPUT_TEST_TABLE (id BIGINT NOT NULL PRIMARY KEY, 
col1 VARCHAR, col2 INTEGER) SALT_BUCKETS = 8; * * @author kostas.kougios 
* Date: 21/09/15 */ object SamplePopulateJobextends Logging
{
    def main(args: Array[String]):Unit = {
       val Divider =4096 val conf =new SparkConf().setAppName(getClass.getName)
       val hbaseZookeeper = conf.get("spark.hbase.zookeeper")
       val numOfRows = conf.getLong("spark.num-of-rows", 10)

       val sc =new SparkContext(conf)

       try {
          import org.apache.phoenix.spark._

          val seq =for (i <-1l to numOfRows by numOfRows / Divider)yield i

          sc.parallelize(seq).flatMap {
             k =>
                logInfo(s"at $k")
                val it = (k to (k + numOfRows / Divider)).iterator.map(i => (i, s"row $i", i % (1024 *1024)))
                it
          }.saveToPhoenix(
                "OUTPUT_TEST_TABLE", Seq("ID", "COL1", "COL2"), zkUrl =Some(hbaseZookeeper)
             )
       }finally {
          sc.stop()
       }

    }
}





On 02/10/15 16:50, Hafiz Mujadid wrote:
> Hi all!
>
> I want to append data to an hbase table using spark-phoenix connector. 
> How can i append data into an existing table?
>
>
> thanks