You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by SparkUser6 <al...@gmail.com> on 2018/05/18 00:21:26 UTC
Getting Data From Hbase using Spark is Extremely Slow
I have written four lines of simple spark program to process data in Phoenix
table:
queryString = getQueryFullString( );// Get data from Phoenix table select
col from table
JavaPairRDD<NullWritable, TestWritable> phRDD = jsc.newAPIHadoopRDD(
configuration,
PhoenixInputFormat.class,
NullWritable.class,
TestWritable.class);
JavaRDD<Long> rdd = phRDD.map(new Function<Tuple2<NullWritable,
TestWritable>, Long>() {
@Override//Goal is to scan all the data
public Long call(Tuple2<NullWritable, TestWritable> tuple) throws
Exception {
return 1L;
}
});
System.out.println(rdd.count());
This program takes 2 hours to process for 2 million record, can anyone help
me understand what is wrong.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org