You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mich Talebzadeh <mi...@gmail.com> on 2016/12/29 12:10:02 UTC
Reading specific column family and columns in Hbase table through spark
Hi,
I have a routine in Spark that iterates through Hbase rows and tries to
read columns.
My question is how can I read the correct ordering of columns?
example
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
classOf[org.apache.hadoop.hbase.client.Result])
val parsed = hBaseRDD.map{ case(b, a) => val iter = a.list().iterator();
( Bytes.toString(a.getRow()).toString,
Bytes.toString( iter.next().getValue()).toString,
Bytes.toString( iter.next().getValue()).toString,
Bytes.toString( iter.next().getValue()).toString,
Bytes.toString(iter.next().getValue())
)}
The above reads the column family columns sequentially. How can I force it
to read specific columns only?
Thanks
Dr Mich Talebzadeh
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
Re: Reading specific column family and columns in Hbase table through spark
Posted by Nkechi Achara <nk...@googlemail.com>.
Hey Mich,
Are you setting the column family / qualifier values in the config?
e.g.
config.set(TableInputFormat.SCAN_COLUMN_FAMILY, "cf") // column family
config.set(TableInputFormat.SCAN_COLUMNS, "cf1:cq1 cf1:cq2") // column
qualifier
As you already have the results when you use newAPIHadoopRDD then you can
cast it to a conversion function too, like:
val r: Result
r.getValue(<Column Family as Bytes>, <column Qualifier as Bytes>) this will
either retrieve the value in Bytes or null if it does not exist.
Thanks,
K
On 29 December 2016 at 13:10, Mich Talebzadeh <mi...@gmail.com>
wrote:
> Hi,
>
> I have a routine in Spark that iterates through Hbase rows and tries to
> read columns.
>
> My question is how can I read the correct ordering of columns?
>
> example
>
> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> classOf[org.apache.hadoop.hbase.client.Result])
>
> val parsed = hBaseRDD.map{ case(b, a) => val iter = a.list().iterator();
> ( Bytes.toString(a.getRow()).toString,
> Bytes.toString( iter.next().getValue()).toString,
> Bytes.toString( iter.next().getValue()).toString,
> Bytes.toString( iter.next().getValue()).toString,
> Bytes.toString(iter.next().getValue())
> )}
>
> The above reads the column family columns sequentially. How can I force it
> to read specific columns only?
>
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>