You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by mi...@gmail.com on 2008/11/11 10:55:10 UTC

Question on Multimap

Hi, all

Let there is a "People" table and its column family "Personal Details"  
contains name, address, email etc. For instance, "Personal Details:name"  
column contains value "Michael". Now I have to support multiple names for  
the same person. For instance, I want the "Personal Details:name" column to  
contain a list of values: "Michael", "Mike", "Misha", etc. I would not like  
to use timestamps/versions for that. I would not like to implement a list  
as a single string with delimiters (eg "Michael,Mike,Misha" ) either.

How would you suggest implementing that?

Thank you for your cooperation,
M.

Re: Question on Multimap

Posted by Michael Dagaev <mi...@gmail.com>.
Jonathan

    Let's consider the "Person" example. There is table "Person" with
"PersonalDetails" column family.
The column family contains many different key-value pairs (name,
address, email, etc.) and I do not know
the keys upfront. Each table row contains data about a single person
and has a key -- person ID.
We store and fetch the "PersonalDetails" data from the table by person
ID as Map<String, String>

   So far so good.

   Now I have discovered that the "PersonalData" data is Map<String,
List<String>> rather than Map<String, String>,
i.e. there are a few values for each key.

   I would not like to use versions for that. I am thinking about a
separate table "PersonalDetails" with row key = person ID + key + md5
of value. In order to read all values of a given key I would try to
use the Scanner API.

Thank you for your cooperation,
M.


On Tue, Nov 11, 2008 at 8:18 PM, Jonathan Gray <jl...@streamy.com> wrote:
> Michael,
>
> Can you be more specific with what you would like to have?
>
> In what way would you like to be able to store the results, and then fetch
> them?
>
> Would these primitives suffice?
>
> add_name(user, "New Name")
> del_name(user, "Name to Del")
> get_names(user) -> ["Name1", "Name2", ...]
>
> Seems you could do that with cell versions, unless you need versions for
> something else.
>
> Another option would be to create a separate family for names but that would
> be too much if you need to support multi-value for many fields.
>
> I will think more about this... Seems there should be an easy enough
> solution.
>
> JG
>
> -----Original Message-----
> From: rathi [mailto:riteshrathi@gmail.com]
> Sent: Tuesday, November 11, 2008 2:22 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Question on Multimap
>
> You can use the label support in hbase where you can add data like
> name:label1 name:label2 ....etc. Label can be added to any column family at
> anytime while adding data without pre-announcing it. In you case it can be
> name:first, name:second etc.
>
> Hope it helps.
>
> rathi
>
> On Tue, Nov 11, 2008 at 3:25 PM, <mi...@gmail.com> wrote:
>
>> Hi, all
>>
>> Let there is a "People" table and its column family "Personal Details"
>> contains name, address, email etc. For instance, "Personal Details:name"
>> column contains value "Michael". Now I have to support multiple names for
>> the same person. For instance, I want the "Personal Details:name" column
> to
>> contain a list of values: "Michael", "Mike", "Misha", etc. I would not
> like
>> to use timestamps/versions for that. I would not like to implement a list
> as
>> a single string with delimiters (eg "Michael,Mike,Misha" ) either.
>>
>> How would you suggest implementing that?
>>
>> Thank you for your cooperation,
>> M.
>>
>
>

RE: Question on Multimap

Posted by Jonathan Gray <jl...@streamy.com>.
Michael,

Can you be more specific with what you would like to have?

In what way would you like to be able to store the results, and then fetch
them?

Would these primitives suffice?

add_name(user, "New Name")
del_name(user, "Name to Del")
get_names(user) -> ["Name1", "Name2", ...]

Seems you could do that with cell versions, unless you need versions for
something else.

Another option would be to create a separate family for names but that would
be too much if you need to support multi-value for many fields.

I will think more about this... Seems there should be an easy enough
solution.

JG 

-----Original Message-----
From: rathi [mailto:riteshrathi@gmail.com] 
Sent: Tuesday, November 11, 2008 2:22 AM
To: hbase-user@hadoop.apache.org
Subject: Re: Question on Multimap

You can use the label support in hbase where you can add data like
name:label1 name:label2 ....etc. Label can be added to any column family at
anytime while adding data without pre-announcing it. In you case it can be
name:first, name:second etc.

Hope it helps.

rathi

On Tue, Nov 11, 2008 at 3:25 PM, <mi...@gmail.com> wrote:

> Hi, all
>
> Let there is a "People" table and its column family "Personal Details"
> contains name, address, email etc. For instance, "Personal Details:name"
> column contains value "Michael". Now I have to support multiple names for
> the same person. For instance, I want the "Personal Details:name" column
to
> contain a list of values: "Michael", "Mike", "Misha", etc. I would not
like
> to use timestamps/versions for that. I would not like to implement a list
as
> a single string with delimiters (eg "Michael,Mike,Misha" ) either.
>
> How would you suggest implementing that?
>
> Thank you for your cooperation,
> M.
>


Re: Question on Multimap

Posted by rathi <ri...@gmail.com>.
You can use the label support in hbase where you can add data like
name:label1 name:label2 ....etc. Label can be added to any column family at
anytime while adding data without pre-announcing it. In you case it can be
name:first, name:second etc.

Hope it helps.

rathi

On Tue, Nov 11, 2008 at 3:25 PM, <mi...@gmail.com> wrote:

> Hi, all
>
> Let there is a "People" table and its column family "Personal Details"
> contains name, address, email etc. For instance, "Personal Details:name"
> column contains value "Michael". Now I have to support multiple names for
> the same person. For instance, I want the "Personal Details:name" column to
> contain a list of values: "Michael", "Mike", "Misha", etc. I would not like
> to use timestamps/versions for that. I would not like to implement a list as
> a single string with delimiters (eg "Michael,Mike,Misha" ) either.
>
> How would you suggest implementing that?
>
> Thank you for your cooperation,
> M.
>