You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ramasubramanian Narayanan <ra...@gmail.com> on 2012/12/24 11:11:03 UTC

Regarding Rowkey and Column Family

Hi,

*Table Name : Customer*
*
*
*Field Name         Column Family*
Customer Number      CF1
DOB                  CF1
FName                CF1
MName                CF1
LName                CF1
Address Type         CF2
Address Line1        CF2
Address Line2        CF2
Address Line3        CF2
Address Line4        CF2
State                CF2
City                 CF2
Country              CF2

Is it good to have rowkey as follows for the same table?

Rowkey Design:
--------------
For CF1 : Customer Number + YYYYMMD (business date)
For CF2 : Customer Number + Address Type

Note :
Address Type can be any of HOME/OFFICE/OTHERS

regards,
Rams

Re: Regarding Rowkey and Column Family

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Rams,

I don't have the JSon created on the same place where I'm storing it
in HBase so the code I can provide will not really be helpful.

JSon is giving you a string.

So a put will "simply" look like that:
	Put entry= new Put (key); // Key is an array of byte.
	entry.add(CF_DUMMY, C_DUMMY, Bytes.toBytes(youJSonString));
	put_entry.add(email); // put_entry is an array of put that I give to
my HTable later.

JM

2012/12/26, Ramasubramanian Narayanan <ra...@gmail.com>:
> Hi,
>
> Will be helpful if you could send a sample program/script..
>
> regards,'
> Rams
>
> On Wed, Dec 26, 2012 at 9:09 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> I would rather serialize the JSON object into a byte array and then store
>> it into an HBase cell. Later whenever I need to pull out some value, I
>> can
>> deserialize it and get the result.
>>
>> If you know the column name in advance, you can use the QualifierFilter
>> to
>> get the rows.
>>
>> Best Regards,
>> Tariq
>> +91-9741563634
>> https://mtariq.jux.com/
>>
>>
>> On Wed, Dec 26, 2012 at 8:35 PM, Ramasubramanian Narayanan <
>> ramasubramanian.narayanan@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > Thanks a lot... Can you please help me a sample code how to insert &
>> > read
>> > Json object in HBase...
>> >
>> > Also how  to select the particular row from column (inserted thru Json
>> > sobject).
>> >
>> > Have given the script based on  google'ng we did... Please help in
>> > using
>> > this script how to use in HBase shell script also in Java..
>> >
>> >
>> > -----------------------------
>> > { "Customer": {
>> > "Customer Detail": [
>> > {"CustomerNumber": "10000000001",
>> >         "DOB": "01/01/01",
>> >         "Fname": "Fname1",
>> >         "Mname": "Mname1",
>> >         "Lname": "Lname1",
>> > "address": {
>> >         "AddressType": "Home",
>> >       "AddressLine1" :"1.1.Address Line1",
>> >     "AddressLine2" :"1.1.Address Line2",
>> >       "AddressLine3" :"1.1.Address Line3",
>> >       "AddressLine4" :"1.1.Address Line4",
>> >      "State" :"1.1.State",
>> >     "City" :"1.1.City",
>> >       "Country" :"1.1.Country"
>> >       }
>> > },
>> >     { "CustomerNumber": "10000000002",
>> >         "DOB": "01/02/01",
>> >         "Fname": "Fname2",
>> >         "Mname": "Mname2",
>> >         "Lname": "Lname2",
>> > "address": [{
>> >         "AddressType": "Home",
>> >       "AddressLine1" :"2.1.Address Line1",
>> >     "AddressLine2" :"2.1.Address Line2",
>> >       "AddressLine3" :"2.1.Address Line3",
>> >       "AddressLine4" :"2.1.Address Line4",
>> >      "State" :"2.1.State",
>> >     "City" :"2.1.City",
>> >       "Country" :"2.1.Country"
>> >       },
>> >   {
>> >         "AddressType": "Office",
>> >       "AddressLine1" :"2.2.Address Line1",
>> >     "AddressLine2" :"2.2.Address Line2",
>> >       "AddressLine3" :"2.2.Address Line3",
>> >       "AddressLine4" :"2.2.Address Line4",
>> >      "State" :"2.2.State",
>> >     "City" :"2.2.City",
>> >       "Country" :"2.2.Country"
>> >       }
>> > ]
>> >
>> >       },
>> >       { "CustomerNumber": "10000000003",
>> >         "DOB": "01/03/01",
>> >         "Fname": "Fname3",
>> >         "Mname": "Mname3",
>> >         "Lname": "Lname3",
>> > "address": [{
>> >         "AddressType": "Home",
>> >       "AddressLine1" :"3.1.Address Line1",
>> >     "AddressLine2" :"3.1.Address Line2",
>> >       "AddressLine3" :"3.1.Address Line3",
>> >       "AddressLine4" :"3.1.Address Line4",
>> >      "State" :"3.1.State",
>> >     "City" :"3.1.City",
>> >       "Country" :"3.1.Country"
>> >       },
>> >   {
>> >         "AddressType": "Office",
>> >       "AddressLine1" :"3.2.Address Line1",
>> >     "AddressLine2" :"3.2.Address Line2",
>> >       "AddressLine3" :"3.2.Address Line3",
>> >       "AddressLine4" :"3.2.Address Line4",
>> >      "State" :"3.2.State",
>> >     "City" :"3.2.City",
>> >       "Country" :"3.2.Country"
>> >       },
>> >   {
>> >         "AddressType": "Others",
>> >       "AddressLine1" :"3.3.Address Line1",
>> >     "AddressLine2" :"3.3.Address Line2",
>> >       "AddressLine3" :"3.3.Address Line3",
>> >       "AddressLine4" :"3.3.Address Line4",
>> >      "State" :"3.3.State",
>> >     "City" :"3.3.City",
>> >       "Country" :"3.3.Country"
>> >       }
>> > ]
>> >       },
>> >       { "CustomerNumber": "10000000004",
>> >         "DOB": "01/04/01",
>> >         "Fname": "Fname4",
>> >         "Mname": "Mname4",
>> >         "Lname": "Lname4",
>> > "address": [{
>> >         "AddressType": "Home",
>> >       "AddressLine1" :"4.1.Address Line1",
>> >     "AddressLine2" :"4.1.Address Line2",
>> >       "AddressLine3" :"4.1.Address Line3",
>> >       "AddressLine4" :"4.1.Address Line4",
>> >      "State" :"4.1.State",
>> >     "City" :"4.1.City",
>> >       "Country" :"4.1.Country"
>> >       },
>> >   {
>> >         "AddressType": "Office",
>> >       "AddressLine1" :"4.2.Address Line1",
>> >     "AddressLine2" :"4.2.Address Line2",
>> >       "AddressLine3" :"4.2.Address Line3",
>> >       "AddressLine4" :"4.2.Address Line4",
>> >      "State" :"4.2.State",
>> >     "City" :"4.2.City",
>> >       "Country" :"4.2.Country"
>> >       },
>> >             {
>> >         "AddressType": "Office2",
>> >       "AddressLine1" :"4.3.Address Line1",
>> >     "AddressLine2" :"4.3.Address Line2",
>> >       "AddressLine3" :"4.3.Address Line3",
>> >       "AddressLine4" :"4.3.Address Line4",
>> >      "State" :"4.3.State",
>> >     "City" :"4.3.City",
>> >       "Country" :"4.3.Country"
>> >       },
>> >             {
>> >         "AddressType": "Others",
>> >       "AddressLine1" :"4.4.Address Line1",
>> >     "AddressLine2" :"4.4.Address Line2",
>> >       "AddressLine3" :"4.4.Address Line3",
>> >       "AddressLine4" :"4.4.Address Line4",
>> >      "State" :"4.4.State",
>> >     "City" :"4.4.City",
>> >       "Country" :"4.4.Country"
>> >       }
>> > ]
>> > }]
>> > }}
>> >
>> > --------------------------------------------------------------
>> >
>> > regards,
>> > Rams
>> >
>> > On Mon, Dec 24, 2012 at 9:15 PM, Jean-Marc Spaggiari <
>> > jean-marc@spaggiari.org> wrote:
>> >
>> > > Hi Rams,
>> > >
>> > > Even if a customer can have multiple addresses, you can still simply
>> > > put them all on the same field...
>> > >
>> > > A ArrayList of address, converted in a JSon sting, in a single HBase
>> > > cell will still do it.
>> > >
>> > > You can have them on separated cells if you think you will access
>> > > them
>> > > separatly. You can also have different columns identifiers for each
>> > > type of address you can have.
>> > >
>> > > Like you have CF1 for all you fields, C=Infos for the customer info,
>> > > C=PHY for Physical address, C=HOM for home address, C=OFF for office
>> > > address, and so on?
>> > >
>> > > The idea is to reduce the CFs if not required, and really think about
>> > > the way you access your data.
>> > >
>> > > If you access all the address at the same time, then simply put all
>> > > of
>> > > them on the same cell, on a Array of Address converted in String with
>> > > JSon. So simple ;)
>> > >
>> > > JM
>> > >
>> > > 2012/12/24, Ramasubramanian <ra...@gmail.com>:
>> > > > Hi,
>> > > >
>> > > > Let me explain the scenario.
>> > > >
>> > > > For address of the customer we have designed 3 tables (in
>> > > > relational
>> > way)
>> > > >
>> > > > 1. Address link table
>> > > >     Will have key columns like
>> > > >       Address type - physical or email /fax/phone/URL/ etc.,
>> > > >       Address category- (home/work)
>> > > >       Primary address indicator
>> > > >       Bad address indicator
>> > > >       Etc.,
>> > > > 2. Physical address
>> > > >     This will contain the actual physical address. A customer can
>> have
>> > n
>> > > > Number of addresses.
>> > > >    Fields :
>> > > >        - address type (physical)
>> > > >        - address category (home/work/etc.,)
>> > > >         - address1
>> > > >         - address 2
>> > > >         .........
>> > > > 3. Electronic address
>> > > >     It will contain email/fax/phone/URL etc, and it's value
>> > > >      Fields :
>> > > >        - address type (email /fax/phone/URL/ etc.,)
>> > > >        - address category (home/work/etc.,)
>> > > >        - value ( actual value based on address type. Like actual
>> phone
>> > > > number)
>> > > >
>> > > >
>> > > > Now in the above scenario, while designing in hbase, I am going to
>> > > eliminate
>> > > > link table and have those fields in both physical and electronic
>> > address.
>> > > >
>> > > > So both the tables has common fields like address type and address
>> > > category.
>> > > > Hence thought of having these two fields common for both the set of
>> > > fields.
>> > > > (In a single table)
>> > > >
>> > > > Regards,
>> > > > Rams
>> > > >
>> > > > On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> > > >
>> > > >> it is. but why do you want  to do that? you will run into issues
>> once
>> > > >> your
>> > > >> data starts growing. each cell, along with the actual value stores
>> few
>> > > >> additional things, *row, column *and the *version. *as a result
>> > > >> you
>> > will
>> > > >> loose space if you do that.
>> > > >>
>> > > >> Best Regards,
>> > > >> Tariq
>> > > >> +91-9741563634
>> > > >> https://mtariq.jux.com/
>> > > >>
>> > > >>
>> > > >> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
>> > > >> ramasubramanian.narayanan@gmail.com> wrote:
>> > > >>
>> > > >>> Hi,
>> > > >>>
>> > > >>> Is it ok to have same column into different column familes?
>> > > >>>
>> > > >>> regards,
>> > > >>> Rams
>> > > >>>
>> > > >>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <
>> dontariq@gmail.com>
>> > > >>> wrote:
>> > > >>>
>> > > >>>> you are creating 2 different rows here. cf means how column are
>> > > clubbed
>> > > >>>> together as a single entity which is represented by that cf. but
>> > here
>> > > >>>> you
>> > > >>>> are creating 2 different rows having one cf each, CF1 and CF2
>> > > >>> respectively.
>> > > >>>> if you want to have 1 row with 2 cf, you have to do use same
>> rowkey
>> > > for
>> > > >>>> both the cf.
>> > > >>>>
>> > > >>>>
>> > > >>>>
>> > > >>>> Best Regards,
>> > > >>>> Tariq
>> > > >>>> +91-9741563634
>> > > >>>> https://mtariq.jux.com/
>> > > >>>>
>> > > >>>>
>> > > >>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>> > > >>>> ramasubramanian.narayanan@gmail.com> wrote:
>> > > >>>>
>> > > >>>>> Hi,
>> > > >>>>>
>> > > >>>>> *Table Name : Customer*
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>> *Field Name         Column Family*
>> > > >>>>> Customer Number      CF1
>> > > >>>>> DOB                  CF1
>> > > >>>>> FName                CF1
>> > > >>>>> MName                CF1
>> > > >>>>> LName                CF1
>> > > >>>>> Address Type         CF2
>> > > >>>>> Address Line1        CF2
>> > > >>>>> Address Line2        CF2
>> > > >>>>> Address Line3        CF2
>> > > >>>>> Address Line4        CF2
>> > > >>>>> State                CF2
>> > > >>>>> City                 CF2
>> > > >>>>> Country              CF2
>> > > >>>>>
>> > > >>>>> Is it good to have rowkey as follows for the same table?
>> > > >>>>>
>> > > >>>>> Rowkey Design:
>> > > >>>>> --------------
>> > > >>>>> For CF1 : Customer Number + YYYYMMD (business date)
>> > > >>>>> For CF2 : Customer Number + Address Type
>> > > >>>>>
>> > > >>>>> Note :
>> > > >>>>> Address Type can be any of HOME/OFFICE/OTHERS
>> > > >>>>>
>> > > >>>>> regards,
>> > > >>>>> Rams
>> > > >>>
>> > > >
>> > >
>> >
>>
>

Re: Regarding Rowkey and Column Family

Posted by Ramasubramanian Narayanan <ra...@gmail.com>.
Hi,

Will be helpful if you could send a sample program/script..

regards,'
Rams

On Wed, Dec 26, 2012 at 9:09 PM, Mohammad Tariq <do...@gmail.com> wrote:

> I would rather serialize the JSON object into a byte array and then store
> it into an HBase cell. Later whenever I need to pull out some value, I can
> deserialize it and get the result.
>
> If you know the column name in advance, you can use the QualifierFilter to
> get the rows.
>
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
>
>
> On Wed, Dec 26, 2012 at 8:35 PM, Ramasubramanian Narayanan <
> ramasubramanian.narayanan@gmail.com> wrote:
>
> > Hi,
> >
> > Thanks a lot... Can you please help me a sample code how to insert & read
> > Json object in HBase...
> >
> > Also how  to select the particular row from column (inserted thru Json
> > sobject).
> >
> > Have given the script based on  google'ng we did... Please help in using
> > this script how to use in HBase shell script also in Java..
> >
> >
> > -----------------------------
> > { "Customer": {
> > "Customer Detail": [
> > {"CustomerNumber": "10000000001",
> >         "DOB": "01/01/01",
> >         "Fname": "Fname1",
> >         "Mname": "Mname1",
> >         "Lname": "Lname1",
> > "address": {
> >         "AddressType": "Home",
> >       "AddressLine1" :"1.1.Address Line1",
> >     "AddressLine2" :"1.1.Address Line2",
> >       "AddressLine3" :"1.1.Address Line3",
> >       "AddressLine4" :"1.1.Address Line4",
> >      "State" :"1.1.State",
> >     "City" :"1.1.City",
> >       "Country" :"1.1.Country"
> >       }
> > },
> >     { "CustomerNumber": "10000000002",
> >         "DOB": "01/02/01",
> >         "Fname": "Fname2",
> >         "Mname": "Mname2",
> >         "Lname": "Lname2",
> > "address": [{
> >         "AddressType": "Home",
> >       "AddressLine1" :"2.1.Address Line1",
> >     "AddressLine2" :"2.1.Address Line2",
> >       "AddressLine3" :"2.1.Address Line3",
> >       "AddressLine4" :"2.1.Address Line4",
> >      "State" :"2.1.State",
> >     "City" :"2.1.City",
> >       "Country" :"2.1.Country"
> >       },
> >   {
> >         "AddressType": "Office",
> >       "AddressLine1" :"2.2.Address Line1",
> >     "AddressLine2" :"2.2.Address Line2",
> >       "AddressLine3" :"2.2.Address Line3",
> >       "AddressLine4" :"2.2.Address Line4",
> >      "State" :"2.2.State",
> >     "City" :"2.2.City",
> >       "Country" :"2.2.Country"
> >       }
> > ]
> >
> >       },
> >       { "CustomerNumber": "10000000003",
> >         "DOB": "01/03/01",
> >         "Fname": "Fname3",
> >         "Mname": "Mname3",
> >         "Lname": "Lname3",
> > "address": [{
> >         "AddressType": "Home",
> >       "AddressLine1" :"3.1.Address Line1",
> >     "AddressLine2" :"3.1.Address Line2",
> >       "AddressLine3" :"3.1.Address Line3",
> >       "AddressLine4" :"3.1.Address Line4",
> >      "State" :"3.1.State",
> >     "City" :"3.1.City",
> >       "Country" :"3.1.Country"
> >       },
> >   {
> >         "AddressType": "Office",
> >       "AddressLine1" :"3.2.Address Line1",
> >     "AddressLine2" :"3.2.Address Line2",
> >       "AddressLine3" :"3.2.Address Line3",
> >       "AddressLine4" :"3.2.Address Line4",
> >      "State" :"3.2.State",
> >     "City" :"3.2.City",
> >       "Country" :"3.2.Country"
> >       },
> >   {
> >         "AddressType": "Others",
> >       "AddressLine1" :"3.3.Address Line1",
> >     "AddressLine2" :"3.3.Address Line2",
> >       "AddressLine3" :"3.3.Address Line3",
> >       "AddressLine4" :"3.3.Address Line4",
> >      "State" :"3.3.State",
> >     "City" :"3.3.City",
> >       "Country" :"3.3.Country"
> >       }
> > ]
> >       },
> >       { "CustomerNumber": "10000000004",
> >         "DOB": "01/04/01",
> >         "Fname": "Fname4",
> >         "Mname": "Mname4",
> >         "Lname": "Lname4",
> > "address": [{
> >         "AddressType": "Home",
> >       "AddressLine1" :"4.1.Address Line1",
> >     "AddressLine2" :"4.1.Address Line2",
> >       "AddressLine3" :"4.1.Address Line3",
> >       "AddressLine4" :"4.1.Address Line4",
> >      "State" :"4.1.State",
> >     "City" :"4.1.City",
> >       "Country" :"4.1.Country"
> >       },
> >   {
> >         "AddressType": "Office",
> >       "AddressLine1" :"4.2.Address Line1",
> >     "AddressLine2" :"4.2.Address Line2",
> >       "AddressLine3" :"4.2.Address Line3",
> >       "AddressLine4" :"4.2.Address Line4",
> >      "State" :"4.2.State",
> >     "City" :"4.2.City",
> >       "Country" :"4.2.Country"
> >       },
> >             {
> >         "AddressType": "Office2",
> >       "AddressLine1" :"4.3.Address Line1",
> >     "AddressLine2" :"4.3.Address Line2",
> >       "AddressLine3" :"4.3.Address Line3",
> >       "AddressLine4" :"4.3.Address Line4",
> >      "State" :"4.3.State",
> >     "City" :"4.3.City",
> >       "Country" :"4.3.Country"
> >       },
> >             {
> >         "AddressType": "Others",
> >       "AddressLine1" :"4.4.Address Line1",
> >     "AddressLine2" :"4.4.Address Line2",
> >       "AddressLine3" :"4.4.Address Line3",
> >       "AddressLine4" :"4.4.Address Line4",
> >      "State" :"4.4.State",
> >     "City" :"4.4.City",
> >       "Country" :"4.4.Country"
> >       }
> > ]
> > }]
> > }}
> >
> > --------------------------------------------------------------
> >
> > regards,
> > Rams
> >
> > On Mon, Dec 24, 2012 at 9:15 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Hi Rams,
> > >
> > > Even if a customer can have multiple addresses, you can still simply
> > > put them all on the same field...
> > >
> > > A ArrayList of address, converted in a JSon sting, in a single HBase
> > > cell will still do it.
> > >
> > > You can have them on separated cells if you think you will access them
> > > separatly. You can also have different columns identifiers for each
> > > type of address you can have.
> > >
> > > Like you have CF1 for all you fields, C=Infos for the customer info,
> > > C=PHY for Physical address, C=HOM for home address, C=OFF for office
> > > address, and so on?
> > >
> > > The idea is to reduce the CFs if not required, and really think about
> > > the way you access your data.
> > >
> > > If you access all the address at the same time, then simply put all of
> > > them on the same cell, on a Array of Address converted in String with
> > > JSon. So simple ;)
> > >
> > > JM
> > >
> > > 2012/12/24, Ramasubramanian <ra...@gmail.com>:
> > > > Hi,
> > > >
> > > > Let me explain the scenario.
> > > >
> > > > For address of the customer we have designed 3 tables (in relational
> > way)
> > > >
> > > > 1. Address link table
> > > >     Will have key columns like
> > > >       Address type - physical or email /fax/phone/URL/ etc.,
> > > >       Address category- (home/work)
> > > >       Primary address indicator
> > > >       Bad address indicator
> > > >       Etc.,
> > > > 2. Physical address
> > > >     This will contain the actual physical address. A customer can
> have
> > n
> > > > Number of addresses.
> > > >    Fields :
> > > >        - address type (physical)
> > > >        - address category (home/work/etc.,)
> > > >         - address1
> > > >         - address 2
> > > >         .........
> > > > 3. Electronic address
> > > >     It will contain email/fax/phone/URL etc, and it's value
> > > >      Fields :
> > > >        - address type (email /fax/phone/URL/ etc.,)
> > > >        - address category (home/work/etc.,)
> > > >        - value ( actual value based on address type. Like actual
> phone
> > > > number)
> > > >
> > > >
> > > > Now in the above scenario, while designing in hbase, I am going to
> > > eliminate
> > > > link table and have those fields in both physical and electronic
> > address.
> > > >
> > > > So both the tables has common fields like address type and address
> > > category.
> > > > Hence thought of having these two fields common for both the set of
> > > fields.
> > > > (In a single table)
> > > >
> > > > Regards,
> > > > Rams
> > > >
> > > > On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
> > > >
> > > >> it is. but why do you want  to do that? you will run into issues
> once
> > > >> your
> > > >> data starts growing. each cell, along with the actual value stores
> few
> > > >> additional things, *row, column *and the *version. *as a result you
> > will
> > > >> loose space if you do that.
> > > >>
> > > >> Best Regards,
> > > >> Tariq
> > > >> +91-9741563634
> > > >> https://mtariq.jux.com/
> > > >>
> > > >>
> > > >> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> > > >> ramasubramanian.narayanan@gmail.com> wrote:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>> Is it ok to have same column into different column familes?
> > > >>>
> > > >>> regards,
> > > >>> Rams
> > > >>>
> > > >>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <
> dontariq@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>> you are creating 2 different rows here. cf means how column are
> > > clubbed
> > > >>>> together as a single entity which is represented by that cf. but
> > here
> > > >>>> you
> > > >>>> are creating 2 different rows having one cf each, CF1 and CF2
> > > >>> respectively.
> > > >>>> if you want to have 1 row with 2 cf, you have to do use same
> rowkey
> > > for
> > > >>>> both the cf.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> Best Regards,
> > > >>>> Tariq
> > > >>>> +91-9741563634
> > > >>>> https://mtariq.jux.com/
> > > >>>>
> > > >>>>
> > > >>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
> > > >>>> ramasubramanian.narayanan@gmail.com> wrote:
> > > >>>>
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> *Table Name : Customer*
> > > >>>>> *
> > > >>>>> *
> > > >>>>> *Field Name         Column Family*
> > > >>>>> Customer Number      CF1
> > > >>>>> DOB                  CF1
> > > >>>>> FName                CF1
> > > >>>>> MName                CF1
> > > >>>>> LName                CF1
> > > >>>>> Address Type         CF2
> > > >>>>> Address Line1        CF2
> > > >>>>> Address Line2        CF2
> > > >>>>> Address Line3        CF2
> > > >>>>> Address Line4        CF2
> > > >>>>> State                CF2
> > > >>>>> City                 CF2
> > > >>>>> Country              CF2
> > > >>>>>
> > > >>>>> Is it good to have rowkey as follows for the same table?
> > > >>>>>
> > > >>>>> Rowkey Design:
> > > >>>>> --------------
> > > >>>>> For CF1 : Customer Number + YYYYMMD (business date)
> > > >>>>> For CF2 : Customer Number + Address Type
> > > >>>>>
> > > >>>>> Note :
> > > >>>>> Address Type can be any of HOME/OFFICE/OTHERS
> > > >>>>>
> > > >>>>> regards,
> > > >>>>> Rams
> > > >>>
> > > >
> > >
> >
>

Re: Regarding Rowkey and Column Family

Posted by Mohammad Tariq <do...@gmail.com>.
I would rather serialize the JSON object into a byte array and then store
it into an HBase cell. Later whenever I need to pull out some value, I can
deserialize it and get the result.

If you know the column name in advance, you can use the QualifierFilter to
get the rows.

Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/


On Wed, Dec 26, 2012 at 8:35 PM, Ramasubramanian Narayanan <
ramasubramanian.narayanan@gmail.com> wrote:

> Hi,
>
> Thanks a lot... Can you please help me a sample code how to insert & read
> Json object in HBase...
>
> Also how  to select the particular row from column (inserted thru Json
> sobject).
>
> Have given the script based on  google'ng we did... Please help in using
> this script how to use in HBase shell script also in Java..
>
>
> -----------------------------
> { "Customer": {
> "Customer Detail": [
> {"CustomerNumber": "10000000001",
>         "DOB": "01/01/01",
>         "Fname": "Fname1",
>         "Mname": "Mname1",
>         "Lname": "Lname1",
> "address": {
>         "AddressType": "Home",
>       "AddressLine1" :"1.1.Address Line1",
>     "AddressLine2" :"1.1.Address Line2",
>       "AddressLine3" :"1.1.Address Line3",
>       "AddressLine4" :"1.1.Address Line4",
>      "State" :"1.1.State",
>     "City" :"1.1.City",
>       "Country" :"1.1.Country"
>       }
> },
>     { "CustomerNumber": "10000000002",
>         "DOB": "01/02/01",
>         "Fname": "Fname2",
>         "Mname": "Mname2",
>         "Lname": "Lname2",
> "address": [{
>         "AddressType": "Home",
>       "AddressLine1" :"2.1.Address Line1",
>     "AddressLine2" :"2.1.Address Line2",
>       "AddressLine3" :"2.1.Address Line3",
>       "AddressLine4" :"2.1.Address Line4",
>      "State" :"2.1.State",
>     "City" :"2.1.City",
>       "Country" :"2.1.Country"
>       },
>   {
>         "AddressType": "Office",
>       "AddressLine1" :"2.2.Address Line1",
>     "AddressLine2" :"2.2.Address Line2",
>       "AddressLine3" :"2.2.Address Line3",
>       "AddressLine4" :"2.2.Address Line4",
>      "State" :"2.2.State",
>     "City" :"2.2.City",
>       "Country" :"2.2.Country"
>       }
> ]
>
>       },
>       { "CustomerNumber": "10000000003",
>         "DOB": "01/03/01",
>         "Fname": "Fname3",
>         "Mname": "Mname3",
>         "Lname": "Lname3",
> "address": [{
>         "AddressType": "Home",
>       "AddressLine1" :"3.1.Address Line1",
>     "AddressLine2" :"3.1.Address Line2",
>       "AddressLine3" :"3.1.Address Line3",
>       "AddressLine4" :"3.1.Address Line4",
>      "State" :"3.1.State",
>     "City" :"3.1.City",
>       "Country" :"3.1.Country"
>       },
>   {
>         "AddressType": "Office",
>       "AddressLine1" :"3.2.Address Line1",
>     "AddressLine2" :"3.2.Address Line2",
>       "AddressLine3" :"3.2.Address Line3",
>       "AddressLine4" :"3.2.Address Line4",
>      "State" :"3.2.State",
>     "City" :"3.2.City",
>       "Country" :"3.2.Country"
>       },
>   {
>         "AddressType": "Others",
>       "AddressLine1" :"3.3.Address Line1",
>     "AddressLine2" :"3.3.Address Line2",
>       "AddressLine3" :"3.3.Address Line3",
>       "AddressLine4" :"3.3.Address Line4",
>      "State" :"3.3.State",
>     "City" :"3.3.City",
>       "Country" :"3.3.Country"
>       }
> ]
>       },
>       { "CustomerNumber": "10000000004",
>         "DOB": "01/04/01",
>         "Fname": "Fname4",
>         "Mname": "Mname4",
>         "Lname": "Lname4",
> "address": [{
>         "AddressType": "Home",
>       "AddressLine1" :"4.1.Address Line1",
>     "AddressLine2" :"4.1.Address Line2",
>       "AddressLine3" :"4.1.Address Line3",
>       "AddressLine4" :"4.1.Address Line4",
>      "State" :"4.1.State",
>     "City" :"4.1.City",
>       "Country" :"4.1.Country"
>       },
>   {
>         "AddressType": "Office",
>       "AddressLine1" :"4.2.Address Line1",
>     "AddressLine2" :"4.2.Address Line2",
>       "AddressLine3" :"4.2.Address Line3",
>       "AddressLine4" :"4.2.Address Line4",
>      "State" :"4.2.State",
>     "City" :"4.2.City",
>       "Country" :"4.2.Country"
>       },
>             {
>         "AddressType": "Office2",
>       "AddressLine1" :"4.3.Address Line1",
>     "AddressLine2" :"4.3.Address Line2",
>       "AddressLine3" :"4.3.Address Line3",
>       "AddressLine4" :"4.3.Address Line4",
>      "State" :"4.3.State",
>     "City" :"4.3.City",
>       "Country" :"4.3.Country"
>       },
>             {
>         "AddressType": "Others",
>       "AddressLine1" :"4.4.Address Line1",
>     "AddressLine2" :"4.4.Address Line2",
>       "AddressLine3" :"4.4.Address Line3",
>       "AddressLine4" :"4.4.Address Line4",
>      "State" :"4.4.State",
>     "City" :"4.4.City",
>       "Country" :"4.4.Country"
>       }
> ]
> }]
> }}
>
> --------------------------------------------------------------
>
> regards,
> Rams
>
> On Mon, Dec 24, 2012 at 9:15 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Rams,
> >
> > Even if a customer can have multiple addresses, you can still simply
> > put them all on the same field...
> >
> > A ArrayList of address, converted in a JSon sting, in a single HBase
> > cell will still do it.
> >
> > You can have them on separated cells if you think you will access them
> > separatly. You can also have different columns identifiers for each
> > type of address you can have.
> >
> > Like you have CF1 for all you fields, C=Infos for the customer info,
> > C=PHY for Physical address, C=HOM for home address, C=OFF for office
> > address, and so on?
> >
> > The idea is to reduce the CFs if not required, and really think about
> > the way you access your data.
> >
> > If you access all the address at the same time, then simply put all of
> > them on the same cell, on a Array of Address converted in String with
> > JSon. So simple ;)
> >
> > JM
> >
> > 2012/12/24, Ramasubramanian <ra...@gmail.com>:
> > > Hi,
> > >
> > > Let me explain the scenario.
> > >
> > > For address of the customer we have designed 3 tables (in relational
> way)
> > >
> > > 1. Address link table
> > >     Will have key columns like
> > >       Address type - physical or email /fax/phone/URL/ etc.,
> > >       Address category- (home/work)
> > >       Primary address indicator
> > >       Bad address indicator
> > >       Etc.,
> > > 2. Physical address
> > >     This will contain the actual physical address. A customer can have
> n
> > > Number of addresses.
> > >    Fields :
> > >        - address type (physical)
> > >        - address category (home/work/etc.,)
> > >         - address1
> > >         - address 2
> > >         .........
> > > 3. Electronic address
> > >     It will contain email/fax/phone/URL etc, and it's value
> > >      Fields :
> > >        - address type (email /fax/phone/URL/ etc.,)
> > >        - address category (home/work/etc.,)
> > >        - value ( actual value based on address type. Like actual phone
> > > number)
> > >
> > >
> > > Now in the above scenario, while designing in hbase, I am going to
> > eliminate
> > > link table and have those fields in both physical and electronic
> address.
> > >
> > > So both the tables has common fields like address type and address
> > category.
> > > Hence thought of having these two fields common for both the set of
> > fields.
> > > (In a single table)
> > >
> > > Regards,
> > > Rams
> > >
> > > On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > >
> > >> it is. but why do you want  to do that? you will run into issues once
> > >> your
> > >> data starts growing. each cell, along with the actual value stores few
> > >> additional things, *row, column *and the *version. *as a result you
> will
> > >> loose space if you do that.
> > >>
> > >> Best Regards,
> > >> Tariq
> > >> +91-9741563634
> > >> https://mtariq.jux.com/
> > >>
> > >>
> > >> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> > >> ramasubramanian.narayanan@gmail.com> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Is it ok to have same column into different column familes?
> > >>>
> > >>> regards,
> > >>> Rams
> > >>>
> > >>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> you are creating 2 different rows here. cf means how column are
> > clubbed
> > >>>> together as a single entity which is represented by that cf. but
> here
> > >>>> you
> > >>>> are creating 2 different rows having one cf each, CF1 and CF2
> > >>> respectively.
> > >>>> if you want to have 1 row with 2 cf, you have to do use same rowkey
> > for
> > >>>> both the cf.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Best Regards,
> > >>>> Tariq
> > >>>> +91-9741563634
> > >>>> https://mtariq.jux.com/
> > >>>>
> > >>>>
> > >>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
> > >>>> ramasubramanian.narayanan@gmail.com> wrote:
> > >>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> *Table Name : Customer*
> > >>>>> *
> > >>>>> *
> > >>>>> *Field Name         Column Family*
> > >>>>> Customer Number      CF1
> > >>>>> DOB                  CF1
> > >>>>> FName                CF1
> > >>>>> MName                CF1
> > >>>>> LName                CF1
> > >>>>> Address Type         CF2
> > >>>>> Address Line1        CF2
> > >>>>> Address Line2        CF2
> > >>>>> Address Line3        CF2
> > >>>>> Address Line4        CF2
> > >>>>> State                CF2
> > >>>>> City                 CF2
> > >>>>> Country              CF2
> > >>>>>
> > >>>>> Is it good to have rowkey as follows for the same table?
> > >>>>>
> > >>>>> Rowkey Design:
> > >>>>> --------------
> > >>>>> For CF1 : Customer Number + YYYYMMD (business date)
> > >>>>> For CF2 : Customer Number + Address Type
> > >>>>>
> > >>>>> Note :
> > >>>>> Address Type can be any of HOME/OFFICE/OTHERS
> > >>>>>
> > >>>>> regards,
> > >>>>> Rams
> > >>>
> > >
> >
>

Re: Regarding Rowkey and Column Family

Posted by Ramasubramanian Narayanan <ra...@gmail.com>.
Hi,

Thanks a lot... Can you please help me a sample code how to insert & read
Json object in HBase...

Also how  to select the particular row from column (inserted thru Json
sobject).

Have given the script based on  google'ng we did... Please help in using
this script how to use in HBase shell script also in Java..


-----------------------------
{ "Customer": {
"Customer Detail": [
{"CustomerNumber": "10000000001",
        "DOB": "01/01/01",
        "Fname": "Fname1",
        "Mname": "Mname1",
        "Lname": "Lname1",
"address": {
        "AddressType": "Home",
      "AddressLine1" :"1.1.Address Line1",
    "AddressLine2" :"1.1.Address Line2",
      "AddressLine3" :"1.1.Address Line3",
      "AddressLine4" :"1.1.Address Line4",
     "State" :"1.1.State",
    "City" :"1.1.City",
      "Country" :"1.1.Country"
      }
},
    { "CustomerNumber": "10000000002",
        "DOB": "01/02/01",
        "Fname": "Fname2",
        "Mname": "Mname2",
        "Lname": "Lname2",
"address": [{
        "AddressType": "Home",
      "AddressLine1" :"2.1.Address Line1",
    "AddressLine2" :"2.1.Address Line2",
      "AddressLine3" :"2.1.Address Line3",
      "AddressLine4" :"2.1.Address Line4",
     "State" :"2.1.State",
    "City" :"2.1.City",
      "Country" :"2.1.Country"
      },
  {
        "AddressType": "Office",
      "AddressLine1" :"2.2.Address Line1",
    "AddressLine2" :"2.2.Address Line2",
      "AddressLine3" :"2.2.Address Line3",
      "AddressLine4" :"2.2.Address Line4",
     "State" :"2.2.State",
    "City" :"2.2.City",
      "Country" :"2.2.Country"
      }
]

      },
      { "CustomerNumber": "10000000003",
        "DOB": "01/03/01",
        "Fname": "Fname3",
        "Mname": "Mname3",
        "Lname": "Lname3",
"address": [{
        "AddressType": "Home",
      "AddressLine1" :"3.1.Address Line1",
    "AddressLine2" :"3.1.Address Line2",
      "AddressLine3" :"3.1.Address Line3",
      "AddressLine4" :"3.1.Address Line4",
     "State" :"3.1.State",
    "City" :"3.1.City",
      "Country" :"3.1.Country"
      },
  {
        "AddressType": "Office",
      "AddressLine1" :"3.2.Address Line1",
    "AddressLine2" :"3.2.Address Line2",
      "AddressLine3" :"3.2.Address Line3",
      "AddressLine4" :"3.2.Address Line4",
     "State" :"3.2.State",
    "City" :"3.2.City",
      "Country" :"3.2.Country"
      },
  {
        "AddressType": "Others",
      "AddressLine1" :"3.3.Address Line1",
    "AddressLine2" :"3.3.Address Line2",
      "AddressLine3" :"3.3.Address Line3",
      "AddressLine4" :"3.3.Address Line4",
     "State" :"3.3.State",
    "City" :"3.3.City",
      "Country" :"3.3.Country"
      }
]
      },
      { "CustomerNumber": "10000000004",
        "DOB": "01/04/01",
        "Fname": "Fname4",
        "Mname": "Mname4",
        "Lname": "Lname4",
"address": [{
        "AddressType": "Home",
      "AddressLine1" :"4.1.Address Line1",
    "AddressLine2" :"4.1.Address Line2",
      "AddressLine3" :"4.1.Address Line3",
      "AddressLine4" :"4.1.Address Line4",
     "State" :"4.1.State",
    "City" :"4.1.City",
      "Country" :"4.1.Country"
      },
  {
        "AddressType": "Office",
      "AddressLine1" :"4.2.Address Line1",
    "AddressLine2" :"4.2.Address Line2",
      "AddressLine3" :"4.2.Address Line3",
      "AddressLine4" :"4.2.Address Line4",
     "State" :"4.2.State",
    "City" :"4.2.City",
      "Country" :"4.2.Country"
      },
            {
        "AddressType": "Office2",
      "AddressLine1" :"4.3.Address Line1",
    "AddressLine2" :"4.3.Address Line2",
      "AddressLine3" :"4.3.Address Line3",
      "AddressLine4" :"4.3.Address Line4",
     "State" :"4.3.State",
    "City" :"4.3.City",
      "Country" :"4.3.Country"
      },
            {
        "AddressType": "Others",
      "AddressLine1" :"4.4.Address Line1",
    "AddressLine2" :"4.4.Address Line2",
      "AddressLine3" :"4.4.Address Line3",
      "AddressLine4" :"4.4.Address Line4",
     "State" :"4.4.State",
    "City" :"4.4.City",
      "Country" :"4.4.Country"
      }
]
}]
}}

--------------------------------------------------------------

regards,
Rams

On Mon, Dec 24, 2012 at 9:15 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Rams,
>
> Even if a customer can have multiple addresses, you can still simply
> put them all on the same field...
>
> A ArrayList of address, converted in a JSon sting, in a single HBase
> cell will still do it.
>
> You can have them on separated cells if you think you will access them
> separatly. You can also have different columns identifiers for each
> type of address you can have.
>
> Like you have CF1 for all you fields, C=Infos for the customer info,
> C=PHY for Physical address, C=HOM for home address, C=OFF for office
> address, and so on?
>
> The idea is to reduce the CFs if not required, and really think about
> the way you access your data.
>
> If you access all the address at the same time, then simply put all of
> them on the same cell, on a Array of Address converted in String with
> JSon. So simple ;)
>
> JM
>
> 2012/12/24, Ramasubramanian <ra...@gmail.com>:
> > Hi,
> >
> > Let me explain the scenario.
> >
> > For address of the customer we have designed 3 tables (in relational way)
> >
> > 1. Address link table
> >     Will have key columns like
> >       Address type - physical or email /fax/phone/URL/ etc.,
> >       Address category- (home/work)
> >       Primary address indicator
> >       Bad address indicator
> >       Etc.,
> > 2. Physical address
> >     This will contain the actual physical address. A customer can have n
> > Number of addresses.
> >    Fields :
> >        - address type (physical)
> >        - address category (home/work/etc.,)
> >         - address1
> >         - address 2
> >         .........
> > 3. Electronic address
> >     It will contain email/fax/phone/URL etc, and it's value
> >      Fields :
> >        - address type (email /fax/phone/URL/ etc.,)
> >        - address category (home/work/etc.,)
> >        - value ( actual value based on address type. Like actual phone
> > number)
> >
> >
> > Now in the above scenario, while designing in hbase, I am going to
> eliminate
> > link table and have those fields in both physical and electronic address.
> >
> > So both the tables has common fields like address type and address
> category.
> > Hence thought of having these two fields common for both the set of
> fields.
> > (In a single table)
> >
> > Regards,
> > Rams
> >
> > On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com> wrote:
> >
> >> it is. but why do you want  to do that? you will run into issues once
> >> your
> >> data starts growing. each cell, along with the actual value stores few
> >> additional things, *row, column *and the *version. *as a result you will
> >> loose space if you do that.
> >>
> >> Best Regards,
> >> Tariq
> >> +91-9741563634
> >> https://mtariq.jux.com/
> >>
> >>
> >> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> >> ramasubramanian.narayanan@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> Is it ok to have same column into different column familes?
> >>>
> >>> regards,
> >>> Rams
> >>>
> >>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
> >>> wrote:
> >>>
> >>>> you are creating 2 different rows here. cf means how column are
> clubbed
> >>>> together as a single entity which is represented by that cf. but here
> >>>> you
> >>>> are creating 2 different rows having one cf each, CF1 and CF2
> >>> respectively.
> >>>> if you want to have 1 row with 2 cf, you have to do use same rowkey
> for
> >>>> both the cf.
> >>>>
> >>>>
> >>>>
> >>>> Best Regards,
> >>>> Tariq
> >>>> +91-9741563634
> >>>> https://mtariq.jux.com/
> >>>>
> >>>>
> >>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
> >>>> ramasubramanian.narayanan@gmail.com> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> *Table Name : Customer*
> >>>>> *
> >>>>> *
> >>>>> *Field Name         Column Family*
> >>>>> Customer Number      CF1
> >>>>> DOB                  CF1
> >>>>> FName                CF1
> >>>>> MName                CF1
> >>>>> LName                CF1
> >>>>> Address Type         CF2
> >>>>> Address Line1        CF2
> >>>>> Address Line2        CF2
> >>>>> Address Line3        CF2
> >>>>> Address Line4        CF2
> >>>>> State                CF2
> >>>>> City                 CF2
> >>>>> Country              CF2
> >>>>>
> >>>>> Is it good to have rowkey as follows for the same table?
> >>>>>
> >>>>> Rowkey Design:
> >>>>> --------------
> >>>>> For CF1 : Customer Number + YYYYMMD (business date)
> >>>>> For CF2 : Customer Number + Address Type
> >>>>>
> >>>>> Note :
> >>>>> Address Type can be any of HOME/OFFICE/OTHERS
> >>>>>
> >>>>> regards,
> >>>>> Rams
> >>>
> >
>

Re: Regarding Rowkey and Column Family

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Rams,

Even if a customer can have multiple addresses, you can still simply
put them all on the same field...

A ArrayList of address, converted in a JSon sting, in a single HBase
cell will still do it.

You can have them on separated cells if you think you will access them
separatly. You can also have different columns identifiers for each
type of address you can have.

Like you have CF1 for all you fields, C=Infos for the customer info,
C=PHY for Physical address, C=HOM for home address, C=OFF for office
address, and so on?

The idea is to reduce the CFs if not required, and really think about
the way you access your data.

If you access all the address at the same time, then simply put all of
them on the same cell, on a Array of Address converted in String with
JSon. So simple ;)

JM

2012/12/24, Ramasubramanian <ra...@gmail.com>:
> Hi,
>
> Let me explain the scenario.
>
> For address of the customer we have designed 3 tables (in relational way)
>
> 1. Address link table
>     Will have key columns like
>       Address type - physical or email /fax/phone/URL/ etc.,
>       Address category- (home/work)
>       Primary address indicator
>       Bad address indicator
>       Etc.,
> 2. Physical address
>     This will contain the actual physical address. A customer can have n
> Number of addresses.
>    Fields :
>        - address type (physical)
>        - address category (home/work/etc.,)
>         - address1
>         - address 2
>         .........
> 3. Electronic address
>     It will contain email/fax/phone/URL etc, and it's value
>      Fields :
>        - address type (email /fax/phone/URL/ etc.,)
>        - address category (home/work/etc.,)
>        - value ( actual value based on address type. Like actual phone
> number)
>
>
> Now in the above scenario, while designing in hbase, I am going to eliminate
> link table and have those fields in both physical and electronic address.
>
> So both the tables has common fields like address type and address category.
> Hence thought of having these two fields common for both the set of fields.
> (In a single table)
>
> Regards,
> Rams
>
> On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> it is. but why do you want  to do that? you will run into issues once
>> your
>> data starts growing. each cell, along with the actual value stores few
>> additional things, *row, column *and the *version. *as a result you will
>> loose space if you do that.
>>
>> Best Regards,
>> Tariq
>> +91-9741563634
>> https://mtariq.jux.com/
>>
>>
>> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
>> ramasubramanian.narayanan@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Is it ok to have same column into different column familes?
>>>
>>> regards,
>>> Rams
>>>
>>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>>
>>>> you are creating 2 different rows here. cf means how column are clubbed
>>>> together as a single entity which is represented by that cf. but here
>>>> you
>>>> are creating 2 different rows having one cf each, CF1 and CF2
>>> respectively.
>>>> if you want to have 1 row with 2 cf, you have to do use same rowkey for
>>>> both the cf.
>>>>
>>>>
>>>>
>>>> Best Regards,
>>>> Tariq
>>>> +91-9741563634
>>>> https://mtariq.jux.com/
>>>>
>>>>
>>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>>>> ramasubramanian.narayanan@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> *Table Name : Customer*
>>>>> *
>>>>> *
>>>>> *Field Name         Column Family*
>>>>> Customer Number      CF1
>>>>> DOB                  CF1
>>>>> FName                CF1
>>>>> MName                CF1
>>>>> LName                CF1
>>>>> Address Type         CF2
>>>>> Address Line1        CF2
>>>>> Address Line2        CF2
>>>>> Address Line3        CF2
>>>>> Address Line4        CF2
>>>>> State                CF2
>>>>> City                 CF2
>>>>> Country              CF2
>>>>>
>>>>> Is it good to have rowkey as follows for the same table?
>>>>>
>>>>> Rowkey Design:
>>>>> --------------
>>>>> For CF1 : Customer Number + YYYYMMD (business date)
>>>>> For CF2 : Customer Number + Address Type
>>>>>
>>>>> Note :
>>>>> Address Type can be any of HOME/OFFICE/OTHERS
>>>>>
>>>>> regards,
>>>>> Rams
>>>
>

Re: Regarding Rowkey and Column Family

Posted by Ramasubramanian <ra...@gmail.com>.
Hi,

Let me explain the scenario. 

For address of the customer we have designed 3 tables (in relational way)

1. Address link table
    Will have key columns like 
      Address type - physical or email /fax/phone/URL/ etc.,
      Address category- (home/work)
      Primary address indicator
      Bad address indicator
      Etc.,
2. Physical address
    This will contain the actual physical address. A customer can have n Number of addresses. 
   Fields :
       - address type (physical)
       - address category (home/work/etc.,)
        - address1
        - address 2
        .........
3. Electronic address
    It will contain email/fax/phone/URL etc, and it's value
     Fields :
       - address type (email /fax/phone/URL/ etc.,)
       - address category (home/work/etc.,)
       - value ( actual value based on address type. Like actual phone number)


Now in the above scenario, while designing in hbase, I am going to eliminate link table and have those fields in both physical and electronic address. 

So both the tables has common fields like address type and address category. Hence thought of having these two fields common for both the set of fields. (In a single table)

Regards,
Rams

On 24-Dec-2012, at 6:45 PM, Mohammad Tariq <do...@gmail.com> wrote:

> it is. but why do you want  to do that? you will run into issues once your
> data starts growing. each cell, along with the actual value stores few
> additional things, *row, column *and the *version. *as a result you will
> loose space if you do that.
> 
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
> 
> 
> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> ramasubramanian.narayanan@gmail.com> wrote:
> 
>> Hi,
>> 
>> Is it ok to have same column into different column familes?
>> 
>> regards,
>> Rams
>> 
>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> 
>>> you are creating 2 different rows here. cf means how column are clubbed
>>> together as a single entity which is represented by that cf. but here you
>>> are creating 2 different rows having one cf each, CF1 and CF2
>> respectively.
>>> if you want to have 1 row with 2 cf, you have to do use same rowkey for
>>> both the cf.
>>> 
>>> 
>>> 
>>> Best Regards,
>>> Tariq
>>> +91-9741563634
>>> https://mtariq.jux.com/
>>> 
>>> 
>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>>> ramasubramanian.narayanan@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> *Table Name : Customer*
>>>> *
>>>> *
>>>> *Field Name         Column Family*
>>>> Customer Number      CF1
>>>> DOB                  CF1
>>>> FName                CF1
>>>> MName                CF1
>>>> LName                CF1
>>>> Address Type         CF2
>>>> Address Line1        CF2
>>>> Address Line2        CF2
>>>> Address Line3        CF2
>>>> Address Line4        CF2
>>>> State                CF2
>>>> City                 CF2
>>>> Country              CF2
>>>> 
>>>> Is it good to have rowkey as follows for the same table?
>>>> 
>>>> Rowkey Design:
>>>> --------------
>>>> For CF1 : Customer Number + YYYYMMD (business date)
>>>> For CF2 : Customer Number + Address Type
>>>> 
>>>> Note :
>>>> Address Type can be any of HOME/OFFICE/OTHERS
>>>> 
>>>> regards,
>>>> Rams
>> 

Re: Regarding Rowkey and Column Family

Posted by Ramasubramanian <ra...@gmail.com>.
Hi,

Thanks for your detailed explanation. 

The address will be multiple ones for a single customer. For example a same customer can hold home address, office address, etc., hence I grouped into different column family. 

1. Is my approach is correct?

2. What can we have as a rowkey for both these column families?

3. I think customer Number is sequence hence planning to include YYYYMMDD along with customer number in the rowkey. Is that fine?

Regards,
Rams

On 24-Dec-2012, at 7:54 PM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi Rams,
> 
> How are you going to access you data?
> 
> HBase will create one cell (Which mean rowkey+timestamp+...+data) for
> eache cell.
> 
> Are you really going to sometime access Address Line1 without
> accessing Address Line2?
> 
> Are you really going to access the City wihtout accessing the State?
> 
> If not, why not just put a JSon object with all this data in a single cell?
> 
> So at the end your table will look llike:
> 
> *Table Name : Customer*
> *
> *
> *Field Name         Column Family*
> Customer Information CF1
> Address CF1
> 
> 
> In Customer Information you bundle:
> Customer Number      CF1
> DOB                  CF1
> FName                CF1
> MName                CF1
> LName                CF1
> 
> And in Address you bundle:
> Address Type         CF2
> Address Line1        CF2
> Address Line2        CF2
> Address Line3        CF2
> Address Line4        CF2
> State                CF2
> City                 CF2
> Country              CF2
> 
> But if you always access the address when you access the customer
> information, then the best way might be to just put all those field in
> a single JSon object, and have just one CF and on C in your table...
> 
> Regarding the key, if you customer number is sequential and you insert
> based on this field, you will hotspot one server at a time... If the
> number is "random", then it's ok.
> 
> HTH.
> 
> JM
> 
> 2012/12/24, Mohammad Tariq <do...@gmail.com>:
>> it is. but why do you want  to do that? you will run into issues once your
>> data starts growing. each cell, along with the actual value stores few
>> additional things, *row, column *and the *version. *as a result you will
>> loose space if you do that.
>> 
>> Best Regards,
>> Tariq
>> +91-9741563634
>> https://mtariq.jux.com/
>> 
>> 
>> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
>> ramasubramanian.narayanan@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Is it ok to have same column into different column familes?
>>> 
>>> regards,
>>> Rams
>>> 
>>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>> 
>>>> you are creating 2 different rows here. cf means how column are clubbed
>>>> together as a single entity which is represented by that cf. but here
>>>> you
>>>> are creating 2 different rows having one cf each, CF1 and CF2
>>> respectively.
>>>> if you want to have 1 row with 2 cf, you have to do use same rowkey for
>>>> both the cf.
>>>> 
>>>> 
>>>> 
>>>> Best Regards,
>>>> Tariq
>>>> +91-9741563634
>>>> https://mtariq.jux.com/
>>>> 
>>>> 
>>>> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>>>> ramasubramanian.narayanan@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> *Table Name : Customer*
>>>>> *
>>>>> *
>>>>> *Field Name         Column Family*
>>>>> Customer Number      CF1
>>>>> DOB                  CF1
>>>>> FName                CF1
>>>>> MName                CF1
>>>>> LName                CF1
>>>>> Address Type         CF2
>>>>> Address Line1        CF2
>>>>> Address Line2        CF2
>>>>> Address Line3        CF2
>>>>> Address Line4        CF2
>>>>> State                CF2
>>>>> City                 CF2
>>>>> Country              CF2
>>>>> 
>>>>> Is it good to have rowkey as follows for the same table?
>>>>> 
>>>>> Rowkey Design:
>>>>> --------------
>>>>> For CF1 : Customer Number + YYYYMMD (business date)
>>>>> For CF2 : Customer Number + Address Type
>>>>> 
>>>>> Note :
>>>>> Address Type can be any of HOME/OFFICE/OTHERS
>>>>> 
>>>>> regards,
>>>>> Rams
>> 

Re: Regarding Rowkey and Column Family

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Rams,

How are you going to access you data?

HBase will create one cell (Which mean rowkey+timestamp+...+data) for
eache cell.

Are you really going to sometime access Address Line1 without
accessing Address Line2?

Are you really going to access the City wihtout accessing the State?

If not, why not just put a JSon object with all this data in a single cell?

So at the end your table will look llike:

*Table Name : Customer*
*
*
*Field Name         Column Family*
Customer Information CF1
Address CF1


In Customer Information you bundle:
Customer Number      CF1
DOB                  CF1
FName                CF1
MName                CF1
LName                CF1

And in Address you bundle:
Address Type         CF2
Address Line1        CF2
Address Line2        CF2
Address Line3        CF2
Address Line4        CF2
State                CF2
City                 CF2
Country              CF2

But if you always access the address when you access the customer
information, then the best way might be to just put all those field in
a single JSon object, and have just one CF and on C in your table...

Regarding the key, if you customer number is sequential and you insert
based on this field, you will hotspot one server at a time... If the
number is "random", then it's ok.

HTH.

JM

2012/12/24, Mohammad Tariq <do...@gmail.com>:
> it is. but why do you want  to do that? you will run into issues once your
> data starts growing. each cell, along with the actual value stores few
> additional things, *row, column *and the *version. *as a result you will
> loose space if you do that.
>
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
>
>
> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> ramasubramanian.narayanan@gmail.com> wrote:
>
>> Hi,
>>
>> Is it ok to have same column into different column familes?
>>
>> regards,
>> Rams
>>
>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>>
>> > you are creating 2 different rows here. cf means how column are clubbed
>> > together as a single entity which is represented by that cf. but here
>> > you
>> > are creating 2 different rows having one cf each, CF1 and CF2
>> respectively.
>> > if you want to have 1 row with 2 cf, you have to do use same rowkey for
>> > both the cf.
>> >
>> >
>> >
>> > Best Regards,
>> > Tariq
>> > +91-9741563634
>> > https://mtariq.jux.com/
>> >
>> >
>> > On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>> > ramasubramanian.narayanan@gmail.com> wrote:
>> >
>> > > Hi,
>> > >
>> > > *Table Name : Customer*
>> > > *
>> > > *
>> > > *Field Name         Column Family*
>> > > Customer Number      CF1
>> > > DOB                  CF1
>> > > FName                CF1
>> > > MName                CF1
>> > > LName                CF1
>> > > Address Type         CF2
>> > > Address Line1        CF2
>> > > Address Line2        CF2
>> > > Address Line3        CF2
>> > > Address Line4        CF2
>> > > State                CF2
>> > > City                 CF2
>> > > Country              CF2
>> > >
>> > > Is it good to have rowkey as follows for the same table?
>> > >
>> > > Rowkey Design:
>> > > --------------
>> > > For CF1 : Customer Number + YYYYMMD (business date)
>> > > For CF2 : Customer Number + Address Type
>> > >
>> > > Note :
>> > > Address Type can be any of HOME/OFFICE/OTHERS
>> > >
>> > > regards,
>> > > Rams
>> > >
>> >
>>
>

Re: Regarding Rowkey and Column Family

Posted by Mohammad Tariq <do...@gmail.com>.
it is. but why do you want  to do that? you will run into issues once your
data starts growing. each cell, along with the actual value stores few
additional things, *row, column *and the *version. *as a result you will
loose space if you do that.

Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/


On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
ramasubramanian.narayanan@gmail.com> wrote:

> Hi,
>
> Is it ok to have same column into different column familes?
>
> regards,
> Rams
>
> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
> > you are creating 2 different rows here. cf means how column are clubbed
> > together as a single entity which is represented by that cf. but here you
> > are creating 2 different rows having one cf each, CF1 and CF2
> respectively.
> > if you want to have 1 row with 2 cf, you have to do use same rowkey for
> > both the cf.
> >
> >
> >
> > Best Regards,
> > Tariq
> > +91-9741563634
> > https://mtariq.jux.com/
> >
> >
> > On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
> > ramasubramanian.narayanan@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > *Table Name : Customer*
> > > *
> > > *
> > > *Field Name         Column Family*
> > > Customer Number      CF1
> > > DOB                  CF1
> > > FName                CF1
> > > MName                CF1
> > > LName                CF1
> > > Address Type         CF2
> > > Address Line1        CF2
> > > Address Line2        CF2
> > > Address Line3        CF2
> > > Address Line4        CF2
> > > State                CF2
> > > City                 CF2
> > > Country              CF2
> > >
> > > Is it good to have rowkey as follows for the same table?
> > >
> > > Rowkey Design:
> > > --------------
> > > For CF1 : Customer Number + YYYYMMD (business date)
> > > For CF2 : Customer Number + Address Type
> > >
> > > Note :
> > > Address Type can be any of HOME/OFFICE/OTHERS
> > >
> > > regards,
> > > Rams
> > >
> >
>

Re: Regarding Rowkey and Column Family

Posted by Ramasubramanian Narayanan <ra...@gmail.com>.
Hi,

Is it ok to have same column into different column familes?

regards,
Rams

On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <do...@gmail.com> wrote:

> you are creating 2 different rows here. cf means how column are clubbed
> together as a single entity which is represented by that cf. but here you
> are creating 2 different rows having one cf each, CF1 and CF2 respectively.
> if you want to have 1 row with 2 cf, you have to do use same rowkey for
> both the cf.
>
>
>
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
>
>
> On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
> ramasubramanian.narayanan@gmail.com> wrote:
>
> > Hi,
> >
> > *Table Name : Customer*
> > *
> > *
> > *Field Name         Column Family*
> > Customer Number      CF1
> > DOB                  CF1
> > FName                CF1
> > MName                CF1
> > LName                CF1
> > Address Type         CF2
> > Address Line1        CF2
> > Address Line2        CF2
> > Address Line3        CF2
> > Address Line4        CF2
> > State                CF2
> > City                 CF2
> > Country              CF2
> >
> > Is it good to have rowkey as follows for the same table?
> >
> > Rowkey Design:
> > --------------
> > For CF1 : Customer Number + YYYYMMD (business date)
> > For CF2 : Customer Number + Address Type
> >
> > Note :
> > Address Type can be any of HOME/OFFICE/OTHERS
> >
> > regards,
> > Rams
> >
>

Re: Regarding Rowkey and Column Family

Posted by Mohammad Tariq <do...@gmail.com>.
you are creating 2 different rows here. cf means how column are clubbed
together as a single entity which is represented by that cf. but here you
are creating 2 different rows having one cf each, CF1 and CF2 respectively.
if you want to have 1 row with 2 cf, you have to do use same rowkey for
both the cf.



Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/


On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
ramasubramanian.narayanan@gmail.com> wrote:

> Hi,
>
> *Table Name : Customer*
> *
> *
> *Field Name         Column Family*
> Customer Number      CF1
> DOB                  CF1
> FName                CF1
> MName                CF1
> LName                CF1
> Address Type         CF2
> Address Line1        CF2
> Address Line2        CF2
> Address Line3        CF2
> Address Line4        CF2
> State                CF2
> City                 CF2
> Country              CF2
>
> Is it good to have rowkey as follows for the same table?
>
> Rowkey Design:
> --------------
> For CF1 : Customer Number + YYYYMMD (business date)
> For CF2 : Customer Number + Address Type
>
> Note :
> Address Type can be any of HOME/OFFICE/OTHERS
>
> regards,
> Rams
>