You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by aa a <am...@gmail.com> on 2014/02/07 20:02:15 UTC

modify prePut() behavior

Hi,

I'd like to perform a put but have it succeed only if certain conditions
based on values of some columns are met.
I am on 0.94.6

my first option was:
write a prePut() Observer, perform the checks in there and then call a
bypass() if needed.
However, calling bypass() seems to have no effect. I thought that bypass()
would cause HBase to not perform the put.
Is that not correct?

second option:
write a preCheckAndPut() and perform the validations there. problem is that
I have been running the process via a map/reduce and using
TableMapReduceUtil.initTableReducerJob(
        TABLE_NAME,
        null,
        job);
And there seems to be no way to tell this to use "checkAndPut" instead of
"put", is there?


third option:
I could try to do my own htable.checkAndPut from the mapper. However, since
I am on a kerberized cluster, I am not able to get an htable in the setup()
method of the mapper.
GSS initiate failed [Caused by GSSException: No valid credentials provided
(Mechanism level: Failed to find any Kerberos tgt)]
 trying to perform UserGroupInformation.loginUserFromKeytab() fails too;
I  get Caused by: javax.security.auth.login.LoginException: Unable to
obtain password from user
So I am not able to do any htable call from the mapper.

The only M/R that has worked for me to read from HDFS text file as source
and write to HBase sink is TableMapReduceUtil.initTableReducerJob() with
null for reducer class.

Any thoughts?

Thank you much in advance

ameet

Re: modify prePut() behavior

Posted by ameet c <am...@gmail.com>.
Thank You Ted, yes should have gone there first.
Will check the page first for errors ...
thanks


On Fri, Feb 7, 2014 at 10:51 PM, Ted Yu <yu...@gmail.com> wrote:

> In region server status screen, Coprocessors currently loaded by the
> regionserver are shown.
>
> FYI
>
>
> On Fri, Feb 7, 2014 at 7:22 PM, ameet c <am...@gmail.com> wrote:
>
> > Thanks Ted,
> >
> > finally figured what the problem was, my oversight.
> > After poring over logs from individual region servers, it was much more
> > straightforward.
> > Due to permissions of 640 on the jar file that I was deploying to HDFS,
> the
> > class was not getting loaded.
> > Once I made the permissions 777, The region was able to load and run the
> > class.
> > and like a promise, bypass() worked fine.
> >
> > Thank you for persisting Ted...
> >
> >
> >
> >
> > On Fri, Feb 7, 2014 at 10:12 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > If you can capture your scenario in a unit test, I will investigate.
> > >
> > > Thanks
> > >
> > >
> > > On Fri, Feb 7, 2014 at 5:30 PM, ameet c <am...@gmail.com>
> wrote:
> > >
> > > > nope, at this point, I just have a basic test table and my prePut()
> has
> > > > just one line e.bypass()
> > > > I am just inserting a simple row with one column.
> > > > I am expecting nothing to be inserted in the table when I do a scan.
> > > > But that's not what I see.
> > > > all puts are going through.
> > > > Is this the correct behavior?
> > > >
> > > >
> > > >
> > > > On Fri, Feb 7, 2014 at 6:59 PM, Ted Yu <yu...@gmail.com> wrote:
> > > >
> > > > > bq. I have disabled all processing in my observer
> > > > >
> > > > > Was it possible that there was other modification going on (such as
> > > > Append)
> > > > > ?
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com>
> > wrote:
> > > > >
> > > > > > Here is my pre job run result,
> > > > > >
> > > > > > hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> > > > > > COLUMN                               CELL
> > > > > >
> > > > > >  all_cf:custId                       timestamp=1391797558233,
> > > > > > value=111021583220
> > > > > >
> > > > > >  all_cf:dataDt                       timestamp=1391797558233,
> > > > > > value=2013-11-14
> > > > > >
> > > > > >  all_cf:mktCd                        timestamp=1391797558233,
> > > value=hcl
> > > > > >
> > > > > >  all_cf:rplctLogSeqNbr               timestamp=1391797558233,
> > > value=100
> > > > > >
> > > > > >  all_cf:rplctOprnType                timestamp=1391797558233,
> > > > > value=UPDATE
> > > > > >
> > > > > >  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> > > > > > value=8322910300
> > > > > >
> > > > > >  all_cf:subsrptnStsCd                timestamp=1391797558233,
> > > > value=XXXX
> > > > > >
> > > > > >  all_cf:subsrptnStsRsnCd             timestamp=1391797558233,
> > > > value=SOCHI
> > > > > >
> > > > > > 8 row(s) in 0.0570 seconds
> > > > > >
> > > > > > for now, I have disabled all processing in my observer: by simply
> > > > having
> > > > > > this,
> > > > > > public void prePut(ObserverContext<RegionCoprocessorEnvironment>
> e,
> > > > > > Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> > > > > > e.bypass();
> > > > > > }
> > > > > > What this means is that when I try to do a put, it should fail,
> > > right?
> > > > > >
> > > > > > my job is comprised of HDFS source and HBase sink via,
> > > > > > TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null,
> > > job);
> > > > > > ---
> > > > > > so the table puts are taken care of by TableMapReduceUtil.
> > > > > >
> > > > > > When I run my job, the "get" returns this:
> > > > > > hbase(main):036:0> get 'subsrptn','8322910300|111021583220'
> COLUMN
> > > CELL
> > > > > ...
> > > > > > ..
> > > > > > [ omitted for brevity ]
> > > > > > all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
> > > > > >
> > > > > > Notice that the value and timestamp have changed.
> > > > > >
> > > > > > Why do I see this?
> > > > > >
> > > > > > ameet
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > bq. However, calling bypass() seems to have no effect.
> > > > > > >
> > > > > > > In HRegion :
> > > > > > >
> > > > > > >           if (coprocessorHost.prePut((Put) m, walEdit,
> > > > > > m.getWriteToWAL()))
> > > > > > > {
> > > > > > >             // pre hook says skip this Put
> > > > > > >             // mark as success and skip in doMiniBatchMutation
> > > > > > >             batchOp.retCodeDetails[i] =
> OperationStatus.SUCCESS;
> > > > > > >
> > > > > > > Can you be a bit more specific about your observation ?
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 7, 2014 at 11:02 AM, aa a <ameet.chaubal@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'd like to perform a put but have it succeed only if certain
> > > > > > conditions
> > > > > > > > based on values of some columns are met.
> > > > > > > > I am on 0.94.6
> > > > > > > >
> > > > > > > > my first option was:
> > > > > > > > write a prePut() Observer, perform the checks in there and
> then
> > > > call
> > > > > a
> > > > > > > > bypass() if needed.
> > > > > > > > However, calling bypass() seems to have no effect. I thought
> > that
> > > > > > > bypass()
> > > > > > > > would cause HBase to not perform the put.
> > > > > > > > Is that not correct?
> > > > > > > >
> > > > > > > > second option:
> > > > > > > > write a preCheckAndPut() and perform the validations there.
> > > problem
> > > > > is
> > > > > > > that
> > > > > > > > I have been running the process via a map/reduce and using
> > > > > > > > TableMapReduceUtil.initTableReducerJob(
> > > > > > > >         TABLE_NAME,
> > > > > > > >         null,
> > > > > > > >         job);
> > > > > > > > And there seems to be no way to tell this to use
> "checkAndPut"
> > > > > instead
> > > > > > of
> > > > > > > > "put", is there?
> > > > > > > >
> > > > > > > >
> > > > > > > > third option:
> > > > > > > > I could try to do my own htable.checkAndPut from the mapper.
> > > > However,
> > > > > > > since
> > > > > > > > I am on a kerberized cluster, I am not able to get an htable
> in
> > > the
> > > > > > > setup()
> > > > > > > > method of the mapper.
> > > > > > > > GSS initiate failed [Caused by GSSException: No valid
> > credentials
> > > > > > > provided
> > > > > > > > (Mechanism level: Failed to find any Kerberos tgt)]
> > > > > > > >  trying to perform UserGroupInformation.loginUserFromKeytab()
> > > fails
> > > > > > too;
> > > > > > > > I  get Caused by: javax.security.auth.login.LoginException:
> > > Unable
> > > > to
> > > > > > > > obtain password from user
> > > > > > > > So I am not able to do any htable call from the mapper.
> > > > > > > >
> > > > > > > > The only M/R that has worked for me to read from HDFS text
> file
> > > as
> > > > > > source
> > > > > > > > and write to HBase sink is
> > > TableMapReduceUtil.initTableReducerJob()
> > > > > > with
> > > > > > > > null for reducer class.
> > > > > > > >
> > > > > > > > Any thoughts?
> > > > > > > >
> > > > > > > > Thank you much in advance
> > > > > > > >
> > > > > > > > ameet
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: modify prePut() behavior

Posted by Ted Yu <yu...@gmail.com>.
In region server status screen, Coprocessors currently loaded by the
regionserver are shown.

FYI


On Fri, Feb 7, 2014 at 7:22 PM, ameet c <am...@gmail.com> wrote:

> Thanks Ted,
>
> finally figured what the problem was, my oversight.
> After poring over logs from individual region servers, it was much more
> straightforward.
> Due to permissions of 640 on the jar file that I was deploying to HDFS, the
> class was not getting loaded.
> Once I made the permissions 777, The region was able to load and run the
> class.
> and like a promise, bypass() worked fine.
>
> Thank you for persisting Ted...
>
>
>
>
> On Fri, Feb 7, 2014 at 10:12 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > If you can capture your scenario in a unit test, I will investigate.
> >
> > Thanks
> >
> >
> > On Fri, Feb 7, 2014 at 5:30 PM, ameet c <am...@gmail.com> wrote:
> >
> > > nope, at this point, I just have a basic test table and my prePut() has
> > > just one line e.bypass()
> > > I am just inserting a simple row with one column.
> > > I am expecting nothing to be inserted in the table when I do a scan.
> > > But that's not what I see.
> > > all puts are going through.
> > > Is this the correct behavior?
> > >
> > >
> > >
> > > On Fri, Feb 7, 2014 at 6:59 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > bq. I have disabled all processing in my observer
> > > >
> > > > Was it possible that there was other modification going on (such as
> > > Append)
> > > > ?
> > > >
> > > > Cheers
> > > >
> > > >
> > > > On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com>
> wrote:
> > > >
> > > > > Here is my pre job run result,
> > > > >
> > > > > hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> > > > > COLUMN                               CELL
> > > > >
> > > > >  all_cf:custId                       timestamp=1391797558233,
> > > > > value=111021583220
> > > > >
> > > > >  all_cf:dataDt                       timestamp=1391797558233,
> > > > > value=2013-11-14
> > > > >
> > > > >  all_cf:mktCd                        timestamp=1391797558233,
> > value=hcl
> > > > >
> > > > >  all_cf:rplctLogSeqNbr               timestamp=1391797558233,
> > value=100
> > > > >
> > > > >  all_cf:rplctOprnType                timestamp=1391797558233,
> > > > value=UPDATE
> > > > >
> > > > >  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> > > > > value=8322910300
> > > > >
> > > > >  all_cf:subsrptnStsCd                timestamp=1391797558233,
> > > value=XXXX
> > > > >
> > > > >  all_cf:subsrptnStsRsnCd             timestamp=1391797558233,
> > > value=SOCHI
> > > > >
> > > > > 8 row(s) in 0.0570 seconds
> > > > >
> > > > > for now, I have disabled all processing in my observer: by simply
> > > having
> > > > > this,
> > > > > public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
> > > > > Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> > > > > e.bypass();
> > > > > }
> > > > > What this means is that when I try to do a put, it should fail,
> > right?
> > > > >
> > > > > my job is comprised of HDFS source and HBase sink via,
> > > > > TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null,
> > job);
> > > > > ---
> > > > > so the table puts are taken care of by TableMapReduceUtil.
> > > > >
> > > > > When I run my job, the "get" returns this:
> > > > > hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN
> > CELL
> > > > ...
> > > > > ..
> > > > > [ omitted for brevity ]
> > > > > all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
> > > > >
> > > > > Notice that the value and timestamp have changed.
> > > > >
> > > > > Why do I see this?
> > > > >
> > > > > ameet
> > > > >
> > > > >
> > > > > On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com>
> wrote:
> > > > >
> > > > > > bq. However, calling bypass() seems to have no effect.
> > > > > >
> > > > > > In HRegion :
> > > > > >
> > > > > >           if (coprocessorHost.prePut((Put) m, walEdit,
> > > > > m.getWriteToWAL()))
> > > > > > {
> > > > > >             // pre hook says skip this Put
> > > > > >             // mark as success and skip in doMiniBatchMutation
> > > > > >             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
> > > > > >
> > > > > > Can you be a bit more specific about your observation ?
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'd like to perform a put but have it succeed only if certain
> > > > > conditions
> > > > > > > based on values of some columns are met.
> > > > > > > I am on 0.94.6
> > > > > > >
> > > > > > > my first option was:
> > > > > > > write a prePut() Observer, perform the checks in there and then
> > > call
> > > > a
> > > > > > > bypass() if needed.
> > > > > > > However, calling bypass() seems to have no effect. I thought
> that
> > > > > > bypass()
> > > > > > > would cause HBase to not perform the put.
> > > > > > > Is that not correct?
> > > > > > >
> > > > > > > second option:
> > > > > > > write a preCheckAndPut() and perform the validations there.
> > problem
> > > > is
> > > > > > that
> > > > > > > I have been running the process via a map/reduce and using
> > > > > > > TableMapReduceUtil.initTableReducerJob(
> > > > > > >         TABLE_NAME,
> > > > > > >         null,
> > > > > > >         job);
> > > > > > > And there seems to be no way to tell this to use "checkAndPut"
> > > > instead
> > > > > of
> > > > > > > "put", is there?
> > > > > > >
> > > > > > >
> > > > > > > third option:
> > > > > > > I could try to do my own htable.checkAndPut from the mapper.
> > > However,
> > > > > > since
> > > > > > > I am on a kerberized cluster, I am not able to get an htable in
> > the
> > > > > > setup()
> > > > > > > method of the mapper.
> > > > > > > GSS initiate failed [Caused by GSSException: No valid
> credentials
> > > > > > provided
> > > > > > > (Mechanism level: Failed to find any Kerberos tgt)]
> > > > > > >  trying to perform UserGroupInformation.loginUserFromKeytab()
> > fails
> > > > > too;
> > > > > > > I  get Caused by: javax.security.auth.login.LoginException:
> > Unable
> > > to
> > > > > > > obtain password from user
> > > > > > > So I am not able to do any htable call from the mapper.
> > > > > > >
> > > > > > > The only M/R that has worked for me to read from HDFS text file
> > as
> > > > > source
> > > > > > > and write to HBase sink is
> > TableMapReduceUtil.initTableReducerJob()
> > > > > with
> > > > > > > null for reducer class.
> > > > > > >
> > > > > > > Any thoughts?
> > > > > > >
> > > > > > > Thank you much in advance
> > > > > > >
> > > > > > > ameet
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: modify prePut() behavior

Posted by ameet c <am...@gmail.com>.
Thanks Ted,

finally figured what the problem was, my oversight.
After poring over logs from individual region servers, it was much more
straightforward.
Due to permissions of 640 on the jar file that I was deploying to HDFS, the
class was not getting loaded.
Once I made the permissions 777, The region was able to load and run the
class.
and like a promise, bypass() worked fine.

Thank you for persisting Ted...




On Fri, Feb 7, 2014 at 10:12 PM, Ted Yu <yu...@gmail.com> wrote:

> If you can capture your scenario in a unit test, I will investigate.
>
> Thanks
>
>
> On Fri, Feb 7, 2014 at 5:30 PM, ameet c <am...@gmail.com> wrote:
>
> > nope, at this point, I just have a basic test table and my prePut() has
> > just one line e.bypass()
> > I am just inserting a simple row with one column.
> > I am expecting nothing to be inserted in the table when I do a scan.
> > But that's not what I see.
> > all puts are going through.
> > Is this the correct behavior?
> >
> >
> >
> > On Fri, Feb 7, 2014 at 6:59 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > bq. I have disabled all processing in my observer
> > >
> > > Was it possible that there was other modification going on (such as
> > Append)
> > > ?
> > >
> > > Cheers
> > >
> > >
> > > On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com> wrote:
> > >
> > > > Here is my pre job run result,
> > > >
> > > > hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> > > > COLUMN                               CELL
> > > >
> > > >  all_cf:custId                       timestamp=1391797558233,
> > > > value=111021583220
> > > >
> > > >  all_cf:dataDt                       timestamp=1391797558233,
> > > > value=2013-11-14
> > > >
> > > >  all_cf:mktCd                        timestamp=1391797558233,
> value=hcl
> > > >
> > > >  all_cf:rplctLogSeqNbr               timestamp=1391797558233,
> value=100
> > > >
> > > >  all_cf:rplctOprnType                timestamp=1391797558233,
> > > value=UPDATE
> > > >
> > > >  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> > > > value=8322910300
> > > >
> > > >  all_cf:subsrptnStsCd                timestamp=1391797558233,
> > value=XXXX
> > > >
> > > >  all_cf:subsrptnStsRsnCd             timestamp=1391797558233,
> > value=SOCHI
> > > >
> > > > 8 row(s) in 0.0570 seconds
> > > >
> > > > for now, I have disabled all processing in my observer: by simply
> > having
> > > > this,
> > > > public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
> > > > Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> > > > e.bypass();
> > > > }
> > > > What this means is that when I try to do a put, it should fail,
> right?
> > > >
> > > > my job is comprised of HDFS source and HBase sink via,
> > > > TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null,
> job);
> > > > ---
> > > > so the table puts are taken care of by TableMapReduceUtil.
> > > >
> > > > When I run my job, the "get" returns this:
> > > > hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN
> CELL
> > > ...
> > > > ..
> > > > [ omitted for brevity ]
> > > > all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
> > > >
> > > > Notice that the value and timestamp have changed.
> > > >
> > > > Why do I see this?
> > > >
> > > > ameet
> > > >
> > > >
> > > > On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com> wrote:
> > > >
> > > > > bq. However, calling bypass() seems to have no effect.
> > > > >
> > > > > In HRegion :
> > > > >
> > > > >           if (coprocessorHost.prePut((Put) m, walEdit,
> > > > m.getWriteToWAL()))
> > > > > {
> > > > >             // pre hook says skip this Put
> > > > >             // mark as success and skip in doMiniBatchMutation
> > > > >             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
> > > > >
> > > > > Can you be a bit more specific about your observation ?
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'd like to perform a put but have it succeed only if certain
> > > > conditions
> > > > > > based on values of some columns are met.
> > > > > > I am on 0.94.6
> > > > > >
> > > > > > my first option was:
> > > > > > write a prePut() Observer, perform the checks in there and then
> > call
> > > a
> > > > > > bypass() if needed.
> > > > > > However, calling bypass() seems to have no effect. I thought that
> > > > > bypass()
> > > > > > would cause HBase to not perform the put.
> > > > > > Is that not correct?
> > > > > >
> > > > > > second option:
> > > > > > write a preCheckAndPut() and perform the validations there.
> problem
> > > is
> > > > > that
> > > > > > I have been running the process via a map/reduce and using
> > > > > > TableMapReduceUtil.initTableReducerJob(
> > > > > >         TABLE_NAME,
> > > > > >         null,
> > > > > >         job);
> > > > > > And there seems to be no way to tell this to use "checkAndPut"
> > > instead
> > > > of
> > > > > > "put", is there?
> > > > > >
> > > > > >
> > > > > > third option:
> > > > > > I could try to do my own htable.checkAndPut from the mapper.
> > However,
> > > > > since
> > > > > > I am on a kerberized cluster, I am not able to get an htable in
> the
> > > > > setup()
> > > > > > method of the mapper.
> > > > > > GSS initiate failed [Caused by GSSException: No valid credentials
> > > > > provided
> > > > > > (Mechanism level: Failed to find any Kerberos tgt)]
> > > > > >  trying to perform UserGroupInformation.loginUserFromKeytab()
> fails
> > > > too;
> > > > > > I  get Caused by: javax.security.auth.login.LoginException:
> Unable
> > to
> > > > > > obtain password from user
> > > > > > So I am not able to do any htable call from the mapper.
> > > > > >
> > > > > > The only M/R that has worked for me to read from HDFS text file
> as
> > > > source
> > > > > > and write to HBase sink is
> TableMapReduceUtil.initTableReducerJob()
> > > > with
> > > > > > null for reducer class.
> > > > > >
> > > > > > Any thoughts?
> > > > > >
> > > > > > Thank you much in advance
> > > > > >
> > > > > > ameet
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: modify prePut() behavior

Posted by Ted Yu <yu...@gmail.com>.
If you can capture your scenario in a unit test, I will investigate.

Thanks


On Fri, Feb 7, 2014 at 5:30 PM, ameet c <am...@gmail.com> wrote:

> nope, at this point, I just have a basic test table and my prePut() has
> just one line e.bypass()
> I am just inserting a simple row with one column.
> I am expecting nothing to be inserted in the table when I do a scan.
> But that's not what I see.
> all puts are going through.
> Is this the correct behavior?
>
>
>
> On Fri, Feb 7, 2014 at 6:59 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > bq. I have disabled all processing in my observer
> >
> > Was it possible that there was other modification going on (such as
> Append)
> > ?
> >
> > Cheers
> >
> >
> > On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com> wrote:
> >
> > > Here is my pre job run result,
> > >
> > > hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> > > COLUMN                               CELL
> > >
> > >  all_cf:custId                       timestamp=1391797558233,
> > > value=111021583220
> > >
> > >  all_cf:dataDt                       timestamp=1391797558233,
> > > value=2013-11-14
> > >
> > >  all_cf:mktCd                        timestamp=1391797558233, value=hcl
> > >
> > >  all_cf:rplctLogSeqNbr               timestamp=1391797558233, value=100
> > >
> > >  all_cf:rplctOprnType                timestamp=1391797558233,
> > value=UPDATE
> > >
> > >  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> > > value=8322910300
> > >
> > >  all_cf:subsrptnStsCd                timestamp=1391797558233,
> value=XXXX
> > >
> > >  all_cf:subsrptnStsRsnCd             timestamp=1391797558233,
> value=SOCHI
> > >
> > > 8 row(s) in 0.0570 seconds
> > >
> > > for now, I have disabled all processing in my observer: by simply
> having
> > > this,
> > > public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
> > > Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> > > e.bypass();
> > > }
> > > What this means is that when I try to do a put, it should fail, right?
> > >
> > > my job is comprised of HDFS source and HBase sink via,
> > > TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null, job);
> > > ---
> > > so the table puts are taken care of by TableMapReduceUtil.
> > >
> > > When I run my job, the "get" returns this:
> > > hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN CELL
> > ...
> > > ..
> > > [ omitted for brevity ]
> > > all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
> > >
> > > Notice that the value and timestamp have changed.
> > >
> > > Why do I see this?
> > >
> > > ameet
> > >
> > >
> > > On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > bq. However, calling bypass() seems to have no effect.
> > > >
> > > > In HRegion :
> > > >
> > > >           if (coprocessorHost.prePut((Put) m, walEdit,
> > > m.getWriteToWAL()))
> > > > {
> > > >             // pre hook says skip this Put
> > > >             // mark as success and skip in doMiniBatchMutation
> > > >             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
> > > >
> > > > Can you be a bit more specific about your observation ?
> > > >
> > > > Cheers
> > > >
> > > >
> > > > On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com>
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'd like to perform a put but have it succeed only if certain
> > > conditions
> > > > > based on values of some columns are met.
> > > > > I am on 0.94.6
> > > > >
> > > > > my first option was:
> > > > > write a prePut() Observer, perform the checks in there and then
> call
> > a
> > > > > bypass() if needed.
> > > > > However, calling bypass() seems to have no effect. I thought that
> > > > bypass()
> > > > > would cause HBase to not perform the put.
> > > > > Is that not correct?
> > > > >
> > > > > second option:
> > > > > write a preCheckAndPut() and perform the validations there. problem
> > is
> > > > that
> > > > > I have been running the process via a map/reduce and using
> > > > > TableMapReduceUtil.initTableReducerJob(
> > > > >         TABLE_NAME,
> > > > >         null,
> > > > >         job);
> > > > > And there seems to be no way to tell this to use "checkAndPut"
> > instead
> > > of
> > > > > "put", is there?
> > > > >
> > > > >
> > > > > third option:
> > > > > I could try to do my own htable.checkAndPut from the mapper.
> However,
> > > > since
> > > > > I am on a kerberized cluster, I am not able to get an htable in the
> > > > setup()
> > > > > method of the mapper.
> > > > > GSS initiate failed [Caused by GSSException: No valid credentials
> > > > provided
> > > > > (Mechanism level: Failed to find any Kerberos tgt)]
> > > > >  trying to perform UserGroupInformation.loginUserFromKeytab() fails
> > > too;
> > > > > I  get Caused by: javax.security.auth.login.LoginException: Unable
> to
> > > > > obtain password from user
> > > > > So I am not able to do any htable call from the mapper.
> > > > >
> > > > > The only M/R that has worked for me to read from HDFS text file as
> > > source
> > > > > and write to HBase sink is TableMapReduceUtil.initTableReducerJob()
> > > with
> > > > > null for reducer class.
> > > > >
> > > > > Any thoughts?
> > > > >
> > > > > Thank you much in advance
> > > > >
> > > > > ameet
> > > > >
> > > >
> > >
> >
>

Re: modify prePut() behavior

Posted by ameet c <am...@gmail.com>.
nope, at this point, I just have a basic test table and my prePut() has
just one line e.bypass()
I am just inserting a simple row with one column.
I am expecting nothing to be inserted in the table when I do a scan.
But that's not what I see.
all puts are going through.
Is this the correct behavior?



On Fri, Feb 7, 2014 at 6:59 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. I have disabled all processing in my observer
>
> Was it possible that there was other modification going on (such as Append)
> ?
>
> Cheers
>
>
> On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com> wrote:
>
> > Here is my pre job run result,
> >
> > hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> > COLUMN                               CELL
> >
> >  all_cf:custId                       timestamp=1391797558233,
> > value=111021583220
> >
> >  all_cf:dataDt                       timestamp=1391797558233,
> > value=2013-11-14
> >
> >  all_cf:mktCd                        timestamp=1391797558233, value=hcl
> >
> >  all_cf:rplctLogSeqNbr               timestamp=1391797558233, value=100
> >
> >  all_cf:rplctOprnType                timestamp=1391797558233,
> value=UPDATE
> >
> >  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> > value=8322910300
> >
> >  all_cf:subsrptnStsCd                timestamp=1391797558233, value=XXXX
> >
> >  all_cf:subsrptnStsRsnCd             timestamp=1391797558233, value=SOCHI
> >
> > 8 row(s) in 0.0570 seconds
> >
> > for now, I have disabled all processing in my observer: by simply having
> > this,
> > public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
> > Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> > e.bypass();
> > }
> > What this means is that when I try to do a put, it should fail, right?
> >
> > my job is comprised of HDFS source and HBase sink via,
> > TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null, job);
> > ---
> > so the table puts are taken care of by TableMapReduceUtil.
> >
> > When I run my job, the "get" returns this:
> > hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN CELL
> ...
> > ..
> > [ omitted for brevity ]
> > all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
> >
> > Notice that the value and timestamp have changed.
> >
> > Why do I see this?
> >
> > ameet
> >
> >
> > On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > bq. However, calling bypass() seems to have no effect.
> > >
> > > In HRegion :
> > >
> > >           if (coprocessorHost.prePut((Put) m, walEdit,
> > m.getWriteToWAL()))
> > > {
> > >             // pre hook says skip this Put
> > >             // mark as success and skip in doMiniBatchMutation
> > >             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
> > >
> > > Can you be a bit more specific about your observation ?
> > >
> > > Cheers
> > >
> > >
> > > On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I'd like to perform a put but have it succeed only if certain
> > conditions
> > > > based on values of some columns are met.
> > > > I am on 0.94.6
> > > >
> > > > my first option was:
> > > > write a prePut() Observer, perform the checks in there and then call
> a
> > > > bypass() if needed.
> > > > However, calling bypass() seems to have no effect. I thought that
> > > bypass()
> > > > would cause HBase to not perform the put.
> > > > Is that not correct?
> > > >
> > > > second option:
> > > > write a preCheckAndPut() and perform the validations there. problem
> is
> > > that
> > > > I have been running the process via a map/reduce and using
> > > > TableMapReduceUtil.initTableReducerJob(
> > > >         TABLE_NAME,
> > > >         null,
> > > >         job);
> > > > And there seems to be no way to tell this to use "checkAndPut"
> instead
> > of
> > > > "put", is there?
> > > >
> > > >
> > > > third option:
> > > > I could try to do my own htable.checkAndPut from the mapper. However,
> > > since
> > > > I am on a kerberized cluster, I am not able to get an htable in the
> > > setup()
> > > > method of the mapper.
> > > > GSS initiate failed [Caused by GSSException: No valid credentials
> > > provided
> > > > (Mechanism level: Failed to find any Kerberos tgt)]
> > > >  trying to perform UserGroupInformation.loginUserFromKeytab() fails
> > too;
> > > > I  get Caused by: javax.security.auth.login.LoginException: Unable to
> > > > obtain password from user
> > > > So I am not able to do any htable call from the mapper.
> > > >
> > > > The only M/R that has worked for me to read from HDFS text file as
> > source
> > > > and write to HBase sink is TableMapReduceUtil.initTableReducerJob()
> > with
> > > > null for reducer class.
> > > >
> > > > Any thoughts?
> > > >
> > > > Thank you much in advance
> > > >
> > > > ameet
> > > >
> > >
> >
>

Re: modify prePut() behavior

Posted by Ted Yu <yu...@gmail.com>.
bq. I have disabled all processing in my observer

Was it possible that there was other modification going on (such as Append)
?

Cheers


On Fri, Feb 7, 2014 at 11:26 AM, aa a <am...@gmail.com> wrote:

> Here is my pre job run result,
>
> hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
> COLUMN                               CELL
>
>  all_cf:custId                       timestamp=1391797558233,
> value=111021583220
>
>  all_cf:dataDt                       timestamp=1391797558233,
> value=2013-11-14
>
>  all_cf:mktCd                        timestamp=1391797558233, value=hcl
>
>  all_cf:rplctLogSeqNbr               timestamp=1391797558233, value=100
>
>  all_cf:rplctOprnType                timestamp=1391797558233, value=UPDATE
>
>  all_cf:subsrptnNbrTxt               timestamp=1391797558233,
> value=8322910300
>
>  all_cf:subsrptnStsCd                timestamp=1391797558233, value=XXXX
>
>  all_cf:subsrptnStsRsnCd             timestamp=1391797558233, value=SOCHI
>
> 8 row(s) in 0.0570 seconds
>
> for now, I have disabled all processing in my observer: by simply having
> this,
> public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
> Put put, WALEdit edit, boolean writeToWAL) throws IOException {
> e.bypass();
> }
> What this means is that when I try to do a put, it should fail, right?
>
> my job is comprised of HDFS source and HBase sink via,
> TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null, job);
> ---
> so the table puts are taken care of by TableMapReduceUtil.
>
> When I run my job, the "get" returns this:
> hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN CELL ...
> ..
> [ omitted for brevity ]
> all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1
>
> Notice that the value and timestamp have changed.
>
> Why do I see this?
>
> ameet
>
>
> On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > bq. However, calling bypass() seems to have no effect.
> >
> > In HRegion :
> >
> >           if (coprocessorHost.prePut((Put) m, walEdit,
> m.getWriteToWAL()))
> > {
> >             // pre hook says skip this Put
> >             // mark as success and skip in doMiniBatchMutation
> >             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
> >
> > Can you be a bit more specific about your observation ?
> >
> > Cheers
> >
> >
> > On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I'd like to perform a put but have it succeed only if certain
> conditions
> > > based on values of some columns are met.
> > > I am on 0.94.6
> > >
> > > my first option was:
> > > write a prePut() Observer, perform the checks in there and then call a
> > > bypass() if needed.
> > > However, calling bypass() seems to have no effect. I thought that
> > bypass()
> > > would cause HBase to not perform the put.
> > > Is that not correct?
> > >
> > > second option:
> > > write a preCheckAndPut() and perform the validations there. problem is
> > that
> > > I have been running the process via a map/reduce and using
> > > TableMapReduceUtil.initTableReducerJob(
> > >         TABLE_NAME,
> > >         null,
> > >         job);
> > > And there seems to be no way to tell this to use "checkAndPut" instead
> of
> > > "put", is there?
> > >
> > >
> > > third option:
> > > I could try to do my own htable.checkAndPut from the mapper. However,
> > since
> > > I am on a kerberized cluster, I am not able to get an htable in the
> > setup()
> > > method of the mapper.
> > > GSS initiate failed [Caused by GSSException: No valid credentials
> > provided
> > > (Mechanism level: Failed to find any Kerberos tgt)]
> > >  trying to perform UserGroupInformation.loginUserFromKeytab() fails
> too;
> > > I  get Caused by: javax.security.auth.login.LoginException: Unable to
> > > obtain password from user
> > > So I am not able to do any htable call from the mapper.
> > >
> > > The only M/R that has worked for me to read from HDFS text file as
> source
> > > and write to HBase sink is TableMapReduceUtil.initTableReducerJob()
> with
> > > null for reducer class.
> > >
> > > Any thoughts?
> > >
> > > Thank you much in advance
> > >
> > > ameet
> > >
> >
>

Re: modify prePut() behavior

Posted by aa a <am...@gmail.com>.
Here is my pre job run result,

hbase(main):035:0> get 'subsrptn','8322910300|111021583220'
COLUMN                               CELL

 all_cf:custId                       timestamp=1391797558233,
value=111021583220

 all_cf:dataDt                       timestamp=1391797558233,
value=2013-11-14

 all_cf:mktCd                        timestamp=1391797558233, value=hcl

 all_cf:rplctLogSeqNbr               timestamp=1391797558233, value=100

 all_cf:rplctOprnType                timestamp=1391797558233, value=UPDATE

 all_cf:subsrptnNbrTxt               timestamp=1391797558233,
value=8322910300

 all_cf:subsrptnStsCd                timestamp=1391797558233, value=XXXX

 all_cf:subsrptnStsRsnCd             timestamp=1391797558233, value=SOCHI

8 row(s) in 0.0570 seconds

for now, I have disabled all processing in my observer: by simply having
this,
public void prePut(ObserverContext<RegionCoprocessorEnvironment> e,
Put put, WALEdit edit, boolean writeToWAL) throws IOException {
e.bypass();
}
What this means is that when I try to do a put, it should fail, right?

my job is comprised of HDFS source and HBase sink via,
TableMapReduceUtil.initTableReducerJob( Statics.TABLE_NAME, null, job);
---
so the table puts are taken care of by TableMapReduceUtil.

When I run my job, the "get" returns this:
hbase(main):036:0> get 'subsrptn','8322910300|111021583220' COLUMN CELL ...
..
[ omitted for brevity ]
all_cf:subsrptnStsRsnCd timestamp=1391801091387, value=SOCHI-1

Notice that the value and timestamp have changed.

Why do I see this?

ameet


On Fri, Feb 7, 2014 at 2:09 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. However, calling bypass() seems to have no effect.
>
> In HRegion :
>
>           if (coprocessorHost.prePut((Put) m, walEdit, m.getWriteToWAL()))
> {
>             // pre hook says skip this Put
>             // mark as success and skip in doMiniBatchMutation
>             batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;
>
> Can you be a bit more specific about your observation ?
>
> Cheers
>
>
> On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com> wrote:
>
> > Hi,
> >
> > I'd like to perform a put but have it succeed only if certain conditions
> > based on values of some columns are met.
> > I am on 0.94.6
> >
> > my first option was:
> > write a prePut() Observer, perform the checks in there and then call a
> > bypass() if needed.
> > However, calling bypass() seems to have no effect. I thought that
> bypass()
> > would cause HBase to not perform the put.
> > Is that not correct?
> >
> > second option:
> > write a preCheckAndPut() and perform the validations there. problem is
> that
> > I have been running the process via a map/reduce and using
> > TableMapReduceUtil.initTableReducerJob(
> >         TABLE_NAME,
> >         null,
> >         job);
> > And there seems to be no way to tell this to use "checkAndPut" instead of
> > "put", is there?
> >
> >
> > third option:
> > I could try to do my own htable.checkAndPut from the mapper. However,
> since
> > I am on a kerberized cluster, I am not able to get an htable in the
> setup()
> > method of the mapper.
> > GSS initiate failed [Caused by GSSException: No valid credentials
> provided
> > (Mechanism level: Failed to find any Kerberos tgt)]
> >  trying to perform UserGroupInformation.loginUserFromKeytab() fails too;
> > I  get Caused by: javax.security.auth.login.LoginException: Unable to
> > obtain password from user
> > So I am not able to do any htable call from the mapper.
> >
> > The only M/R that has worked for me to read from HDFS text file as source
> > and write to HBase sink is TableMapReduceUtil.initTableReducerJob() with
> > null for reducer class.
> >
> > Any thoughts?
> >
> > Thank you much in advance
> >
> > ameet
> >
>

Re: modify prePut() behavior

Posted by Ted Yu <yu...@gmail.com>.
bq. However, calling bypass() seems to have no effect.

In HRegion :

          if (coprocessorHost.prePut((Put) m, walEdit, m.getWriteToWAL())) {
            // pre hook says skip this Put
            // mark as success and skip in doMiniBatchMutation
            batchOp.retCodeDetails[i] = OperationStatus.SUCCESS;

Can you be a bit more specific about your observation ?

Cheers


On Fri, Feb 7, 2014 at 11:02 AM, aa a <am...@gmail.com> wrote:

> Hi,
>
> I'd like to perform a put but have it succeed only if certain conditions
> based on values of some columns are met.
> I am on 0.94.6
>
> my first option was:
> write a prePut() Observer, perform the checks in there and then call a
> bypass() if needed.
> However, calling bypass() seems to have no effect. I thought that bypass()
> would cause HBase to not perform the put.
> Is that not correct?
>
> second option:
> write a preCheckAndPut() and perform the validations there. problem is that
> I have been running the process via a map/reduce and using
> TableMapReduceUtil.initTableReducerJob(
>         TABLE_NAME,
>         null,
>         job);
> And there seems to be no way to tell this to use "checkAndPut" instead of
> "put", is there?
>
>
> third option:
> I could try to do my own htable.checkAndPut from the mapper. However, since
> I am on a kerberized cluster, I am not able to get an htable in the setup()
> method of the mapper.
> GSS initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: Failed to find any Kerberos tgt)]
>  trying to perform UserGroupInformation.loginUserFromKeytab() fails too;
> I  get Caused by: javax.security.auth.login.LoginException: Unable to
> obtain password from user
> So I am not able to do any htable call from the mapper.
>
> The only M/R that has worked for me to read from HDFS text file as source
> and write to HBase sink is TableMapReduceUtil.initTableReducerJob() with
> null for reducer class.
>
> Any thoughts?
>
> Thank you much in advance
>
> ameet
>