You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Kristoffer Sjögren <st...@gmail.com> on 2013/04/19 19:53:45 UTC

Overwrite a row

Hi

Is it possible to completely overwrite/replace a row in a single _atomic_
action? Already existing columns and qualifiers should be removed if they
do not exist in the data inserted into the row.

The only way to do this is to first delete the row then insert new data in
its place, correct? Or is there an operation to do this?

Cheers,
-Kristoffer

Re: Overwrite a row

Posted by Mohamed Ibrahim <mi...@mibrahim.net>.

Hello Kristoffer,

HBase row mutations are atomic ( http://hbase.apache.org/acid-semantics.html ),
which include put . So when you overwrite a row it is not possible for
another processes to read half old / half new data. They will either read
all old or all new data if the put succeeds. It is also not possible for
put to fail in the middle leaving a partly modified row.

Best,
Mohamed

On Fri, Apr 19, 2013 at 1:53 PM, Kristoffer Sjögren <st...@gmail.com>wrote:

> Hi
>
> Is it possible to completely overwrite/replace a row in a single _atomic_
> action? Already existing columns and qualifiers should be removed if they
> do not exist in the data inserted into the row.
>
> The only way to do this is to first delete the row then insert new data in
> its place, correct? Or is there an operation to do this?
>
> Cheers,
> -Kristoffer
>

Re: Overwrite a row

Posted by Ted Yu <yu...@gmail.com>.

If the maximum number of versions is set to 1 for your table, you would
already have what you wanted.

Normally max versions being 1 is not desired, that was why I asked about
your use case.

Cheers

On Fri, Apr 19, 2013 at 12:44 PM, Kristoffer Sjögren <st...@gmail.com>wrote:

> What would you suggest? I want the operation to be atomic.
>
>
> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > What is the maximum number of versions do you allow for the underlying
> > table ?
> >
> > Thanks
> >
> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <stoffe@gmail.com
> > >wrote:
> >
> > > Hi
> > >
> > > Is it possible to completely overwrite/replace a row in a single
> _atomic_
> > > action? Already existing columns and qualifiers should be removed if
> they
> > > do not exist in the data inserted into the row.
> > >
> > > The only way to do this is to first delete the row then insert new data
> > in
> > > its place, correct? Or is there an operation to do this?
> > >
> > > Cheers,
> > > -Kristoffer
> > >
> >
>

number of zookeeper connections, how many is too many?

Posted by kaveh minooie <ka...@plutoz.com>.

Hi

I was just wondering if what I am seeing in my cluster makes sense. I 
have a hadoop cluster with 10 nodes and I am running 10 regionserver on 
top them as well. in my zoo keeper configuration I choose to allow 
unlimited number of connection mostly to see how high it actually goes. 
now, I run 8 map task on each of my node to a total of 80 concurrent map 
tasks, and my hbase regionservers each have a bit short of 200 regions 
on each server for a total of 1838 (or something) all belonging to only 
one table.

right after bringing up the hbase or when no mapreduce (or anyother 
client )is using the hbase the number of connection is always 23. when i 
run a mapreduce job that basically goes over the entire talbe ( has 1800 
something map tasks). I see ( in the zk_dump on hbase master web 
interface ) that the number of connections goes up to about 390ish.

I am new to this, so my main question is first does this makes sense? or 
am i doing something wrong? cause I don't understand why each region 
server has to establish more than one connection.

thanks,

Re: Overwrite a row

Posted by Kristoffer Sjögren <st...@gmail.com>.

Interesting. RowMutation is a great API improvement, I shall have a look at
multirowmutation as well.

Thanks!


On Sun, Apr 21, 2013 at 9:39 AM, Anoop John <an...@gmail.com> wrote:

> You can use MultiRowMutationEndpoint for atomic op on multiple rows (within
> same region)..
>
>
> On Sun, Apr 21, 2013 at 5:55 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Here is code from 0.94 code base:
> >
> >   public void mutateRow(final RowMutations rm) throws IOException {
> >     new ServerCallable<Void>(connection, tableName, rm.getRow(),
> >         operationTimeout) {
> >       public Void call() throws IOException {
> >         server.mutateRow(location.getRegionInfo().getRegionName(), rm);
> >         return null;
> >
> > where RowMutations has the following check:
> >
> >   private void internalAdd(Mutation m) throws IOException {
> >     int res = Bytes.compareTo(this.row, m.getRow());
> >     if(res != 0) {
> >       throw new IOException("The row in the recently added Put/Delete " +
> >           Bytes.toStringBinary(m.getRow()) + " doesn't match the original
> > one " +
> >           Bytes.toStringBinary(this.row));
> >
> > This means you need to issue multiple mutateRow() calls for different
> rows.
> >
> > I think you should consider the potential impact on performance due to
> this
> > limitation.
> >
> > For advanced usage, take a look at MultiRowMutationEndpoint:
> >
> >  * This class demonstrates how to implement atomic multi row transactions
> > using
> >  * {@link HRegion#mutateRowsWithLocks(java.util.Collection,
> > java.util.Collection)}
> >  * and Coprocessor endpoints.
> >
> > Cheers
> >
> > On Sat, Apr 20, 2013 at 10:11 AM, Kristoffer Sjögren <stoffe@gmail.com
> > >wrote:
> >
> > > Just to absolutely be clear, is this also true for a batch that span
> > > multiple rows?
> > >
> > >
> > > On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Operations within each batch are atomic.
> > > > They would either all succeed or all fail.
> > > >
> > > > Time stamps would all refer to the latest cell (KeyVal).
> > > >
> > > > Cheers
> > > >
> > > > On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren <st...@gmail.com>
> > > wrote:
> > > >
> > > > > The schema is known beforehand so this is exactly what I need.
> Great!
> > > > >
> > > > > One more question. What guarantees does the batch operation have?
> Are
> > > the
> > > > > operations contained within each batch atomic? I.e. all mutations
> > will
> > > be
> > > > > given the same timestamp? If something fails, all operation fail or
> > can
> > > > it
> > > > > fail partially?
> > > > >
> > > > > Thanks for your help, much appreciated.
> > > > >
> > > > > Cheers,
> > > > > -Kristoffer
> > > > >
> > > > >
> > > > > On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com>
> wrote:
> > > > >
> > > > >> I don't know details about Kristoffer's schema.
> > > > >> If all the column qualifiers are known a priori, mutateRow()
> should
> > > > serve
> > > > >> his needs.
> > > > >>
> > > > >> HBase allows arbitrary number of columns in a column family. If
> the
> > > > schema
> > > > >> is dynamic, mutateRow() wouldn't suffice.
> > > > >> If the column qualifiers are known but the row is very wide (and a
> > few
> > > > >> columns are updated per call), performance would degrade.
> > > > >>
> > > > >> Just some factors to consider.
> > > > >>
> > > > >> Cheers
> > > > >>
> > > > >> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <
> > > mibrahim@mibrahim.net
> > > > >>> wrote:
> > > > >>
> > > > >>> Actually I do see it in the 0.94 JavaDocs (
> > > > >>
> > > >
> > >
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > > > >>> ),
> > > > >>> so may be it was added in 0.94.6 even though the jira says fixed
> in
> > > > 0.95
> > > > >> .
> > > > >>> I haven't used it though, but it seems that's what you're looking
> > > for.
> > > > >>>
> > > > >>> Sorry for confusion.
> > > > >>>
> > > > >>> Mohamed
> > > > >>>
> > > > >>>
> > > > >>> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <
> > > > mibrahim@mibrahim.net
> > > > >>>> wrote:
> > > > >>>
> > > > >>>> It seems that 0.95 is not released yet, mutateRow won't be a
> > > solution
> > > > >> for
> > > > >>>> now. I saw it in the downloads and I thought it was released.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
> > > > >> mibrahim@mibrahim.net
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Just noticed you want to delete as well. I think that's
> supported
> > > > >> since
> > > > >>>>> 0.95 in mutateRow (
> > > > >>
> > > >
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > > > >> ).
> > > > >>>>> You can do multiple puts and deletes and they will be performed
> > > > >>> atomically.
> > > > >>>>> So you can remove qualifiers and put new ones.
> > > > >>>>>
> > > > >>>>> Mohamed
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <
> > > > stoffe@gmail.com
> > > > >>>> wrote:
> > > > >>>>>
> > > > >>>>>> What would you suggest? I want the operation to be atomic.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com>
> > > > wrote:
> > > > >>>>>>
> > > > >>>>>>> What is the maximum number of versions do you allow for the
> > > > >>> underlying
> > > > >>>>>>> table ?
> > > > >>>>>>>
> > > > >>>>>>> Thanks
> > > > >>>>>>>
> > > > >>>>>>> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> > > > >>> stoffe@gmail.com
> > > > >>>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi
> > > > >>>>>>>>
> > > > >>>>>>>> Is it possible to completely overwrite/replace a row in a
> > single
> > > > >>>>>> _atomic_
> > > > >>>>>>>> action? Already existing columns and qualifiers should be
> > > removed
> > > > >>> if
> > > > >>>>>> they
> > > > >>>>>>>> do not exist in the data inserted into the row.
> > > > >>>>>>>>
> > > > >>>>>>>> The only way to do this is to first delete the row then
> insert
> > > > >> new
> > > > >>>>>> data
> > > > >>>>>>> in
> > > > >>>>>>>> its place, correct? Or is there an operation to do this?
> > > > >>>>>>>>
> > > > >>>>>>>> Cheers,
> > > > >>>>>>>> -Kristoffer
> > > > >>
> > > >
> > >
> >
>

Re: Overwrite a row

Posted by Anoop John <an...@gmail.com>.

You can use MultiRowMutationEndpoint for atomic op on multiple rows (within
same region)..


On Sun, Apr 21, 2013 at 5:55 AM, Ted Yu <yu...@gmail.com> wrote:

> Here is code from 0.94 code base:
>
>   public void mutateRow(final RowMutations rm) throws IOException {
>     new ServerCallable<Void>(connection, tableName, rm.getRow(),
>         operationTimeout) {
>       public Void call() throws IOException {
>         server.mutateRow(location.getRegionInfo().getRegionName(), rm);
>         return null;
>
> where RowMutations has the following check:
>
>   private void internalAdd(Mutation m) throws IOException {
>     int res = Bytes.compareTo(this.row, m.getRow());
>     if(res != 0) {
>       throw new IOException("The row in the recently added Put/Delete " +
>           Bytes.toStringBinary(m.getRow()) + " doesn't match the original
> one " +
>           Bytes.toStringBinary(this.row));
>
> This means you need to issue multiple mutateRow() calls for different rows.
>
> I think you should consider the potential impact on performance due to this
> limitation.
>
> For advanced usage, take a look at MultiRowMutationEndpoint:
>
>  * This class demonstrates how to implement atomic multi row transactions
> using
>  * {@link HRegion#mutateRowsWithLocks(java.util.Collection,
> java.util.Collection)}
>  * and Coprocessor endpoints.
>
> Cheers
>
> On Sat, Apr 20, 2013 at 10:11 AM, Kristoffer Sjögren <stoffe@gmail.com
> >wrote:
>
> > Just to absolutely be clear, is this also true for a batch that span
> > multiple rows?
> >
> >
> > On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Operations within each batch are atomic.
> > > They would either all succeed or all fail.
> > >
> > > Time stamps would all refer to the latest cell (KeyVal).
> > >
> > > Cheers
> > >
> > > On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren <st...@gmail.com>
> > wrote:
> > >
> > > > The schema is known beforehand so this is exactly what I need. Great!
> > > >
> > > > One more question. What guarantees does the batch operation have? Are
> > the
> > > > operations contained within each batch atomic? I.e. all mutations
> will
> > be
> > > > given the same timestamp? If something fails, all operation fail or
> can
> > > it
> > > > fail partially?
> > > >
> > > > Thanks for your help, much appreciated.
> > > >
> > > > Cheers,
> > > > -Kristoffer
> > > >
> > > >
> > > > On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com> wrote:
> > > >
> > > >> I don't know details about Kristoffer's schema.
> > > >> If all the column qualifiers are known a priori, mutateRow() should
> > > serve
> > > >> his needs.
> > > >>
> > > >> HBase allows arbitrary number of columns in a column family. If the
> > > schema
> > > >> is dynamic, mutateRow() wouldn't suffice.
> > > >> If the column qualifiers are known but the row is very wide (and a
> few
> > > >> columns are updated per call), performance would degrade.
> > > >>
> > > >> Just some factors to consider.
> > > >>
> > > >> Cheers
> > > >>
> > > >> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <
> > mibrahim@mibrahim.net
> > > >>> wrote:
> > > >>
> > > >>> Actually I do see it in the 0.94 JavaDocs (
> > > >>
> > >
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > > >>> ),
> > > >>> so may be it was added in 0.94.6 even though the jira says fixed in
> > > 0.95
> > > >> .
> > > >>> I haven't used it though, but it seems that's what you're looking
> > for.
> > > >>>
> > > >>> Sorry for confusion.
> > > >>>
> > > >>> Mohamed
> > > >>>
> > > >>>
> > > >>> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <
> > > mibrahim@mibrahim.net
> > > >>>> wrote:
> > > >>>
> > > >>>> It seems that 0.95 is not released yet, mutateRow won't be a
> > solution
> > > >> for
> > > >>>> now. I saw it in the downloads and I thought it was released.
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
> > > >> mibrahim@mibrahim.net
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Just noticed you want to delete as well. I think that's supported
> > > >> since
> > > >>>>> 0.95 in mutateRow (
> > > >>
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > > >> ).
> > > >>>>> You can do multiple puts and deletes and they will be performed
> > > >>> atomically.
> > > >>>>> So you can remove qualifiers and put new ones.
> > > >>>>>
> > > >>>>> Mohamed
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <
> > > stoffe@gmail.com
> > > >>>> wrote:
> > > >>>>>
> > > >>>>>> What would you suggest? I want the operation to be atomic.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com>
> > > wrote:
> > > >>>>>>
> > > >>>>>>> What is the maximum number of versions do you allow for the
> > > >>> underlying
> > > >>>>>>> table ?
> > > >>>>>>>
> > > >>>>>>> Thanks
> > > >>>>>>>
> > > >>>>>>> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> > > >>> stoffe@gmail.com
> > > >>>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi
> > > >>>>>>>>
> > > >>>>>>>> Is it possible to completely overwrite/replace a row in a
> single
> > > >>>>>> _atomic_
> > > >>>>>>>> action? Already existing columns and qualifiers should be
> > removed
> > > >>> if
> > > >>>>>> they
> > > >>>>>>>> do not exist in the data inserted into the row.
> > > >>>>>>>>
> > > >>>>>>>> The only way to do this is to first delete the row then insert
> > > >> new
> > > >>>>>> data
> > > >>>>>>> in
> > > >>>>>>>> its place, correct? Or is there an operation to do this?
> > > >>>>>>>>
> > > >>>>>>>> Cheers,
> > > >>>>>>>> -Kristoffer
> > > >>
> > >
> >
>

Re: Overwrite a row

Posted by Ted Yu <yu...@gmail.com>.

Here is code from 0.94 code base:

  public void mutateRow(final RowMutations rm) throws IOException {
    new ServerCallable<Void>(connection, tableName, rm.getRow(),
        operationTimeout) {
      public Void call() throws IOException {
        server.mutateRow(location.getRegionInfo().getRegionName(), rm);
        return null;

where RowMutations has the following check:

  private void internalAdd(Mutation m) throws IOException {
    int res = Bytes.compareTo(this.row, m.getRow());
    if(res != 0) {
      throw new IOException("The row in the recently added Put/Delete " +
          Bytes.toStringBinary(m.getRow()) + " doesn't match the original
one " +
          Bytes.toStringBinary(this.row));

This means you need to issue multiple mutateRow() calls for different rows.

I think you should consider the potential impact on performance due to this
limitation.

For advanced usage, take a look at MultiRowMutationEndpoint:

 * This class demonstrates how to implement atomic multi row transactions
using
 * {@link HRegion#mutateRowsWithLocks(java.util.Collection,
java.util.Collection)}
 * and Coprocessor endpoints.

Cheers

On Sat, Apr 20, 2013 at 10:11 AM, Kristoffer Sjögren <st...@gmail.com>wrote:

> Just to absolutely be clear, is this also true for a batch that span
> multiple rows?
>
>
> On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Operations within each batch are atomic.
> > They would either all succeed or all fail.
> >
> > Time stamps would all refer to the latest cell (KeyVal).
> >
> > Cheers
> >
> > On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren <st...@gmail.com>
> wrote:
> >
> > > The schema is known beforehand so this is exactly what I need. Great!
> > >
> > > One more question. What guarantees does the batch operation have? Are
> the
> > > operations contained within each batch atomic? I.e. all mutations will
> be
> > > given the same timestamp? If something fails, all operation fail or can
> > it
> > > fail partially?
> > >
> > > Thanks for your help, much appreciated.
> > >
> > > Cheers,
> > > -Kristoffer
> > >
> > >
> > > On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > >> I don't know details about Kristoffer's schema.
> > >> If all the column qualifiers are known a priori, mutateRow() should
> > serve
> > >> his needs.
> > >>
> > >> HBase allows arbitrary number of columns in a column family. If the
> > schema
> > >> is dynamic, mutateRow() wouldn't suffice.
> > >> If the column qualifiers are known but the row is very wide (and a few
> > >> columns are updated per call), performance would degrade.
> > >>
> > >> Just some factors to consider.
> > >>
> > >> Cheers
> > >>
> > >> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <
> mibrahim@mibrahim.net
> > >>> wrote:
> > >>
> > >>> Actually I do see it in the 0.94 JavaDocs (
> > >>
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > >>> ),
> > >>> so may be it was added in 0.94.6 even though the jira says fixed in
> > 0.95
> > >> .
> > >>> I haven't used it though, but it seems that's what you're looking
> for.
> > >>>
> > >>> Sorry for confusion.
> > >>>
> > >>> Mohamed
> > >>>
> > >>>
> > >>> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <
> > mibrahim@mibrahim.net
> > >>>> wrote:
> > >>>
> > >>>> It seems that 0.95 is not released yet, mutateRow won't be a
> solution
> > >> for
> > >>>> now. I saw it in the downloads and I thought it was released.
> > >>>>
> > >>>>
> > >>>> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
> > >> mibrahim@mibrahim.net
> > >>>> wrote:
> > >>>>
> > >>>>> Just noticed you want to delete as well. I think that's supported
> > >> since
> > >>>>> 0.95 in mutateRow (
> > >>
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > >> ).
> > >>>>> You can do multiple puts and deletes and they will be performed
> > >>> atomically.
> > >>>>> So you can remove qualifiers and put new ones.
> > >>>>>
> > >>>>> Mohamed
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <
> > stoffe@gmail.com
> > >>>> wrote:
> > >>>>>
> > >>>>>> What would you suggest? I want the operation to be atomic.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com>
> > wrote:
> > >>>>>>
> > >>>>>>> What is the maximum number of versions do you allow for the
> > >>> underlying
> > >>>>>>> table ?
> > >>>>>>>
> > >>>>>>> Thanks
> > >>>>>>>
> > >>>>>>> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> > >>> stoffe@gmail.com
> > >>>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi
> > >>>>>>>>
> > >>>>>>>> Is it possible to completely overwrite/replace a row in a single
> > >>>>>> _atomic_
> > >>>>>>>> action? Already existing columns and qualifiers should be
> removed
> > >>> if
> > >>>>>> they
> > >>>>>>>> do not exist in the data inserted into the row.
> > >>>>>>>>
> > >>>>>>>> The only way to do this is to first delete the row then insert
> > >> new
> > >>>>>> data
> > >>>>>>> in
> > >>>>>>>> its place, correct? Or is there an operation to do this?
> > >>>>>>>>
> > >>>>>>>> Cheers,
> > >>>>>>>> -Kristoffer
> > >>
> >
>

Re: Overwrite a row

Posted by Kristoffer Sjögren <st...@gmail.com>.

Just to absolutely be clear, is this also true for a batch that span
multiple rows?


On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu <yu...@gmail.com> wrote:

> Operations within each batch are atomic.
> They would either all succeed or all fail.
>
> Time stamps would all refer to the latest cell (KeyVal).
>
> Cheers
>
> On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren <st...@gmail.com> wrote:
>
> > The schema is known beforehand so this is exactly what I need. Great!
> >
> > One more question. What guarantees does the batch operation have? Are the
> > operations contained within each batch atomic? I.e. all mutations will be
> > given the same timestamp? If something fails, all operation fail or can
> it
> > fail partially?
> >
> > Thanks for your help, much appreciated.
> >
> > Cheers,
> > -Kristoffer
> >
> >
> > On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> I don't know details about Kristoffer's schema.
> >> If all the column qualifiers are known a priori, mutateRow() should
> serve
> >> his needs.
> >>
> >> HBase allows arbitrary number of columns in a column family. If the
> schema
> >> is dynamic, mutateRow() wouldn't suffice.
> >> If the column qualifiers are known but the row is very wide (and a few
> >> columns are updated per call), performance would degrade.
> >>
> >> Just some factors to consider.
> >>
> >> Cheers
> >>
> >> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
> >>> wrote:
> >>
> >>> Actually I do see it in the 0.94 JavaDocs (
> >>
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> >>> ),
> >>> so may be it was added in 0.94.6 even though the jira says fixed in
> 0.95
> >> .
> >>> I haven't used it though, but it seems that's what you're looking for.
> >>>
> >>> Sorry for confusion.
> >>>
> >>> Mohamed
> >>>
> >>>
> >>> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <
> mibrahim@mibrahim.net
> >>>> wrote:
> >>>
> >>>> It seems that 0.95 is not released yet, mutateRow won't be a solution
> >> for
> >>>> now. I saw it in the downloads and I thought it was released.
> >>>>
> >>>>
> >>>> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
> >> mibrahim@mibrahim.net
> >>>> wrote:
> >>>>
> >>>>> Just noticed you want to delete as well. I think that's supported
> >> since
> >>>>> 0.95 in mutateRow (
> >>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> >> ).
> >>>>> You can do multiple puts and deletes and they will be performed
> >>> atomically.
> >>>>> So you can remove qualifiers and put new ones.
> >>>>>
> >>>>> Mohamed
> >>>>>
> >>>>>
> >>>>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <
> stoffe@gmail.com
> >>>> wrote:
> >>>>>
> >>>>>> What would you suggest? I want the operation to be atomic.
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com>
> wrote:
> >>>>>>
> >>>>>>> What is the maximum number of versions do you allow for the
> >>> underlying
> >>>>>>> table ?
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> >>> stoffe@gmail.com
> >>>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi
> >>>>>>>>
> >>>>>>>> Is it possible to completely overwrite/replace a row in a single
> >>>>>> _atomic_
> >>>>>>>> action? Already existing columns and qualifiers should be removed
> >>> if
> >>>>>> they
> >>>>>>>> do not exist in the data inserted into the row.
> >>>>>>>>
> >>>>>>>> The only way to do this is to first delete the row then insert
> >> new
> >>>>>> data
> >>>>>>> in
> >>>>>>>> its place, correct? Or is there an operation to do this?
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> -Kristoffer
> >>
>

Re: Overwrite a row

Posted by Ted Yu <yu...@gmail.com>.

Operations within each batch are atomic. 
They would either all succeed or all fail. 

Time stamps would all refer to the latest cell (KeyVal). 

Cheers

On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren <st...@gmail.com> wrote:

> The schema is known beforehand so this is exactly what I need. Great!
> 
> One more question. What guarantees does the batch operation have? Are the
> operations contained within each batch atomic? I.e. all mutations will be
> given the same timestamp? If something fails, all operation fail or can it
> fail partially?
> 
> Thanks for your help, much appreciated.
> 
> Cheers,
> -Kristoffer
> 
> 
> On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com> wrote:
> 
>> I don't know details about Kristoffer's schema.
>> If all the column qualifiers are known a priori, mutateRow() should serve
>> his needs.
>> 
>> HBase allows arbitrary number of columns in a column family. If the schema
>> is dynamic, mutateRow() wouldn't suffice.
>> If the column qualifiers are known but the row is very wide (and a few
>> columns are updated per call), performance would degrade.
>> 
>> Just some factors to consider.
>> 
>> Cheers
>> 
>> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
>>> wrote:
>> 
>>> Actually I do see it in the 0.94 JavaDocs (
>> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
>>> ),
>>> so may be it was added in 0.94.6 even though the jira says fixed in 0.95
>> .
>>> I haven't used it though, but it seems that's what you're looking for.
>>> 
>>> Sorry for confusion.
>>> 
>>> Mohamed
>>> 
>>> 
>>> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
>>>> wrote:
>>> 
>>>> It seems that 0.95 is not released yet, mutateRow won't be a solution
>> for
>>>> now. I saw it in the downloads and I thought it was released.
>>>> 
>>>> 
>>>> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
>> mibrahim@mibrahim.net
>>>> wrote:
>>>> 
>>>>> Just noticed you want to delete as well. I think that's supported
>> since
>>>>> 0.95 in mutateRow (
>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
>> ).
>>>>> You can do multiple puts and deletes and they will be performed
>>> atomically.
>>>>> So you can remove qualifiers and put new ones.
>>>>> 
>>>>> Mohamed
>>>>> 
>>>>> 
>>>>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <stoffe@gmail.com
>>>> wrote:
>>>>> 
>>>>>> What would you suggest? I want the operation to be atomic.
>>>>>> 
>>>>>> 
>>>>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>>> 
>>>>>>> What is the maximum number of versions do you allow for the
>>> underlying
>>>>>>> table ?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
>>> stoffe@gmail.com
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi
>>>>>>>> 
>>>>>>>> Is it possible to completely overwrite/replace a row in a single
>>>>>> _atomic_
>>>>>>>> action? Already existing columns and qualifiers should be removed
>>> if
>>>>>> they
>>>>>>>> do not exist in the data inserted into the row.
>>>>>>>> 
>>>>>>>> The only way to do this is to first delete the row then insert
>> new
>>>>>> data
>>>>>>> in
>>>>>>>> its place, correct? Or is there an operation to do this?
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> -Kristoffer
>>

Re: Overwrite a row

Posted by Kristoffer Sjögren <st...@gmail.com>.

The schema is known beforehand so this is exactly what I need. Great!

One more question. What guarantees does the batch operation have? Are the
operations contained within each batch atomic? I.e. all mutations will be
given the same timestamp? If something fails, all operation fail or can it
fail partially?

Thanks for your help, much appreciated.

Cheers,
-Kristoffer


On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu <yu...@gmail.com> wrote:

> I don't know details about Kristoffer's schema.
> If all the column qualifiers are known a priori, mutateRow() should serve
> his needs.
>
> HBase allows arbitrary number of columns in a column family. If the schema
> is dynamic, mutateRow() wouldn't suffice.
> If the column qualifiers are known but the row is very wide (and a few
> columns are updated per call), performance would degrade.
>
> Just some factors to consider.
>
> Cheers
>
> On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
> >wrote:
>
> > Actually I do see it in the 0.94 JavaDocs (
> >
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> > ),
> > so may be it was added in 0.94.6 even though the jira says fixed in 0.95
> .
> > I haven't used it though, but it seems that's what you're looking for.
> >
> > Sorry for confusion.
> >
> > Mohamed
> >
> >
> > On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
> > >wrote:
> >
> > > It seems that 0.95 is not released yet, mutateRow won't be a solution
> for
> > > now. I saw it in the downloads and I thought it was released.
> > >
> > >
> > > On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <
> mibrahim@mibrahim.net
> > >wrote:
> > >
> > >> Just noticed you want to delete as well. I think that's supported
> since
> > >> 0.95 in mutateRow (
> > >>
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> ).
> > >> You can do multiple puts and deletes and they will be performed
> > atomically.
> > >> So you can remove qualifiers and put new ones.
> > >>
> > >> Mohamed
> > >>
> > >>
> > >> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <stoffe@gmail.com
> > >wrote:
> > >>
> > >>> What would you suggest? I want the operation to be atomic.
> > >>>
> > >>>
> > >>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
> > >>>
> > >>> > What is the maximum number of versions do you allow for the
> > underlying
> > >>> > table ?
> > >>> >
> > >>> > Thanks
> > >>> >
> > >>> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> > stoffe@gmail.com
> > >>> > >wrote:
> > >>> >
> > >>> > > Hi
> > >>> > >
> > >>> > > Is it possible to completely overwrite/replace a row in a single
> > >>> _atomic_
> > >>> > > action? Already existing columns and qualifiers should be removed
> > if
> > >>> they
> > >>> > > do not exist in the data inserted into the row.
> > >>> > >
> > >>> > > The only way to do this is to first delete the row then insert
> new
> > >>> data
> > >>> > in
> > >>> > > its place, correct? Or is there an operation to do this?
> > >>> > >
> > >>> > > Cheers,
> > >>> > > -Kristoffer
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Overwrite a row

Posted by Ted Yu <yu...@gmail.com>.

I don't know details about Kristoffer's schema.
If all the column qualifiers are known a priori, mutateRow() should serve
his needs.

HBase allows arbitrary number of columns in a column family. If the schema
is dynamic, mutateRow() wouldn't suffice.
If the column qualifiers are known but the row is very wide (and a few
columns are updated per call), performance would degrade.

Just some factors to consider.

Cheers

On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim <mi...@mibrahim.net>wrote:

> Actually I do see it in the 0.94 JavaDocs (
>
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
> ),
> so may be it was added in 0.94.6 even though the jira says fixed in 0.95 .
> I haven't used it though, but it seems that's what you're looking for.
>
> Sorry for confusion.
>
> Mohamed
>
>
> On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
> >wrote:
>
> > It seems that 0.95 is not released yet, mutateRow won't be a solution for
> > now. I saw it in the downloads and I thought it was released.
> >
> >
> > On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <mibrahim@mibrahim.net
> >wrote:
> >
> >> Just noticed you want to delete as well. I think that's supported since
> >> 0.95 in mutateRow (
> >>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)).
> >> You can do multiple puts and deletes and they will be performed
> atomically.
> >> So you can remove qualifiers and put new ones.
> >>
> >> Mohamed
> >>
> >>
> >> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <stoffe@gmail.com
> >wrote:
> >>
> >>> What would you suggest? I want the operation to be atomic.
> >>>
> >>>
> >>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
> >>>
> >>> > What is the maximum number of versions do you allow for the
> underlying
> >>> > table ?
> >>> >
> >>> > Thanks
> >>> >
> >>> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <
> stoffe@gmail.com
> >>> > >wrote:
> >>> >
> >>> > > Hi
> >>> > >
> >>> > > Is it possible to completely overwrite/replace a row in a single
> >>> _atomic_
> >>> > > action? Already existing columns and qualifiers should be removed
> if
> >>> they
> >>> > > do not exist in the data inserted into the row.
> >>> > >
> >>> > > The only way to do this is to first delete the row then insert new
> >>> data
> >>> > in
> >>> > > its place, correct? Or is there an operation to do this?
> >>> > >
> >>> > > Cheers,
> >>> > > -Kristoffer
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: Overwrite a row

Posted by Mohamed Ibrahim <mi...@mibrahim.net>.

Actually I do see it in the 0.94 JavaDocs (
http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
),
so may be it was added in 0.94.6 even though the jira says fixed in 0.95 .
I haven't used it though, but it seems that's what you're looking for.

Sorry for confusion.

Mohamed


On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim <mi...@mibrahim.net>wrote:

> It seems that 0.95 is not released yet, mutateRow won't be a solution for
> now. I saw it in the downloads and I thought it was released.
>
>
> On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <mi...@mibrahim.net>wrote:
>
>> Just noticed you want to delete as well. I think that's supported since
>> 0.95 in mutateRow (
>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations) ).
>> You can do multiple puts and deletes and they will be performed atomically.
>> So you can remove qualifiers and put new ones.
>>
>> Mohamed
>>
>>
>> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <st...@gmail.com>wrote:
>>
>>> What would you suggest? I want the operation to be atomic.
>>>
>>>
>>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
>>>
>>> > What is the maximum number of versions do you allow for the underlying
>>> > table ?
>>> >
>>> > Thanks
>>> >
>>> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <stoffe@gmail.com
>>> > >wrote:
>>> >
>>> > > Hi
>>> > >
>>> > > Is it possible to completely overwrite/replace a row in a single
>>> _atomic_
>>> > > action? Already existing columns and qualifiers should be removed if
>>> they
>>> > > do not exist in the data inserted into the row.
>>> > >
>>> > > The only way to do this is to first delete the row then insert new
>>> data
>>> > in
>>> > > its place, correct? Or is there an operation to do this?
>>> > >
>>> > > Cheers,
>>> > > -Kristoffer
>>> > >
>>> >
>>>
>>
>>
>

Re: Overwrite a row

Posted by Mohamed Ibrahim <mi...@mibrahim.net>.

It seems that 0.95 is not released yet, mutateRow won't be a solution for
now. I saw it in the downloads and I thought it was released.


On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim <mi...@mibrahim.net>wrote:

> Just noticed you want to delete as well. I think that's supported since
> 0.95 in mutateRow (
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations) ).
> You can do multiple puts and deletes and they will be performed atomically.
> So you can remove qualifiers and put new ones.
>
> Mohamed
>
>
> On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <st...@gmail.com>wrote:
>
>> What would you suggest? I want the operation to be atomic.
>>
>>
>> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>> > What is the maximum number of versions do you allow for the underlying
>> > table ?
>> >
>> > Thanks
>> >
>> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <stoffe@gmail.com
>> > >wrote:
>> >
>> > > Hi
>> > >
>> > > Is it possible to completely overwrite/replace a row in a single
>> _atomic_
>> > > action? Already existing columns and qualifiers should be removed if
>> they
>> > > do not exist in the data inserted into the row.
>> > >
>> > > The only way to do this is to first delete the row then insert new
>> data
>> > in
>> > > its place, correct? Or is there an operation to do this?
>> > >
>> > > Cheers,
>> > > -Kristoffer
>> > >
>> >
>>
>
>

Re: Overwrite a row

Posted by Mohamed Ibrahim <mi...@mibrahim.net>.

Just noticed you want to delete as well. I think that's supported since
0.95 in mutateRow (
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
).
You can do multiple puts and deletes and they will be performed atomically.
So you can remove qualifiers and put new ones.

Mohamed

On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren <st...@gmail.com>wrote:

> What would you suggest? I want the operation to be atomic.
>
>
> On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > What is the maximum number of versions do you allow for the underlying
> > table ?
> >
> > Thanks
> >
> > On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <stoffe@gmail.com
> > >wrote:
> >
> > > Hi
> > >
> > > Is it possible to completely overwrite/replace a row in a single
> _atomic_
> > > action? Already existing columns and qualifiers should be removed if
> they
> > > do not exist in the data inserted into the row.
> > >
> > > The only way to do this is to first delete the row then insert new data
> > in
> > > its place, correct? Or is there an operation to do this?
> > >
> > > Cheers,
> > > -Kristoffer
> > >
> >
>

Re: Overwrite a row

Posted by Kristoffer Sjögren <st...@gmail.com>.

What would you suggest? I want the operation to be atomic.


On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu <yu...@gmail.com> wrote:

> What is the maximum number of versions do you allow for the underlying
> table ?
>
> Thanks
>
> On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <stoffe@gmail.com
> >wrote:
>
> > Hi
> >
> > Is it possible to completely overwrite/replace a row in a single _atomic_
> > action? Already existing columns and qualifiers should be removed if they
> > do not exist in the data inserted into the row.
> >
> > The only way to do this is to first delete the row then insert new data
> in
> > its place, correct? Or is there an operation to do this?
> >
> > Cheers,
> > -Kristoffer
> >
>

Re: Overwrite a row

Posted by Ted Yu <yu...@gmail.com>.

What is the maximum number of versions do you allow for the underlying
table ?

Thanks

On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren <st...@gmail.com>wrote:

> Hi
>
> Is it possible to completely overwrite/replace a row in a single _atomic_
> action? Already existing columns and qualifiers should be removed if they
> do not exist in the data inserted into the row.
>
> The only way to do this is to first delete the row then insert new data in
> its place, correct? Or is there an operation to do this?
>
> Cheers,
> -Kristoffer
>