You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/07/15 02:18:02 UTC

[jira] [Created] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-4102
                 URL: https://issues.apache.org/jira/browse/HBASE-4102
             Project: HBase
          Issue Type: New Feature
            Reporter: stack


Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.

It could be done as a coprocessor but this feels more like a fundamental feature.

Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4102:
---------------------------------

    Attachment: 4102.txt

Here's a first patch.
I hopefully minimized the copying and creation of byte[] in the regionserver.
I added a new constructor to KeyValue to be able to create an empty but pre-sized KeyValue that can be filled later.

There's a lot of boilerplate that is very similar between Put, Increment, and now Append. Could think about factoring some more of it out.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126274#comment-13126274 ] 

Hudson commented on HBASE-4102:
-------------------------------

Integrated in HBase-TRUNK #2322 (See [https://builds.apache.org/job/HBase-TRUNK/2322/])
    HBASE-4102 atomicAppend: A put that appends to the latest version of a cell

larsh : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Append.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Put.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestIncrement.java

                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126203#comment-13126203 ] 

Lars Hofhansl commented on HBASE-4102:
--------------------------------------

Created HBASE-4583 for RWCC discussion.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Resolved] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl resolved HBASE-4102.
----------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed

Committed to trunk
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Jonathan Gray (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126141#comment-13126141 ] 

Jonathan Gray commented on HBASE-4102:
--------------------------------------

I think unifying Put and Append is not support important.  It would be good to unify Increment and Append, maybe even CheckAndPut/Delete?  A generic atomic op thing.

For the attributes, I think we just need a convention for system attributes, for example, they are preceded by an _ underscore.  And then we can put all the used attributes into HConstants for easy tracking.

Let's open another JIRA to integrate RWCC w/ Append and possibly Increment as well.  We can discuss there.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4102:
---------------------------------

    Assignee: Lars Hofhansl

This seems like a simple useful addition.
Should this be in 0.92 or 0.94?
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Work started] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Work started) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-4102 started by Lars Hofhansl.

> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4102:
---------------------------------

    Fix Version/s: 0.94.0
    
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125134#comment-13125134 ] 

Lars Hofhansl commented on HBASE-4102:
--------------------------------------

All tests pass.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4102:
---------------------------------

    Attachment: 4102-v1.txt

Thanks Ted...
New change addressing all your points.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123876#comment-13123876 ] 

Ted Yu commented on HBASE-4102:
-------------------------------

{code}
+/**
+ * Testing of HRegion.incrementColumnValue
+ *
+ */
+public class TestAtomicOperation extends HBaseTestCase {
{code}
Javadoc should be updated for the new test.
For Mutation.java:
{code}
+   * @return the number of different families included in this put
+   */
+  public int numFamilies() {
{code}
I think put in javadoc should be mutation.
For Append.readFields():
{code}
+    if (version > APPEND_VERSION) {
+      throw new IOException("version not supported");
{code}
Value of version should be included in the exception.
For RegionCoprocessorHost.postAppend():
{code}
+   * @param appent Append object
{code}
there was a typo above.
For HRegion.append():
{code}
+  public Result append(Append append, Integer lockid, boolean writeToWAL)
+      throws IOException {
...
+    checkRow(row, "increment");
{code}
Second parameter above should be "append"

                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123572#comment-13123572 ] 

Lars Hofhansl commented on HBASE-4102:
--------------------------------------

I have this working now.

But now I realized two things:
1. I modeled it after the old ICV. I assume we want something like the new Increment API.
2. Is this something that even want to build into HBase? Or should a user implement this with a coprocessor endpoint? (It would be possible to do with a coprocessor, albeit not quite as efficient as an endpoint would have no access to the Stores.

                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125374#comment-13125374 ] 

Lars Hofhansl commented on HBASE-4102:
--------------------------------------

Agree with trunk only (otherwise never get 0.92 out of the door).

bq. Let's put all the polish on before putting it in an official release.

So you're saying the RWCC integration and refactoring should go in before this is committed to trunk?
I would offer to do the RWCC integration right away, but this seems like a special case where data in the memstore is changed in place (same as Increment), and I am not sure about the RWCC semantics in that case.

I'd say we put this in as is, and then think about RWCC and refactoring for this and Increment.

                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125991#comment-13125991 ] 

stack commented on HBASE-4102:
------------------------------

Fix copyright on commit (Remove it I'd say.. this line "Copyright 2010 The Apache Software Foundation"... these are no longer necessary ... just the license).

I like your no-cluster-spinup test of testAppend in HRegion and the "Ultimate Answer to the Ultimate Question of Life,"

This seems like a good idea '+  // TODO: refactor to derive from Put?' but I'm ok w/ it going in as is.

Add Append class comment when you commit.  Say basically what its about.

+1 on commit.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123618#comment-13123618 ] 

stack commented on HBASE-4102:
------------------------------

@Lars On whether it should be 0.92 or 0.94, I'm thinking 0.94 (because people are watching me -- J-D and Todd will kill me if I commit a new feature to 0.92).  That said, I think at least SU will patch their 0.92 with this patch to get this feature; we need it.

I think we want it like new increment API.

On doing it as a CP, the argument is that this is a fundamental rather than something to do as a CP.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126137#comment-13126137 ] 

Lars Hofhansl commented on HBASE-4102:
--------------------------------------

At closer look Append and Put are subtly different when it comes to version handling (so they probably will still need to have separate implementations of readFields and write (which is not the bulk of the class, as I moved the other common code into Mutation and OperationWithAttributes as part of HBASE-4347).

There're still opportunities to unify some common code between Put and Increment (and now Append), and as you say that is probably for another jira.

Yet another question is the use of Attributes to add flags to Operations (get/scan/put/increment/append). It's nice because it allows adding flags without changing RPC versions. On the other hand it relies on a user never using the attribute that we're using for a flag. I am wondering whether should eat an incompatibility once and add "system attributes" (or something) that are independent from the user attributes. Then we can add flags to these operation without RPC changes and with interfering with user attributes.
If we do this, it might be good to do that with 0.92.

I'd also like to understand what RWCC means for the upsert store operation.

That all said... I fixed the copyright notice and added more comments, and will check this into trunk soon.

                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125144#comment-13125144 ] 

Ted Yu commented on HBASE-4102:
-------------------------------

+1 on patch v1.
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry

Posted by "Jonathan Gray (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125225#comment-13125225 ] 

Jonathan Gray commented on HBASE-4102:
--------------------------------------

This is really nice Lars.  I'd love to see integration with RWCC and to somehow unify the code with Increment.  But I'm okay with committing this and filing a follow-up JIRA.

I'm also going to backport this into my local 92 branch but I think it should only be committed to trunk.  Let's put all the polish on before putting it in an official release.

Nice work!
                
> atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entry
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4102
>                 URL: https://issues.apache.org/jira/browse/HBASE-4102
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: 4102-v1.txt, 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather than make a new cell each time.  At our place, the frontend keeps a list of urls a user has visited -- their md5s -- and updates it as user progresses.  Rather than read, modify client-side, then write new value back to hbase, it would be sweet if could do it all in one operation in hbase server.  TSDB aims to be space efficient.  Rather than pay the cost of the KV wrapper per metric, it would rather have a KV for an interval an in this KV have a value that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the client wants to see the resulting cell; often a client won't want to see the result and in this case, why pay the price formulating and delivering a response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira