You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Drew Farris (JIRA)" <ji...@apache.org> on 2012/10/11 15:31:03 UTC

[jira] [Created] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Drew Farris created ACCUMULO-803:
------------------------------------

             Summary: Add Reverse Logical Time as a Time Type
                 Key: ACCUMULO-803
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
             Project: Accumulo
          Issue Type: Improvement
          Components: tserver
    Affects Versions: 1.4.2
            Reporter: Drew Farris
            Assignee: Keith Turner
            Priority: Minor
         Attachments: ACCUMULO-803.patch

In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 

I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 

Perhaps it is useful in a general case?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "David Medinets (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485186#comment-13485186 ] 

David Medinets commented on ACCUMULO-803:
-----------------------------------------

I feel mucking with timestamps is asking for trouble. I'd rather see an effort to make the Key semantics changeable in a controlled fashion using a plug-in architecture. Then changes to support FIFO or other sorting mechanisms would use a well-defined API. Changes (and bugs) for each mechanism would be isolated.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477029#comment-13477029 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

bq.  does that imply that we might want a different timestamp scheme on different column families?

Currently each tablets timestamp is stored in the metadata table.  It tablets had to persist and indeterminate number of timestamps, then I do not think we could safely store that in the metadata table.   Would need to store it somewhere else.  Its nice storing the info in the metadata table because you can atomically update the timestamp on minor compaction and bulkimport by putting it in the same mutation.  So where would this info be stored and how would it be atomically updated after minor compaction and bulk import?

Would also have to be careful not to blow out memory on tservers.  e.g. If each tablet on a tserver is keeping track of timestamps for many columns in memory.  Would need to cache timestamp info from persisted store to avoid this problem.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477246#comment-13477246 ] 

Drew Farris commented on ACCUMULO-803:
--------------------------------------

bq. looks good a few comments.

Sounds good, I'll update the patch, docs and MockTable per suggestions.

                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478075#comment-13478075 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

bq.  do you think people might want to set multiple different transformations for different column updates in the same mutation?

One more consideration.  This would add another set of put methods to mutation, which already has alot of put methods.  
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Adam Fuchs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477018#comment-13477018 ] 

Adam Fuchs commented on ACCUMULO-803:
-------------------------------------

As a thought exercise, let's think about how far this might go. Since Combiners work on a column family, does that imply that we might want a different timestamp scheme on different column families? Should this turn into another plugin framework for assigning timestamps on tablet servers, or perhaps a timestamp data definition language for column families?
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485136#comment-13485136 ] 

Drew Farris commented on ACCUMULO-803:
--------------------------------------

I agree, I don't feel comfortable allowing users to set column family ordering on a per mutation basis. 

With the API you propose, do you feel that fifo ordering would be best achieved by manipulating timestamps as is done the original patch? If so, does this introduce any strangeness in the relationship between typeType and fifoColumnFamilies? Could it make sense to use TimeType.MILLIS and a fifo column family? Perhaps yes.

I feel a little uncomfortable with the fact that column families are dynamic but we would require the user to specify a set of column families at creation time if they want fifo behavior. This seems to run counter to the spirit of column families (and how they're used) in Accumulo. 

One way to solve this problem would be to add setColumnFamilyOrdering(String cf, boolean fifo) (or something like it) to TableOperations. This might not be a good idea because we run into the same problem we have with Mutations: the user could shoot themselves in the foot if they set the ordering on a column family to be different than the default for a table that already contains data. 

So, I have to admit that I lean closest to the original mechanism, but I could be biased because I'm the patch author :)

Thoughts?








                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476990#comment-13476990 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

looks good a few comments.

   * can you make this patch against trunk/1.5 instead of 1.4?  I do not think this is something we would want to add to 1.4.
   * the user manual would need some mention of this new feature, in docs/src/user_manual/chapters/table_configuration.tex I think
   * in MockTable I think you should subtract from Long.MAX_VALUE instead of Integer.MAX_VALUE for consistency, even though its an int is being subtracted
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Adam Fuchs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477087#comment-13477087 ] 

Adam Fuchs commented on ACCUMULO-803:
-------------------------------------

You could do both count up and count down logical time on many different groups of keys in the same tablet with only persisting one timestamp. If the plugin framework supports a one-up counter, a plugin can then subtract that from max long or transform it otherwise as necessary for each timestamp group. This would maintain the monotonicity property of logical time on any of the groups of entries at the cost of introducing sparcity (which doesn't really matter).
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner updated ACCUMULO-803:
----------------------------------

    Assignee: Drew Farris  (was: Keith Turner)
      Status: Patch Available  (was: Open)
    
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris updated ACCUMULO-803:
---------------------------------

    Attachment: ACCUMULO-803.patch

Updated patch against trunk (1.5.x) incorporating recommended changes.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris updated ACCUMULO-803:
---------------------------------

    Affects Version/s:     (was: 1.4.2)
                       1.5.0
    
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "David Medinets (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485185#comment-13485185 ] 

David Medinets commented on ACCUMULO-803:
-----------------------------------------

I feel mucking with timestamps is asking for trouble. I'd rather see
an effort to make the Key semantics changeable in a controlled fashion
using a plug-in architecture. Then changes to support FIFO or other
sorting mechanisms would use a well-defined API. Changes (and bugs)
for each mechanism would be isolated.


                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477130#comment-13477130 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

bq. You could do both count up and count down logical time on many different groups of keys in the same tablet with only persisting one timestamp

That would work.  You can go up or down in a sparse manner.  Does not seem like plugins are needed.  Are there other operations we want to support besides FIFO and FILO?  Seems operations like multiplication, addition, modulo, and division would not be useful.  Randomizing the timestamp could be accomplished on the client side.  Would be nice to have use cases for plugin framework.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477969#comment-13477969 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

bq.  Since Combiners work on a column family, does that imply that we might want a different timestamp scheme on different column families?

We might also consider setting an option on mutations to reverse system set time.  This gives users the ability to target this in very tailored ways.  Users could apply it to certain column family prefixes, to column qualifiers, etc.  It gives a lot of flexibility w/o any server side code.  Could add something like the following to mutation.

{code:java}
  
enum TimestampTransformations {
   NONE,
   REVERSE
}

class Mutation {
  //transformation that will be applied to system set timestamps
  public void setTimestampTransformation(TimestampTransformations tt);
}


{code}

I think the only drawback with this is that it does not support setting system time for bulk insert.  The current patch will support that.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris updated ACCUMULO-803:
---------------------------------

    Attachment: ACCUMULO-803.patch

Rough cut at reverse logical time
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.4.2
>            Reporter: Drew Farris
>            Assignee: Keith Turner
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485111#comment-13485111 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

Thinking about setting the order in the mutation, I realized it gives users rope to hang themselves with.  A user could write two sets of code.  One program sets the order of column foo to FIFO while another program sets it to LIFO.  The user did not intend to do this, but after running both the data is in the system and they have to deal with it.  Below shows a table of what been proposed so far and what I see as the pros and cons.

|Timestamp ordering method|Pros|Cons| 
|Set order at table level| simple and works w/ bulk import | may force user to store data they want commingled in separate tables |
|Set order per column in table config | Ordering for a column is always consistent.  Can provide granularity w/ bulk import. | Complex to set order in complex ways (i.e. plugin). Introduces some server side computation overhead |
|Set order per column in mutation | Easy to set order in complex ways | Can easily write code that inconsistently sets per column order. Do not have same granularity w/ bulk import, user can specify if an entire file is FIFO or LIFO |

The table above assumes the order decision method in the table config is immutable.  If the config were mutable it would suffer from the same problem as setting the order in the mutation.  The issue of complexity with per table order config can be avoided if a really simple config is used.  For example no plugins, at table creation time the user just specifies which column families are FIFO.  This does not give the user the same flexibility as a plugin or setting it on the mutation, but its better than setting FIFO for a whole table.

So I am now thinking a simple, immutable config at table creation time may be best.  Could add the following to table operations. 

{code:java}
  public void create(String tableName, boolean versioningIter, TimeType timeType, Set<Text> fifoColumnFamilies);
{code}

                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Adam Fuchs (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478059#comment-13478059 ] 

Adam Fuchs commented on ACCUMULO-803:
-------------------------------------

I like the setting of timestamp transformation in the Mutation, but do you think people might want to set multiple different transformations for different column updates in the same mutation?
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478072#comment-13478072 ] 

Keith Turner commented on ACCUMULO-803:
---------------------------------------

bq. I like the setting of timestamp transformation in the Mutation, but do you think people might want to set multiple different transformations for different column updates in the same mutation?

I like it too, but the bulk import aspect is still tormenting me.  I was being miserly with computation and storage when thinking about setting it at the mutation level.  Eric just modified mutation so that the system timstamp is set once per mutation.  It used to set the same timestamp for each column.  I wanted to maintain this slight boost in ingest performance.  I suppose we could still set the timestamp once permutation and just interpret it differently if a flag is set for the column when getTimestamp() is called.

For bulk import, if the user request logical time, it will set the same timestamp for all keys in the file.   We could make it reverse this timestamp for all keys in the file.  But I have not thought of a simple method for do  something more granular.  Seems like the two are options are modify the Key written by bulk import or modify the special iterator that sets timestamps on bulk imported files when they are read.  Don't really like this options.  But we don't have to support anything more granular for bulk import.


                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-803) Add Reverse Logical Time as a Time Type

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485329#comment-13485329 ] 

Drew Farris commented on ACCUMULO-803:
--------------------------------------

I feel that reversing the assignment of timestamps is the safest approach. The alternative would involve modifying the code that sorts entries in rfiles which I am a little reluctant to touch at this point. I don't know that there's an alternative that would be less invasive than reverse logical time.
                
> Add Reverse Logical Time as a Time Type
> ---------------------------------------
>
>                 Key: ACCUMULO-803
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-803
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: ACCUMULO-803.patch, ACCUMULO-803.patch
>
>
> In a context where we are doing aggregation/combination of multiple values for a given key it may be useful to iterate over the values associated with that key in the order in which the mutations were applied (FIFO), instead of the FILO order that seems to occur when using {{TimeType.LOGICAL}}. 
> I encountered when implemeting a checkAndPut operation that would ensure that the previous value was expected before putting a new value. In this case, if the previous value was not as expected, the mutation would be ignored. 
> Perhaps it is useful in a general case?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira