You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Lars Francke (JIRA)" <ji...@apache.org> on 2009/12/05 00:39:20 UTC

[jira] Updated: (HBASE-1744) Thrift server to match the new java api.

     [ https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Francke updated HBASE-1744:
--------------------------------

    Attachment: Hbase.thrift

This is my first stab at a completely new (again, based on Tim's experimental idea) Thrift API. Quite a few points came up and I'd be interested in feedback:

* So far I have not included Filters in any way as there is no struct subclassing in Thrift. So I'd need to come up with one data structure for all Filters. I thought of using a list with struct entries that contain a FilterType (enum) and a list or map of string options that would have to be parsed. Probably not the best solution so if anyone has an idea go ahead
* A few things that are currently missing that I plan to add: Row lock, Scanner caching, maxVersions, incrementColumnValue and checkAndPut
* A few things that I'm not yet sure about: The async version of createTable, closeRegion (both from HBaseAdmin), getServerInfo from ClusterStatus, quite a few methods from HTable that are not in HTableInterface (getRegionLocation, getStartKeys, getEndKeys, getRegionsInfo, getRegionLocation
* What to do with autoFlush? I could easily add a method disableAutoflush or something like that but what if flushCommit is never called? How does HBase behave? I'd have to return a identifier for the table so that it can be reused in subsequent calls (just like the scanner interface)
* I changed scanners to return longs before realizing that scanners timeout after 60 seconds (default) so I'll probably change it back to ints
* Currently scanners are not cleaned up, I plan on adding a cleanup thread that automatically closes/removes old scanners from the list
* Results are returned in TResult objects. You currently have to check if isSetRow or if values.size == 0 - I am thinking of adding an boolen field "empty" as this should be more reliable but as TResults will be sent quite often every this would probably cause a noticeable overhead (I'm not sure how well an optional field is handled in Thrift)
* Some structs have the "T" as a prefix to avoid name clashes with HBase (Get -> TGet) but some don't (HTableDescriptor -> TableDescriptor, HColumnDescriptor -> ColumnDescriptor), should I use the "T" everywhere to be consistent?
* The "delete" method had to be called something else in Thrift as "delete" is a reserved keyword. Tim used "delet" but I think that's quite confusing (I thought it was a type :) )so I opted for "deleteSingle" instead but I'm open for suggestions

Well as I said: My first stab at this - I'm not very familiar with HBase so I hope I got most of it right. If there are no major concerns I'll go ahead and implement the rest of it (parts are done already) and upload a first patch.

Thanks for all the help on IRC (larsgeorge, dj_ryan, jdcryans and the others)!

> Thrift server to match the new java api.
> ----------------------------------------
>
>                 Key: HBASE-1744
>                 URL: https://issues.apache.org/jira/browse/HBASE-1744
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: thrift
>            Reporter: Tim Sell
>            Assignee: Tim Sell
>             Fix For: 0.21.0
>
>         Attachments: Hbase.thrift, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java client.
> Thinking of ways to make a thrift client that is just as elegant. something like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map<TColumn, Bytes> values
> }
> This creates more verbose rpc  than if the columns in TPut were just map<Bytes, map<Bytes, Bytes>>, but that is harder to fit timestamps into and still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.