You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "zezhou (JIRA)" <ji...@apache.org> on 2011/08/02 15:21:29 UTC

[jira] [Created] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

the problem in hbase thrift client when scan/get rows by timestamp
------------------------------------------------------------------

                 Key: HBASE-4155
                 URL: https://issues.apache.org/jira/browse/HBASE-4155
             Project: HBase
          Issue Type: Bug
          Components: thrift
    Affects Versions: 0.90.0
            Reporter: zezhou


I want to scan rows by specified timestamp. I use following hbase shell command :

scan 'testcrawl',{TIMESTAMP=>1312268202071} 
ROW                                         COLUMN+CELL                                                                                                                   
 put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
 put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
 put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 


As I expected, I can get the rows which timestamp is 1312268202071.
But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:

1312179170000
1312268202059

I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .

scan.setTimeRange(Long.MIN_VALUE, timestamp);

This cause thrift client return rows before specified row ,not the rows timestamp specified.
But in hbase client and avro client ,it use following code to set time parameter.

scan.setTimeStamp(timestamp);

this will return rows timestamp specified.

Is this a feature or a bug in thrift client ?
if this is a feature, which method in thrift client can get the rows by specified timestamp?


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "zezhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zezhou updated HBASE-4155:
--------------------------

    Attachment: patch.txt

this patch is based on hbase-0.90.1-cdh3u0 version

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: patch.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4155:
-------------------------

    Status: Patch Available  (was: Open)

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: 4155.txt, patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078865#comment-13078865 ] 

Ted Yu commented on HBASE-4155:
-------------------------------

Looks like certain test depends on the current behavior of getRowTs():
{code}
testAll(org.apache.hadoop.hbase.thrift.TestThriftServer)  Time elapsed: 31.038 sec  <<< ERROR!
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at org.apache.hadoop.hbase.thrift.TestThriftServer.doTestTableTimestampsAndColumns(TestThriftServer.java:202)
        at org.apache.hadoop.hbase.thrift.TestThriftServer.testAll(TestThriftServer.java:67)
{code}

We need to decide whether modifying the test accordingly is the way to go.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: 4155.txt, patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078442#comment-13078442 ] 

Jean-Daniel Cryans commented on HBASE-4155:
-------------------------------------------

Yeah that's an option... and then we deprecate it for the next major version? That means we also have to print out in bold letters when people that function that it's going to be deprecated... although you might want to print that out only once. In any case we need to find a good way to communicate our intent to the users.

But now looking the idl I'm starting to think we should just change the behavior:

{noformat}
  /** 
   * Get the specified number of versions for the specified table,
   * row, and column.  Only versions less than or equal to the specified
   * timestamp will be returned.
   *
   * @return list of cells for specified row/column
   */
  list<TCell> getVerTs(
...
  /** 
   * Get all the data for the specified table and row at the specified
   * timestamp. Returns an empty list if the row does not exist.
   * 
   * @return TRowResult containing the row and map of columns to TCells
   */
  list<TRowResult> getRowTs(
...
  /** 
   * Get a scanner on the current table starting at the specified row and
   * ending at the last row in the table.  Return the specified columns.
   * Only values with the specified timestamp are returned.
   *
   * @return scanner id to be used with other scanner procedures
   */
  ScannerID scannerOpenTs(
{noformat}

getVerTs has the right documentation, but not scannerOpenTs and getRowTs since they also use the TimeRange (but the behavior we want is documented).

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078663#comment-13078663 ] 

Ted Yu commented on HBASE-4155:
-------------------------------

@zezhou:
Except for getVerTs(), all the other getXX() methods should be changed.
Your patch was a diff which cannot be applied to Apache repository.
Please generate patch using subversion or git.

Thanks

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: patch.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4155:
--------------------------

    Attachment: 4155.txt

This is the version I plan to commit.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: 4155.txt, patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078446#comment-13078446 ] 

Ted Yu commented on HBASE-4155:
-------------------------------

We should correct the implementation of scannerOpenTs and getRowTs.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "zezhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zezhou updated HBASE-4155:
--------------------------

    Attachment: patch.txt.svn

@Ted ,sorry , this is my first time to submit a patch. I have re-generated the patch using svn.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440702#comment-13440702 ] 

stack commented on HBASE-4155:
------------------------------

@Ted Modify the test to match the new behavior I'd say.
                
> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: 4155.txt, patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078356#comment-13078356 ] 

Jean-Daniel Cryans commented on HBASE-4155:
-------------------------------------------

I dug back in the commits to see when this was introduced, and it seems that this was done as part of the uber refactoring done in HBASE-1304 two years ago. The behavior does seem broken tho, it should set the timestamp and not a time range (that would be another method). How it works right seems counter-intuitive.

So I would be +1 on fixing it, but the main issue would be that some users might already rely on the current behavior (I know that we don't here).

Others using Thrift want to comment?

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078365#comment-13078365 ] 

Ted Yu commented on HBASE-4155:
-------------------------------

I suggest we keep the current behavior.
ThriftServer.HBaseHandler has reference to Configuration object. We can introduce a new config option for setting exact timestamp instead of time range.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440580#comment-13440580 ] 

Lars Hofhansl commented on HBASE-4155:
--------------------------------------

Where are we with this? Can we close?
                
> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: 4155.txt, patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078848#comment-13078848 ] 

Ted Yu commented on HBASE-4155:
-------------------------------

+1 on patch version 2.
Minor comment, the following line is called twice in scannerOpenWithStopTs():
{code}
        scan.setTimeStamp(timestamp);
{code}
But this was due to double call in the current class.

> the problem in hbase thrift client when scan/get rows by timestamp
> ------------------------------------------------------------------
>
>                 Key: HBASE-4155
>                 URL: https://issues.apache.org/jira/browse/HBASE-4155
>             Project: HBase
>          Issue Type: Bug
>          Components: thrift
>    Affects Versions: 0.90.0
>            Reporter: zezhou
>         Attachments: patch.txt, patch.txt.svn
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I want to scan rows by specified timestamp. I use following hbase shell command :
> scan 'testcrawl',{TIMESTAMP=>1312268202071} 
> ROW                                         COLUMN+CELL                                                                                                                   
>  put1.com                                   column=crawl:data, timestamp=1312268202071, value=<html>put1</html>                                                          
>  put1.com                                   column=crawl:type, timestamp=1312268202071, value=html                                                                        
>  put1.com                                   column=links:outlinks, timestamp=1312268202071, value=www.163.com;www.sina.com 
> As I expected, I can get the rows which timestamp is 1312268202071.
> But when I use thift client to do the same thing ,the return data is the rows which time before specified timestamp ,  not the same as hbase shell.following is timestamp of return data:
> 1312179170000
> 1312268202059
> I look up the source in  hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use following code to set time parameter .
> scan.setTimeRange(Long.MIN_VALUE, timestamp);
> This cause thrift client return rows before specified row ,not the rows timestamp specified.
> But in hbase client and avro client ,it use following code to set time parameter.
> scan.setTimeStamp(timestamp);
> this will return rows timestamp specified.
> Is this a feature or a bug in thrift client ?
> if this is a feature, which method in thrift client can get the rows by specified timestamp?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira