You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jerry Du (JIRA)" <ji...@apache.org> on 2011/06/07 13:17:59 UTC

[jira] [Created] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

use Scan with setCaching() and PageFilter have a problem
--------------------------------------------------------

                 Key: HBASE-3958
                 URL: https://issues.apache.org/jira/browse/HBASE-3958
             Project: HBase
          Issue Type: Bug
          Components: filters, regionserver
    Affects Versions: 0.90.3
         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
            Reporter: Jerry Du
            Priority: Minor


I have a table with 3 ranges,then I scan the table cross all 3 ranges.

Scan scan = new Scan();
scan.setCaching(10);
scan.setFilter(new PageFilter(21));
[result rows count = 63]
the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.

Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
Example:
Scan scan = new Scan();
scan.setCaching(10);
scan.setFilter(new PageFilter(20));
[result rows count = 20]



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113568#comment-13113568 ] 

stack commented on HBASE-3958:
------------------------------

I'm not sure I am completely understanding the problem but the javadoc on PageFilter says that it will not work across region boundaries:  http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/PageFilter.html

Filters have row scope only.  If your scope is beyond a single row, the results will be indeterminate.  We say this in Filter javadoc but we don't say it enough and we don't say it on the main Filter page.  We need to make this more clear.

Is this about indeterminate behavior because filter is working across rows?

> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "subramanian raghunathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109279#comment-13109279 ] 

subramanian raghunathan commented on HBASE-3958:
------------------------------------------------

As per Jerry Du:
Rangs means Cross regions Scan(multi-regions scan).

The issue is my first HBase program, the following is P-code:
 
create a table which is preSplit 100 regions;
each region have 100 rows;

fill data with row key [0,9999]
 
Scan with startKey and stopKey which cross all regions;[0,9999)
scan.setCaching(3);
scan.setFilter(new PageFilter(5));
 
the out put is:
Row key:
0
1
2
caching border
3
4
region_0 with filter border
5
caching border
6
7
8
caching border
9
region_1 with filter border
10
11
caching border
12
13
14
caching border AND region_2 with filter border
 
 
 
Case another
scan.setCaching(2);
scan.setFilter(new PageFilter(5));
Output will be
Row key:
0
1
caching border
2
3
caching border
4
region_0 with filter border
5
caching border
6
7
caching border
8
9
caching border AND region_1 with filter border
 
scan stop in both caching and region border
 
The Reason is two:
Filter instance is only in one region scan;
in method org.apache.hadoop.hbase.clien.HTable.ClientScanner.next()
do {} while (remainingResultSize > 0 && countdown > 0 && nextScanner(countdown, values == null));
the stop condition is NOT consider scan with Filter
NOT Only PageFilter,any filter will be problem in cross regions scan(multi-regions scan).

> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "Jerry Du (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045781#comment-13045781 ] 

Jerry Du commented on HBASE-3958:
---------------------------------

No, I have no patch now. I am living in China,I can NOT always connect to outside China.So, the first is checkout the 0.90.3 source code,and config the dev env.

> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3958.
--------------------------

    Resolution: Fixed

Resolving because the boys didn't come back w/ answers to questions posted (w/o responses to say otherwise, it looks like hbase is working as advertised)
                
> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13444294#comment-13444294 ] 

Lars Hofhansl commented on HBASE-3958:
--------------------------------------

I think we can close this, no?
                
> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045711#comment-13045711 ] 

stack commented on HBASE-3958:
------------------------------

Do you have a patch to fix this Jerry?

> use Scan with setCaching() and PageFilter have a problem
> --------------------------------------------------------
>
>                 Key: HBASE-3958
>                 URL: https://issues.apache.org/jira/browse/HBASE-3958
>             Project: HBase
>          Issue Type: Bug
>          Components: filters, regionserver
>    Affects Versions: 0.90.3
>         Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
>            Reporter: Jerry Du
>            Priority: Minor
>
> I have a table with 3 ranges,then I scan the table cross all 3 ranges.
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(21));
> [result rows count = 63]
> the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result.
> Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned.
> If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan.
> Example:
> Scan scan = new Scan();
> scan.setCaching(10);
> scan.setFilter(new PageFilter(20));
> [result rows count = 20]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira