You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/10/14 02:15:31 UTC
[jira] Created: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
FilterList of prefix and columnvalue not working properly with deletes and multiple values
------------------------------------------------------------------------------------------
Key: HBASE-1906
URL: https://issues.apache.org/jira/browse/HBASE-1906
Project: Hadoop HBase
Issue Type: Bug
Reporter: stack
Fix For: 0.20.2, 0.21.0
Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1906:
-------------------------
Attachment: filterlist.patch
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-1906.
--------------------------
Resolution: Fixed
Assignee: stack
Hadoop Flags: [Reviewed]
Applied to branch and trunk.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765707#action_12765707 ]
Jonathan Gray commented on HBASE-1906:
--------------------------------------
+1 for commit. Reviewed patch but did not test, if all existing (and new) filter tests pass then should be okay.
New HRegion.nextInternal() looks great, thanks for cleaning up that mess stack.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1906:
-------------------------
Attachment: 1906-v4.patch
This patch passes all tests.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1906:
-------------------------
Attachment: 1906-v2.patch
Just formatting clean up. No fix yet.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766435#action_12766435 ]
stack commented on HBASE-1906:
------------------------------
Any filter that depends on the filterRow will give odd results because this final step in the filter process may not get called if a row has more than one column.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766420#action_12766420 ]
stack commented on HBASE-1906:
------------------------------
Here is some more detail on this issue.
The illustrative code put up a table with 5 column families and added values. It then set up a scanner that used a FilterList of two Filters against one of the column families. The first filter was a prefix filter. The second a test on the cell content. The behavior wanted was that only rows that matched the prefix and the supplied cell value should be returned.
Before the fix was applied, we would do the right thing -- return rows that matched on prefix and cell value -- but then we'd tag onto the resultset part of a row; its rowid would match the prefix filter but it would not have the required cell content. We'd return all columns that sorted before the column that had the cell the filter was testing.
The illustrating code then threw in deletes of the cell we were testing on but we were still returning the part row (IIRC).
What was happening was that there was a code path whereby we could leave the internal next loop without calling the filter filterRow method. This latter method, if given the chance, was knocking out rows that didn't match on both supplied filters. Skipping out without its invocation was letting out candidate results that should have been suppressed.
Here is the old code:
{code}
1745 private boolean nextInternal() throws IOException {
1746 // This method should probably be reorganized a bit... has gotten messy
1747 KeyValue kv;
1748 byte[] currentRow = null;
1749 boolean filterCurrentRow = false;
1750 while (true) {
1751 kv = this.storeHeap.peek();
1752 if (kv == null) {
1753 return false;
1754 }
1755 byte [] row = kv.getRow();
1756 if (filterCurrentRow && Bytes.equals(currentRow, row)) {
1757 // filter all columns until row changes
1758 this.storeHeap.next(results);
1759 results.clear();
1760 continue;
1761 }
1762 // see if current row should be filtered based on row key
1763 if ((filter != null && filter.filterRowKey(row, 0, row.length)) ||
1764 (oldFilter != null && oldFilter.filterRowKey(row, 0, row.length))) {
1765 if(!results.isEmpty() && !Bytes.equals(currentRow, row)) {
1766 return true;
1767 }
1768 this.storeHeap.next(results);
1769 results.clear();
1770 resetFilters();
1771 filterCurrentRow = true;
1772 currentRow = row;
1773 continue;
1774 }
1775 if(!Bytes.equals(currentRow, row)) {
1776 // Continue on the next row:
1777 currentRow = row;
1778 filterCurrentRow = false;
1779 // See if we passed stopRow
1780 if(stopRow != null &&
1781 comparator.compareRows(stopRow, 0, stopRow.length,
1782 currentRow, 0, currentRow.length) <= 0) {
1783 return false;
1784 }
1785 // if there are _no_ results or current row should be filtered
1786 if (results.isEmpty() || filter != null && filter.filterRow()) {
1787 // make sure results is empty
1788 results.clear();
1789 resetFilters();
1790 continue;
1791 }
1792 return true;
1793 }
1794 this.storeHeap.next(results);
1795 }
1796 }
1797
1798 public void close() {
1799 storeHeap.close();
1800 }
{code}
We would exit at #1766 without calling filter.filterRow rather than at #1792.
The above method was rewritten so we don't skip out without calling filterRow.
{code}
private boolean nextInternal() throws IOException {
byte [] currentRow = null;
boolean filterCurrentRow = false;
while (true) {
KeyValue kv = this.storeHeap.peek();
if (kv == null) return false;
byte [] row = kv.getRow();
boolean samerow = Bytes.equals(currentRow, row);
if (samerow && filterCurrentRow) {
// Filter all columns until row changes
readAndDumpCurrentResult();
continue;
}
if (!samerow) {
// Continue on the next row:
currentRow = row;
filterCurrentRow = false;
// See if we passed stopRow
if (this.stopRow != null &&
comparator.compareRows(this.stopRow, 0, this.stopRow.length,
currentRow, 0, currentRow.length) <= 0) {
return false;
}
if (hasResults()) return true;
}
// See if current row should be filtered based on row key
if (this.filter != null && this.filter.filterRowKey(row, 0, row.length)) {
readAndDumpCurrentResult();
resetFilters();
filterCurrentRow = true;
currentRow = row;
continue;
}
this.storeHeap.next(results);
}
}
{code}
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765412#action_12765412 ]
stack commented on HBASE-1906:
------------------------------
@jgray What seems to be happening is that we can exit the loop in HRegion#nextInternal without a call to filterRow. W/o this call, stuff is left in when we exit via:
{code}
if (filter != null && filter.filterRowKey(row, 0, row.length)) {
if (!results.isEmpty() && !Bytes.equals(currentRow, row)) {
return true;
}
{code}
Its as though this test should be done first:
{code}
if (!Bytes.equals(currentRow, row)) {
{code}
... before we see if a row should be filtered out based off row key.
If filtered out by filterRowKey, then need to run filterRow on results already accumulated somehow.
Will keep digging but input if any appreciated.
That deletes can come out of the peek seems fine after looking at it some...
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue
not working properly with deletes and multiple values
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1906:
-------------------------
Attachment: 1906-v3.patch
Here's a fix. Most of the patch is just formatting changes (apart from the addition of the two tests --- one client-side and other on HRegion). The fix is in the HRegion#nextInternal. I halved its size. It was duplicating function using near-duplicate code. Importantly, there was a code path where we could exit with results without calling filterRow. The tests had filters that would rule out a whole row if filterRow was called.
> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-1906
> URL: https://issues.apache.org/jira/browse/HBASE-1906
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.2, 0.21.0
>
> Attachments: 1906-v2.patch, 1906-v3.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.