You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/10/14 02:15:31 UTC

[jira] Created: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

FilterList of prefix and columnvalue not working properly with deletes and multiple values
------------------------------------------------------------------------------------------

                 Key: HBASE-1906
                 URL: https://issues.apache.org/jira/browse/HBASE-1906
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
             Fix For: 0.20.2, 0.21.0


Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1906:
-------------------------

    Attachment: filterlist.patch

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1906.
--------------------------

      Resolution: Fixed
        Assignee: stack
    Hadoop Flags: [Reviewed]

Applied to branch and trunk.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765707#action_12765707 ] 

Jonathan Gray commented on HBASE-1906:
--------------------------------------

+1 for commit.  Reviewed patch but did not test, if all existing (and new) filter tests pass then should be okay.

New HRegion.nextInternal() looks great, thanks for cleaning up that mess stack.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1906:
-------------------------

    Attachment: 1906-v4.patch

This patch passes all tests.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1906:
-------------------------

    Attachment: 1906-v2.patch

Just formatting clean up.  No fix yet.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766435#action_12766435 ] 

stack commented on HBASE-1906:
------------------------------

Any filter that depends on the filterRow will give odd results because this final step in the filter process may not get called if a row has more than one column.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766420#action_12766420 ] 

stack commented on HBASE-1906:
------------------------------

Here is some more detail on this issue.

The illustrative code put up a table with 5 column families and added values.  It then set up a scanner that used a FilterList of two Filters against one of the column families.  The first filter was a prefix filter.  The second a test on the cell content.  The behavior wanted was that only rows that matched the prefix and the supplied cell value should be returned.

Before the fix was applied, we would do the right thing -- return rows that matched on prefix and cell value -- but then we'd tag onto the resultset part of a row; its rowid would match the prefix filter but it would not have the required cell content.   We'd return all columns that sorted before the column that had the cell the filter was testing.

The illustrating code then threw in deletes of the cell we were testing on but we were still returning the part row (IIRC).

What was happening was that there was a code path whereby we could leave the internal next loop without calling the filter filterRow method.  This latter method, if given the chance, was knocking out rows that didn't match on both supplied filters.  Skipping out without its invocation was letting out candidate results that should have been suppressed.

Here is the old code:

{code}
1745     private boolean nextInternal() throws IOException {
1746       // This method should probably be reorganized a bit... has gotten messy
1747       KeyValue kv;
1748       byte[] currentRow = null;
1749       boolean filterCurrentRow = false;
1750       while (true) {
1751         kv = this.storeHeap.peek();
1752         if (kv == null) {
1753           return false;
1754         }
1755         byte [] row = kv.getRow();
1756         if (filterCurrentRow && Bytes.equals(currentRow, row)) {
1757           // filter all columns until row changes
1758           this.storeHeap.next(results);
1759           results.clear();
1760           continue;
1761         }
1762         // see if current row should be filtered based on row key
1763         if ((filter != null && filter.filterRowKey(row, 0, row.length)) ||
1764             (oldFilter != null && oldFilter.filterRowKey(row, 0, row.length))) {
1765           if(!results.isEmpty() && !Bytes.equals(currentRow, row)) {
1766             return true;
1767           }
1768           this.storeHeap.next(results);
1769           results.clear();
1770           resetFilters();
1771           filterCurrentRow = true;
1772           currentRow = row;
1773           continue;
1774         }
1775         if(!Bytes.equals(currentRow, row)) {
1776           // Continue on the next row:
1777           currentRow = row;
1778           filterCurrentRow = false;
1779           // See if we passed stopRow
1780           if(stopRow != null &&
1781               comparator.compareRows(stopRow, 0, stopRow.length,
1782                   currentRow, 0, currentRow.length) <= 0) {
1783             return false;
1784           }
1785           // if there are _no_ results or current row should be filtered
1786           if (results.isEmpty() || filter != null && filter.filterRow()) {
1787             // make sure results is empty
1788             results.clear();
1789             resetFilters();
1790             continue;
1791           }
1792           return true;
1793         }
1794         this.storeHeap.next(results);
1795       }
1796     }
1797 
1798     public void close() {
1799       storeHeap.close();
1800     }
{code}

We would exit at #1766 without calling filter.filterRow rather than at #1792.

The above method was rewritten so we don't skip out without calling filterRow.

{code}
    private boolean nextInternal() throws IOException {
      byte [] currentRow = null;
      boolean filterCurrentRow = false;
      while (true) {
        KeyValue kv = this.storeHeap.peek();
        if (kv == null) return false;
        byte [] row = kv.getRow();
        boolean samerow = Bytes.equals(currentRow, row);
        if (samerow && filterCurrentRow) {
          // Filter all columns until row changes
          readAndDumpCurrentResult();
          continue;
        }
        if (!samerow) {
          // Continue on the next row:
          currentRow = row;
          filterCurrentRow = false;
          // See if we passed stopRow
          if (this.stopRow != null &&
              comparator.compareRows(this.stopRow, 0, this.stopRow.length,
                currentRow, 0, currentRow.length) <= 0) {
            return false;
          }
          if (hasResults()) return true;
        }
        // See if current row should be filtered based on row key
        if (this.filter != null && this.filter.filterRowKey(row, 0, row.length)) {
          readAndDumpCurrentResult();
          resetFilters();
          filterCurrentRow = true;
          currentRow = row;
          continue;
        }
        this.storeHeap.next(results);
      }
    }
{code}

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765412#action_12765412 ] 

stack commented on HBASE-1906:
------------------------------

@jgray What seems to be happening is that we can exit the loop in HRegion#nextInternal without a call to filterRow.  W/o this call, stuff is left in when we exit via:

{code}
        if (filter != null && filter.filterRowKey(row, 0, row.length)) {
          if (!results.isEmpty() && !Bytes.equals(currentRow, row)) {
            return true;
          }
{code}

Its as though this test should be done first:

{code}
        if (!Bytes.equals(currentRow, row)) {
{code}

... before we see if a row should be filtered out based off row key.

If filtered out by filterRowKey, then need to run filterRow on results already accumulated somehow.

Will keep digging but input if any appreciated.

That deletes can come out of the peek seems fine after looking at it some... 


> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1906:
-------------------------

    Attachment: 1906-v3.patch

Here's a fix.  Most of the patch is just formatting changes (apart from the addition of the two tests --- one client-side and other on HRegion).  The fix is in the HRegion#nextInternal.  I halved its size.  It was duplicating function using near-duplicate code.  Importantly, there was a code path where we could exit with results without calling filterRow.  The tests had filters that would rule out a whole row if filterRow was called.

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.