You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Eran Kutner (Created) (JIRA)" <ji...@apache.org> on 2011/10/18 16:30:10 UTC

[jira] [Created] (HBASE-4612) Allow ColumnPrefixFilter to support multiple refixes

Allow ColumnPrefixFilter to support multiple refixes
----------------------------------------------------

                 Key: HBASE-4612
                 URL: https://issues.apache.org/jira/browse/HBASE-4612
             Project: HBase
          Issue Type: Improvement
          Components: filters
    Affects Versions: 0.90.4
            Reporter: Eran Kutner
            Priority: Minor


When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133586#comment-13133586 ] 

Eran Kutner commented on HBASE-4612:
------------------------------------

OK, I uploaded a patch for trunk, hopefully what I've done with the createFilterFromArguments method makes sense.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch, HBASE-4612.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217964#comment-13217964 ] 

Lars Hofhansl commented on HBASE-4612:
--------------------------------------

We have MultipleColumnPrefixFilter (in trunk at least), which does exactly what is described here.
Is this different?
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch, HBASE-4612.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Kutner updated HBASE-4612:
-------------------------------

    Attachment: HBASE-4612-0.90.patch
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Ted Yu (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu reassigned HBASE-4612:
-----------------------------

    Assignee: Eran Kutner
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130023#comment-13130023 ] 

Eran Kutner commented on HBASE-4612:
------------------------------------

@Ted:
{quote}Improvements go to TRUNK.{quote}
I know but see my initial comment regarding the new Thrift initialization method, I'm just not sure how it's supposed to work or what am I supposed to do there.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Kutner updated HBASE-4612:
-------------------------------

    Attachment: HBASE-4612.patch
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch, HBASE-4612.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130033#comment-13130033 ] 

Ted Yu commented on HBASE-4612:
-------------------------------

As far as HBASE-4176 is concerned, take a look at this in TRUNK:
{code}
  public static Filter createFilterFromArguments(ArrayList<byte []> filterArguments) {
    Preconditions.checkArgument(filterArguments.size() == 1,
                                "Expected 1 but got: %s", filterArguments.size());
    byte [] columnPrefix = ParseFilter.removeQuotesFromByteArray(filterArguments.get(0));
    return new ColumnPrefixFilter(columnPrefix);
  }
{code}
You can relax the check above and call filterArguments.toArray() so that the new ctor can be used.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Jonathan Gray (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129955#comment-13129955 ] 

Jonathan Gray commented on HBASE-4612:
--------------------------------------

Hey Eran.  Thanks for the contribution!  A few comments..

- There's no explanation of the behavior anywhere.  In the constructors and addPrefix() methods, you should document that this creates an OR condition across all of the prefixes, correct?
- No need to instantiate a new comparator all the time (use Bytes.BYTES_COMPARATOR)
- Something seems odd when you keep adding to the end of a List and then sort.  How about a TreeSet?  You can easily ignore dupes that way.
- There's no input verification so, for example, you could pass a null to the constructor or an empty byte[][] and have some strange behavior.  Like it will instantiate okay but then you'll get server-side NPEs or IOOB.
- this.prefixes.size() == 0 -> this.prefixes.isEmpty()
- your comment at the top of filterColumn, i wouldn't exactly call it a workaround, but it's a good comment.  looking at the logic, it seems like correct behavior would be that it can be called with current == size() but it would be a bug if current > size(), right?  should you add an assert or throw an exception?
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Kutner updated HBASE-4612:
-------------------------------

    Attachment:     (was: HBASE-4612-0.90.patch)
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Lars Hofhansl (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl resolved HBASE-4612.
----------------------------------

    Resolution: Won't Fix

I am closing this, because we already have MultipleColumnPrefixFilter.
Please reopen if I misunderstood.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch, HBASE-4612.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4612:
--------------------------

    Fix Version/s: 0.94.0
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Kutner updated HBASE-4612:
-------------------------------

    Summary: Allow ColumnPrefixFilter to support multiple prefixes  (was: Allow ColumnPrefixFilter to support multiple refixes)
    
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133611#comment-13133611 ] 

Ted Yu commented on HBASE-4612:
-------------------------------

+1 on HBASE-4612.patch, assuming test suite passes.

Nice work Eran.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch, HBASE-4612.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4612) Allow ColumnPrefixFilter to support multiple refixes

Posted by "Eran Kutner (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Kutner updated HBASE-4612:
-------------------------------

    Attachment: HBASE-4612-0.90.patch
    
> Allow ColumnPrefixFilter to support multiple refixes
> ----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Eran Kutner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130020#comment-13130020 ] 

Eran Kutner commented on HBASE-4612:
------------------------------------

Hi Jonathan, thanks for the feedback! See answers inline:

{quote}There's no explanation of the behavior anywhere. In the constructors and addPrefix() methods, you should document that this creates an OR condition across all of the prefixes, correct?{quote} - good point, added some more explanations.
{quote}No need to instantiate a new comparator all the time (use Bytes.BYTES_COMPARATOR){quote} - Didn't know it existed. Changed.
{quote}Something seems odd when you keep adding to the end of a List and then sort. How about a TreeSet? You can easily ignore dupes that way.{quote} - This is intentional. Sorting is done only during initialization but accessing a ArrayList, which is actually based on an array, is much more efficient than accessing a tree, so I sacrifice the aesthetics of the code for better runtime performance.
{quote}There's no input verification so, for example, you could pass a null to the constructor or an empty byte[][] and have some strange behavior. Like it will instantiate okay but then you'll get server-side NPEs or IOOB.{quote} - it's a good point but I've looked and no other filter is validating its input either. I can throw a InvalidArgumentException but don't know if it's a good idea considering it's not the norm.
{quote}this.prefixes.size() == 0 -> this.prefixes.isEmpty(){quote} - ok, changed.
{quote}your comment at the top of filterColumn, i wouldn't exactly call it a workaround, but it's a good comment. looking at the logic, it seems like correct behavior would be that it can be called with current == size() but it would be a bug if current > size(), right? should you add an assert or throw an exception?{quote} - well it is kind of a workaround, because as an individual filter I expect not be called again after returning NEXT_ROW, however, when used with FilterList the filter does get called again which puts it in an ilegal state, so it has to explicitly handle that case. That is also why it can't throw an exception in that scenario, because it seems to be happening normally when used with FilterList. as for "current" it has to be smaller than size() or it would be outside the bounds of the array.


                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130016#comment-13130016 ] 

Ted Yu commented on HBASE-4612:
-------------------------------

Improvements go to TRUNK.
@Eran:
Please prepare patch for TRUNK and run test suite.

Nice work.
                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very useful to be able to scan them using multiple prefixes, allowing to fetch specific groups in one scan, without fetching the entire row. This is impossible to achieve using a FilterList, so I've added such support to the existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a new method to support instantiating filters using Thrift. I'm not sure how the serialization works there so I didn't implement that, but the rest of my code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira