You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/04/15 18:09:04 UTC

[jira] Created: (HBASE-581) Allow adding fitlers to TIF (At same time, ensure TIF and TOF are subclassable)

Allow adding fitlers to TIF (At same time, ensure TIF and TOF are subclassable)
-------------------------------------------------------------------------------

                 Key: HBASE-581
                 URL: https://issues.apache.org/jira/browse/HBASE-581
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: stack
            Priority: Critical
             Fix For: 0.2.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591712#action_12591712 ] 

stack commented on HBASE-581:
-----------------------------

Hey, David, where's the patch (smile).  Also, you are talking about TRUNK, right when you say it falls over with ten maps?  I can run 8 and probably more MR maps against a single node without it falling over.  Maybe our boxes are just bigger than yours?

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590163#action_12590163 ] 

David Alves commented on HBASE-581:
-----------------------------------

Hi.
The main reason I couldn't subclass TIF is that HTable instance in TIF is package protected.
I then implemented an inner FilteredTableRecordReader that *can* receive a RowFilterInterface as argument, as this is not mandatory (the filter can be null) this could be a nice addition even to the HBase TIF, receiveing the filter class (and someway it's contructor arguments) by configuration. Even if this is not possible just allowing to set the record reader instance in TIF would make Overriding getRecordReader (and the TableSplit casts) unecessary.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591955#action_12591955 ] 

David Alves commented on HBASE-581:
-----------------------------------

Thanks stack. No problem about the acessors. The patch fully works in my tests, actually it makes my setup work (crawl and index) on my modest boxes :) I'll apply your patch and add the javadoc. Do you use a different eclipse formatter (I formatted with official java standards)?

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-581:
------------------------

    Attachment: tif-v2.patch

Your patch looks good David.  Here's a v2 with some mild formatting.  It also changes the data members back to private and adds accessors instead -- would this work for you?  Regards the getSplits, the way you've redone the method, if numSplits is < number of regions, then you do best effort at divvying up the regions so some splits will have more than one region?  It works in your testing?  Add javadoc explaining how numSplits now is actually acted on in getSplits and then upload a new patch and we'll get it committed.  Thanks.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591714#action_12591714 ] 

David Alves commented on HBASE-581:
-----------------------------------

Yes I'm talking about TRUNK, as I have 10 regions (for a given table) they always return 10 maps wich completely kills my setup, maybe your boxes are better. 
Anyhow and before I submit the extension/row filter patch I'm just finishing up the patch that solves the splits question . the getSplits() method will return splits equal to the lesser of numSplits/startKeys.lenght and will divide the splits in the most even groups possible making the first splits bigger in the case they are uneven.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591736#action_12591736 ] 

David Alves commented on HBASE-581:
-----------------------------------

Actually in my case I use Base64 conversion to pass the filters that are built dinamically in JobConf, but subclassing should solve the problem for now.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591715#action_12591715 ] 

David Alves commented on HBASE-581:
-----------------------------------

btw any ideas on how should we pass the filter arguments through job conf? its kind of complicated (multiple filters have different arguments and regexprowfilter has complex arguments) maybe we should only solve the extesion problem for now? if you agree I'll submit the patch right now. 

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "Clint Morgan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591734#action_12591734 ] 

Clint Morgan commented on HBASE-581:
------------------------------------

Can you put binary data in a Configuration? If so then we could use Writable to get the filter's bytes. Otherwise, write the filter to HDFS, and pass along the path in the JobConf?

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Alves updated HBASE-581:
------------------------------

    Attachment: tif-0.patch

tif patch.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592074#action_12592074 ] 

David Alves commented on HBASE-581:
-----------------------------------

The extension problem is not correctly solved. I'm trying to use it and it is not friendly to say the least. I'm in the process of improving it.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF is subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-581:
------------------------

    Summary: Allow adding filters to TableInputFormat (At same time, ensure TIF is subclassable)  (was: Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable))

Just have this issue be about TIF, not TOF.

> Allow adding filters to TableInputFormat (At same time, ensure TIF is subclassable)
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch, tif-v5.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated HBASE-581:
--------------------------------

    Summary: Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)  (was: Allow adding fitlers to TIF (At same time, ensure TIF and TOF are subclassable))

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592100#action_12592100 ] 

David Alves commented on HBASE-581:
-----------------------------------

Some thoughts before I submit the patch:
TIF should support filters
extensions of TIF might not need to get the columns, and table name from jobConf exactly like current tif does

In light of this I extracted an abstract class that supports filters from tableinputformat (which still exists and has the same API).
Subclasses just have to provide the table and the input columns (and optionally the rowFilter).
Also table record reader was altered to be extensible and tif chaged to support other tablerecordreaders
finally in order to sublass the tif to add for example a filter one just has to do the following:

[code]

class myTIFSubClass {

@Override
public void configure(JobConf job) {
       Text myTableName = ...;
       Text[] myCols = ....;
       RowFilterInterface myFilter = ...;
       super.setHTable(new HTable(new HBaseConfiguration(job), new Text(
                    myTableName))
        super.setInputColumns(myCols);
        super.setRowFilter(myFilter);
    }

}
[/code]

what do you think? shall I submit the patch?

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Alves updated HBASE-581:
------------------------------

    Attachment: tif-v5.patch

Added a new abstract parent class.
Corrected javadoc.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch, tif-v5.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591710#action_12591710 ] 

stack commented on HBASE-581:
-----------------------------

It could.  Or at a minimum should be overrideable (If its not, should be fixed as part of this issue).

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592121#action_12592121 ] 

stack commented on HBASE-581:
-----------------------------

At a minimum, the way that the current TIF specifies columns -- a long string that you do a split on spaces -- is a little imperfect.

What are you thinking?  Changing TIF so its subclassable and providing a new more flexible abstract TIF?  That sounds good to me (Suggest you add example of how to use into the class javadoc).

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591737#action_12591737 ] 

stack commented on HBASE-581:
-----------------------------

You could, but it'd be a little perverse loading binary data into the Configuation.

I was thinking the filter would just be bundled up in the job jar so its available on CLASSPATH (or user would add it to the cluster CLASSPATH).

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Alves updated HBASE-581:
------------------------------

    Attachment: tif-v3.patch

Corrected javadoc.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592099#action_12592099 ] 

Jim Kellerman commented on HBASE-581:
-------------------------------------

David,

With respect to formatting, please see http://wiki.apache.org/hadoop/Hbase/HowToContribute about 1/2 way down the page.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592128#action_12592128 ] 

David Alves commented on HBASE-581:
-----------------------------------

I think I finished if. The patch has an abstract parent TIF that contains a default TableRecordReader and receives a table, the columns, and optionally a rowfilter by accessors. Subclasses should only use whatever means to set the necessary fields.
I just have one doubt whether I should make the abstract class implement JobConfigurable or expect the subclasses to do so.
I'll include the example sub class in the javadoc as suggested.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590183#action_12590183 ] 

David Alves commented on HBASE-581:
-----------------------------------

Sure. I'll submit it as soon as I can.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590178#action_12590178 ] 

stack commented on HBASE-581:
-----------------------------

Any chance of your making a patch David -- if only to help the discussion along?

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Alves updated HBASE-581:
------------------------------

    Status: Patch Available  (was: Open)

My first patch :). Solves the extension issue (table, tablename, and cols are protected as well as the TableRecordReader to allow for extension) and the number os splits issue (now the number of splits is the lesser of numSplits and startKeys.lenght, splits can cover more that one region and are created the most evenly possible). TestTableIndex is failing (updated yesterday) but I'm almost sure it's not because of my changes .

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF is subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-581:
------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Applied patch to TRUNK (after minor formatting changes).  Thanks for the patch David!

> Allow adding filters to TableInputFormat (At same time, ensure TIF is subclassable)
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch, tif-v5.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591671#action_12591671 ] 

David Alves commented on HBASE-581:
-----------------------------------

On another subject getSplits always returns the same number independent of "mapred.map.tasks". Which means applications will always have the same number of maps as of splits which can overload hbase (in my experience 10 maps that use TIF on just two nodes completely overload Hbase, meaning not only that the M/R job will fail but also that regionservers go down). 
So getSplits() should take "mapred.map.tasks" into account right?




> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591725#action_12591725 ] 

stack commented on HBASE-581:
-----------------------------

Yeah, now you mention it, filters are ornery.  Coming up w/ some means of specifying via config. what filter and its args is probably not going to happen.  Making TIF subclassable is probably the ways to go here.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-581) Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)

Posted by "David Alves (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Alves updated HBASE-581:
------------------------------

    Attachment: tif-v4.patch

Corrected javadoc some more.

> Allow adding filters to TableInputFormat (At same time, ensure TIF and TOF are subclassable)
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-581
>                 URL: https://issues.apache.org/jira/browse/HBASE-581
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.2.0
>
>         Attachments: tif-0.patch, tif-v2.patch, tif-v3.patch, tif-v4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.