You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/07/08 07:06:14 UTC

[jira] Created: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Allow emitting Deletes out of new TableReducer
----------------------------------------------

                 Key: HBASE-1626
                 URL: https://issues.apache.org/jira/browse/HBASE-1626
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
             Fix For: 0.20.0


Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney updated HBASE-1626:
---------------------------------

    Attachment: table-reduce.patch

I couldn't figure out how to put HTable in Context :) so I added HTable as an extra argument to reduce. This way is probably way too confusing but may give you an idea of what I am talking about.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728874#action_12728874 ] 

Lars George commented on HBASE-1626:
------------------------------------

I checked the above and if you put the fully specified class into the @link it works too, but then you also have to add a short name, like so

{code}
{@link org.apache.hadoop.hbase.client.Put Put}
{code}

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: reducer.patch

reducer.patch fixes the usage of the old outdated reduce() call.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728842#action_12728842 ] 

Jonathan Gray commented on HBASE-1626:
--------------------------------------

Can we add more documentation / class comments / etc?

Our TableReduce now just takes a Writable but there's not really any documentation in the class about what the intent is / how it should be used.

We ignore Key in both Map input and Reduce output?  If so, let's advertise that in javadocs all over the place.  Our comments are super generic but our usage/implementation is not generic.

We just need to be explicit and verbose in the comments about how this should be used.

For example:

{noformat}
+   * Writes the reducer output to an HBase table.
+   * 
+   * @param <KEY>  The type of the key.
{noformat}

This KEY is actually not used for anything, anywhere.  Why don't we say that?  Are there actually cases it could be used?

And we don't make mention that the output Value is fixed as a Writable, and that this Writable must be either a Put or Delete.


Comments aside, I think the patch is good.  Base class for Put/Delete would be more explicit, but easier to just use Writable with lots of javadoc.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-1626:
----------------------------

    Assignee: Lars George

Assigning lars so he'll take a look at this issue.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment:     (was: 1626-v5.patch)

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v3.patch

Patch 1626-v3.patch adds different generics structure as discussed with Stack.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728695#action_12728695 ] 

Lars George commented on HBASE-1626:
------------------------------------

Seems like there are two issues buried here, one is to be able to "generalize" the class that is handed into the reduce phase. The other is how to access a table. For the latter - correct me if I am wrong Doğacan - you seem to have tackled the wrong end of the stick. Instead of extending TableReducer and make use of a table in the IdentityTableReducer you leave that as is and simply add a custom TableReducer that creates the the table in the "setup()" method, does the put's etc. in the "reduce()" call and closes/flushes in the "cleanup()" method.

In other words you do not need to do anything but create a simple job that uses IdentityTableReducer together with TableOutputFormat - which takes care of the table.put(). As long as I do not miss anything else that is pretty much what you are doing. Use the TableMapReduceUtil class to set up the job and also the name of the table etc.

The crucial part is abstracting the type of the class the reducer actually receives, so instead of assuming a Put it should be a Delete as well if possible. I think Stack has that down 100% in his patch. So his patch together with using the above classes you are fine. 

Question for Stack
{code}
+      if (value instanceof Put) this.table.put(new Put((Put)value));
+      else if (value instanceof Delete) this.table.delete(new Delete((Delete)value));
{code}

why doing that and not 

{code}
+      if (value instanceof Put) this.table.put((Put) value);
+      else if (value instanceof Delete) this.table.delete((Delete) value);
{code}

Just wondering if there is a reason to create a new object. Are the cached in the framework and the object reference causes them to be modified before written? They are already written to an intermediate during the map/reduce cross over so they are already copies. 

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728918#action_12728918 ] 

Doğacan Güney commented on HBASE-1626:
--------------------------------------

Sorry guys, I was out all day.

Last patch looks great!

+1 from me.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v10.patch, 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626-v9.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1626:
-------------------------

    Attachment: deletes.patch

Doğacan had interesting idea where hbase subclassed the new hadoop context.

A lesser suggestion was to add a Marker interface to Delete and Put.

Playing around with it, a Marker interface would be a bit messy.  It'd have to be in the client package and be public.  It'd have to be something like Update or UpdateMarker (yuck).  But maybe this is the way to go?

Otherwise, make the hbase reducer generic so can pass anything -- e.g. a Delete or a Put -- and not specify types which might not be that bad.  Here's why.

Looking at the TableOutputFormat, turns out the key is not used at all so no need to specify a Type; i.e. ImmutableByteWritable.  And for the value, we could leave it as Writable.  If not a Put or Delete, throw an IOE.  Means we lose some of power of generics.

This is what the attached patch does.

Lars, any chance you'd take a look?  Do we even need a TableReducer if its all generic types?  I can see point of the TableMapper because value will be Result (Here too, Key is redundant...).  Is it wrong to specialize the TableOutputFormat?  Rather, leave it generic (though value must be a Delete or a Put)?

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1626.
--------------------------

      Resolution: Fixed
    Release Note: Undoes alot of the stipulation regards types coming out of TableMapper and in and out of TableReducer going into TableOutputFormat.  See javadoc.
    Hadoop Flags: [Reviewed]

Committed.  Thanks for the patch Lars.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v10.patch, 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626-v9.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728699#action_12728699 ] 

Lars George commented on HBASE-1626:
------------------------------------

Oops looking at the IdentityTableReducer, I think the issue is it has the same bug I had in my own classes, i.e. missing the @Override on the "reduce()" and hence missing it is not used at all! Means that I think the reason for Doğacan to extend the classes was that the expected results did not work. I'll attach a patch to fix in the existing mapreduce code, please apply that first before applying the generics patch.

Also, while the key is not used, users could still extend our classes and make use of it, so not sure why touch it or not. If you leave TableReduce generic then there is really no need for it anymore. It could be left as this?

{code}
public abstract class TableReducer<KEYIN, VALUEIN>
extends Reducer<KEYIN, VALUEIN, ImmutableBytesWritable, Writable> {
{code}

Then defining a reducer class 

{code}
public class IdentityTableReducer 
extends TableReducer<ImmutableBytesWritable, Writable> {
{code}

You have still Put's in there, so I guess that should also be change to be Writables?


> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-1626:
----------------------------

    Assignee: Lars George  (was: Doğacan Güney)

Assigning back to Lars, the man that did the work.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v10.patch, 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626-v9.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v6.patch

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-1626:
----------------------------

    Assignee: Doğacan Güney  (was: Lars George)

So Doğacan will take a look

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v5.patch

Patch 1626-v5.patch combines Stack details with proper ITR class, using solely Writable.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v5.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728803#action_12728803 ] 

Lars George commented on HBASE-1626:
------------------------------------

Doğacan please let me know if I (we) can help you get the MR job sorted out. 

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728870#action_12728870 ] 

stack commented on HBASE-1626:
------------------------------

TR imports Put and Delete though doesn't use them.

Do you think we should cache the new TableOutputCommitter?  That seems to be what FOF does.  Synchronizes the method and hands out same each time (probably minor).

Otherwise, +1 on patch, good stuff Lars.  I can fix above when I commit.  Lets see what Doğacan says before commit.  Will assign it to him for comment.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1626:
-------------------------

    Attachment: 1626-v5.patch

Add in the delete.  Got rid of a warning in eclipse about not specifying type.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v2.patch

Patch 1626-v2.patch is the same as before but fixes a really outdated JavaDoc comment.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728873#action_12728873 ] 

Lars George commented on HBASE-1626:
------------------------------------

Stack, I imported Put and Delete into TR so the JavaDoc warning would go away in Eclipse. I think this is for ambiguity and could be done also with a fully specified class reference. What do you prefer?

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v9.patch

Patch 1626-v9.patch removes the not needed imports and adds fully specified paths to the Put and Delete links.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626-v9.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728707#action_12728707 ] 

Lars George commented on HBASE-1626:
------------------------------------

Ah, I see HBASE-940 which added the cloning because of the BatchUpdate being reused. This was copied as is to Put/Delete. All I am asking is if this still valid or could be saved.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1626:
-------------------------

    Attachment: 1626-v4.patch

Made the TableReducer Key Generic since we don't use it so don't care what it is.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v7.patch

Patch 1626-v7.patch adds the missing Delete change (sorry, missed that earlier) and adds an empty TableOutputCommitter. Test are passing now:

{code}
>ant -Dtestcase=mapreduce/Test* compile-core-test test
...
test-core:
   [delete] Deleting directory C:\workspace\hbase-trunk\build\test\logs
    [mkdir] Created dir: C:\workspace\hbase-trunk\build\test\logs
    [junit] Running org.apache.hadoop.hbase.mapreduce.TestTableIndex
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 66.765 sec
    [junit] Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 71.688 sec
    [junit] Running org.apache.hadoop.hbase.mapreduce.TestTimeRangeMapRed
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 28.64 sec

test:

BUILD SUCCESSFUL
Total time: 2 minutes 57 seconds
{code}

Also added comments to classes about what the class types should be as per Jon's comment - I assume though this is not everywhere where it could be added. Please suggest where else you would like to see it.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626.patch

Patch 1626.patch combines the fix to IdentityTableReduce and applies the changes Stack did to allow for Put's and Delete's to be passed to the TOF. I kept the key type as is but only changed the value. That way the change is less intrusive - it is to be discussed if that is what we want or not. We could still decide to do away with TableReducer - but the TableMapper would stay so the change would be introducing a slight imbalance.

As far as not enforcing the value type but now use a generic Writable - well I think that is OK given an exception is thrown if someone really stuffs that up. While we do not have a base class for Put and Delete we have to go down this route.


> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Lars George
>             Fix For: 0.20.0
>
>         Attachments: 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v10.patch

Patch 1626-v10.patch adds an improved JavaDoc comment to ITR as per Jon.

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v10.patch, 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626-v9.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1626) Allow emitting Deletes out of new TableReducer

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated HBASE-1626:
-------------------------------

    Attachment: 1626-v8.patch

Patch 1626-v8.patch adds better JavaDocs with basic examples. Jon, I did not add the comment that key is not used in ITR as that is decided in the TOF. 

> Allow emitting Deletes out of new TableReducer
> ----------------------------------------------
>
>                 Key: HBASE-1626
>                 URL: https://issues.apache.org/jira/browse/HBASE-1626
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Doğacan Güney
>             Fix For: 0.20.0
>
>         Attachments: 1626-v2.patch, 1626-v3.patch, 1626-v4.patch, 1626-v5.patch, 1626-v6.patch, 1626-v7.patch, 1626-v8.patch, 1626.patch, deletes.patch, reducer.patch, table-reduce.patch
>
>
> Doğacan Güney (nutch) wants to emit Delete from TableReduce.  Currently we only do Put.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.