You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Dick King (JIRA)" <ji...@apache.org> on 2006/04/05 01:54:43 UTC

[jira] Created: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Reading an ArrayWriter does not work because valueClass does not get initialized
--------------------------------------------------------------------------------

         Key: HADOOP-120
         URL: http://issues.apache.org/jira/browse/HADOOP-120
     Project: Hadoop
        Type: Bug

  Components: io  
 Environment: Red Hat  
    Reporter: Dick King


If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527308 ] 

Hadoop QA commented on HADOOP-120:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12365761/HADOOP-120-3.patch
against trunk revision r575472.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/754/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/754/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/754/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/754/console

This message is automatically generated.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373386 ] 

Dick King commented on HADOOP-120:
----------------------------------

The big deal is probably not so much the space it takes but the TIME it takes to look up a class.

Owen, I'll stop by at 11 to talk about this.  I think we need to generalize the notion of an input ot output class to an input or output _descriptor_ .

Gotta run now -- literally.

-dk


> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373232 ] 

Dick King commented on HADOOP-120:
----------------------------------

I realized that, of course, but I'm not expecting a whole lot of small ArrayWritable's .

Owen O'Malley's solution has the problem that ArrayWritables can themselves contain ArrayWritables.  We might, for example, want to have a two-component ArrayWritable where there are some integers and some UTF8s,  so we need a class

  public class ArrayOfArrayOfSomeIntsAndSomeUTF8s extends ArrayWritable
  {
     ... filling this in gives me a headache
  }

-dk


> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373224 ] 

Owen O'Malley commented on HADOOP-120:
--------------------------------------

After thinking about it for a bit, the problem with this patch is that this is going to encode the typename in each and every record. So if your value type is ArrayWriter<UTF8>, you are going to spend an extra 2+strlen("org.apache.hadoop.io.UTF8") bytes per a record. That's a fair amount of overhead.

We also have to be a careful with the serialization of ArrayWritable because it is used in the DFS name node logs.

I'm not sure what the right solution is. Probably for right now, I would derive a subclass of ArrayWritable that is specific for your type. It isn't pretty, but it is guaranteed to be safe.

public class UTF8Array extends ArrayWritable {
  public UTF8Array() {
     super(UTF8.class);
  }
}

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373234 ] 

Andrzej Bialecki  commented on HADOOP-120:
------------------------------------------

Regarding the class name encoding: you could use the same trick that we use in Nutch, org.apache.nutch.crawl.MapWritable, which uses a sort of dictionary encoding, and as long as you use only the "standard" types the overhead is just 1 byte. For non-standard types, the type name is put into a dictionary (once), and henceforth only 1 byte is used, too.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373408 ] 

Andrzej Bialecki  commented on HADOOP-120:
------------------------------------------

Dictionary is stored together inside the file, so I don't think it's a problem - you can always read it later and restore the data.

Caller may be unaware of exact types stored in a Map-like structure, but it should not prevent him from knowing (and using) portions of data that he knows about, so I think this doesn't violate the convention... ok, maybe bends it a little ;)

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528606 ] 

Hudson commented on HADOOP-120:
-------------------------------

Integrated in Hadoop-Nightly #241 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/241/])

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Cameron Pope
>             Fix For: 0.15.0
>
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Dick King (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-120?page=all ]

Dick King updated HADOOP-120:
-----------------------------

    Attachment: hadoop-120-fix.patch

This patch appears to fix the problem.  I don't know whether you will object to the exception modification on ClassNotFoundException .

-dk


> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Rohan A Mehta (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495070 ] 

Rohan A Mehta commented on HADOOP-120:
--------------------------------------

Hello,
This bug is still not fixed in the latest versions of Hadoop. I wrote a fix similar to the one submitted. But it hasnt been updated in the respository.
Is there some issue with this fix ?

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>         Environment: Red Hat  
>            Reporter: Dick King
>         Assigned To: Owen O'Malley
>         Attachments: hadoop-120-fix.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12373388 ] 

Dick King commented on HADOOP-120:
----------------------------------

Regarding the dictionary approach ... remember the life of a dictionary is the lifetime of a DFS, not merely a particular run or a DFS file.  Also it does violate the apparent convention that the callER knows the type of the object.  Now maybe that should be rethought, but we should do so explicitly, not back into it.

-dk


> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>          Key: HADOOP-120
>          URL: http://issues.apache.org/jira/browse/HADOOP-120
>      Project: Hadoop
>         Type: Bug

>   Components: io
>  Environment: Red Hat  
>     Reporter: Dick King
>  Attachments: hadoop-120-fix.patch
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Issue Comment Edited: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Cameron Pope (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527245 ] 

camerooni edited comment on HADOOP-120 at 9/13/07 1:27 PM:
--------------------------------------------------------------

Yes, that's definitely a better fix and if anyone is using ArrayWritable's default constructor, it should be easy to change.

      was (Author: camerooni):
    Yes, that's definitely a better fix and if anyone is using ArrayWritable's default constructor, it sould be easy to change.
  
> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Cameron Pope (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527245 ] 

Cameron Pope commented on HADOOP-120:
-------------------------------------

Yes, that's definitely a better fix and if anyone is using ArrayWritable's default constructor, it sould be easy to change.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495075 ] 

Hadoop QA commented on HADOOP-120:
----------------------------------

-1, could not apply patch.

The patch command could not apply the latest attachment http://issues.apache.org/jira/secure/attachment/12324953/hadoop-120-fix.patch as a patch to trunk revision r536583.

Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/133/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.12.3
>         Environment: Red Hat  
>            Reporter: Dick King
>         Assigned To: Owen O'Malley
>         Attachments: hadoop-120-fix.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Cameron Pope (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cameron Pope updated HADOOP-120:
--------------------------------

    Affects Version/s:     (was: 0.12.3)
                       0.15.0
               Status: Patch Available  (was: Open)

I ran into this issue this morning. I'm new to Hadoop so forgive me if I failed to understand the big picture in submitting this patch.

To address, I added some documentation to ArrayWritable indicating that it needs to be subclassed to be used as input to Reducers. I also added a check for valueClass being undefined when calling readFields() - as a relatively new Hadoop user, it took me a while to figure out what the NPE I got in ReflectionUtils was and what to do with it. 

It looks like some DFS datastructures depend on ArrayWritable being the way that it is, so it seems that changing the serialization format would break stuff. It might make sense to create a different ArrayWritable that serialized type information for Hadoop users specifically for MapReduce applications, but this class should probably document its contract with the outside world and throw a specific exception when it's violated. I've also added a unit test to verify the exceptions thrown or not thrown are as expected.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Owen O'Malley
>         Attachments: hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525858 ] 

Hadoop QA commented on HADOOP-120:
----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12365371/hadoop-120-workaround.patch applied and successfully tested against trunk revision r573708.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/715/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/715/console

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Owen O'Malley
>         Attachments: hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-120:
--------------------------------

    Attachment: HADOOP-120-3.patch

Here's the new version that prohibits creation of ArrayWritable's with a null valueClass.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-120:
--------------------------------

    Status: Open  (was: Patch Available)

This patch no longer applies to trunk.  It's also not formatted according to Hadoop's conventions.  Finally, this is not an acceptable approach.  Rather, one could use the workaround suggested by Owen, or we could add a new Writable class that writes the name of the class in each instance, but it would not be back-compatible to make this change to ArrayWritable.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.12.3
>         Environment: Red Hat  
>            Reporter: Dick King
>         Assigned To: Owen O'Malley
>         Attachments: hadoop-120-fix.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-120:
--------------------------------

    Assignee: Doug Cutting  (was: Owen O'Malley)
      Status: Open  (was: Patch Available)

I think we can do this more simply by prohibiting the creation of ArrayWritable's whose valueClass is null.  I'll attach a patch.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-120:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.15.0
         Assignee: Cameron Pope  (was: Doug Cutting)
           Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Cameron.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Cameron Pope
>             Fix For: 0.15.0
>
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Rohan A Mehta (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohan A Mehta updated HADOOP-120:
---------------------------------

    Affects Version/s: 0.12.3
               Status: Patch Available  (was: Open)

Could anyone commit this patch ?

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.12.3
>         Environment: Red Hat  
>            Reporter: Dick King
>         Assigned To: Owen O'Malley
>         Attachments: hadoop-120-fix.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Frédéric Bertin (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-120?page=comments#action_12421827 ] 
            
Frédéric Bertin commented on HADOOP-120:
----------------------------------------

Hello all,

what is current thinking about this bug and the right way to fix it?

What about adding a simple javadoc comment in this class to warn users that it must be derived to be used as a Reducer value? 

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: http://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>         Environment: Red Hat  
>            Reporter: Dick King
>         Attachments: hadoop-120-fix.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Cameron Pope (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cameron Pope updated HADOOP-120:
--------------------------------

    Attachment: hadoop-120-workaround.patch

Patch to document ArrayWritable and throw a specific exception when readFields is undefine.

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Owen O'Malley
>         Attachments: hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-120) Reading an ArrayWriter does not work because valueClass does not get initialized

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-120:
--------------------------------

    Status: Patch Available  (was: Open)

> Reading an ArrayWriter does not work because valueClass does not get initialized
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-120
>                 URL: https://issues.apache.org/jira/browse/HADOOP-120
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.15.0
>         Environment: Red Hat  
>            Reporter: Dick King
>            Assignee: Doug Cutting
>         Attachments: HADOOP-120-3.patch, hadoop-120-fix.patch, hadoop-120-workaround.patch
>
>
> If you have a Reducer whose value type is an ArrayWriter it gets enstreamed alright but at reconstruction type when ArrayWriter::readFields(DataInput in) runs on a DataInput that has a nonempty ArrayWriter , newInstance fails trying to instantiate the null class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.