You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Marcin Kurczych (JIRA)" <ji...@apache.org> on 2011/08/23 22:22:29 UTC

[jira] [Created] (HIVE-2400) Update unittests Hadoop version

Update unittests Hadoop version
-------------------------------

                 Key: HIVE-2400
                 URL: https://issues.apache.org/jira/browse/HIVE-2400
             Project: Hive
          Issue Type: Improvement
            Reporter: Marcin Kurczych


Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
* har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2400) Update unittests Hadoop version

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093950#comment-13093950 ] 

Ning Zhang commented on HIVE-2400:
----------------------------------

@Marcin, can you download and test if hadoop 0.20.3 indeed solves the har issue you mentioned.

> Update unittests Hadoop version
> -------------------------------
>
>                 Key: HIVE-2400
>                 URL: https://issues.apache.org/jira/browse/HIVE-2400
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marcin Kurczych
>            Assignee: Marcin Kurczych
>
> Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
> * har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
> fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2400) Update unittests Hadoop version

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093907#comment-13093907 ] 

Ning Zhang commented on HIVE-2400:
----------------------------------

+1 on upgrading to a higher hadoop version. It seems the apache repo only hold upto 0.20.3. Maybe we need to upgradt to that version. Also we'll need to upgrade all the mirrows (facebook and cloudera's). 

Any other thoughts?  

> Update unittests Hadoop version
> -------------------------------
>
>                 Key: HIVE-2400
>                 URL: https://issues.apache.org/jira/browse/HIVE-2400
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marcin Kurczych
>
> Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
> * har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
> fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2400) Update unittests Hadoop version

Posted by "Marcin Kurczych (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcin Kurczych updated HIVE-2400:
----------------------------------

    Description: 
Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
* har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.

fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

  was:
Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
* har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .


> Update unittests Hadoop version
> -------------------------------
>
>                 Key: HIVE-2400
>                 URL: https://issues.apache.org/jira/browse/HIVE-2400
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marcin Kurczych
>
> Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
> * har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
> fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HIVE-2400) Update unittests Hadoop version

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang reassigned HIVE-2400:
--------------------------------

    Assignee: Marcin Kurczych

> Update unittests Hadoop version
> -------------------------------
>
>                 Key: HIVE-2400
>                 URL: https://issues.apache.org/jira/browse/HIVE-2400
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marcin Kurczych
>            Assignee: Marcin Kurczych
>
> Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
> * har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
> fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2400) Update unittests Hadoop version

Posted by "Marcin Kurczych (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094999#comment-13094999 ] 

Marcin Kurczych commented on HIVE-2400:
---------------------------------------

I've manually replaced hadoop-core and hadoop-tools jars to Hadoop 0.20.3 ones and everything almost worked (all tests, including new ones, which were failing because of Hadoop 0.20.1 bugs). There's "almost", because I've run into a problem: VersionInfo.getVersion() was returning "Unknown" so I hardcoded something like if("Unknown".equals(vers)) vers="0.20.3"; for testing and then everything went perfect. This must be problem with jars, I've used ones from https://repository.apache.org/index.html#nexus-search;quick~hadoop .

> Update unittests Hadoop version
> -------------------------------
>
>                 Key: HIVE-2400
>                 URL: https://issues.apache.org/jira/browse/HIVE-2400
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marcin Kurczych
>            Assignee: Marcin Kurczych
>
> Hadoop 0.20.1 used in unittests contains bugs that were fixed in later versions of Hadoop, for example
> * har:// connections cannot be indexed by (scheme, authority, username) - the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
> fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira