You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/08/25 03:20:59 UTC

[jira] Created: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

support "add archive" in addition to "add files" and "add jars"
---------------------------------------------------------------

                 Key: HIVE-792
                 URL: https://issues.apache.org/jira/browse/HIVE-792
             Project: Hadoop Hive
          Issue Type: New Feature
            Reporter: Zheng Shao


In JobClient.java, we have:
{code}
    if (commandConf != null) {
      files = commandConf.get("tmpfiles");
      libjars = commandConf.get("tmpjars");
      archives = commandConf.get("tmparchives");
    }
{code}

The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").


We should have "add archive" which sets "tmparchives".



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-792:
------------------------------

    Attachment: hive-792-2009-08-25.patch

> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>         Attachments: hive-792-2009-08-25.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747746#action_12747746 ] 

He Yongqiang commented on HIVE-792:
-----------------------------------

I think it is line 36 of  AddResourceProcessor:
if (tokens.length < 2 || (t = SessionState.find_resource_type(tokens[0])) == null)

But i can not find a way to add a testcase which can easily integrats with hive's test framework. Does there exist a testcase for "add file" or "add jar", if yes, i can add a similar one.


> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747225#action_12747225 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

https://svn.apache.org/viewvc/hadoop/common/branches/branch-0.17/src/java/org/apache/hadoop/mapred/JobClient.java?view=co
In hadoop 0.17, "add jars" uses "tmpjars" which goes to "DistributedCache.addArchiveToClassPath".

> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-792) support "add archive" in addition to "add file" and "add jar"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-792:
----------------------------

    Summary: support "add archive" in addition to "add file" and "add jar"  (was: support "add archive" in addition to "add files" and "add jars")

> support "add archive" in addition to "add file" and "add jar"
> -------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747224#action_12747224 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

https://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapred/JobClient.java?view=co
In hadoop trunk, "add jars" uses "tmpjars" which goes to "DistributedCache.addFileToClassPath".


> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add file" and "add jar"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747773#action_12747773 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

+1. Will commit if test passes.


> support "add archive" in addition to "add file" and "add jar"
> -------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747218#action_12747218 ] 

Todd Lipcon commented on HIVE-792:
----------------------------------

Note that a lot of time you want to add a jar to the classpath without unpacking it. In that case you want "addFileToClassPath". Unpacking is reasonably slow, as is the recursive delete when you're done

> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao reassigned HIVE-792:
-------------------------------

    Assignee: He Yongqiang

> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747562#action_12747562 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

@hive-792-2009-08-25.patch:
Line 9: archive -> archives: to keep all confs in sync.

How is the user going to add an archive? We need a "add archive" command. 
We also need a unit test (if possible).


> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747744#action_12747744 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

Can you point me to where the string "archive" is matched against in the code?
Also is it possible to add a test case?


> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747169#action_12747169 ] 

Zheng Shao commented on HIVE-792:
---------------------------------

Details: (from ExecDriver.initialize(...) and JobClient.java)

"add file" uses "tmpfiles" which goes to "DistributedCache.addCacheFile"
"add jars" uses "tmpjars" which goes to "DistributedCache.addArchiveToClassPath"
"add archives" should use "tmparchives" which goes to "DistributedCache.addCacheArchive"


> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-792) support "add archive" in addition to "add files" and "add jars"

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-792:
------------------------------

    Attachment: hive-792-2009-08-26.patch

hive-792-2009-08-26.patch integrates Zheng's comments(Thanks Zheng!). I think "add archive" is supported by AddResourceProcessor, and we do not need to add extra code for that.

> support "add archive" in addition to "add files" and "add jars"
> ---------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-792) support "add archive" in addition to "add file" and "add jar"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao resolved HIVE-792.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.5.0
     Release Note: HIVE-792. Support "add archive" in addition to "add file" and "add jar". (He Yongqiang via zshao)
     Hadoop Flags: [Reviewed]

Committed. Thanks Yongqiang!

> support "add archive" in addition to "add file" and "add jar"
> -------------------------------------------------------------
>
>                 Key: HIVE-792
>                 URL: https://issues.apache.org/jira/browse/HIVE-792
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: He Yongqiang
>             Fix For: 0.5.0
>
>         Attachments: hive-792-2009-08-25.patch, hive-792-2009-08-26.patch
>
>
> In JobClient.java, we have:
> {code}
>     if (commandConf != null) {
>       files = commandConf.get("tmpfiles");
>       libjars = commandConf.get("tmpjars");
>       archives = commandConf.get("tmparchives");
>     }
> {code}
> The good thing about tmparchives is that TT will automatically unarchive the files (because "tmparchives" goes through DistributeCache.addCacheArchive, while TT won't do that for "tmpfiles").
> We should have "add archive" which sets "tmparchives".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.