You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2011/08/31 02:33:09 UTC
[jira] [Created] (HIVE-2422) remove the intermediate dir of one
hive query when it finish
remove the intermediate dir of one hive query when it finish
-------------------------------------------------------------
Key: HIVE-2422
URL: https://issues.apache.org/jira/browse/HIVE-2422
Project: Hive
Issue Type: Bug
Reporter: He Yongqiang
right now if one hive query got compiled to 2 mr jobs, and the first job's output feed the second job. When the query finish, the first job's output should be removed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2422) remove the intermediate dir of one
hive query when it finish
Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
He Yongqiang reassigned HIVE-2422:
----------------------------------
Assignee: He Yongqiang
> remove the intermediate dir of one hive query when it finish
> -------------------------------------------------------------
>
> Key: HIVE-2422
> URL: https://issues.apache.org/jira/browse/HIVE-2422
> Project: Hive
> Issue Type: Bug
> Reporter: He Yongqiang
> Assignee: He Yongqiang
>
> right now if one hive query got compiled to 2 mr jobs, and the first job's output feed the second job. When the query finish, the first job's output should be removed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2422) remove the intermediate dir when the
hive query finish
Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
He Yongqiang updated HIVE-2422:
-------------------------------
Summary: remove the intermediate dir when the hive query finish (was: remove the intermediate dir of one hive query when it finish )
> remove the intermediate dir when the hive query finish
> -------------------------------------------------------
>
> Key: HIVE-2422
> URL: https://issues.apache.org/jira/browse/HIVE-2422
> Project: Hive
> Issue Type: Bug
> Reporter: He Yongqiang
> Assignee: He Yongqiang
>
> right now if one hive query got compiled to 2 mr jobs, and the first job's output feed the second job. When the query finish, the first job's output should be removed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2422) remove the intermediate dir when the
hive query finish
Posted by "Anurag Tangri (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400063#comment-13400063 ]
Anurag Tangri commented on HIVE-2422:
-------------------------------------
Looks like scratch did is not being cleaned at lot of locations. Another such location:
1. ExecDriver.java's execute() function.
Here, if it is created before launching a job and there is error in job launch, it is not cleaned in exception before returning :
try {
if (ctx == null) {
ctx = new Context(job);
ctxCreated = true;
}
emptyScratchDirStr = ctx.getMRTmpFileURI();
emptyScratchDir = new Path(emptyScratchDirStr);
FileSystem fs = emptyScratchDir.getFileSystem(job);
fs.mkdirs(emptyScratchDir);
} catch (IOException e) {
e.printStackTrace();
console.printError("Error launching map-reduce job", "\n"
+ org.apache.hadoop.util.StringUtils.stringifyException(e));
return 5;
}
Here, ctx.clear() needs to be called in exception.
-Anurag Tangri
> remove the intermediate dir when the hive query finish
> -------------------------------------------------------
>
> Key: HIVE-2422
> URL: https://issues.apache.org/jira/browse/HIVE-2422
> Project: Hive
> Issue Type: Bug
> Reporter: He Yongqiang
> Assignee: He Yongqiang
>
> right now if one hive query got compiled to 2 mr jobs, and the first job's output feed the second job. When the query finish, the first job's output should be removed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2422) remove the intermediate dir when the
hive query finish
Posted by "Priyadarshini (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160067#comment-13160067 ]
Priyadarshini commented on HIVE-2422:
-------------------------------------
I have executed this query.
select a.rollNo,b.rollNo from student a join student b on a.rollNo=b.rollNo group by a.rollNo,b.rollNo;
The above query has spawned 2 MR jobs.
After the execution of the query, org.apache.hadoop.hive.ql.Context.clear() method is deleting the ScratchDir of the query.
> remove the intermediate dir when the hive query finish
> -------------------------------------------------------
>
> Key: HIVE-2422
> URL: https://issues.apache.org/jira/browse/HIVE-2422
> Project: Hive
> Issue Type: Bug
> Reporter: He Yongqiang
> Assignee: He Yongqiang
>
> right now if one hive query got compiled to 2 mr jobs, and the first job's output feed the second job. When the query finish, the first job's output should be removed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira