You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2009/05/17 07:17:45 UTC

[jira] Created: (HIVE-489) compiler should check for validity of output paths before job submission

compiler should check for validity of output paths before job submission
------------------------------------------------------------------------

                 Key: HIVE-489
                 URL: https://issues.apache.org/jira/browse/HIVE-489
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Joydeep Sen Sarma


couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710177#action_12710177 ] 

Joydeep Sen Sarma commented on HIVE-489:
----------------------------------------

ok - that covers my problem - i can't think very many reasons other than the parent dir issue on why things would fail - perhaps things like permissions if the hadoop instance is using one. 

otherwise - let's dup this one ..

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713345#action_12713345 ] 

Prasad Chakka commented on HIVE-489:
------------------------------------

yeah, i didn't realize 489 was for directories and not partitions. 488 doesn't help with directories.

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713337#action_12713337 ] 

Joydeep Sen Sarma commented on HIVE-489:
----------------------------------------

i don't see any code for the insert overwrite case:
- genFileSinkPlan generates a loadFileDesc
- MoveTask does not mkdir on parent for loadFileDesc

the changes in 488 only seem to cover loads into tables and partitions if i am not mistaken ? 

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713115#action_12713115 ] 

Prasad Chakka commented on HIVE-489:
------------------------------------

488 only creates parent directory before moving the data (both insert overwrite and load) but it doesn't validate the output path before executing the job.Hive should validate the output path while generating the plan so that job wouldn't be executed unnecessarily. 

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710168#action_12710168 ] 

Prasad Chakka commented on HIVE-489:
------------------------------------

HIVE-488 should fix the issue of not creating the parent directories.

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710172#action_12710172 ] 

Prasad Chakka commented on HIVE-489:
------------------------------------

shouldn't matter. both queries use MoveTask underneath.

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710171#action_12710171 ] 

Joydeep Sen Sarma commented on HIVE-489:
----------------------------------------

this happened on insert overwrite directory though ..

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-489) compiler should check for validity of output paths before job submission

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712890#action_12712890 ] 

Joydeep Sen Sarma commented on HIVE-489:
----------------------------------------

the fix for 488 doesn't seem to cover the loadFile case which applies to the 'insert overwrite directory' clause .. - correct?

> compiler should check for validity of output paths before job submission
> ------------------------------------------------------------------------
>
>                 Key: HIVE-489
>                 URL: https://issues.apache.org/jira/browse/HIVE-489
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>
> couple of hours after a job has run - finding that the move operation failed (because the output directory did not exist and move doesn't make parent directories automatically). No Trash - output is gone. Hive should have barfed on this in the compile phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.