You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2010/05/20 03:05:53 UTC

[jira] Created: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Hive should use NullOutputFormat for hadoop jobs
------------------------------------------------

                 Key: HIVE-1355
                 URL: https://issues.apache.org/jira/browse/HIVE-1355
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Joydeep Sen Sarma


see https://issues.apache.org/jira/browse/MAPREDUCE-1802

hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).

as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma updated HIVE-1355:
------------------------------------

    Attachment: 1355.1.patch

set nulloutputformat as hive output format
set option to omit setup/cleanup for hive submitted jobs

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-1355:
---------------------------------

    Issue Type: Improvement  (was: Bug)

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>             Fix For: 0.6.0
>
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-1355:
-----------------------------

          Status: Resolved  (was: Patch Available)
    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Committed. Thanks Joydeep

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma reassigned HIVE-1355:
---------------------------------------

    Assignee: Joydeep Sen Sarma

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-1355:
---------------------------------

    Fix Version/s: 0.6.0

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>             Fix For: 0.6.0
>
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874452#action_12874452 ] 

Namit Jain commented on HIVE-1355:
----------------------------------

+1

will commit if the tests pass

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1355) Hive should use NullOutputFormat for hadoop jobs

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma updated HIVE-1355:
------------------------------------

    Status: Patch Available  (was: Open)

> Hive should use NullOutputFormat for hadoop jobs
> ------------------------------------------------
>
>                 Key: HIVE-1355
>                 URL: https://issues.apache.org/jira/browse/HIVE-1355
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1355.1.patch
>
>
> see https://issues.apache.org/jira/browse/MAPREDUCE-1802
> hive doesn't depend on hadoop job output folder. it produces output exclusively via side effect folders. we should use an outputformat that can request hadoop skip cleanup/setup. this could be nulloutputformat (unless there are any objections in hadoop to changing nulloutputformat behavior).
> as a small side effect, it also avoids some totally unnecessary hdfs file creates and deletes in hdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.