You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2011/02/09 00:29:58 UTC

[jira] Created: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Hive SymlinkTextInputFormat does not estimate input size correctly
------------------------------------------------------------------

                 Key: HIVE-1978
                 URL: https://issues.apache.org/jira/browse/HIVE-1978
             Project: Hive
          Issue Type: Improvement
            Reporter: He Yongqiang
            Assignee: He Yongqiang




-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992335#comment-12992335 ] 

Namit Jain commented on HIVE-1978:
----------------------------------

It might be simpler to add a .q file testcase.
Just load 2 files (say a1.q and a2.q in a hdfs directory).
Then load a new file, say foo, for the table 'T' - the contents of the file 'foo' are

a1.q
a2.q


Then, 'T' can be queried

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1978:
-----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed. Thanks Yongqiang!

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-1978:
-------------------------------

    Attachment: HIVE-1978.1.patch

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992644#comment-12992644 ] 

He Yongqiang commented on HIVE-1978:
------------------------------------

namit, a .q test file can not include what this jira does. From a .q file, it is very difficult to know SymlinkTextInputFormat get the input size correctly.

>>getContentSummary' in all existing input formats.
There is no guarantee that the inputformat is from Hive. It is very difficult to change all input format.

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993267#comment-12993267 ] 

Ning Zhang commented on HIVE-1978:
----------------------------------

+1

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-1978:
-------------------------------

    Status: Patch Available  (was: Open)

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-1978:
-------------------------------

    Attachment: HIVE-1978.2.patch

fixed a typo

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-1978:
-----------------------------

    Status: Open  (was: Patch Available)

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-1978:
-------------------------------

    Status: Patch Available  (was: Open)

> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992338#comment-12992338 ] 

Namit Jain commented on HIVE-1978:
----------------------------------

Also, it might be simpler to add the new function 'getContentSummary' in all existing
input formats.

You can create a dummy class which all other hive input formats (other than symlinktextinputformat) extend.
In the abstract dummy class, the existing defn. can be there.

	            FileSystem fs = p.getFileSystem(ctx.getConf());	          
                    cs = fs.getContentSummary(p);


That waym you dont need any special checking in Utilities.java - it calls getContentSummary(),
which is implemented by all input formats that hive supports.




> Hive SymlinkTextInputFormat does not estimate input size correctly
> ------------------------------------------------------------------
>
>                 Key: HIVE-1978
>                 URL: https://issues.apache.org/jira/browse/HIVE-1978
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira