You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2010/04/22 03:03:49 UTC

[jira] Created: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

CombineHiveInputFormat throws exception when partition name contains special characters to URI
----------------------------------------------------------------------------------------------

                 Key: HIVE-1317
                 URL: https://issues.apache.org/jira/browse/HIVE-1317
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: Ning Zhang
            Assignee: Ning Zhang


If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861925#action_12861925 ] 

Ning Zhang commented on HIVE-1317:
----------------------------------

Found a bug in my local testing, I will update a new patch once it is fixed. 

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment: HIVE-1317.4.patch

uploading a new patch HIVE-1317.4.patch. Changes include
1) removed the code in CombineHiveInputFormatShim to remove hdfs:// when getting splits.
2) The hdfs:// is striped when constructing CombineFilter to facilitate the requirement in Hadoop CombineFileINputformat. 


> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.3.patch, HIVE-1317.4.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

               Status: Patch Available  (was: Open)
    Affects Version/s: 0.6.0
        Fix Version/s: 0.6.0

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment: HIVE-1317.3.patch

attaching a new patch HIVE-1317.3.patch that solves the previous bug that failed in grouping partitioning columns.

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.3.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment: HIVE-1317.patch

Uploading a patch changing the way paths are compared. 

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain resolved HIVE-1317.
------------------------------

    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Committed. Thanks Ning

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.3.patch, HIVE-1317.4.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861824#action_12861824 ] 

Namit Jain commented on HIVE-1317:
----------------------------------

Ning, I am getting a compilation error after applying the patch -

    [javac] symbol  : variable File
    [javac] location: class org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.CombineFilter
    [javac]       pString = p.toString() + File.separator;
    [javac]                                ^
    [javac] Note: Some input files use or override a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] Note: Some input files use unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 1 error

BUILD FAILED


> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment:     (was: HIVE-1317.2.patch)

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment: HIVE-1317.2.patch

Updated HIVE-1317.2.patch with the fix. 

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Attachment: HIVE-1317.2.patch

Uploading a new patch HIVE-1317.2.patch. This fixes the test. 

Also talk with Namit offline, the CombineHiveInputFormat will get all paths under a partition directory due to CombineFileInputFormat.createPool(job, CombineFilter), in which CombineFilter will accept all files under the partition directory. That's why in CombineHiveInputFormat will we get the full path names from FileInputFormat.getInputPaths(). 


> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859660#action_12859660 ] 

Namit Jain commented on HIVE-1317:
----------------------------------

The test outputs are not deterministic.

About the change in CombineHiveInputFormat, cant the change be pushed to hiveInputFormat instead, and CHIF anyway is a subclass of HIF

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864885#action_12864885 ] 

Namit Jain commented on HIVE-1317:
----------------------------------

+1


will commit if the tests pass

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.3.patch, HIVE-1317.4.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1317) CombineHiveInputFormat throws exception when partition name contains special characters to URI

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1317:
-----------------------------

    Status: Open  (was: Patch Available)

> CombineHiveInputFormat throws exception when partition name contains special characters to URI
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1317
>                 URL: https://issues.apache.org/jira/browse/HIVE-1317
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1317.2.patch, HIVE-1317.patch
>
>
> If a partition name contains characters such as ':' and '|' which have special meaning in URI (hdfs uses URI internally for Path), CombineHiveInputFormat throws an exception. URI was created in CombineHiveInputFormat to compare a path belongs to a partition in partitionToPathInfo. We should bypass URI creation by just string comparisons. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.