You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Johan Oskarsson (JIRA)" <ji...@apache.org> on 2009/02/24 18:00:15 UTC

[jira] Created: (HIVE-302) Implement "LINES TERMINATED BY"

Implement "LINES TERMINATED BY"
-------------------------------

                 Key: HIVE-302
                 URL: https://issues.apache.org/jira/browse/HIVE-302
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Serializers/Deserializers
            Reporter: Johan Oskarsson
             Fix For: 0.3.0


Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HIVE-302:
---------------------------------

    Fix Version/s:     (was: 0.4.0)
                   0.5.0

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>             Fix For: 0.5.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793512#action_12793512 ] 

Namit Jain commented on HIVE-302:
---------------------------------

ctas.q has a CTAS with line delimiter specified - that also breaks, the tests need to be modified

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HIVE-302:
-------------------------------

    Assignee: Zheng Shao

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793457#action_12793457 ] 

He Yongqiang commented on HIVE-302:
-----------------------------------

+1. Looks good. Will commit if tests pass.

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-302:
----------------------------

    Attachment: HIVE-302.3.patch

Fixed ctas.q and also tested all test cases.

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch, HIVE-302.3.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-302:
----------------------------

    Attachment: HIVE-302.2.patch

This one keeps the old syntax but will throw out an error in case anything breaks.


> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-302:
----------------------------

    Status: Patch Available  (was: Open)

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HIVE-302:
---------------------------------

    Fix Version/s:     (was: 0.3.0)
                   0.4.0

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>             Fix For: 0.4.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-302:
----------------------------

      Resolution: Fixed
    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Incompatible change])
          Status: Resolved  (was: Patch Available)

Committed. Thanks Zheng

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch, HIVE-302.3.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793408#action_12793408 ] 

Namit Jain commented on HIVE-302:
---------------------------------

create table nzhang_ctas5 row format delimited fields terminated by ',' lines terminated by '.' stored as textfile as select key, value from src sort by key, value limit 10;

select * from nzhang_ctas5;


Modified ctas.q to add the following:

Got only 1 row as the output.


> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>             Fix For: 0.5.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793446#action_12793446 ] 

Namit Jain commented on HIVE-302:
---------------------------------

talked with Zheng offline - it might be easier to specify a semantic error instead for compatiblity

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793435#action_12793435 ] 

Zheng Shao commented on HIVE-302:
---------------------------------

This is not fixable currently because the line terminator is determined by LineRecordReader.LineReader which is in the Hadoop land.
However we do support writing to such tables.

In order to avoid confusion, I will just drop this from the syntax.


> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-302:
----------------------------

    Attachment: HIVE-302.1.patch

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793761#action_12793761 ] 

Namit Jain commented on HIVE-302:
---------------------------------

+1

looks good - will commit if the tests pass

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch, HIVE-302.2.patch, HIVE-302.3.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-302:
----------------------------

    Hadoop Flags: [Incompatible change]

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-302:
----------------------------------

    Issue Type: Bug  (was: Improvement)

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>             Fix For: 0.3.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-302) Implement "LINES TERMINATED BY"

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793464#action_12793464 ] 

He Yongqiang commented on HIVE-302:
-----------------------------------

oh, sorry, ignore my previous comment.

> Implement "LINES TERMINATED BY"
> -------------------------------
>
>                 Key: HIVE-302
>                 URL: https://issues.apache.org/jira/browse/HIVE-302
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Johan Oskarsson
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-302.1.patch
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do anything when querying that data. It needs to be implemented to support various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.