You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2011/08/05 04:57:27 UTC

[jira] [Commented] (HIVE-2303) files with control-A,B are not delimited correctly.

    [ https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079767#comment-13079767 ] 

Jakob Homan commented on HIVE-2303:
-----------------------------------

+1 on patch.  Always escaping seems reasonable.

> files with control-A,B are not delimited correctly.
> ---------------------------------------------------
>
>                 Key: HIVE-2303
>                 URL: https://issues.apache.org/jira/browse/HIVE-2303
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.8.0
>
>         Attachments: patch-2303.txt
>
>
> The following is from one of our users:
>  
> create external table impressions (imp string, msg string)
>   row format delimited
>     fields terminated by '\t'
>     lines terminated by '\n'
>   stored as textfile                 
>   location '/xxx';
>  
> Some strings in my data contains Control-A, Control-B etc as internal delimiters.  If I do a
>  
> Select * from impressions limit 10;
>  
> All fields were able to print correctly.  However if I do a
>  
> Select * from impressions where msg regexp '.*' limit 10;
>  
> The fields were broken by the control characters.  The difference between the 2 commands is that the latter requires a map-reduce job.  
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira