You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2011/07/29 09:32:09 UTC
[jira] [Updated] (HIVE-2303) files with control-A,B are not
delimited correctly.
[ https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HIVE-2303:
------------------------------------------
Attachment: patch-2303.txt
Patch adds escape property to the default output table.
> files with control-A,B are not delimited correctly.
> ---------------------------------------------------
>
> Key: HIVE-2303
> URL: https://issues.apache.org/jira/browse/HIVE-2303
> Project: Hive
> Issue Type: Bug
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-2303.txt
>
>
> The following is from one of our users:
>
> create external table impressions (imp string, msg string)
> row format delimited
> fields terminated by '\t'
> lines terminated by '\n'
> stored as textfile
> location '/xxx';
>
> Some strings in my data contains Control-A, Control-B etc as internal delimiters. If I do a
>
> Select * from impressions limit 10;
>
> All fields were able to print correctly. However if I do a
>
> Select * from impressions where msg regexp '.*' limit 10;
>
> The fields were broken by the control characters. The difference between the 2 commands is that the latter requires a map-reduce job.
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira