You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "David Phillips (JIRA)" <ji...@apache.org> on 2008/11/28 16:33:44 UTC

[jira] Commented: (HIVE-40) Hive Deserializer for plain text with separators simple support for quoting

    [ https://issues.apache.org/jira/browse/HIVE-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651588#action_12651588 ] 

David Phillips commented on HIVE-40:
------------------------------------

Is this fixed now?  From http://wiki.apache.org/hadoop/Hive/UserGuide:

Apache Access Log Tables

{noformat}CREATE TABLE apachelog (
ipaddress STRING, identd STRING, user STRING,finishtime STRING,
requestline string, returncode INT, size INT)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe'
WITH SERDEPROPERTIES (
'serialization.format'=
'org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol',
'quote.delim'='("|\\[|\\])',
'field.delim'=' ',
'serialization.null.format'='-')
STORED AS TEXTFILE;{noformat} 


> Hive Deserializer for plain text with separators simple support for quoting 
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-40
>                 URL: https://issues.apache.org/jira/browse/HIVE-40
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Pete Wyckoff
>
> Hive does currently support things like Apache log format where the separator is " " but strings are quoted. But, to do this, the field separator specified on the command line has to be horrific.
> TCTLSeparatedProtocol could take another parameter QuoteCharacter and then when this is set, respect quoting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.